Speech Synthesizer 5.0 Serial Key

  1. Support this channel on Patreonmy websitehttp://www.the8bitguy.com/.
  2. Slate Digital Complete Bundle (Win) Slate Digital – VTM VMR Complete Bundle, VBC, FG-X Slate Digital VTM 1.1.1.1, VMR Complete Bundle 1.5.0.1, VBC 1.2.9.1, FG-X 1.4.0, Windows 7,8 and 10 64-bit (VST, VST3 and AAX) Instructions: Attached Includes: Virtual Mix Rack Complete Bundle Virtual Buss Compressors Virtual Tape Machines Slate Digital FG-X – Direct Download (1GB).
  1. Speech Synthesizer 5.0 Serial Key Serial
  2. Speech Synthesizer 5.0 Serial Key Generator
  3. Speech Synthesizer 5.0 Serial Keys
  4. Speech Synthesizer 5.0 Serial Keyboard

TMS5220
Pinout
Installation inside the PE-box
Internal structure
Speech synthesis logic
D/A conversion

Speech synthesiser. Enter some text in the input below and press return or the 'play' button to hear it. Change voices using the dropdown menu. Speech Synthesizer free download - Space Synthesizer, Tazti Speech Recognition Software, Text to Speech Maker, and many more programs. Speech Synthesizer 5.0 is a powerful text to voice converting application that is extremely useful for converting all types of text to speech, learning English pronunciation and Vocabulary building. The voice synthesizer software runs on all Windows Systems and has adjustable speaking speeds as well as an aesthetically designed user interface.

Speech ROMs
Pinout
Operation
Speech ROMs in the TI module

Operating the Speech Synthesizer
_Commands
_Speech encoding
_Lookup tables

Timing diagrams
Electrical characteristics

Introduction

The speech synthesizer module is a stand-alone unit that fitsinbetweenthe console and the peripheral connection cable (if any). If contains aTMS5220 speech synthesis chip and two TMS6100 serial ROMs that hold afairlylimited vocabulary. Here are some pictures.

The TMS5220 synthesizer chip can receive speech data either from theserial ROMs or directly from the CPU. It contains a 16-byte parallel in/ 128-bit serial out FIFO buffer for the latter purpose. It interactswiththe CPU via three registers: the Command register (input), the Dataregister(output) and the Status register (output).

The speech synthesis logic uses the serial data to generate digitalspeech that is accessible on the I/O pin. This signal is fedtoan internal digital-to-analog converter to produce an analog signal onpin SPEAKER, that can be used to drive a speaker. In the caseof the TI-99/4A, the analog signal is sent to the TMS9919sound chip inside the console.


Pinout

Power supply
Vdd -5V (drain supply voltage)
Vss +5V (substrate supply voltage)
Vref 0V (ground reference voltage)

Upon power-up, an internal circuitery ensures a clear condition 95%of the time, provided Vss-Vdd reaches +10 volts in less than 2millisecond.This is done by issuing an internal 'Reset' command, which lasts15 milliseconds. To ensure a 100% clear reset condition, the softwarecansend nine >FF bytes to the synthesizer, followed with a 'Reset'command (>Fx).

System interface
D0-D7 Data bus. D0 is the most significant bit (weight>80),D7 is the least significant bit (weight >01). In the TI module,theselines are connected to the data bus present on the side port, pins#34-40,and 43.

RS* Read select. This input pin goes low when the CPU wantstoread data from the synthesizer.

WS* Write select. This input pin goes low when the CPU wantsto write data to the synthesizer.

If both RS* nor WS* are high, the synthsizeroutputsare in high impedance state.

If both RS* and WS* go low, results areunpredictable.This never occurs in the TI module, as RS* and WS* signalscome from a 74LS138 decoder which by definition can never bring morethanone output low.

The TI-99/4A console generates a signal on pin #2 of the side port,to be used when the speech synthesizer is accessed. This signal decodesA0-A5, A15, DBIN and MEMEN*, and is active (high) for read operationsateven addresses in the range >9000-93FE and write operations at evenaddresses in the range >9400-97FE.

Inside the synthesizer module, there is a 74LS138 decoder thatcombinesthis signal it with A15/CRUOUT to react only to even addresses(althoughthis was already taken care of in the console) and with the RESET*line.The decoder uses address line A5 to distinguish read operations fromwriteoperations and triggers WS* or RS* on the TMS5520.

READY* The speech synthesizer is a very slow device. All I/Ooperations require halting the CPU until the synthesizer is done withreading/writingdata on D0-D7. The READY* pin is used for that purpose: itgoeshigh 100 ns after RS* or WS* goes low, to signalthedevice is not ready. In the TI module, the READY* pincontrolsa 2N3904 transistor that connects the READY* line of thesideport (pin #12) to the ground when passing.

INT* This output pin goes active low when the TS (talkstatus)status bit turns zero. If TS turns one during a read cycle, the INT*pin will go low after the cycle has been completed. It will also go lowduring a Speak External command if the BL (buffer low) or BE (bufferempty)bits turns zero. This pin is not connected in the TI module.

Speech memory interface
ADD1-ADD8/DATA These four output pins are used to send anaddressto the external speech memories. The address is sent as 5 nibbles, andincludes a chip selection code. ADD8/DATA also serves as an input pintoread data from the memory.

M0, M1 These two output pins carry command bits to the speechROMs. M1 is pulsed high five times to pass an address to the ROMs, M0ispulsed high to read subsequent bits from the ROM. Pulsing high both M0and M1 causes an internal readn-and-branch in the ROM.

ROMCLK This output pin is used to synchronize operations whenaccessing the speech ROMs. It is derived from the OSC signal, dividedby4 (ROMCLK corresponds to phi2).

Sound interface
SPEAKER This is the output of the digital-to-analogconverter.It carries an analog sound signal from 0 to 1.5 mAmp, with a resolutionof 5.9 uAmp. In the TI module, this pin is grounded via a 1.8K resistor(so as to generate a voltage accros it), and is filtered with aparallel0.22 uF cap. It also goes to the side port, pin #44, via a 1 uF serialcap. Inside the console, it goes through a 330 Ohm resistor to theAUDIOINpin of the TMS9919 sound chip.

I/O The digital sound data, upstream the D/A converter canberead on this pin. The data is in the form of a signed 10-bit value,synchronizedby ROMCLK. This pin in not connected in the TI module.

T11 This pin signals that data will be available on the I/Opin.The signal remains active (high) for two pulses, and data appear on theI/O pin the next pulse after T11 went low. This pinisnot connected in the TI module.


Clock interface
OSC This input pin provides the clock signal used by thesynthesizer.It is generally connected to a RC circuit tuned to 640 kHz (whichprovidesan internal sample rate of 8 kHz and a ROMCLK signal of 160kHz)or to 800 kHz (which results in a sample rate of 10 kHz and a ROMCLKsignal of 200 kHz). The manual recommends a 80-100 kOhm resistortoselect 10 kHz and a 120-200 kOhm resistor to select 8 kHz. It alsoadvisesa 10 pF shunt capacitor in parallel with the resistor to filter outnoise.In the TI speech module, there are 3 resistors in parallel that can becut out to select the proper frequency (and no shunt cap). In my moduleone of the resistor has been removed and the remaining two add up to aresistance of 209 kOhm.

The OSC signal is internally divided by 4 to generate four phaseclocks:PHI1 (major phase), PHI2 (major phase, ROMCLK), PHI3(pre-loadfor PHI1) and PHI4 (pre-load for PHI2).

Alternatively, OSC could be connected to a 320 kHz ceramicresonator,whose other pin is connected to Vss (-5 Volts). However, this optionmustbe enabled during manufacture of the device and is therefore notaccessibleto us. It is possible however to feed a 320 kHz squarewave (0 / +5volts)clock signal to OSC if PROMOUT is connected to Vss.

PROMOUT This pin is for test purposes and is normally notconnected.If it is forced to -5 V, it disables the internal oscillator, so thatOSCaccepts an external clock signal.

TEST This pin is for test purposes and must not be connected.


Putting the synthesizer inside the PE-box

Having the speech-synthesizer and the 'firehose' PE-box cabledaisy-chained in the console side port is a bit tricky. First it takesspace, then its prone to poor contact that may result in loosing allyourwork if you accidentally bump this assembly.

For this reason, a quick hack was designed by Joe Spiegel to let youinstall the speech synthesizer board inside the PE-box. It does requiresome additional circuitery though, since the PE-box bus does not matchexactly the console side port.

First, the power supply is unregulated in the PE-box. So theconnectioncard must carry two voltage regulators: a 7805 provides +5 volts (alsoused by the additional TTL chips) and an 7905 provides -5 volts.

Then the PE-box bus does not contain the SBE selection signal thatindicatesaccess to the speech synthesizer. The adapter board must thus carry thenecessary logic.

Contrarily to TI requirements, many address lines are not bufferedinthe adapter board. This is because they only drive one chip, so itdoesn'tmake any difference whether this chip is a buffer or something else...Lines that are used for more than one purpose (A5, A15 and DBIN) arebufferedby three gates of a 74LS367 tri-state buffer. Obviously, these gatesarepermanently enabled. Finally, some lines are not used by the adapterboardand go directly to the synthesizer: READY, RESET* and AUDIOIN.

In the schematics below, a >-- denotes a line from thePE-boxbus, whereas a --< denotes a pin of the speech-sythesizerboard.

Address decoding is performed by two 74LS138 decoders that react tomemory operations in the range >90xx-94xx. A15 is included to makesurethat only even addresses are taken into account. A5 and DBIN arecombinedwith a 74LS00 NAND gate to make sure that read operations do not reach>9400. This exactly mimics the circuitery found inside the console,and described above.

Two outputs of the second decoder react to >90xx and >94xxrespectively.They are combined with a 74LS00 NAND gate to provide the active-highselectionsignal SBE to the speech synthesizer. Two 1N914 diodes mounted as a'wired-and'play the same role to provide an active-low signal that enables the74LS245data bus buffer (whose direction is set by DBIN) and a 74LS367tri-statebuffer. This buffer makes the DRBENA* line go low to activate the databus buffers in the connection card and cable.

I haven't tried this circuit myself, but I was told that it works...


Internal structure

General organisation


The status register
This output register contains only 3 relevant bits:

TSBLBE. .. .. .. .. .


TS Talk Status. Bit 0, weight >80. This bit is 1 when thesynthesizeris processing data. This occurs immediately after a 'Speak' commandor 50 usec after nine bytes were loaded by a 'Speak-External'command. TS goes back to 0 after the stop code (energy=1111) isencountered,when the FIFO becomes empty or in case of a reset. In the first twocases,the audio output is interpolating toward zero during the current frameand will only terminate at the next frame.

BL Buffer Low. Bit 1, weight >40. This bit becomes 1duringa 'Speak-External' command, when the number of bytes in the FIFOdecreases below 8. It reverts to zero 50 usec after a ninth byte iswritteninto the FIFO.

BE Buffer Empty. Bit 2, weight >20. This bit becomes 1duringa 'Speak-External' command, when there is no more byte in theFIFO. This clears TS, terminates speech (at some abnormal point) andredirectsincoming bytes to the command register.

The data register
This output register is organised as a serial-in/parallel-outbuffer.It serves to hold data transfered from the speech ROM and pass it totheCPU as a single byte ('Read' command). The last bit transferedis the rightmost, least significant one.

The FIFO buffer
This input buffer is organized as a 16-byte parallel-in/128-bitserial-outstack, obeying a first-in, first-out logic. It is used to hold datapassedbytewise by the CPU for the 'Speak external' command. The synthesizershifts out bits as it needs them to create speech, once 8 bits havebeenused, the stack ripples down by one byte and begins shifting bits outofthe second 'first in' byte. An internal stack pointer keeps trackof the 'last in' byte, so that the synthesizer knows where toput incoming bytes. The position of this pointer whithin the stack isreflectedin the status register BL and BE bits.

The command register
This input register is used to internally latch the command passedby the CPU. There are 7 possible commands:Reset,Load-address, Read-byte, Read-and-branch, Speak, Speak-external andLoad-frame-rate.The later is only available on the more advanced TMS5520C synthesizer,on the TMS5520 it is considered as a NOP (no operation) command.


Speech synthesis logic

  1. Speech data is send to the speech synthesis logic in the form ofcodedparameters, either from the FIFO (data sent by the CPU) or from the ROMcontrol logic (data fetched from the ROMs).
  2. Data are fed serially into the parameter input register.
  3. Data are unpacked and various tests are performed: is the repeatbitset? Is pitch 0? Is energy 0 ?
  4. Unpacked parameters are stored in a parameter RAM
  5. Saved parameters are used as indexes to fetch the appropriate10-bitvalues from the lookup ROMs.
  6. The output of the lookup ROMs are the target values for theinterpolationlogic to reach during this frame period. A frame period lasts 25 msecwitha 640 kHz oscillator, 20 msec at 800 kHz.
  7. The interpolater reaches the new values in eight steps. Each oneofthese eight interpolation periods lasts 20 ROMCLK periods,whichis 3.125 msec (assuming a 640 kHz oscillator). Speech data is read (1)on the first of the eight.
  8. After each interpolation period, the interpolater sends new pitchandenergy parameters to the signal generator, and new reflectionparameters(K1-K10) to the LPC lattice network.
  9. The signal generator produces the filter excitation sequence forbothvoiced and unvoiced frames.
  10. At the end of each sample period, digitized speech data areavailableon the I/O pin and to the D/A converter. A sample period isdefinedas one ROMCLK period, which is 6.25 usec with a 640 kHzoscillator.This corresponds to a sampling rate of 8kHz.

  11. Timing10 kHz8 kHz
    Oscillator rate
    Osc period
    800 kHz
    1.25 usec
    640 kHz
    1.5625 usec
    ROMCLK rate
    ROMCLK period
    200 kHz
    5 usec
    160 kHz
    6.25 usec
    Sample rate
    Sample period
    10 kHz
    100 usec
    8 kHz
    125 usec
    Interpolation rate
    Interpol. period
    400 Hz
    2.5 msec
    320 Hz
    3.125 ms
    Frame rate
    Frame period
    50 Hz
    20 msec
    40 Hz
    25 msec

D/A conversion

The digital speech signal available on the I/O pin, is convertedintoan analog signal available on the SPEAKER pin, by an internalD/A converter with a 2% LSB linearity resolution. Every sample period(125usec with a 640 kHz oscillator), the most-significant 10 bits of the14-bitLPC lattice network output are sampled. The seven least-significantbitsare sent directly to the D/A converter, together with the sign bit(mostsignificant bit). The remaining two bits, YC and YB, are combined withthe sign bit and used to clip the driver to a 'full-on' or 'full-off'condition.

The resulting ouput is a current from 0 to 1.5 milliamps with aresolutionof 5.9 microamps, which is optimal to drive the TMS9919 sound chip. The1.8K resistor in series to the ground, causes the SPEAKER pinto deliver 2.7 volts when the lattice output is less than -127. Whenthelattice output is greater than 128, SPEAKER is clipped to 0volts.When no speech takes place, the lattice output is -1, which causes theSPEAKER ouput to drive 750 microamps, this is meant for anAC-coupledspeaker.

LPC latice outputsD/A input Analog out
(uA)
ValueYDYCYBYAY9Y8Y7Y6Y5Y4Y3Y2Y1Y0
> +127 011xxxxxxxxxxx111111110
> +127010xxxxxxxxxxx111111110
> +127001xxxxxxxxxxx111111110
+12700011111111xxx111111110
+12600011111110xxx111111105.86
etc.000........xxx......
+100000000001xxx10000001738.0
000000000000xxx10000000744.0
-1 (off) 11111111111xxx01111111750.0
-211111111110xxx01111110755.8
etc.111........xxx......
-12811100000000xxx000000001500
< -128110xxxxxxxxxxx000000001500
< -128101xxxxxxxxxxx000000001500
< -128100xxxxxxxxxxx000000001500


Speech ROMs

The data manual for the TMS5020C specifies that the serial memoriescan be TMS6100, TMS6125 or custom speech ROM or EPROMs. In the TImodule,there are two TMS6100 custom chips that are piggy-backed, i.e. mountedon the top of each other which each and every pin on one chip connectedto the corresponding pin on the other.

Internally, the TMS6100 is organized as 16 Kbytes, but some logicwasadded inside the chip so that it appears as 128 Kbits. Firstly, theaddresscan be latched into an internal counter. Since the bus is only 4-bitwide,the address is passed as five consecutive chunks, for a total of 20bits.The logic inside the ROM remembers how many chunks were passed andwhereto place the next nibble. Reading operations reset this logic.

Bits are read one at a time from the speech ROM. To this end, theROMcontains a data register into which it copies the currently accessedbyte.Each successive read operation causes a register shift and a bit ofdatais sent out. An internal counter keeps track of the read operations andloads the next byte when needed. This automatically updates the addresslatch.

Pinout

Power supply
Vdd -5 V
Vss +5 V

Synthesizer interface
CS* Chip select. In the TI module this pin is connected toVss,thus constantly active. Selection is achieved by mean of a specialselectioncode sent on the ADD1-ADD8 pins, after the address.

ADD1-ADD8 Address pins. To be connected to the correspondingpins on the synthesizer. ADD8 also serves as output pin fortheserial data. The address is passed at 14 bits to the ADD1-ADD8pins. It is followed by a chip selection code of 4 bits. Each chip hasits own internal selection mask (programmed during manufacture forROMs).If the selection code matches the mask, the chip will be selected. Thisallows to select upto 16 chips without the need of an external decoder.

M0, M1 Control pins. There are four possible combination ofstatesfor these two pins:
M0=0, M1=0: Idle state. The ROM is passive.
M0=0, M1=1: Load address. The next 4 bits of address are loaded intotheROM's address latch.
M0=1, M1=0: Read. The next bit from the current byte is sent on ADD8.Ifnecessary, fetches the next byte.
M0=1, M1=1: Read-and-branch. The ROM fetches 2 bytes at the currentaddressand uses the least significant 14 bits as a new address.

Non-connected pins
Although not connected internally, many of these pins are connectedexternally in the TI module. Pins #2 and #9 are connected to Vss (+5V),pins #15 through 28 are connected together, probably for mechanicalsupport.


Speech ROM operation

As explained above, the TMS6100 speech ROMs latch an address as 20bits,passed as five nibbles of 4 bits (since the bus is 4-bit wide). To loadan address, the speech synthesizer places a nibble on the bus, togglesM1 up for the duration of one ROMCLK period and repeats thisoperationfive times. The first 14 bits make up an address inside the ROM chip.Thenext four bits form a selection code that must match the internal chipcode to select this chip. This allows parallel connection of upto 16chipsand provides about 30 minutes of speech. The last 2 bits are ignored(d.c.= don't care bits)..

Once the address is latched, the synthesizer toggles M0 highfor two ROMCLK pulses. This is known as a 'dummy read'operation. No data will be transfered, the goal is only to reset theinternalload pointer so that the next 'load-address' operation loadsthe first 4 bits. The synthesizer must wait at least 80 usecafterwards,before attempting to read data from the speech ROMs.

From that point on, the synthesizer can read data one bit at a time,by pulsing M0 high for the duration of one ROMCLK period.On the next ROMCLK pulse, the selected ROM will send a bitonADD8, and increment its internal bit counter. Once 8 bits havebeen read, the ROM increments the address counter, fetches the nextbyte,and passes its first bit.

In summary, for all practical purposes, the speech ROM can beconsideredas a countinuum of 128K bits, that can be addressed starting at every8thbit.

Speech Synthesizer 5.0 Serial Key

Here is the timing diagram of a 'load address' operation:


Now the synthesizer waits at least 80 microseconds before it startsreadingbits:

In this example, 7 bits are read (D0 to D6), but the operation couldcontinue for as many bits as desired.


Speech ROMs in the TI module

The TI Speech Synthesizer module has two custom ROM chips. Theycontainsa 'binary-tree' list of words in plain ascii, with the addresseswhere to find speech data for each word. If you want to view thecontentof these ROMs, you can use my ModuleExplorerprogram, and set the memory type as 's'.

Texas Instruments originally intended to release more ROMs to extendthe vocabulary of the module. That's what the little door on the top ofthe module is for. However, they realized that speech produced merelybyconcatenating words is of poor quality: After. All. Nobody. Speaks.Like.That! To some extent it is possible to improve it by reading speechdata,modifying a frame here and there (e.g. stripping the last frame of awordto link it to the next), and feeding the result to the synthesizer. Buteven like this, the result it poor.

TI thus created a much more versatile program, that was integratedintothe Terminal Emulator module (don't ask me why). The module GROMcontainsspeech data for a list of allophons, i.e. all possible sounds inEnglish.The module ROM contains two subprograms, the first one breaks plainenglishsentences into a list of allophons, the second creates the speech datafrom those allophons, adds accentuation and voice inflexion, and passesit to the speech synthesizer via the Speak-External command. Thisprovidesa much more satisfactory discourse.


Operating the Speech Synthesizer

The CPU controls the speech synthesizer by writing commands to thecommandregister. If the command is 'Speak-External' all following writingoperations will be redirected to a FIFO buffer (First In-First Out).Thisbuffer can accept 16 bytes of data and feed them serially as 128 bitstothe speech synthesis circuits.

All read operations return the content of the status register,exceptimmediately after a 'Read-Byte' command: in this case the nextbyte of data comes from the data register. This is a serial-in,parallel-outregister that accepts 8 bits from the speech memory and pass them as 1byte to the CPU.

In the TI-99/4A, the speech synthesizer maps at >9000 for readoperationsand at >9400 for write operations.


Commands

Only one command at a time can be passed to the command register. Ifthe CPU tries to send a second byte to the synthesizer before thecurrentcommand is terminated, the synthesizer will activate the READY line tostall the CPU, until it is ready to accept the next byte. Note thatthisis not true for read operations.

Speech Synthesizer 5.0 Serial Key Serial

There are seven possible commands, most are one byte long, but somemay require additional bytes.


Load-Address
This command sends a 14-bit address and a 4-bit chip select code tothe speech memories. The chip whose internal selection mask matches theselection code will set its internal pointer to the specified address.Since there are only 4 address lines (ADD1, ADD2, ADD4 and ADD8), we'llneed five successive Load-Address commands to complete the operation.Thefollowing table explains how to compose the five required bytes:

Speak
This command causes the synthesizer to read data from the serialmemory(through the ADD8/DATA line) and to use it to generate speech. The TSbitbecomes 1 in the status register and speech begins on the next frameboundary.Speech continues until the stop code (energy = >F) is received, atwhichpoint the synthesizer interpolates towards zero. Speech will beterminatedand the TS bit reset at the next frame. Alternatively, the Resetcommandcan be used to interrupt speech and reset TS immediately, whithoutinterpolationto zero.

Speak-External
This commands allows the CPU to supply its own speech data, ratherthan reading them from the speech ROMs. Upon reception of the command,the FIFO is cleared (which sets the BL and BE status bits and activatesthe INT* pin) and all subsequent data from the CPU is sent to the FIFO.Nothing happens until the ninth byte is received. At that time, BLbecomes0, TS becomes 1 and speech begins 50 microseconds later. Speechcontinuesuntil the stop code is processed (a frame with the energy set as 1111)or until the buffer is empty or the synthesizer is externally reset (bybringing RS* and WS* low together). The Reset command cannot be used tostop speech as it will be mistaken for speech data.

Although this is not mentionned in the manual, I found out byexperimentingwith the Speak-External command, that it is not possible to overflowtheFIFO with the TI-99/4A: when it becomes full it activates the READYline and stalls the CPU. Therefore, a speech program only has toensurethat the FIFO is never empty (by monitoring BL), and doesn't need tocareabout passing bytes too fast.

Read-Byte
This command is used by the CPU to read the next 8 bits from thespeechROMs. This assumes that a Load-Address command has been previouslyissued.Only the next byte will be speech data, all subsequent readingoperationsaccess the status register, until another Read-Byte command is issued.

Read-and-Branch
This command causes the synthesizer to read two bytes at thecurrentaddress in the ROMs memory. These two bytes are used to select a newaddressinto the current ROM, i.e. they constitute a pointer. There is a 240usecdelay before the new address is set and the synthesizer is ready for aSpeak command.

This command is usefull because it allows to create differentversionsof the ROMs, that can be operated by the same program. For instance,imaginewe want to create a french version of the Speech Synthesizer module. Itis very unlikely that speech data for the word 'Bonjour' willbe at the same address in the french ROMs than data for 'Hello'in the english version. Thus a program using the Load-Address and Speakcommands will produce gibberish, unless it is completely rewritten. Ifon the other hand the ROMs contain a vector table (i.e. a table withtheaddresses of the speech data for the various words), all we have to dois to make sure that the vector for 'Bonjour' is at the sameaddress than the vector for 'Hello' in the english version. Thecalling program will just use Read-and-Branch without knowing whichlanguagethe ROMs contain.

Load-Frame-Rate
This command is used to tell a TMS5520C synthesizer whether itshoulduse a fixed frame rate, or use the first two bits of each frame as aframerate (it is ignored by the older TMS5520). The command byte has theform:x0x0xvrr where v is the frame mode and rr the fixed frame rate. If v=0,rr will be used as a frame rate until further notice. If v=1, rr isirrelevantand the frame rate must be passed with each frame. The four possibleframerates are:
00: 200 samples/sec
01: 150 samples/sec
10: 100 samples/sec
11: 50 samples/sec

Reset
This command is used to interrupt the current Speak command. Thesynthesizerstops speaking briskly, without interpolation to zero, the TS statusbitis reset and the FIFO is cleared (which sets the BL and BE bits andactivatesthe INT* pin). This command also sends a 'load address' commandto the speech ROMs (using a dummy address), followed with a dummy readpulse.


Speech encoding

I don't have the pretention to understand all the electronicmumbo-jumboin the TMS5220C manual, therefore I'll just quote excerpts here and letyou figure it out by yourself.

'Linear Predictive Coding (LPC) synthesizes human speech byrecoveringfrom the original speech enough data to construct a time-varyingdigitalfilter model of the vocal tract. This filter is excited with a digitalrepresentation of either glottal air impulses (voiced sound) or therushof air, which produces unvoiced sound. The output of this filter modelis passed through a 8-bit digital-to-analog (D/A) converter to producea synthetic speech waveform.'

'The LPC analysis program begins with a set of digitized speechsamples. These digitized samples are usually derived with ananalog-to-digital(A/D) converter by sampling an analog wavefore at a rate of 8 or 10kHz.Consecutive samples are grouped together to form a 'frame' ofdigitized samples. The frame may contain 50 to 400 samples, but usuallycontains 200. The LPC analysis routine operates on these digitizedsamples,a frame at a time, by preemphasizing the samples, calculating theenergy,pitch, and the spectral coefficients (K-parameters). Next, each valueiscoded according to a pre-selected coding table.'

'This coded speech parameter data is fed serially from either thespeech memory or the FIFO buffer to the parameter input register. Herethe Controller unpacks the data and performs various tests (i.e., istherepeat bit set, is pitch zero, is energy zero). Once unpacked, thecodedparameter data is stored in RAM to be used as the index value to selectthe appropriate value from the Parameter Look-Up ROM. The outputs oftheParameter Look-Up ROM are the target values for the interpolation logicto reach in this frame period. During each of the interpolation periodsthe interpolation logic sends new parameter values to the LPC latticenetworkwhich makes avalaible a new value of digitized speech to the D/Aconverter.'

'The LPC method of speech encoding reduces the speech data ratefrom approximately 100,000 bits/sec (raw digitized speech) to about4800bits/sec. The analyzer reduces this rate further (to 2000 buts/sec orless)by encoding each of the 20-bit speech parameters as 3 to 6-bit codes.Thesecoded values select a 10-bit parameter from the parameter look-up ROMinthe processor. Depending on the influence of the parameter on speechquality,between 8 and 64 possible values are stored in the Look-Up ROM fordecodingand use in synthesis calculation. Note that the parameter ROM in theTMS5220Cis mask programmable and not touchable or alterable by the user'.

What this boils down to is:

  1. Special hardware is needed to produce LPC-encoded speech, and wedon'thave a good description of it.
  2. Speech is coded by: Pitch, Energy and ten reflection parametersK1-K10.
  3. However, the value of these parameters is meaningless to usbecauseit is an index into look-up tables burried inside the synthesizer.Thesetable contains the real 10-bits parameter values, but we cannot alterthem.Thus we are stuck with the predefined parameter values.

I once wrote a test program that fetches an allophone from theTerminalEmulator module, displays the frames it contains, then lets me modifytheirparameters and listen to the result. My goal was to derive a set ofallophonesfor French. I spent endless hours trying to understand the effect of agiven K parameter and finally had to give up. I wasn't even able tocomeup with a single allophone: the 'an' sound (as in 'en passant')that I was using as a test.

I recently discovered a free speech-generation Windows software,WSDS,on the Texas Instruments website. Unfortunately, WSDS only works with adedicated sound board (which is not free...). There may be a way aroundit, as the Speech Editor, which is part of the package, can read .RAWsoundfiles. So it may be feasible to record words on .RAW files, read themwiththe Speech Editor, and pass them to WSDS to create the LPC data. Ihaven'ttried yet.

Now, here is how speech is encoded:

Energy sets the volume for a given sound, or part of sound.
Pitch is the frequency for that sound.
Rpt is the repeat bit.
K1-K10 are the reflection parameters index values.
With the TMS5520C, we don't have to use a fixed frame rate: we can alsospecify the rate for each frame, using two more bits at the beginningofeach frame to set the frame rate (see the Load-Frame-Ratecommand).

As you can see, only voiced frames (i.e. vowels) require the wholesetof parameters, that add up to 50 bits. Unvoiced frames (consonants andwhispers) can dispense with K7-K10, and only require 31 bits per frame.The synthesizer recognises unvoiced frames because they have a pitchvalueof 00000.

Since the human vocal tract changes shape relatively slowly, it isoftennecessary to repeat the same frame several times, possibly withdifferentpitches. To save memory space, a special repeat bit has been introducedbetween energy and pitch: if this bit is set the previous K parametersare retained and the resulting frame only needs 11 bits of data.

Finally there are two special cases: silence that has an energy ofzeroand is required for interword or intersyllabe pauses. And the stop codethat signals the synthesizer to stop speaking and has an arbitraryenergyvalue of 1111. Both these frames are only 4 bits long (6 bits with twoleading frame rate bits).


Synthesizer lookup tables

Below are the values hidden in the parameter lookup tables used bytheTMS5220.

Energy

ValueRMS
>000
>0152
>0287
>03123
>04174
>05246
>06348
>07491
>08694
>09981
>0A1385
>0B1957
>0C2764
>0D3904
>0E5514
>0F7789


Pitch

ValuePitchValuePitchValuePitchValuePitch
>000>1030>2050>3091
>0115>1131>2152>3194
>0216>1232>2253>3298
>0317>1333>2356>33101
>0418>1434>2458>34105
>0519>1535>2560>35109
>0620>1636>2662>36114
>0721>1737>2765>37118
>0822>1838>2868>38122
>0923>1939>2970>39127
>0A24>1A40>2A72>3A132
>0B25>1B41>2B76>3B137
>0C26>1C42>2C78>3C142
>0D27>1D44>2D80>3D148
>0E28>1E46>2E84>3E153
>0F29>1F48>2F86>3F159


Reflection coefficients

ValueK1K2K3K4K5K6K7K8K9K10
>00-0.97850-0.64000-0.86000-0.64000-0.64000-0.50000-0.60000-0.50000-0.50000-0.40000
>01-0.97270-0.58999-0.75467-0.53145-0.54933-0.41333-0.50667-0.31429-0.34286-0.25714
>02-0.97070-0.53500-0.64933-0.42289-0.45867-0.32667-0.41333-0.12857-0.18571-0.11429
>03-0.96680-0.47507-0.54400-0.31434-0.36800-0.24000-0.320000.05714-0.028570.02857
>04-0.96290-0.41039-0.43867-0.20579-0.27733-0.15333-0.226670.242860.128570.17143
>05-0.95900-0.34129-0.33333-0.09723-0.18667-0.06667-0.133330.428570.285710.31429
>06-0.95310-0.26830-0.228000.01132-0.096000.02000-0.040000.614290.442860.45714
>07-0.94140-0.19209-0.122670.11987-0.005330.106670.053330.800000.600000.60000
>08-0.93360-0.11350-0.017330.228430.085330.193330.14667


>09-0.92580-0.033450.088000.336980.176000.280000.24000


>0A-0.916000.047020.193330.445530.266670.366670.33333


>0B-0.906200.126900.298670.554090.357330.453330.42667


>0C-0.896500.205150.404000.662640.448000.540000.52000


>0D-0.882800.280870.509330.771190.538670.626670.61333


>0E-0.869100.353250.614670.879750.629330.713330.70667


>0F-0.853500.421630.720000.988300.720000.800000.80000


>10-0.804200.48553







>11-0.740580.54464







>12-0.660190.59878







>13-0.561160.64796







>14-0.442960.69227







>15-0.307060.73190







>16-0.157350.76714







>17-0.000050.79828







>180.157250.82567







>190.306960.84965







>1A0.442880.87057







>1B0.561090.88875







>1C0.660130.90451







>1D0.740540.91813







>1E0.804160.92988







>1F0.853500.98830








Timing diagrams

Speech Synthesizer 5.0 Serial Key Generator

(For the CPU interface. See abovefor thespeech ROM interface)

Write cycle for commands and speech data


Read cycle for status transfer

Speech Synthesizer 5.0 Serial Keys


Read-Byte sequence


Electrical characteristics

Absolute maximum ratings

Any pin with respect to Vss...............-20V to +0.3V
Power dissipation...............................600 mW


Recommended operating conditions

Parameter Min Nom Max Unit
Vdd -5.5-5-4.5Volts
Vss4.7555.5Volts
Vref-0-Volts
High level inputVss-0.6 -Vss Volts
Low level input Vdd 0 Vss-4Volts
Free-air temperature0-70`C
Storage temperature-40-70`C
Operational freq (RC)576640704kHz


Electrical characteristics under recommended conditions

Parameter Test conditionsMin Nom Max Unit
High level output voltage0.1 mAmpVss-0.5-Vss Volts
Ditto for D0-D7,WS*,RS*,INT*0.4 mAmp2.4-Vss Volts
Low level output voltage0.1 mAmp--Vss-4.5Volts
Ditto for D0-D7,WS*,RS*,INT*1.6 mAmpVref-0.50Vref+0.5Volts
Supply current from VrefRef to Vss-35mAmp
Supply current from VddRef to Vss
1035mAmp
Data bus load capacitance-25-300pF
Other pins input capacitance--15-pF
Other pins output capacitance--15-pF

Revision 1. 1/31/99 OK to releaseRevision 2. 3/30/99 PolishingRevision 3. 4/1/00 Got the TMS5220 manual! Added ROM tables,blockdiagrams, schematics, etc.Revision 4. 8/16/00. Added intro, link to picture.
Revision 5. 8/26/01. Added link to a picture page.

Speech Synthesizer 5.0 Serial Keyboard


Revision 6. 5/14/02. Added installation inside PE-box.


Back to the TI-99/4A Tech Pages