Hello friends, and Tamil aficionados – it is winter in Boston and start of three months of bitter cold, but beautiful skylines. A little treat for surviving through the low sunlight hours and gray cold days.
What is a TTS ?
TTS converts text to speech. That was easy. Now the hard part is how does the computer do that? We will investigate how the language structure of Tamil can be used to find a suitable algorithm, a set of rules, when applied systematically can generate the speech from text. But first we need to understand what is a phonetic language.
The key is to identify that Tamil is a phonetic language, we have some simple ways of doing this. Basically if we write “முத்து” we say it as “மு + த் + து” phonetically. English is not the same, “Muthu” is written as such but pronounced in groupings of “Mu”+”th”+”u” as 3-syllables, but 5 letters. In Tamil முத்து has 3-letters and 3 syllables, with the number of letters to syllables mostly remaining the same.
OK, now how do we split a word into its constituent phonemes? Maybe by using open-tamil!
So I propose a simple minded algorithm to generate speech from text.
- Split the given Tamil text to words and for each word apply the steps 2 – 5
- Split the word into phonemes (phonemes = syllables) for Tamil
- For each syllable find the corresponding phoneme (phoneme is pronunciation) in form of a sound clip. In Tamil this has been called a ‘மாத்திரை’.
- Concatennate all these phonemes into the word, and apply a linear smoothing filter with a window. This is simple signal processing theory.
- Add a pause, beat, based on word spacing or sentence/paragraph spacing.
This algorithm is simplistic, but it does capture the essence of a text-to-speech engine. It could sound fairly mechanical or ‘robotic’ but the sound has to be made better.
Please share your comments, and views. For those in cold countries, stay warm!