Authors: Muthiah Annamalai† <ezhillang@gmail.com>
†Corresponding Author:
Ezhil Language Foundation, Hayward, CA, USA
Abstract
We propose the Tamil number forms are equivalent by isomorphism (single rule over all numbers to corresponding number forms) in Telugu, Kannada and Malayalam. The latter being almost indistinguishable from Tamil except for prosody; this is based on intuition of digit forms [1]. We further contend that algorithm for generating numerals in any of the four languages are structurally identical due to the equivalence of numerals in a abstract way. We propose common algorithm for generating and parsing number forms [2] in these languages to/from text and into audio TTS generation. These can be used in various applications like token-queue systems, spoken calculators, etc.
1 Introduction
It is quite well known the digits of Dravidian languages and even number forms are roughly symmetric [1]; however the structure of this symmetry is the subject of this paper and we demonstrate the structural symmetry can be applied to a computer algorithm for generation and parsing of numbers in these language simultaneously. To the best of our knowledge this is a first effort in such linguistic motivated algorithm.
2 Evidence
Using a publicly available corpora [3,4] and using Google translate [5] we have established parallel evidences of the sample number forms across the 4 southern languages. By inspection one can ascertain a rough correspondence; however this symmetry goes beyond rough correspondence as authors prior work [2] can be extended by parameterization with suitable suffixes and modifications by language family we posit a single algorithmic routine can perform both parsing and generation of a numbers.
எண் மதிப்பு | தமிழ் | மலயாளம் | கன்னடம் | தெலுங்கு |
1 | ஒன்று | onn | Ondu | Okaṭi |
2 | இரண்டு | raṇṭ | eraḍu | reṇḍu |
3 | மூன்று | mūnn | mūru | mūḍu |
4 | நான்கு | nāl | nālku | nālugu |
5 | ஐந்து | añc | aidu | aidu |
6 | ஆறு | āṟ | āru | āru |
7 | ஏழு | ēḻ | ēḷu | ēḍu |
8 | எட்டு | eṭṭ | eṇṭu | enimidi |
9 | ஒன்பது | ompat | ombattu | tom’midi |
10 | பத்து | patt | hattu | padi |
11 | பதினொன்று | patineānn | hannondu | padakoṇḍu |
12 | பன்னிரண்டு | pantraṇṭ | hanneraḍu | panneṇḍu |
15 | பதினைந்து | patinañc | hadinaidu | padihēnu |
19 | பெத்தொன்பது | patteāmpat | hattombattu | pantom’midi |
20 | இருபது | irupat | ippattu | iravai |
30 | முப்பது | muppat | mūvattu | muppai |
40 | நாற்பது | nālpat | nalavattu | nalabhai |
50 | ஐம்பது | ampat | aivattu | yābhai |
75 | எழுபத்தைந்து | eḻupatti añcu | eppattaidu | ḍebbai aidu |
90 | தொன்னூறு | teāḷḷāyiraṁ | ombainūru | tom’midi vandalu |
99 | தொன்னூற்றொன்பது | teāḷḷāyiratti ompat | ombainūra ombattu | tom’midi vandala tom’midi |
100 | நூறு | nūṟ | ondu nūru | vanda |
200 | இருநூறு | irunnūṟ | innūru | reṇḍu vandalu |
500 | ஐநூறு | aññūṟ | aidu nūru | aidu vandalu |
1000 | ஆயரம் | āyiraṁ | sāvira | veyyi |
5000 | ஐந்து ஆயரம் | ayyāyiraṁ | aidu sāvira | aidu vēlu |
154999 | ஒரு இலட்சத்து ஐம்பத்து நான்கு ஆயிரத்து தொள்ளாயிரத்து தொன்னூற்ற றொன்பது | oru lakṣatti ampattinālāyiratti teāḷḷāyiratti ompat | nūra aivattanālku sāvirada ombhainūra ombattu | nūṭa yābhai nālugu vēla tom’midi vandala tom’midi |
Table 1: Parallel listing of number to words in Tamil, Malayalam, Kannada and Telugu
3 Algorithms
We modify the algorithm first presented in [2] for generating integral and floating point non-negative numbers for Tamil, but instead by view of symmetry in Tamil, Kannada, Malayalam and Telugu together called Dravidian languages (DL) we reorganize it as, follows,
Tamil Internet Conference, 2022. Thanjavur, India. 3 / 5
applied piece-meal to each of DL. In simpler terms we say the algorithms of [2] are parameterized by suffixes and prefixes specific to each DL but the overall structure remains the same by property of isomorphism. We note the steps for joining sections are to be handled in language specific way yet overall algorithm being invariant to source language.
3.1 Algorithm for Generating Numbers
Input: floating point number N Output: string of DL words T Algorithm:
- Load list of prefix and string suffix for all DL number words – 63 words in all.
- Find the quotient Q, remainder R for N divided by 1 crore, lakh, thousand, hundreds, or tens
- If Q is zero set N=R and continue to 2.
- Convert the quotient to words Ta. Take special care to handle 90s, 900s, 9000s, correctly.b. Take special care to handle number in 11-19.
- Invoke same algorithm recursively for remainder R.
- Concatenate results from 5 to T
- Return T
3.2 Algorithm for Parsing Numbers
Input: string of DL list of words T Output: floating point number N Algorithm:
- Load list of prefix and string suffix for all DL number words – 63 words in all.
- Initialize N at 0
- Create temporary stack S
- FOR word W in T
- IF W in stop words (crores, lakhs, thousands, hundreds, tens)a. Convert words in stack S into value and scale temporary result N using a helper routine which handles input upto value 100,0000.b. Empty stack S
- ELSE: push W into S
- END loop started at 4.
- Stack S is mostly non-empty and you have to use a helperroutine to get the final portion of the number using the samehelper function in 5a.
- Correctly parsed value is stored in N
4 Applications
Similar to the applications presented in [2] we can enable parameterized, by Dravidian Language (DL), a TTS generation, and calculator applications by reusing the algorithms of sec 3.1 and sec 3.2.

Fig 1: Organization of Speech input and Audio output calculator in each Dravidian Language (DL)
5 Summary and Conclusion
We have established algorithms to exploit the symmetry of number to words in Dravidian languages of Tamil, Malayalam, Kannada and Telugu for various applications; we demonstrated parameterized algorithm for generating and parsing number forms in these languages and provided a framework for applications like token-queue systems, spoken calculators, etc. leveraging this discoveries.
References:
- Wikipedia on Tamil Numeral Influence, https://en.wikipedia.org/wiki/Tamil_numerals#Influence (accessed Nov 14, 2022)
- M. Annamalai, S. Mahadevan, “Generation and Parsing of Number to Words in Tamil,” Tamil Internet Conference, 2020.
- Rao Vemuri, Freshman Lecture notes from “Learn Telugu and Its Culture,” at UC Davis, Fall 2006. https://www.cs.ucdavis.edu/~vemuri/classes/freshman/index.html
- Malayalam Numbers,https://www.learnentry.com/english-malayalam/vocabulary/numbers-in-malayalam/(accessed Nov 22, 2022)
- Google Translate, https://translate.google.com (accessed Nov 22, 2022)