Symmetries in Number Forms of Tamil and Dravidian Languages

Authors: Muthiah Annamalai <>

†Corresponding Author:

Ezhil Language Foundation, Hayward, CA, USA


We propose the Tamil number forms are equivalent by isomorphism (single rule over all numbers to corresponding number forms) in Telugu, Kannada and Malayalam. The latter being almost indistinguishable from Tamil except for prosody; this is based on intuition of digit forms [1]. We further contend that algorithm for generating numerals in any of the four languages are structurally identical due to the equivalence of numerals in a abstract way. We propose common algorithm for generating and parsing number forms [2] in these languages to/from text and into audio TTS generation. These can be used in various applications like token-queue systems, spoken calculators, etc.

1 Introduction

It is quite well known the digits of Dravidian languages and even number forms are roughly symmetric [1]; however the structure of this symmetry is the subject of this paper and we demonstrate the structural symmetry can be applied to a computer algorithm for generation and parsing of numbers in these language simultaneously. To the best of our knowledge this is a first effort in such linguistic motivated algorithm.

2 Evidence

Using a publicly available corpora [3,4] and using Google translate [5] we have established parallel evidences of the sample number forms across the 4 southern languages. By inspection one can ascertain a rough correspondence; however this symmetry goes beyond rough correspondence as authors prior work [2] can be extended by parameterization with suitable suffixes and modifications by language family we posit a single algorithmic routine can perform both parsing and generation of a numbers.


75எழுபத்தைந்துeḻupatti añcueppattaiduḍebbai aidu
90தொன்னூறுteāḷḷāyiraṁombainūrutom’midi vandalu
99தொன்னூற்றொன்பதுteāḷḷāyiratti ompatombainūra ombattutom’midi vandala tom’midi
100நூறுnūṟondu nūruvanda
200இருநூறுirunnūṟinnūrureṇḍu vandalu
500ஐநூறுaññūṟaidu nūruaidu vandalu
5000ஐந்து ஆயரம்ayyāyiraṁaidu sāviraaidu vēlu
154999ஒரு இலட்சத்து
ஐம்பத்து நான்கு
தொன்னூற்ற றொன்பது
oru lakṣatti ampattinālāyiratti teāḷḷāyiratti ompatnūra aivattanālku sāvirada ombhainūra ombattunūṭa yābhai nālugu vēla tom’midi vandala tom’midi

Table 1: Parallel listing of number to words in Tamil, Malayalam, Kannada and Telugu

3 Algorithms

We modify the algorithm first presented in [2] for generating integral and floating point non-negative numbers for Tamil, but instead by view of symmetry in Tamil, Kannada, Malayalam and Telugu together called Dravidian languages (DL) we reorganize it as, follows,

Tamil Internet Conference, 2022. Thanjavur, India. 3 / 5

applied piece-meal to each of DL. In simpler terms we say the algorithms of [2] are parameterized by suffixes and prefixes specific to each DL but the overall structure remains the same by property of isomorphism. We note the steps for joining sections are to be handled in language specific way yet overall algorithm being invariant to source language.

3.1 Algorithm for Generating Numbers

Input: floating point number Output: string of DL words Algorithm:

  1. Load list of prefix and string suffix for all DL number words – 63 words in all.
  2. Find the quotient Q, remainder for divided by 1 crore, lakh, thousand, hundreds, or tens
  3. If is zero set N=and continue to 2.
  4. Convert the quotient to words Ta. Take special care to handle 90s, 900s, 9000s, correctly.b. Take special care to handle number in 11-19.
  5. Invoke same algorithm recursively for remainder R.
  6. Concatenate results from 5 to T
  7. Return T

3.2 Algorithm for Parsing Numbers

Input: string of DL list of words Output: floating point number Algorithm:

  1. Load list of prefix and string suffix for all DL number words – 63 words in all.
  2. Initialize N at 0
  3. Create temporary stack S
  4. FOR word W in T
  1. IF W in stop words (crores, lakhs, thousands, hundreds, tens)a. Convert words in stack into value and scale temporary result using a helper routine which handles input upto value 100,0000.b. Empty stack S
  2. ELSE: push into S
  3. END loop started at 4.
  4. Stack S is mostly non-empty and you have to use a helperroutine to get the final portion of the number using the samehelper function in 5a.
  5. Correctly parsed value is stored in N

4 Applications

Similar to the applications presented in [2] we can enable parameterized, by Dravidian Language (DL), a TTS generation, and calculator applications by reusing the algorithms of sec 3.1 and sec 3.2.

Fig 1: Organization of Speech input and Audio output calculator in each Dravidian Language (DL)

5 Summary and Conclusion

We have established algorithms to exploit the symmetry of number to words in Dravidian languages of Tamil, Malayalam, Kannada and Telugu for various applications; we demonstrated parameterized algorithm for generating and parsing number forms in these languages and provided a framework for applications like token-queue systems, spoken calculators, etc. leveraging this discoveries.


  1. Wikipedia on Tamil Numeral Influence, (accessed Nov 14, 2022)
  2. M. Annamalai, S. Mahadevan, “Generation and Parsing of Number to Words in Tamil,” Tamil Internet Conference, 2020.
  1. Rao Vemuri, Freshman Lecture notes from “Learn Telugu and Its Culture,” at UC Davis, Fall 2006.
  2. Malayalam Numbers, Nov 22, 2022)
  3. Google Translate, (accessed Nov 22, 2022)

மறுமொழியொன்றை இடுங்கள்

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  மாற்று )

Twitter picture

You are commenting using your Twitter account. Log Out /  மாற்று )

Facebook photo

You are commenting using your Facebook account. Log Out /  மாற்று )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.