Java and Open-Tamil : Write Tamil Applications using Java

Java and Open-Tamil : Write Tamil Applications using Java

I’m sharing a small example; you can download (from Github) the whole Java package and include it in your desktop, mobile or web app. For Free! Example gist follows.

Equivalence of Tamil Numerals to Telugu, Kannada and Malayalam counterparts

Tamil Numerals are equivalent by isomorphism (single rule over all numbers to corresponding numeral) in Telugu, Kannada and Malayalam. The latter being almost indistinguishable from Tamil except for prosody. See: Wikipedia on Tamil Numeral Influence

Tamil Text to Speech Synthesizers – Topical Overview

There are several open-source Tamil Text to Speech Synthesizers (TTS) available to date. Google uses one of these behind the scenes; we may wonder which one. Some of these have previously been reviewed by others at Azhagi and elsewhere on Tamil interwebs.

  1. eSpeak (Rule based synthesizer), GPL licensed (technique formant LPC analysis based)
  2. Festival  (Univ of Endinbrugh)(Tamil voice added on Feb, 2015), BSD like license.
    1. Research carried out by IISc team lead by Prof. A.G. Ramakrishnan
  3. tamil-tts by Prof. Vasu Renganathan, GPL (technique unit selection based)
  4. Android-tts – hack the English TTS to speak Tamil by transliteration
    1. This is a hack suggested by yours truly;
    2. e.g. to speak the phrase, “சும்மா இருக்கியா?” we use the English TTS via the transliteration, “Summaa Irukkiyaa?”. Clearly this maybe sub-optimal but work as a hack.

Opinion : While there are purpoted speech synthesizers in Tamil from academic government (tax-payer) funded projects like SSN-IIT collaboration, they are not publicly available or easily licensable. It is a situation that needs to be remedied.

Linguistic Aspects: One of key issues that make Tamil TTS a relatively easy goal is because the front-end is easily achieved due to the phoneme-orthography of Tamil language – i.e. Tamil language is itself a “phonetic language” where written spelling and phonetic/spoken forms are identical. This is not so in case of  English and European languages like French where there are silent words, and exceptions to rules almost all the time. Tamil language has few exceptions if any.

The back-end of the TTS engine is usually formed by LPC analysis or other source-filter separation algorithms which is an exciting and continually engaging problem for Signal processing engineers, computer scientists, and computational linguists.

Further research for Tamil TTS should improve on the naturalness and ability to include these wonderful tools of previous generation of engineers into our applications for mobile and desktop devices. Google scholar is your friend – start here!.

open-tamil v0.5

(Ref: https://www.flickr.com/photos/xynntii/15742278568/in/photolist-pZ6hY1-4o4BP2-3zDjQc-3zDcQ2-dzekRE-dzekMC-5ADdsd-7baUa2-7beHFC-7beH2u-7baUV8-7baUNR-7baUfg-7beHeU-7baUoK-na3VN2-7baUhM-7HPfMG-73kGgS-77koj2-2HQSyC-7peLBH-8QbbzJ-7fvoSB-9vYyEZ-4G3w5s-5AgGgv-7HPmEs-5AgGaD-5AkZT5-7fvmop-qgPK4y-3LdubW-5uDLC6-dhXtbU-dhXwbb-7baUAR-dhXmqB-7n5oG5-dhXnEY-dhXm5p-5EV9ke-dhXxHM-5EZs4m-7j54eY-dhXnJv-dhXuRM-pVdFZq-dhXwGN-dhXpi6)
Fall / Autumn season in Boston Photo Credit: Xynn Tii on Flickr. (Licensed CC)

New release v0.5

Since February 2015, when v0.4 was released, open-tamil has received contributions from Arulalan, Shrinivasan,  Kumaran and I, with several bug fixes and changes to the code. I’m happy to release v 0.5 on behalf of our team.

pip install open-tamil

Details follow in the latter sections. Happy Fall, for those of you in the Western countries / northern latitudes of Northern Hemisphere, Eat some pumpkin pie, have a good Autumn season.

Always support us in terms of code, contributions and bug reports.

Happy Hacking.
-Muthu

The list of issues fixed/tracked on Github:

Date:   Wed Sep 30 23:20:43 2015 -0400

    #91 : one audio file synthesized for a given number.

commit 8a57d9441ab76466d23feadcafecace8bb80a807
--
Date:   Wed Aug 26 00:11:44 2015 -0400

    permutation generation with filtration predicate - #86

commit a3c00c93bfefa40178553d5fbdacfac22a146c3a
--
Date:   Wed Aug 5 22:24:11 2015 -0400

     Solthiruthi - framework - Trie data structure - pickle capability #37 ;
     Updates to wordgrid to output HTML data

--
Date:   Sat Aug 1 23:31:49 2015 -0400

    1)  Solthiruthi - framework - word play - comprehensive word_split #83
    2)  Solthiruthi - framework - word play - permutagrams #82
    3)  Update unittests
    4)  Add example function
--
Date:   Sat Aug 1 13:16:26 2015 -0400

    1) Redo WordSplitter algorithm #50 with greedy partitioning
    2) Add unittests for English and Tamil word splits
    3) Update Travis YML
--
    4) Unit tests for 2,3
    5) opentamil.__init__ tests will no longer print cruft; reduce cruft on the test points
    6) Redo :  WordSplitter algorithm #50

commit 5dd5432a1ee7d03e499e13a2a9e5e9420cee11ca
--
Date:   Thu Jul 30 22:03:48 2015 -0400

    1)  Solthiruthi - framework - dictionary for English #81
    2)   WordSplitter algorithm #50

commit eb752e4109b059ec5c5f09b34fbf4b513af6493f
--
Date:   Thu Jul 30 00:05:08 2015 -0400

     Solthiruthi - framework - dictionary file is not installed via setup.py #78

commit c6d7c8133e2a692cdef1fe057fd10bd8bea21e1a
--

    added tests to find out the list of rhyming words.
    Ref: Solthiruthi - framework - rhyming words - edhugai/monai #79

commit c66552210742470c1693e7c4e0809bc0beb5e94c
--
    3) Tamil Trie tests go into solthiruthi_tamil_datastore.py with some skip tests
    4) adding test letter_normalized.py
    change #2 is part of class file datastore.py

commit 826d83b29439cdb9971f33a1d4c88ab7b7eabad6
--
    Added predicates (2,3), and anagram function (1).
    Ref: Ref: open-tamil | Anagrams in Tamil VU word list post @ https://ezhillang.wordpress.com/2015/07/27/open-tamil-anagrams-in-tamil-vu-word-list/
    Ref:  Solthiruthi - anagram tools #76

commit 56a0def001d20fef57ae564b099b3f195955c041
--
Date:   Sat Jul 25 23:08:41 2015 -0400

    0) Solthiruthi - framework - word play #75 :
    1) decorator object for Python2.6 exception skip tests.DataDictionaryWithPredicate - dummy interface.
    2) DataDictionaryWithPredicate - dummy interface
--
Date:   Thu Jul 23 22:40:57 2015 -0400

     Solthiruthi - framework - canned dictionary #74

commit d8b203629d5845981c27b1cdd4e0797086cdc40b
--
Date:   Wed Jul 22 01:16:55 2015 -0400

    Solthiruthi framework - permutations / anagrams #73

commit c5e1d91baed1cef49e1a0161c4b35bfdfb395616
--
    2) 63 WAV files for female voice contributed by K. Priya
    3) tamil.numeral.num2str - has option to playback / synthesize a numeral into a audio piece including floating pt numbers.
    Ref:  open-tamil நேரம் படிக்கும் கெடியாரம் #71
    4) to_audio : tamil number / audio synthesizer

--
Date:   Tue Jun 30 21:25:06 2015 -0400

    Merge pull request #72 from atvKumar/master
    
    Added AnuFonts and ShreeLipi
--
Date:   Fri Jun 19 01:34:03 2015 -0400

    1)  Numerals - 1/2, 1/4, sadham (100) and support decimals #69
    2) unit tests for #1

commit dd4eb52322452e091b897cd1bc3d09d848da93d1
--

    0) add Tamil letters and get_letters tests for Java open-tamil packages
    1)  Basic Java package #63

commit e054e2c370813efe8ad4d6479830af7d0a5a5319
--

    1) utf8 - get_letters_elementary_iterable
    2) tests for #1
    get_letters_elementary : split word -> letters reducing uyirmei #62

commit 90e9d3a9b031c1e3000fb2109d76773aa4738e3f
--
Date:   Sat May 23 09:42:21 2015 -0400

    1)  Suffix removal function #47
    2)  tests for #1

commit c1d7ef4f06498f1152273d3c1bbd00df69ec15dc
--
Date:   Sat May 23 01:44:55 2015 -0400

    1) add BadIME tester  Bad IME checking rule #56;
    2) add tests for weird input behavior with get_letters
    3) add tests for #1
    4) Ruby gem push script version bump

--
Date:   Wed May 20 02:06:01 2015 -0400

    1) Tamil Gem - Pavalam #55
    2) first steps toward a Ruby version of open-tamil library
    3) make tamil.rb have a Tamil module with just static methods
--
    1) getAllWordsCount() API for N-gram data store;
    2) added new unittest for N-grams calculation of Gettysburg address. Honest Abe
    3) N-gram frequency analysis of corpus using Tries #54

commit 7e616d853d9c9332c12f43b87645aecaa046d36b
--
    1) add new API for Queue.isempty()
    2) solthiruthi DOM model tokenizer can use re.search and file.readlines() to generate tokens skipping NL and WhiteSpace characters
    3) DTrie -add a count attribute Trie #53

commit 65c6390d3593dc09688a1d7392daf005c2c960f3
--
    1) getAllWordsPrefix() API
    2) tests
    Ref: issue  Solthiruthi - write method getAllWordsPrefix starting with known prefix #44

commit 95ab8bf6e4aa354ef5c0d22c226b94b9bc69d55d
--
    4) MIT license for open-tamil
    
    Ref:  Solthiruthi - heuristic rules #45 (issue)

commit e6b203b7328c7b8e211d35ca45ad1fd416ab0763
--

    1) heuristic rules
    2) tests for #1
    3) dictionary loader utilities

--
    1) datastore - DTrie - in-memory store
    2) Unittests - blazing fast
    3) Solthiruthi - framework - iterable method for DTrie data structure #43
    4)  -do -  DTrie data structure for fast loading #36
    5) getAllWordsHelper fixup for TamilTrie

--
    1) 2721 male names, 1260 female names
     1.1) peyargal.txt clean data
    Ref: ->  Solthiruthi - peyarkal - 4000 கும் மேற்ப்பட்ட தமிழ் பெயர்கள் - Tamil Names #42

commit 7d79bf2ba4aab4c2c228711a137458cec0d441c8
--
    0) word filter by wordlength, words with spaces
    1) Ezhimai - level 0 checker
    Ref: Solthiruthi - framework Level 0 checker #27
    2) Introduce abstract base classes and methods for WordSpeller
    3) Move getidx() function to tamil.utf8 module
    4) bugfix for utf8.tamil_letters
    Ref:  utf8.tamil_letters has agaram repeated twice #41
    5) Add unittest for 4
    6) Add unittest for Ezhimai
--
    2) Wikipedia Word list - 1mil+
    3) Project Madurai word list - 1mil+
    Ref: Issue  solthiruthi - add wikipedia, project madurai wordlist #35

commit 96d8c8eae7aaab84ff2e7ea7a2137352028b10f5
--
    2) add English Trie builder method
    3) unittests for 1, 2.
    Ref: Issue -  Solthiruthi - framework - Trie data structure #36

commit c0917104cc1335e1b40ec05e33f3495a899889a1
--
    1) solthiruthi CLI
    2) tests for solthiruthi CLI using argparse
    Ref:  Solthiruthi - framework - command line interface #34

commit 4b8b187cdfe0503714a0f2de9ff5c737a6cabd57
--
    2) fix up issue where last category of wordlists was not saved
    3) added unittests + data
    Ref:  parser for data on proper nouns #33

commit 9c11e0c2523db5936063dc7e14b88c7d6c99d0e2
--

    1) proper nouns for solthiruthi.
     data on proper nouns - part 1 #32

commit 2eb4a3edf735caeae9a630553881bccc19e85e5c
--
Date:   Fri Apr 24 23:44:30 2015 -0400

    1)  contributing to open-tamil : pangalippugal #31

commit 7e50cb584efe72a320326af61bee8c69fe772488
--
Date:   Sat Apr 4 22:12:50 2015 -0400

    test C-api #3

commit 5ba5cd4ad1966994299ac91d28c6fdfdd75817de
--
Date:   Sat Apr 4 22:06:56 2015 -0400

    attempt #2 / breakout 2 sections

commit 249c7734327002931dff6f2bd31cc369814424d6
--
Date:   Sat Apr 4 22:02:52 2015 -0400

    attempt #2

commit dd0cced42cc7e9f28ddb6d61f60cfb1c66850bf1
--
Date:   Sat Apr 4 11:43:44 2015 -0400

     Fast Tamil Unicode page detection algorithm #23

commit cf8fc37f5a5fdca8766b82f0ca585c67f59ccb3b
--
    1) solthiruthi pkging issue fix
    
    2)  tamil.utf8.istamil_alnum - bug #21
    3)   tamil.utf8.compare_words_lexicographic - Python3 support is missing #20

commit 87a9cce569be56bd4c12de3c80f14d438f515918
--
Date:   Sat Feb 14 12:21:37 2015 -0500

     ஶ் - sanskrit letter missing in open-tamil #11

commit 35c0565d93deff38d5ba8e6bfe5843caf5649ea9
--
Date:   Sat Feb 14 11:57:57 2015 -0500

     ksha க்ஷ் series of Tamil grantha/sanskrit letters missing from tamil.utf8 module of open-tamil. #10

commit f35f77cfb4484b6c42734aa8dcf43842d45b19a4
--
Date:   Sat Feb 7 00:35:46 2015 -0500

    update Santhi tests; move file to right unittest.  Move unit tests to right folders #9

commit 60c5296eaa3b7430d04647886486becc72737883
--
    4) add num2tamilstr_american - to print numerals in American notation of mil, bi and trillion in 1000x multiples.
    5) Update num2tamilstr for subtle bugs.
    6) Added tests for #1, and #2
    7) README.md : Open-Tamil notes on tamil.numeral class and methods.
    8) Notes on Python 2-3 support as goal for project
--

    1) add Tamil regexp module for easier regular expression processing in Tamil
    2) add unit tests for #1. Move unittest test/santhi_rules.py ->
    tamil_regexp.py
    3) add documentation and updates for Tamil UTF-8 to include grantha letters more into the fold
--
Date:   Fri Dec 19 17:15:16 2014 -0800

    Merge pull request #8 from arulalant/master
    
    Doc and Grammar Utils - Initial Setup
--
Date:   Sat Aug 16 09:40:43 2014 -0400

    bump v#

commit b013f4601b22c6e4d99b475030c9a690d718db1a
--
Date:   Sat Aug 16 09:37:24 2014 -0400

    Merge pull request #6 from arulalant/master
    
    Discovery of import tamil error in pip module; this is not complete fix still.
--
Date:   Fri Aug 15 17:42:35 2014 -0400

    Merge pull request #4 from arulalant/master
    
    Added 5 new Tamil encodes in txt2unicode module
--
Date:   Fri Aug 15 14:36:54 2014 -0400

    update v.# make tag and set to release

commit d00b615c1af981c0df031ae5d823444be3523dc6
--
Date:   Mon Aug 11 00:23:04 2014 -0400

    Merge pull request #2 from arulalant/master
    
    Added txt2unicode, txt2ipa modules
--
Date:   Thu Jul 31 00:20:37 2014 -0400

    Merge pull request #1 from tshrinivasan/master
    
    added example file for tscii2utf8 convertor
--
Date:   Sun Jan 26 12:34:10 2014 -0500

    update v #

commit f15c273bbb6a1ad9827f7e6f1fc0e5678fbbdf1a
--
Date:   Tue Jan 14 22:32:34 2014 -0500

    flight manifest - like; add v # bump

commit e743f76d36d2989f9b2a0ee2f30fea2c99211f51
--
Date:   Tue Jan 14 22:26:39 2014 -0500

    bump v #

commit 7fa45f77409076d5906c547624975d0ee5f4c6a7
--
Date:   Tue Nov 12 22:16:53 2013 -0500

    # TSCII - Tamil ASCII, like CJK, wide-chars in Windows for Cyrillic, and other
    # encodings basically map the extended ASCII (128-255) range of the 8-byte
    # storage into various code points.
    #
    # Among various Tamil encodings TSCII was an advancement over font-table
    # encoding, in the pre-Unicode days.

commit 87cd05a2ec071b9f08908965e42019e0cc7c7f01

The full changelog follows:

commit 26035589f14563cd47cf24ae2b5a09ac3942ef71
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Fri Oct 9 21:06:28 2015 -0400

    v0.5 : open-tamil

commit 9875fb7e002c2c40d20155e684999d07ca735edd
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Fri Oct 9 20:56:59 2015 -0400

    1) units_10.wav for female voice
    2) num2tamilstr_american/num2tamilstr dont need the iteration
       which is misleading

commit c196e6ebad501ba33cfe15309ebb2f0d75e422e6
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Fri Oct 9 20:56:35 2015 -0400

    1) units_10.wav for female voice
    2) num2tamilstr_american/num2tamilstr dont need the iteration
       which is misleading

commit 7985af9fc358aecd66f14674524fa541d5e096ba
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Fri Oct 9 20:53:12 2015 -0400

    1) units_10.wav for female voice
    2) num2tamilstr_american/num2tamilstr dont need the iteration
       which is misleading

commit 44f6a520889d8978a541404134d39da043f9f744
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Wed Sep 30 23:48:16 2015 -0400

    say time in numbers

commit 9ef355f8c0d735a42c84ffcc0ce9cec9ba180cbf
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Wed Sep 30 23:20:43 2015 -0400

    #91 : one audio file synthesized for a given number.

commit 8a57d9441ab76466d23feadcafecace8bb80a807
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Wed Sep 30 22:47:51 2015 -0400

    pesum kediyaram : update with correct copy of K Priya voiced pulli.wav

commit 50588d015aa2a6d557199eebf1ae1beb370bfd82
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat Sep 12 07:37:50 2015 +0530

    0) additional tests

commit aa550b2012f6239cdcc04ff034f052dc605faf5d
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Tue Sep 8 16:58:11 2015 +0530

    1) solthiruthi/morphological.py : add RemovePrefix class, and take it for a test ride
    2) tests/solthiruthi_suffixremoval.py : checks +ve test and neutral case for prefix removal
    3) new data on sirpangal to plural suffix removal.

commit d300fac29f1b9bd16cc410a8015644f12a9069a6
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Tue Sep 8 15:51:19 2015 +0530

    1) solthiruthi/morphological.py : use a suffix remove + replacement if there is a suffix dictionary provided. e.g. class RemovePluralSuffix;
    2) Tests for item-1
    3) Rules are borrowed from Damodaran implementation of the Tamil-Stemmer rule for remove_plural_suffix.
    Commits on the road / India Sep 2015

commit 37e4a78e71e4545a3c46a93aee4f6331ae82d285
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Fri Sep 4 22:41:30 2015 -0400

    1) Add plurals stripper
    2) Starting pieces of web based (Django) interface for solthiruthi

commit deb6ae342ea3df5626f418f07a117ee2d74c9ebb
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Wed Aug 26 00:11:44 2015 -0400

    permutation generation with filtration predicate - #86

commit a3c00c93bfefa40178553d5fbdacfac22a146c3a
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Mon Aug 10 21:53:30 2015 -0400

    1) Update plaindrome code to iterate from 0:floor(N/2) only since we compare left end with right end.
    2) Add method *all_plaindromes* to wordutils module to find the list of words we want in a dictionary
    3) Update with Python 2-3 fix for Integer division

commit 873e44f05955e4186c28c85434537bc5c746237a
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Wed Aug 5 22:24:11 2015 -0400

     Solthiruthi - framework - Trie data structure - pickle capability #37 ;
     Updates to wordgrid to output HTML data

commit 4bc5cd185ea2e219ce2bc06fd5f1d4b855ad5bc7
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat Aug 1 23:31:49 2015 -0400

    1)  Solthiruthi - framework - word play - comprehensive word_split #83
    2)  Solthiruthi - framework - word play - permutagrams #82
    3)  Update unittests
    4)  Add example function
    5) Unittest removes exception for string inputs

commit 3beeca74e700c07c0f31b46612ddeb68aa500bc6
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat Aug 1 13:16:26 2015 -0400

    1) Redo WordSplitter algorithm #50 with greedy partitioning
    2) Add unittests for English and Tamil word splits
    3) Update Travis YML

commit 8201bd070ac9a500d331a9f5079422664e9f2841
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Fri Jul 31 03:25:17 2015 -0400

    1) update wordutils.greedy_split : split words into right sized longest portions
       added unittests
    2) greedy_split required adding a hasWordPrefix() method to Trie data store element
    3) Dictionary provides new methods, getWordsStartingWith, and hasWordsStartingWith using Trie structure
    4) Unit tests for 2,3
    5) opentamil.__init__ tests will no longer print cruft; reduce cruft on the test points
    6) Redo :  WordSplitter algorithm #50

commit 5dd5432a1ee7d03e499e13a2a9e5e9420cee11ca
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Thu Jul 30 22:03:48 2015 -0400

    1)  Solthiruthi - framework - dictionary for English #81
    2)   WordSplitter algorithm #50

commit eb752e4109b059ec5c5f09b34fbf4b513af6493f
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Thu Jul 30 00:05:08 2015 -0400

     Solthiruthi - framework - dictionary file is not installed via setup.py #78

commit c6d7c8133e2a692cdef1fe057fd10bd8bea21e1a
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Wed Jul 29 00:17:38 2015 -0400

    added tests to find out the list of rhyming words.
    Ref: Solthiruthi - framework - rhyming words - edhugai/monai #79

commit c66552210742470c1693e7c4e0809bc0beb5e94c
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Tue Jul 28 23:19:53 2015 -0400

    fixup Python 2.6, Python 3 related test issues.
    Update Travis.yml file;

commit 44675745e9e5f7f5820edec050ede7d48beb8b56
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Tue Jul 28 22:54:26 2015 -0400

    finalize Reverse Trie interface;
    data insertion and retrieval happens in regular word order, including getWordsEndingWith() interface; however the internal storage match in the reverse order in a Trie.

commit 3dc77e2b7fa05eab7ba244424645ec50ca5da175
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Tue Jul 28 21:37:51 2015 -0400

    1) add reverse trie for ends-with match
    2) more tests for reverse_word in utf8 package
    3) Tamil Trie tests go into solthiruthi_tamil_datastore.py with some skip tests
    4) adding test letter_normalized.py
    change #2 is part of class file datastore.py

commit 826d83b29439cdb9971f33a1d4c88ab7b7eabad6
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Tue Jul 28 09:18:10 2015 -0400

    Tamil / English suffix matching via tries

commit a044e37e39d3240d752030a5da9a0da9931ca5ff
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Mon Jul 27 01:10:42 2015 -0400

    1) anagrams_in_dictionary
    2) is_anagram
    3) is_palindrome
    
    Added predicates (2,3), and anagram function (1).
    Ref: Ref: open-tamil | Anagrams in Tamil VU word list post @ https://ezhillang.wordpress.com/2015/07/27/open-tamil-anagrams-in-tamil-vu-word-list/
    Ref:  Solthiruthi - anagram tools #76

commit 56a0def001d20fef57ae564b099b3f195955c041
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sun Jul 26 23:30:15 2015 -0400

    is_palindrome, is_anagram API added to tamil.wordutils

commit a4e56bd7c0e1381f2c5895393994e32912a55b88
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sun Jul 26 15:31:02 2015 -0400

    savaal : blog post @ https://ezhillang.wordpress.com/2015/07/26/open-tamil-%E0%AE%9A%E0%AE%B5%E0%AE%BE%E0%AE%B5%E0%AE%BE%E0%AE%B2%E0%AF%8D%E0%AE%B5%E0%AE%BE%E0%AE%9A%E0%AE%B2%E0%AF%8D%E0%AE%9A%E0%AE%B5%E0%AE%BE%E0%AE%B2%E0%AF%8D/

commit b381489ff46c3815011b677a471af92e0471a809
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat Jul 25 23:08:41 2015 -0400

    0) Solthiruthi - framework - word play #75 :
    1) decorator object for Python2.6 exception skip tests.DataDictionaryWithPredicate - dummy interface.
    2) DataDictionaryWithPredicate - dummy interface
    3) Cleanup anagrams interface, and tamil_permutations interface + tests
    4) datastore.py : move demo to a test
    5) anagrams / palindrome / combinations / generate all valid sub words

commit 36a4ee65445757e89d73abbcc33c675e6fa97eba
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Thu Jul 23 22:40:57 2015 -0400

     Solthiruthi - framework - canned dictionary #74

commit d8b203629d5845981c27b1cdd4e0797086cdc40b
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Wed Jul 22 01:16:55 2015 -0400

    Solthiruthi framework - permutations / anagrams #73

commit c5e1d91baed1cef49e1a0161c4b35bfdfb395616
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Mon Jul 13 00:36:20 2015 -0400

    pulli audio file from K Priya

commit edaaac5dffe80cab18ee82cd7021921fbdb47bc7
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Thu Jul 9 00:04:35 2015 -0400

    Python3 rounding issues fixup for testing

commit 23f4e9d910fc15f5ef253a340fc8107316d73159
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Wed Jul 8 23:24:12 2015 -0400

    0) License info
    1) PYTHON2_6 test skip
    2) 63 WAV files for female voice contributed by K. Priya
    3) tamil.numeral.num2str - has option to playback / synthesize a numeral into a audio piece including floating pt numbers.
    Ref:  open-tamil நேரம் படிக்கும் கெடியாரம் #71
    4) to_audio : tamil number / audio synthesizer

commit 8fa80ec6c51870d5ada820893e183d8cac780dcd
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Tue Jul 7 00:33:29 2015 -0400

    Python2.6 should pass tests since @unittest.skipIf decorator is not present in it

commit 18175a7a81cc0e13d492c6328b358dc9641c4c10
Merge: a3a5203 0dd9b68
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Tue Jun 30 21:54:16 2015 -0400

    Merge branch 'master' of https://github.com/arcturusannamalai/open-tamil

commit a3a5203db9833a207b2ad27a8646dc0e0cba52b1
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Tue Jun 30 21:50:50 2015 -0400

    Male voice

commit ee76e796f51f8ee3b49699a04da4e6ef3601ce5d
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Mon Jun 29 00:59:06 2015 -0400

    1) Added audio files for the read-out clock
    2) sorkal, to_audio
    3) to_audio / winsound.PlaySound with right flags on Windows
    4) pizhai_patterns

commit 0dd9b6847adf50d6322f64ecb467f442eda15230
Merge: 69aa0fb 8c3f0a6
Author: Muthiah Annamalai <arcturusannamalai@gmail.com>
Date:   Tue Jun 30 21:25:06 2015 -0400

    Merge pull request #72 from atvKumar/master
    
    Added AnuFonts and ShreeLipi

commit 8c3f0a684954c92780d69bcdec181fd517202428
Author: Kumaran <atv.kumar@gmail.com>
Date:   Mon Jun 29 17:24:10 2015 +0530

    Added AnuFonts and ShreeLipi
    
    Added Support for AnuFonts (Win/Mac) & ShreeLipi (AVID - Win)

commit 69aa0fb40a6685a8d63cf6d770d3d43e180a70ec
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sun Jun 28 22:57:18 2015 -0400

    words for pesum_kediyaram - 63 words totally

commit 5e59c056956ccb0eca8df7c1fb091cc599fbad34
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sun Jun 28 22:48:04 2015 -0400

    init-update;
    pesum_kediyaram/sorkal.py, and README

commit fb76cd73491977ac1f1232516755d2c93876644d
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Fri Jun 19 01:34:03 2015 -0400

    1)  Numerals - 1/2, 1/4, sadham (100) and support decimals #69
    2) unit tests for #1

commit dd4eb52322452e091b897cd1bc3d09d848da93d1
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Thu Jun 18 09:29:49 2015 -0400

    1) numeral - handle negative numbers.
    2) test 1, and test exceptions for first time

commit 0b30a0ed6911ac8c90379d65bbc0f5eb12a8c128
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sun Jun 7 13:43:39 2015 -0400

    Add new test for "கட்டளை" word length in Java, following the blog post @ https://ezhillang.wordpress.com/2015/06/07/open-tamil-java-usage/

commit 5cdfd9a916b20409ee8b01ed981dff6139d22f8c
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat Jun 6 20:36:55 2015 -0400

    0) add Tamil letters and get_letters tests for Java open-tamil packages
    1)  Basic Java package #63

commit e054e2c370813efe8ad4d6479830af7d0a5a5319
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat Jun 6 16:35:46 2015 -0400

    add open-tamil for Java

commit 9c8093e17960327b5ff70a5a77e03bef1894b78c
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Thu Jun 4 00:36:26 2015 -0400

    1) utf8 - get_letters_elementary_iterable
    2) tests for #1
    get_letters_elementary : split word -> letters reducing uyirmei #62

commit 90e9d3a9b031c1e3000fb2109d76773aa4738e3f
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Wed May 27 01:42:03 2015 -0400

    While Ruby 1.93 works locally on Ubuntu it does not on Travis-CI server. So unbundling the server for unitt-testing

commit a142a796777602cfa4e56a1ddb6853b2ae2d53a0
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Wed May 27 01:24:45 2015 -0400

    utf-8 encoding for Ruby 1.93 or later

commit 261f9848929c7db40fd0c6adc4c8bf337783396a
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Wed May 27 01:17:43 2015 -0400

    ruby package now works and tests with Ruby 1.93 or later

commit 7c8bdad2e74c5d731ef042169cbd00c92fd80d19
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Wed May 27 01:12:08 2015 -0400

    travis.yml to run Ruby tests

commit b40105b68a9df376dcd140db4642e43a24e973a2
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat May 23 09:42:21 2015 -0400

    1)  Suffix removal function #47
    2)  tests for #1

commit c1d7ef4f06498f1152273d3c1bbd00df69ec15dc
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat May 23 01:44:55 2015 -0400

    1) add BadIME tester  Bad IME checking rule #56;
    2) add tests for weird input behavior with get_letters
    3) add tests for #1
    4) Ruby gem push script version bump

commit d02cc9f09575d14baccb4c1d771979570a2344be
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Wed May 20 21:11:45 2015 -0400

    updated gemspec file, and deleted binary gem

commit 59ab2e10db2819a5ca89f0e465b413b6f43bacab
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Wed May 20 02:06:01 2015 -0400

    1) Tamil Gem - Pavalam #55
    2) first steps toward a Ruby version of open-tamil library
    3) make tamil.rb have a Tamil module with just static methods
    4) Tamil.get_letters made to work for Ruby.
    5) Added unittests
    6) Fixup other Module Ruby-isms

commit 1bc9378272a638a95d69cd66be9759a87c146e93
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Mon May 18 20:45:27 2015 -0400

    1) getAllWordsCount() API for N-gram data store;
    2) added new unittest for N-grams calculation of Gettysburg address. Honest Abe
    3) N-gram frequency analysis of corpus using Tries #54

commit 7e616d853d9c9332c12f43b87645aecaa046d36b
Merge: 773eb56 65c6390
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat May 16 20:53:16 2015 -0400

    Merge branch 'master' of https://github.com/arcturusannamalai/open-tamil

commit 773eb56b25ab6d29ae770e2736d08c4e08e2fad0
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Thu May 14 08:54:42 2015 -0400

    1) add new API for Queue.isempty()
    2) solthiruthi DOM model tokenizer can use re.search and file.readlines() to generate tokens skipping NL and WhiteSpace characters
    3) DTrie -add a count attribute Trie #53

commit 65c6390d3593dc09688a1d7392daf005c2c960f3
Merge: b0fca1c 3f85383
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Fri May 15 19:48:48 2015 -0400

    Merge branch 'master' of https://github.com/arcturusannamalai/open-tamil

commit b0fca1cd99bb96a3c6702806b6e6f8cdfcf9bf1b
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Fri May 15 19:48:43 2015 -0400

    agarathi - class interface tuning

commit 3f85383abf6c877a5678460c4c0449f88989944b
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Thu May 14 22:07:16 2015 -0400

    flake8 recommended  trim cruft lines on datastore.py

commit 5ddc312536cab1c657dcbc3332c5380953f6d0b1
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Wed May 13 07:26:24 2015 -0400

    1) Queue - FIFO class implemented on a list by controlled interface
    2) Tests for Queue. Head of queue is [0], Tail of Queue is [LEN-1] or [-1]
    3) Queue peek @ head of queue; API.
    4) New tests for peek() method, and words inserted into Queue
    
    solthiruthi - DOM -
    1) Entity as abstract class with WordEntity, NonEntity as concrete objects
    2) DOM model uses a Document class with Queue of tokens of type derived from Entity
    3) Basic test into a new file solthiruthi_dom.py
    4) solthiruthi/dom.py - make Document inherited from Queue
    5) tests/solthiruthi_dom.py - add test for Document and isWord() method of WordEntity

commit 2c48e90ed1af22a392e0e21c817f04121dcba740
Merge: addb26b c1a40d7
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Mon May 11 06:23:03 2015 -0400

    Merge branch 'master' of https://github.com/arcturusannamalai/open-tamil

commit c1a40d7f80cc49dd9b26f93e2b47464b997c493e
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sun May 10 11:13:28 2015 -0400

    frequency counter instead of just rep-2 detector

commit 710e32690ef84cce1107c1ae28eb454de7ea698c
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Thu May 7 03:44:13 2015 -0400

    1) getAllWordsPrefix() API
    2) tests
    Ref: issue  Solthiruthi - write method getAllWordsPrefix starting with known prefix #44

commit 95ab8bf6e4aa354ef5c0d22c226b94b9bc69d55d
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Thu May 7 01:34:53 2015 -0400

    1) added rules for RepeatedLetters, AdjacentConsonants
    2) refactored code for all hierarchy/Rules derived classes
    3) updated tests
    4) MIT license for open-tamil
    
    Ref:  Solthiruthi - heuristic rules #45 (issue)

commit e6b203b7328c7b8e211d35ca45ad1fd416ab0763
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Thu May 7 00:49:04 2015 -0400

    1) heuristic rules
    2) tests for #1
    3) dictionary loader utilities

commit 4677f071e283e566a2d6069e0b52dc7711d347d9
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat May 2 12:14:53 2015 -0400

    tamilvu
    projmad
    wikipedia
    
    use names of dictionary resource.

commit addb26bda4846efcf71fc02c60928458baff1a30
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Thu Apr 30 18:35:09 2015 -0400

    1) use frozenset/set in get_letters for faster search
    2) update unittests

commit e9f22aca048d05f2405c5534bc7220ddb9449572
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Wed Apr 29 23:15:35 2015 -0400

    1) use iterators in Python 3 instead of map which is iterable
    2) update tests in similar way for test data to be list instead

commit 7853dcee76cbdf6db961437e4a7bf59fc80f961a
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Wed Apr 29 22:58:24 2015 -0400

    1) tamil/utf8.py - donot print random text for errors; use exceptions instead
    2) python3 compatibility for word frequency data

commit 6e6371105c040d13b5d86cb25a61028cb8eff35d
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Wed Apr 29 01:51:53 2015 -0400

    1) datastore - DTrie - in-memory store
    2) Unittests - blazing fast
    3) Solthiruthi - framework - iterable method for DTrie data structure #43
    4)  -do -  DTrie data structure for fast loading #36
    5) getAllWordsHelper fixup for TamilTrie

commit 200cc8b2e5c3f87eb934badea8e2fc6f1ba21fc5
Merge: 0278ece 55bea09
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Tue Apr 28 23:44:58 2015 -0400

    Merge branch 'master' of https://github.com/arcturusannamalai/open-tamil

commit 0278ecebc1c5e5320559b721cae6898c66d8adfc
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Tue Apr 28 23:42:35 2015 -0400

    verb classes data structures from Dr. Rajams collection

commit 55bea094bec93e72feb916b31952b3a2feb4895d
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Tue Apr 28 20:54:37 2015 -0400

    2.6 compliance

commit df2c865d311928f41796f587417675ea06c6fe4c
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Tue Apr 28 20:45:14 2015 -0400

    1) unpack dictionary data for development

commit ba69384b2d2bebf967f71557ecd4f89ff8820c8b
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Tue Apr 28 20:38:59 2015 -0400

    1) use resource class for refering to data
    2) solthiruthi.resources has dictionary pointing from name -> data.

commit 2ca48e5b1af1eaf351ba4000592546b6fc55d57a
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Mon Apr 27 21:07:15 2015 -0400

    1) 2721 male names, 1260 female names
     1.1) peyargal.txt clean data
    Ref: ->  Solthiruthi - peyarkal - 4000 கும் மேற்ப்பட்ட தமிழ் பெயர்கள் - Tamil Names #42

commit 7d79bf2ba4aab4c2c228711a137458cec0d441c8
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sun Apr 26 21:44:45 2015 -0400

    0) word filter by wordlength, words with spaces
    1) Ezhimai - level 0 checker
    Ref: Solthiruthi - framework Level 0 checker #27
    2) Introduce abstract base classes and methods for WordSpeller
    3) Move getidx() function to tamil.utf8 module
    4) bugfix for utf8.tamil_letters
    Ref:  utf8.tamil_letters has agaram repeated twice #41
    5) Add unittest for 4
    6) Add unittest for Ezhimai
    7) Ezhimai rel import of WordSpeller interface

commit 485488a86fe108168a4af1f254925354942b1c37
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sun Apr 26 20:25:52 2015 -0400

    1) Tamil VU Dictionary wordlist - 63896
    2) Wikipedia Word list - 1mil+
    3) Project Madurai word list - 1mil+
    Ref: Issue  solthiruthi - add wikipedia, project madurai wordlist #35

commit 96d8c8eae7aaab84ff2e7ea7a2137352028b10f5
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sun Apr 26 16:21:01 2015 -0400

    1) add Tamil Trie test point.
    2) print() statement fixes for Windows terminal

commit f71e0188e70bbc28ff8a023967d9ee18cc8e0205
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sun Apr 26 14:47:36 2015 -0400

    1) add method to list words in Trie, following DFS
    2) add English Trie builder method
    3) unittests for 1, 2.
    Ref: Issue -  Solthiruthi - framework - Trie data structure #36

commit c0917104cc1335e1b40ec05e33f3495a899889a1
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sun Apr 26 05:08:14 2015 -0400

    Trie data store and validation on large wordlist quickly

commit 9a9bd5c5a860d3ca64fba919f58d726ec4b8fefe
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sun Apr 26 01:31:38 2015 -0400

    1) solthiruthi CLI
    2) tests for solthiruthi CLI using argparse
    Ref:  Solthiruthi - framework - command line interface #34

commit 4b8b187cdfe0503714a0f2de9ff5c737a6cabd57
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat Apr 25 15:07:00 2015 -0400

    1) update data_parser algorithm to work in file with 1 category only
    2) fix up issue where last category of wordlists was not saved
    3) added unittests + data
    Ref:  parser for data on proper nouns #33

commit 9c11e0c2523db5936063dc7e14b88c7d6c99d0e2
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat Apr 25 11:19:17 2015 -0400

    1) data for proper nouns compiled from sources
    2) gitignore local data/spreadsheets

commit 74e335f48dd5a358e878f42ef0803d02921e9a31
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat Apr 25 00:57:35 2015 -0400

    1) proper nouns for solthiruthi.
     data on proper nouns - part 1 #32

commit 2eb4a3edf735caeae9a630553881bccc19e85e5c
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Fri Apr 24 23:44:30 2015 -0400

    1)  contributing to open-tamil : pangalippugal #31

commit 7e50cb584efe72a320326af61bee8c69fe772488
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Fri Apr 24 23:18:47 2015 -0400

    1) move refs to solthiruthi/notes;
    2) make data directory for word lists - wikipedia word list, proj mad etc

commit bb556ff4c9b172ca1d8181eb91d9ffe2cc35f782
Author: Muthiah Annamalai <arcturusannamalai@gmail.com>
Date:   Wed Apr 22 22:27:02 2015 -0400

    Update references.txt
    
    misc references

commit 94924859960fa68f23e58a1471e5cb89a53a68a0
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Wed Apr 22 22:13:13 2015 -0400

    notes on solthiruthi

commit e92035ae7a55489affea2603e3c2f35619f8f6f7
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Mon Apr 20 23:09:21 2015 -0400

    update .travis.yml

commit d31a00b70f75fa653f4accf7d7a3ad2f22b98d67
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Mon Apr 20 22:35:13 2015 -0400

    1) add coverage testing\
    2) integration with coverall/coverall.io
    3) coverage info

commit 7632b96488a90a61ad8a90db82163db598c8a59a
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sun Apr 12 14:08:43 2015 -0400

    1. to_unicode_repr : 2-3 compatible
    2. update algorithm for get_words, get_words_iterable, get_tamil_words;
    3. update unittests
    4. remove deadcode/comment it out in the examples

commit 5a612b21a74f225715176c2fe87f19526d557f37
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Mon Apr 6 22:55:03 2015 -0400

    1) runnosetests with nose pkg
    2) runcoverage for particular test points with coverage pkg
    3) update demo tests DemoTest.py to pass on Windows too
    4) update of ngram tests with explicit imports
    5) more coverage for tamil.utf8

commit f8e0363dee57cfb7e2c8debbe8cc37aa6c1bb760
Merge: cf8fc37 fc46cbb
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Mon Apr 6 21:18:28 2015 -0400

    Merge branch 'master' of https://github.com/arcturusannamalai/open-tamil

commit fc46cbb74cee6f4dc3a9917444db759a16a285d4
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat Apr 4 22:12:50 2015 -0400

    test C-api #3

commit 5ba5cd4ad1966994299ac91d28c6fdfdd75817de
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat Apr 4 22:06:56 2015 -0400

    attempt #2 / breakout 2 sections

commit 249c7734327002931dff6f2bd31cc369814424d6
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat Apr 4 22:02:52 2015 -0400

    attempt #2

commit dd0cced42cc7e9f28ddb6d61f60cfb1c66850bf1
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat Apr 4 21:57:10 2015 -0400

    travis-YML try to run C-tests

commit 376108ea0743d4696673e6638201e9d00cad1ec9
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat Apr 4 21:56:19 2015 -0400

    travis-YML try to run C-tests

commit 86e0e560d3b64e815eb4c4cf5352a441accd04d6
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat Apr 4 21:51:53 2015 -0400

    travis-YML try to run C-tests

commit 5a593f5c93b352a6074a1637be045a1ec8ad4544
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat Apr 4 19:16:43 2015 -0400

    2.6 tests

commit 253275b8d22e90b14763fc1acc3ca356a7425b38
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat Apr 4 11:56:12 2015 -0400

    1) solpattiyal to use generators for better memory footprint
    2) solpattiyal - exception handling in batch job
    3) TODO - parallelizing, and filter by frequency / wordlengths or both
    4) TODO - parallelize using multiprocessing/pools

commit 50fa14ed0fdc4a76c38ef629256ea5fd3c34df62
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat Apr 4 11:47:40 2015 -0400

    @msathia

commit a32028e3bf883e707a13d41f832a64ea6c1d547a
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat Apr 4 11:43:44 2015 -0400

     Fast Tamil Unicode page detection algorithm #23

commit cf8fc37f5a5fdca8766b82f0ca585c67f59ccb3b
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Wed Apr 1 23:48:06 2015 -0400

    tamil.js : format keys to be modern

commit 8dc3742fb904fe84070d1b070748e2fd451ce1f1
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Mon Mar 30 01:13:46 2015 -0400

    solpattiyal - documentation string
    wordlist - add python3 compatibility, add richmond.txt as demo file

commit 65d02329bdd76ab8dbbd3c3db11e246a6960ec10
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat Mar 28 22:37:21 2015 -0400

    install from pip

commit c367dad3d0305c27b5c531e8f46be257ef362c45
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Fri Mar 27 02:12:33 2015 -0400

    0) Add (C) MA
    1) ngram - word to ngram generator
    2) add tests in arunchorporul

commit c6bde119cf3651f94416b3b71d5a119e7837e8a1
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Thu Mar 26 21:11:35 2015 -0400

    1) solthiruthi pkging issue fix
    
    2)  tamil.utf8.istamil_alnum - bug #21
    3)   tamil.utf8.compare_words_lexicographic - Python3 support is missing #20

commit 87a9cce569be56bd4c12de3c80f14d438f515918
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sun Mar 22 01:29:35 2015 -0400

    1) spelling algorithm - edit distances
    2) exception handling done correctly - fname non-existent
    3) solthiruthi package - norvig_suggestor
    
    4) norvig_suggestor : test point for item 3

commit 74f9d801324232f5ccb8753837b8c27d9c6ab8ed
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sat Mar 21 16:40:39 2015 -0400

    1) Edit distance - Dice Sorenson metric for similarity calculation
    
    2) Jaccard distance - the actual metric - inverse of Dice-Sorenson coeff
    
    3) Levensteins edit distance.
    4) Tests for Levenshtein edit distance with equal weights

commit bb39235bfd1301658459051180f72de0ff190d4e
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sun Mar 8 17:57:36 2015 -0400

    0) VERSION - test pt update
    1) add new test for TSCII conversion.
    2) move demo -> test.
    3) update test point in letter_tests for python3

commit a1319e298b05e11156c1a768f90b2ed5dfbe17e0
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Sun Mar 8 17:18:24 2015 -0400

    add VERSION information

commit e9650053e060077603aba315ee6f4bd35190826c
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Wed Feb 25 09:45:47 2015 -0500

    1) tscii2utf8.py : works for both Python3, Python2, and redirects to file on console as UTF-8 data.
    2) add get_letters test on frog mating rituals
    3) add test for updated TSCI -> UTF-8 converter

commit 8397f0ebad6e13c77cc39a171882c69780e017ad
Author: Muthiah Annamalai <ezhillang@gmail.com>
Date:   Wed Feb 25 08:24:28 2015 -0500

    1. Update TSCII list redefine to list extend.
    2. More unittest - arivuri

தமிழில் அல்கொரிதம் (Algorithm) / செயல்முரை நூல் தொகுப்பு

தமிழில் அல்கொரிதம் (Algorithm) / செயல்முரை நூல் தொகுப்பு ஒன்றை உருவாக்கணும். இதற்கு சமூக பொறியாளர்கள் பங்களிக்க வேண்டும்.

Alan M. Turing : கணிமையின் பிதாமகன் / Father of Modern Computing Theory ( http://en.wikipedia.org/wiki/Alan_Turing )

இதில் கீழ்க்கண்டவற்றை பற்றியும் எழுதனும்.

0. GCD, Factorial
1. Binary Search
2. Sorting
3. Recursion
4. Graph notation
5. DFS
6. BFS

இதில் தரவு-அமைப்புகைள (Data Structures) பற்றியும் எழுதனும்.

0. Stacks
1. Queues
2. Linked lists
3. Binary Trees
4. Graphs

Github, Wikibooks தளங்கள் ஒன்றை விருப்பத் தேர்வு செய்யலாம்.https://github.com/thamizha/ezhil-book

எழில் மொழியிலும் இதனை எ.கா உருவாக்கலாம்.