Open-Tamil user commands

Lot of times we have felt the problem with open-tamil : it has many utilities, but none of them are usually available as functions or commands out of the box. It has very much been a developer tool, and not a user or informed-layperson tool.

A quick fix is to add the example Python scripts to the default install paths along with open-tamil installation [which is still simple as ‘$ pip install –upgrade open-tamil‘].

1. tamilphonetic - convert EN input to Tamil text
2. tamilwordfilter - filter Tamil input only from all input text data
3. tamilurlfilter - filter Tamil text from the input website data
4. tamiltscii2utf8 - convert encoding from TSCII to UTF-8 for input file
5. tamilwordgrid - generate a crossword from Tamil input text and write to output.html file
6. tamilwordcount - like UNIX wc program but for Tamil

All these functions will be made available in version 0.7 of open-tamil to be released soon. Currently these functions have landed in the development branch through the commit 02810461bef216df56b10ebf09818b94dfc75574

The next step should be to really bundle these tools into a binary executable for various platforms. Also to note, the function tamilwordcount was contributed by a new member to the Open-Tamil group, Mr. Surendhar. Thanks much, and welcome!


GNU iconv – convert from UTF-8 to TSCII and back

iconv a GNU utility can help converting text documents back and forth from various encoding schemes. Particularly it is of interest to us, Tamil speaking folks, because it can convert from UTF-8 to TSCII and back.

If you wanted to convert, hello.utf8 from UTF-8 encoding into TSCII you could use it as follows,

$ iconv -f utf-8 -t tscii hello.utf8 > hello.tscii

where in the Linux shell environment you can redirect the output into the TSCII encoded file.

Developers: Someday I hope volunteers will add more historical Tamil encodings, primarily TAM, TAB, and other font-based encoding schemes to the libiconv. Please start development using git repository at GNU sources.