Thursday, December 6, 2012

Smart input


The early development of software for emailing, web-surfing and texting in Asian languages was hindered by the variety and complexity of writing systems. Asia has by far the largest share of the world’s scripts. These include ideographs (traditional Chinese, simplified Chinese, simplified Japanese); syllabaries (Japanese, Korean); complex alphabets (Tamil, Thai ); alphabets written from right to left (Arabic, Hebrew) or top to bottom (Mongolian); and adaptations of the Roman alphabet with special diacritics (Vietnamese). Several languages have changed their scripts, such as Kazakh, which used Arabic characters until these were replaced by Cyrillic under the Soviet Union, and is now being written more and more in Roman letters. Different software programmes encoded scripts in different ways, and so even if you could write your native language on one computer you could not necessarily do it on another.

This problem has largely been solved by the development of Unicode, which assigns a unique number to each character. Adopted by global companies such as Adobe, Apple, Google, IBM and Microsoft, it is supported by most web-browsers and important programming languages like Java. Thanks to Unicode, Balinese and Cambodians can word-process in their traditional scripts. Characters added in 2007 include the Tibetan /rra/, archaic digits found in Sinhala texts, and the Sundanese script. Bhutanese can write Dzongka using Tibetan-based software.

Although it is now possible to write almost any Asian language, some people still feel that to use a computer it is necessary to know the Roman alphabet, and preferably, the English language. A few years ago I met a student in Lahore in Pakistan, for example, who regretted not being able to email his father down in Karachi because the latter did not know English.


One reason for the continued association between English and computing is the way people originally learnt to use technology. I have several Thai friends who started using computers when the only available programmes were in English, and they became so used to using English for emails that some of them continue to do so, even when writing to Thais. Of course they all know that it is possible to write Thai on a computer, but some of them tell me that it is still quicker for them to use English. Perhaps it is because we only need about 30 keys for English (the 26 letters and the space-bar, return key etc.) whereas to write the 70+ letters of the Thai alphabet nearly every key on the keyboard must be used both in lower case and in upper case mode.

However, even where people decide to use Roman-script input, difficulties may arise because of all the different romanisation systems. Whereas these are relatively minor in Japanese (sho vs syo, for example, or tsu vs tu), the pinyin favoured by China and Wade-Giles system preferred in Taiwan can produce discrepancies such as Gaoxiung vs Kaohsiung and gong fu vs kung fu. The name pronounced /ri/ in Korean can be Li, Lee or Rhee. In Thailand almost every language school seems to have its own way of Romanising the national language – which is very frustrating for foreigners trying to learn it. Japanese has only 50 syllaberies (in two forms of kana: hiragana and katakana) and the keyboard of most Japanese computers (including the one I am writing this on) can easily cover them if we include the number keys in shift mode. Nevertheless, all the Japanese I know input their writing in Roman letters, even though the output appears in a mixture of Chinese characters and kana. This is the same with Chinese. Nearly all Chinese computer users input pinyin (the Roman alphabet adapted to Chinese sounds) to obtain a large choice of characters appear on the screen. We can therefore say that even if Thais and Koreans can do without Roman letters, Chinese and Japanese cannot.


As for writing messages and emails on telephones, this has changed dramatically since the spread of smartphones, which have keyboards similar to computers. A few years ago there used to be competitions for writing messages using the number pads only, and some people – especially young women – seemed to be amazingly fast. Their speed was enhanced by predictive spelling, which anticipates and tries to complete what you are about to write before you have finished it, and this software was especially advanced for Japanese. If I write Japanese on my old phone it is much quicker for me to use Roman-character input than kana input (but of course it is quicker still if I can write in English). In fact, even though I have been using a smartphone for over a year, I still carry my old phone sometimes (for example, when I go jogging), and I think I can input just as fast on its number pad than I can on my smartphone’s keyboard – especially since I find the latter too small for my fingers and often make mistakes!

Many of my older Thai friends use Roman script input on their phones even if they write Thai. Some of them who are using older phones say that it is almost impossible to input Thai characters on a phone because every number key needs to support about seven different characters. Even some Thais with smartphones input using Roman script for the same reason that they do so on their computers – they find an alphabet of 26 letters easier than one of over 70. One friend told me that another reason for inputting in Roman letters is because it is cheaper. He added that since most Thais cannot read Thai written in Roman letters, he tends just to write in English. Another Thai friend told me that even though she usually writes in Thai and she uses Thai character input to do so, she mixes her Thai messages with a lot of English words and expressions. However, I hear that a lot of younger Thais are now texting using Thai characters. I don't know if they add Roman characters (e.g. btw, CUl8r) or English expressions (YES, OK), but I do know that this is quite common among young Chinese and Japanese.

There is no doubt that it is now much easier than before to use Asian scripts on computers and phones. But this brings up a question: what happens when people whose language is written in a complex script such as Chinese characters do not have their computer or phone available? I know in my own case that although it is pretty easy for me to write in Japanese on my computer, if I have to write a letter by hand, or write something on the blackboard, I often find that I suddenly cannot remember how to write certain characters – even quite easy ones that I would never forget how to read. Of course I did not learn Japanese when I was young, so it is inevitable that the characters are not as firmly embedded in my head as they are for Japanese people. But some Japanese friends tell me that they too sometimes forget how to write a character when they don’t have their computer. However, since Japanese is written in a mixture of Chinese and kana (and even roman script), if you don’t know a character you can generally write it phonetically.

But as far as I know, if you forget a Chinese character it doesn't look very good if you just write it in pinyn. So I wonder what Chinese people do when they forget a character? Or maybe they never forget?