Tibetan Unicode: Difference between revisions
m (→External links) |
|||
(18 intermediate revisions by the same user not shown) | |||
Line 7: | Line 7: | ||
The situation for Tibetan was particularly anarchical. There was no recognized standard for encoding Tibetan script characters. Word-processing applications and add-ins for Tibetan used non-standardized, proprietary font-based encodings - mapping the Tibetan glyphs in the fonts they used to character sets originally designed for encoding Roman or Chinese characters. Since each Tibetan system used its own encoding, one of the greatest obstacles to using electronic Tibetan data resulted from the fact that files could not be easily shared by different Tibetan word-processing programs or with other applications without converting files from one encoding scheme to another. | The situation for Tibetan was particularly anarchical. There was no recognized standard for encoding Tibetan script characters. Word-processing applications and add-ins for Tibetan used non-standardized, proprietary font-based encodings - mapping the Tibetan glyphs in the fonts they used to character sets originally designed for encoding Roman or Chinese characters. Since each Tibetan system used its own encoding, one of the greatest obstacles to using electronic Tibetan data resulted from the fact that files could not be easily shared by different Tibetan word-processing programs or with other applications without converting files from one encoding scheme to another. | ||
== Tibetan in the Unicode Standard == | |||
{|border="1" cellspacing="0" cellpadding="5" class="wikitable" style="border-collapse:collapse;background:#FFFFFF;font-size:x-large; font-family: Kailash, Jomolhari, 'Tibetan Machine Uni'; text-align:center" | |||
{|border="1" cellspacing="0" cellpadding="5" class="wikitable" style="border-collapse:collapse;background:#FFFFFF;font-size:large; font-family: Jomolhari | |||
|- | |- | ||
|colspan=" | |colspan="18" style="background:#F8F8F8;font-size:small"| '''Chart of Tibetan script characters in the Unicode Standard'''<br />[http://www.unicode.org/charts/PDF/U0F00.pdf Unicode.org chart] (PDF) | ||
|-style="background:#F8F8F8;font-size:small" | |-style="background:#F8F8F8;font-size:small" | ||
| style="width: | | style="width:10%" | || style="width:5%" | 0 || style="width:5%" | 1 || style="width:5%" | 2 || style="width:5%" | 3 || style="width:5%" | 4 || style="width:5%" | 5 || style="width:5%" | 6 || style="width:5%" | 7 || style="width:5%" | 8 || style="width:5%" | 9 || style="width:5%" | A || style="width:5%" | B || style="width:5%" | C || style="width:5%" | D || style="width:5%" | E || style="width:5%" | F ||style="width=*; background:#F8F8F8;" rowspan="17"| | ||
|- | |- | ||
| style="background:#F8F8F8;font-size:small; font-family: Verdana, sans "| U+0F0x | | style="background:#F8F8F8;font-size:small; font-family: Verdana, sans; height: 36pt;"| U+0F0x | ||
| ༀ || ༁ || ༂ || ༃ || ༄ || ༅ || ༆ || ༇ | | ༀ || ༁ || ༂ || ༃ || ༄ || ༅ || ༆ || ༇ | ||
| ༈ || ༉ || ༊ || ་ || ༌ || ། || ༎ || ༏ | | ༈ || ༉ || ༊ || ་ || ༌ || ། || ༎ || ༏ | ||
|- | |- | ||
| style="background:#F8F8F8;font-size:small; font-family: Verdana, sans"| U+0F1x | | style="background:#F8F8F8;font-size:small; font-family: Verdana, sans; height: 36pt;"| U+0F1x | ||
| ༐ || ༑ || ༒ || ༓ || ༔ || ༕ || ༖ || ༗ | | ༐ || ༑ || ༒ || ༓ || ༔ || ༕ || ༖ || ༗ | ||
| ༘ || ༙ || ༚ || ༛ || ༜ || ༝ || ༞ || ༟ | | ༘ || ༙ || ༚ || ༛ || ༜ || ༝ || ༞ || ༟ | ||
|- | |- | ||
| style="background:#F8F8F8;font-size:small; font-family: Verdana, sans"| U+0F2x | | style="background:#F8F8F8;font-size:small; font-family: Verdana, sans; height: 36pt;"| U+0F2x | ||
| ༠ || ༡ || ༢ || ༣ || ༤ || ༥ || ༦ || ༧ | | ༠ || ༡ || ༢ || ༣ || ༤ || ༥ || ༦ || ༧ | ||
| ༨ || ༩ || ༪ || ༫ || ༬ || ༭ || ༮ || ༯ | | ༨ || ༩ || ༪ || ༫ || ༬ || ༭ || ༮ || ༯ | ||
|- | |- | ||
| style="background:#F8F8F8;font-size:small; font-family: Verdana, sans"| U+0F3x | | style="background:#F8F8F8;font-size:small; font-family: Verdana, sans; height: 36pt"| U+0F3x | ||
| ༰ || ༱ || ༲ || ༳ || ༴ || ༵ || ༶ || ༷ | | ༰ || ༱ || ༲ || ༳ || ༴ || ༵ || ༶ || ༷ | ||
| ༸ || ༹ || ༺ || ༻ || ༼ || ༽ || ༾ || ༿ | | ༸ || ༹ || ༺ || ༻ || ༼ || ༽ || ༾ || ༿ | ||
|- | |- | ||
| style="background:#F8F8F8;font-size:small; font-family: Verdana, sans"| U+0F4x | | style="background:#F8F8F8;font-size:small; font-family: Verdana, sans; height: 36pt;"| U+0F4x | ||
| ཀ || ཁ || ག || གྷ || ང || ཅ || ཆ || ཇ | | ཀ || ཁ || ག || གྷ || ང || ཅ || ཆ || ཇ | ||
| bgcolor="#CCCCCC" | || ཉ || ཊ || ཋ || ཌ || ཌྷ || ཎ || ཏ | | bgcolor="#CCCCCC" | || ཉ || ཊ || ཋ || ཌ || ཌྷ || ཎ || ཏ | ||
|- | |- | ||
| style="background:#F8F8F8;font-size:small; font-family: Verdana, sans"| U+0F5x | | style="background:#F8F8F8;font-size:small; font-family: Verdana, sans; height: 36pt;"| U+0F5x | ||
| ཐ || ད || དྷ || ན || པ || ཕ || བ || བྷ | | ཐ || ད || དྷ || ན || པ || ཕ || བ || བྷ | ||
| མ || ཙ || ཚ || ཛ || ཛྷ || ཝ || ཞ || ཟ | | མ || ཙ || ཚ || ཛ || ཛྷ || ཝ || ཞ || ཟ | ||
|- | |- | ||
| style="background:#F8F8F8;font-size:small; font-family: Verdana, sans"| U+0F6x | | style="background:#F8F8F8;font-size:small; font-family: Verdana, sans; height: 36pt;"| U+0F6x | ||
| འ || ཡ || ར || ལ || ཤ || ཥ || ས || ཧ | | འ || ཡ || ར || ལ || ཤ || ཥ || ས || ཧ | ||
| ཨ || ཀྵ || ཪ || ཫ || ཬ || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | | | ཨ || ཀྵ || ཪ || ཫ || ཬ || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | | ||
|- | |- | ||
| style="background:#F8F8F8;font-size:small; font-family: Verdana, sans"| U+0F7x | | style="background:#F8F8F8;font-size:small; font-family: Verdana, sans; height: 36pt;"| U+0F7x | ||
| bgcolor="#CCCCCC" | || ཱ || ི || ཱི || ུ || ཱུ || ྲྀ || ཷ | | bgcolor="#CCCCCC" | || ཱ || ི || ཱི || ུ || ཱུ || ྲྀ || ཷ | ||
| ླྀ || ཹ || ེ || ཻ || ོ || ཽ || ཾ || ཿ | | ླྀ || ཹ || ེ || ཻ || ོ || ཽ || ཾ || ཿ | ||
|- | |- | ||
| style="background:#F8F8F8;font-size:small; font-family: Verdana, sans"| U+0F8x | | style="background:#F8F8F8;font-size:small; font-family: Verdana, sans; height: 36pt;"| U+0F8x | ||
| ྀ || ཱྀ || ྂ || ྃ || ྄ || ྅ || ྆ || ྇ | | ྀ || ཱྀ || ྂ || ྃ || ྄ || ྅ || ྆ || ྇ | ||
| ྈ || ྉ || ྊ || ྋ || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | | | ྈ || ྉ || ྊ || ྋ || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | | ||
|- | |- | ||
| style="background:#F8F8F8;font-size:small; font-family: Verdana, sans"| U+0F9x | | style="background:#F8F8F8;font-size:small; font-family: Verdana, sans; height: 36pt;"| U+0F9x | ||
| ྐ || ྑ || ྒ || ྒྷ || ྔ || ྕ || ྖ || ྗ | | ྐ || ྑ || ྒ || ྒྷ || ྔ || ྕ || ྖ || ྗ | ||
| bgcolor="#CCCCCC" | || ྙ || ྚ || ྛ || ྜ || ྜྷ || ྞ || ྟ | | bgcolor="#CCCCCC" | || ྙ || ྚ || ྛ || ྜ || ྜྷ || ྞ || ྟ | ||
|- | |- | ||
| style="background:#F8F8F8;font-size:small; font-family: Verdana, sans"| U+0FAx | | style="background:#F8F8F8;font-size:small; font-family: Verdana, sans; height: 36pt;"| U+0FAx | ||
| ྠ || ྡ || ྡྷ || ྣ || ྤ || ྥ || ྦ || ྦྷ | | ྠ || ྡ || ྡྷ || ྣ || ྤ || ྥ || ྦ || ྦྷ | ||
| ྨ || ྩ || ྪ || ྫ || ྫྷ || ྭ || ྮ || ྯ | | ྨ || ྩ || ྪ || ྫ || ྫྷ || ྭ || ྮ || ྯ | ||
|- | |- | ||
| style="background:#F8F8F8;font-size:small; font-family: Verdana, sans"| U+0FBx | | style="background:#F8F8F8;font-size:small; font-family: Verdana, sans; height: 36pt;"| U+0FBx | ||
| ྰ || ྱ || ྲ || ླ || ྴ || ྵ || ྶ || ྷ | | ྰ || ྱ || ྲ || ླ || ྴ || ྵ || ྶ || ྷ | ||
| ྸ || ྐྵ || ྺ || ྻ || ྼ || bgcolor="#CCCCCC" | || ྾ || ྿ | | ྸ || ྐྵ || ྺ || ྻ || ྼ || bgcolor="#CCCCCC" | || ྾ || ྿ | ||
|- | |- | ||
| style="background:#F8F8F8;font-size:small; font-family: Verdana, sans"| U+0FCx | | style="background:#F8F8F8;font-size:small; font-family: Verdana, sans; height: 36pt;"| U+0FCx | ||
| ࿀ || ࿁ || ࿂ || ࿃ || ࿄ || ࿅ || ࿆ || ࿇ | | ࿀ || ࿁ || ࿂ || ࿃ || ࿄ || ࿅ || ࿆ || ࿇ | ||
| ࿈ || ࿉ || ࿊ || ࿋ || ࿌ || bgcolor="#CCCCCC" | || ࿎ || ࿏ | | ࿈ || ࿉ || ࿊ || ࿋ || ࿌ || bgcolor="#CCCCCC" | || ࿎ || ࿏ | ||
|- | |- | ||
| style="background:#F8F8F8;font-size:small; font-family: Verdana, sans"| U+0FDx | | style="background:#F8F8F8;font-size:small; font-family: Verdana, sans; height: 36pt;"| U+0FDx | ||
| ࿐ || ࿑ || ࿒ || ࿓ || ࿔ || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | | | ࿐ || ࿑ || ࿒ || ࿓ || ࿔ || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | | ||
| bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | | | bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | | ||
|- | |- | ||
| style="background:#F8F8F8;font-size:small; font-family: Verdana, sans"| U+0FEx | | style="background:#F8F8F8;font-size:small; font-family: Verdana, sans; height: 36pt;"| U+0FEx | ||
| bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | | | bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | | ||
| bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | | | bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | | ||
|- | |- | ||
| style="background:#F8F8F8;font-size:small; font-family: Verdana, sans"| U+0FFx | | style="background:#F8F8F8;font-size:small; font-family: Verdana, sans; height: 36pt;"| U+0FFx | ||
| bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | | | bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | | ||
| bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | | | bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | || bgcolor="#CCCCCC" | | ||
|} | |} | ||
====How to install Tibetan Unicode software support== | |||
==Characters and Glyphs== | |||
The Unicode Standard encodes characters not glyphs<ref>[http://en.wikipedia.org/wiki/Glyphs glyphs]</ref> or letter-forms. Complex Tibetan combinations or ligatures are encoded in text as the individual characters representing their parts. | |||
==How to install Tibetan Unicode software support== | |||
See [[Tibetan Unicode Installation]] for information on how to install the required software support for Tibetan Unicode on Windows, Linux and Mac OS-X. | See [[Tibetan Unicode Installation]] for information on how to install the required software support for Tibetan Unicode on Windows, Linux and Mac OS-X. | ||
==Current Limitations of Unicode Tibetan== | ==Current Limitations of Unicode Tibetan== | ||
Line 118: | Line 95: | ||
==See also== | ==See also== | ||
[[Tibetan Fonts]] | *[[Tibetan Fonts]] | ||
*[[Legacy Tibetan Software & Character Encoding]] | |||
==External links== | |||
* [http://www.unicode.org/standard/WhatIsUnicode.html What is Unicode?] | * [http://www.unicode.org/standard/WhatIsUnicode.html What is Unicode?] | ||
* [http://www. | * [http://www.thlib.org/tools/#wiki=/access/wiki/site/26a34146-33a6-48ce-001e-f16ce7908a6a/encoding%20model%20of%20the%20tibetan%20script%20in%20the%20ucs.html Encoding model of the Tibetan script in the UCS] - Explains how Tibetan characters are encoded in the ISO 10646 / Unicode Standard. by [[Christopher Fynn]] | ||
* [http://www.unicode.org/charts/PDF/U0F00.pdf Tibetan Block of The Unicode Standard] (code chart) | * [http://www.unicode.org/charts/PDF/U0F00.pdf Tibetan Block of The Unicode Standard] (code chart) | ||
* [http://en.wikipedia.org/wiki/Wikipedia:Enabling_complex_text_support_for_Indic_scripts Enabling Complex Script Text Support for Indic Scripts] - on Wikipedia (includes Tibetan) | * [http://en.wikipedia.org/wiki/Wikipedia:Enabling_complex_text_support_for_Indic_scripts Enabling Complex Script Text Support for Indic Scripts] - on Wikipedia (includes Tibetan) |
Latest revision as of 10:23, 10 June 2009
Introduction
Before the Unicode Standard came along, there were hundreds of different standardized and non-standardized encoding systems for encoding the characters of different writing systems. No single character encoding had enough characters to encode all the characters used in all the different writing systems of the world. Even for Western European languages, which use an uncomplicated writing system, the 7-bit and 8-bit computer character sets, such as ASCII and ISO 8859-1, used for encoding the Roman script were inadequate for all the letters, punctuation, and technical symbols in common use.
The situation for Tibetan was particularly anarchical. There was no recognized standard for encoding Tibetan script characters. Word-processing applications and add-ins for Tibetan used non-standardized, proprietary font-based encodings - mapping the Tibetan glyphs in the fonts they used to character sets originally designed for encoding Roman or Chinese characters. Since each Tibetan system used its own encoding, one of the greatest obstacles to using electronic Tibetan data resulted from the fact that files could not be easily shared by different Tibetan word-processing programs or with other applications without converting files from one encoding scheme to another.
Tibetan in the Unicode Standard
Chart of Tibetan script characters in the Unicode Standard Unicode.org chart (PDF) | |||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | ||
U+0F0x | ༀ | ༁ | ༂ | ༃ | ༄ | ༅ | ༆ | ༇ | ༈ | ༉ | ༊ | ་ | ༌ | ། | ༎ | ༏ | |
U+0F1x | ༐ | ༑ | ༒ | ༓ | ༔ | ༕ | ༖ | ༗ | ༘ | ༙ | ༚ | ༛ | ༜ | ༝ | ༞ | ༟ | |
U+0F2x | ༠ | ༡ | ༢ | ༣ | ༤ | ༥ | ༦ | ༧ | ༨ | ༩ | ༪ | ༫ | ༬ | ༭ | ༮ | ༯ | |
U+0F3x | ༰ | ༱ | ༲ | ༳ | ༴ | ༵ | ༶ | ༷ | ༸ | ༹ | ༺ | ༻ | ༼ | ༽ | ༾ | ༿ | |
U+0F4x | ཀ | ཁ | ག | གྷ | ང | ཅ | ཆ | ཇ | ཉ | ཊ | ཋ | ཌ | ཌྷ | ཎ | ཏ | ||
U+0F5x | ཐ | ད | དྷ | ན | པ | ཕ | བ | བྷ | མ | ཙ | ཚ | ཛ | ཛྷ | ཝ | ཞ | ཟ | |
U+0F6x | འ | ཡ | ར | ལ | ཤ | ཥ | ས | ཧ | ཨ | ཀྵ | ཪ | ཫ | ཬ | ||||
U+0F7x | ཱ | ི | ཱི | ུ | ཱུ | ྲྀ | ཷ | ླྀ | ཹ | ེ | ཻ | ོ | ཽ | ཾ | ཿ | ||
U+0F8x | ྀ | ཱྀ | ྂ | ྃ | ྄ | ྅ | ྆ | ྇ | ྈ | ྉ | ྊ | ྋ | |||||
U+0F9x | ྐ | ྑ | ྒ | ྒྷ | ྔ | ྕ | ྖ | ྗ | ྙ | ྚ | ྛ | ྜ | ྜྷ | ྞ | ྟ | ||
U+0FAx | ྠ | ྡ | ྡྷ | ྣ | ྤ | ྥ | ྦ | ྦྷ | ྨ | ྩ | ྪ | ྫ | ྫྷ | ྭ | ྮ | ྯ | |
U+0FBx | ྰ | ྱ | ྲ | ླ | ྴ | ྵ | ྶ | ྷ | ྸ | ྐྵ | ྺ | ྻ | ྼ | ྾ | ྿ | ||
U+0FCx | ࿀ | ࿁ | ࿂ | ࿃ | ࿄ | ࿅ | ࿆ | ࿇ | ࿈ | ࿉ | ࿊ | ࿋ | ࿌ | ࿎ | ࿏ | ||
U+0FDx | ࿐ | ࿑ | ࿒ | ࿓ | ࿔ | ||||||||||||
U+0FEx | |||||||||||||||||
U+0FFx |
Characters and Glyphs
The Unicode Standard encodes characters not glyphs[1] or letter-forms. Complex Tibetan combinations or ligatures are encoded in text as the individual characters representing their parts.
How to install Tibetan Unicode software support
See Tibetan Unicode Installation for information on how to install the required software support for Tibetan Unicode on Windows, Linux and Mac OS-X.
Current Limitations of Unicode Tibetan
- The main limitation is lack of support in older operatiing systems and applications.
- Support for complex Indic scripts including Tibetan is currently lacking in pre-press / DTP software such as Adobe InDesign and Scribus used by printers and publishers.
See also
External links
- What is Unicode?
- Encoding model of the Tibetan script in the UCS - Explains how Tibetan characters are encoded in the ISO 10646 / Unicode Standard. by Christopher Fynn
- Tibetan Block of The Unicode Standard (code chart)
- Enabling Complex Script Text Support for Indic Scripts - on Wikipedia (includes Tibetan)