Supported Scripts
The Unicode Standard encodes scripts rather than
languages. When writing systems for more than one language
share sets of graphical symbols that have historically related
derivations, the union of all of those graphical symbols is treated
as a single collection of characters for encoding and is identified
as a single script. Each script then serves as an inventory of
graphical symbols, which are drawn upon for the writing systems of
particular languages. In many cases, a single script, such as the
Latin script, may be used to
write tens or even hundreds of languages. In other cases, only one language employs a
particular script—for example, Hangul, which is typically used only
to write the Korean language. The writing systems for some languages
may also use more than one script; for example, Japanese
traditionally makes use of the Han (Kanji), Hiragana, and Katakana
scripts, and modern Japanese usage commonly mixes in the Latin
script as well.
The scripts supported by the Unicode Standard include all of those listed in the following table. The listing in the table is ordered by the version of the Unicode Standard in which a particular script was first encoded. In many instances, supplemental characters for a given script have been encoded in subsequent versions of the standard, after the initial addition of the script. Details about most of these scripts
can be looked up at the ScriptSource website.
Version (Year) |
Scripts Added |
Totals |
1.1 (1993) |
|
23 |
|
Arabic |
Gujarati |
Lao |
|
Armenian |
Gurmukhi |
Latin |
Bengali |
Han |
Malayalam |
Bopomofo |
Hangul |
Oriya |
Cyrillic |
Hebrew |
Tamil |
Devanagari |
Hiragana |
Telugu |
Georgian |
Kannada |
Thai |
Greek |
Katakana |
|
2.0 (1996) |
|
+1, = 24 |
|
Tibetan |
|
|
|
3.0 (1999) |
|
+13, = 37 |
|
Braille (patterns) |
Mongolian |
Syriac |
|
Canadian Syllabics |
Myanmar |
Thaana |
Cherokee |
Ogham |
Yi |
Ethiopic |
Runic |
|
Khmer |
Sinhala |
|
3.1 (2001) |
|
+3, = 40 |
|
Deseret |
Gothic |
Old Italic |
|
3.2 (2002) |
|
+4, = 44 |
|
Buhid |
Tagalog |
|
|
Hanunóo |
Tagbanwa |
|
4.0 (2003) |
|
+7, = 51 |
|
Cypriot |
Osmanya |
Ugaritic |
|
Limbu |
Shavian |
|
Linear B |
Tai Le |
|
4.1 (2005) |
|
+8, = 59 |
|
Buginese |
Kharoshthi |
Syloti Nagri |
|
Coptic |
New Tai Lue |
Tifinagh |
Glagolitic |
Old Persian Cuneiform |
|
5.0 (2006) |
|
+5, = 64 |
|
Balinese |
Phags-pa |
Sumero-Akkadian Cuneiform |
|
N'Ko |
Phoenician |
|
5.1 (2008) |
|
+11, = 75 |
|
Carian |
Lycian |
Saurashtra |
|
Cham |
Lydian |
Sundanese |
Kayah Li |
Ol Chiki |
Vai |
Lepcha |
Rejang |
|
5.2 (2009) |
|
+15, = 90 |
|
Avestan |
Inscriptional Parthian |
Old South Arabian |
|
Bamum |
Javanese |
Old Turkic |
Egyptian Hieroglyphs |
Kaithi |
Samaritan |
Imperial Aramaic |
Lisu |
Tai Tham |
Inscriptional Pahlavi |
Meetei Mayek |
Tai Viet |
6.0 (2010) |
|
+3, = 93 |
|
Batak |
Brahmi |
Mandaic |
|
6.1 (2012) |
|
+7, = 100 |
|
Chakma |
Miao |
Takri |
|
Meroitic Cursive |
Sharada |
|
Meroitic Hieroglyphs |
Sora Sompeng |
|
7.0 (2014) |
|
+23, = 123 |
|
Bassa Vah |
Mahajani |
Pahawh Hmong |
|
Caucasian Albanian |
Manichaean |
Palmyrene |
Duployan (shorthand) |
Mende Kikakui |
Pau Cin Hau |
Elbasan |
Modi |
Psalter Pahlavi |
Grantha |
Mro |
Siddham |
Khojki |
Nabataean |
Tirhuta |
Khudawadi |
Old North Arabian |
Warang Citi |
Linear A |
Old Permic |
|
8.0 (2015) |
|
+6, = 129 |
|
Ahom |
Hatran |
Old Hungarian |
|
Anatolian Hieroglyphs |
Multani |
Sutton SignWriting |
9.0 (2016) |
|
+6, = 135 |
|
Adlam |
Marchen |
Osage |
|
Bhaiksuki |
Newa |
Tangut |
10.0 (2017) |
|
+4, = 139 |
|
Masaram Gondi |
Soyombo |
|
|
Nushu |
Zanabazar Square |
|
11.0 (2018) |
|
+7, = 146 |
|
Dogra |
Makasar |
Sogdian |
|
Gunjala Gondi |
Medefaidrin |
|
Hanifi Rohingya |
Old Sogdian |
|
12.0 (2019) |
|
+4, = 150 |
|
Elymaic |
Nyiakeng Puachue Hmong |
|
|
Nandinagari |
Wancho |
|
13.0 (2020) |
|
+4, = 154 |
|
Chorasmian |
Khitan Small Script |
|
|
Dives Akuru |
Yezidi |
|
14.0 (2021) |
|
+5, = 159 |
|
Cypro-Minoan |
Tangsa |
Vithkuqi |
|
Old Uyghur |
Toto |
|
15.0 (2022) |
|
+2, = 161 |
|
Kawi |
Nag Mundari |
|
|
16.0 (2024) |
|
+7, = 168 |
|
Garay |
Gurung Khema |
Kirat Rai |
|
Ol Onal |
Sunuwar |
Todhri |
Tulu-Tigalari |
|
|
In addition to the scripts listed above, a large number of other collections
of characters are also encoded by Unicode. These collections include
the following:
- Numbers
- General Diacritics
- General Punctuation
- General Symbols
- Mathematical Symbols (Western and Arabic)
- Musical Symbols (Western, Byzantine, Ancient Greek, and other)
- Technical Symbols
- Emoji: For details, see
Emoji Versions
- Dingbats
- Arrows, Blocks, Box Drawing Forms, and Geometric Shapes
- Game Symbols
- Miscellaneous Symbols
- Presentation Forms
- Kangxi and other CJK radicals
|