Technical Notes | |
Version | 4 |
Authors | Ken Whistler, Rick McGowan |
Date | 2024-08-13 |
This Version | https://www.unicode.org/notes/tn33/tn33-4.html |
Previous Version | https://www.unicode.org/notes/tn33/tn33-3.html |
Latest Version | https://www.unicode.org/notes/tn33/ |
This document provides a list of danda characters in the Unicode Standard.
This document is a Unicode Technical Note. Sole responsibility for its contents rests with the author(s). Publication does not imply any endorsement by the Unicode Consortium.
For information on Unicode Technical Notes, including criteria for acceptance, see https://www.unicode.org/notes/.
Dandas are punctuation characters commonly seen in the typographic traditions of writing systems of South and Southeast Asia. While they occur in many scripts, they are primarily found in traditional materials written in scripts historically derived from the Brahmi script.
The typical appearance of a danda is simply a vertical bar. Two vertical bars may also be paired together in a corresponding punctuation mark known as a double danda. Tripled forms may also occur, but are much less common. Although forms based on a simple vertical bar are typical, in some scripts more elaborate forms have developed, and in some cases—such as Tibetan, in which the danda is termed a shad—the danda mark may accrue additional adornments.
Dandas generally delimit phrase-, sentence-, or section-level divisions in text. When both a single and a double danda occur, the double danda is used to demarcate larger units of text than the single danda. This usage is roughly comparable to the use of commas and full stops in Western typography, although dandas typically mark larger phrasal units than what might be separated by commas in Western typography. In many traditional materials, dandas and double dandas delimit what might be best termed verses or sections, and do not map easily onto concepts such as "sentence". Usage may also vary by script, by language, and by corpus.
Many South and Southeast Asian scripts in modern usage have adopted Western typographic practice in varying degrees. In such contexts dandas are often supplanted by common-use Western punctuation marks.
Many of the danda characters encoded in the Unicode Standard have the word "DANDA" in their name, but there are many instances where punctuation marks are encoded, which historically and functionally are dandas, but which have distinct names specific to a particular script. For example, in Tibetan and scripts influenced by Tibetan, these marks have "SHAD", rather than "DANDA" in their names. Also, because danda characters do not all have simple, vertical bar shapes, they are not always easy to find when searching the code charts.
To make it easier to identify danda characters in the Unicode Standard, this Technical Note includes a specific list of known danda characters as of Unicode 16.0. This list may be periodically updated in the future, if further danda characters are added to the Unicode Standard.
The table below is in the usual Unicode Data File format of semi-colon delimited fields optionally followed by "#" and a comment. The table contains a list of characters in the Unicode Standard that are dandas. The first field is a code point or code point range. The second field is the General_Category property value of the character. The third field is a comment giving the name of a single character or the names of the first and last characters in a code point range.
# Dandas # [Not derivable] 0964..0965 ; Po # [2] DEVANAGARI DANDA..DEVANAGARI DOUBLE DANDA 0E5A ; Po # THAI CHARACTER ANGKHANKHU 0F08 ; Po # TIBETAN MARK SBRUL SHAD 0F0D..0F12 ; Po # [7] TIBETAN MARK SHAD..TIBETAN MARK RGYA GRAM SHAD 104A..104B ; Po # [2] MYANMAR SIGN LITTLE SECTION..MYANMAR SIGN SECTION 1735..1736 ; Po # [2] PHILIPPINE SINGLE PUNCTUATION..PHILIPPINE DOUBLE PUNCTUATION 17D4..17D5 ; Po # [2] KHMER SIGN KHAN..KHMER SIGN BARIYOOSAN 1AA8..1AAB ; Po # [4] TAI THAM SIGN KAAN..TAI THAM SIGN SATKAANKUU 1B5E..1B5F ; Po # [2] BALINESE CARIK SIKI..BALINESE CARIK PAREREN 1C3B..1C3C ; Po # [2] LEPCHA PUNCTUATION TA-ROL..LEPCHA PUNCTUATION NYET THYOOM TA-ROL 1C7E..1C7F ; Po # [2] OL CHIKI PUNCTUATION MUCAAD..OL CHIKI PUNCTUATION DOUBLE MUCAAD A876..A877 ; Po # [2] PHAGS-PA SHAD..PHAGS-PA MARK DOUBLE SHAD A8CE..A8CF ; Po # [2] SAURASHTRA DANDA..SAURASHTRA DOUBLE DANDA A92F ; Po # KAYAH LI SIGN SHYA A9C8..A9C9 ; Po # [2] JAVANESE PADA LINGSA..JAVANESE PADA LUNGSI AA5D..AA5F ; Po # [3] CHAM PUNCTUATION DANDA..CHAM PUNCTUATION TRIPLE DANDA AAF0 ; Po # MEETEI MAYEK CHEIKHAN ABEB ; Po # MEETEI MAYEK CHEIKHEI 10A56..10A57 ; Po # [2] KHAROSHTHI PUNCTUATION DANDA..KHAROSHTHI PUNCTUATION DOUBLE DANDA 11047..11048 ; Po # [2] BRAHMI DANDA..BRAHMI DOUBLE DANDA 110C0..110C1 ; Po # [2] KAITHI DANDA..KAITHI DOUBLE DANDA 11141..11142 ; Po # [2] CHAKMA DANDA..CHAKMA DOUBLE DANDA 11175 ; Po # MAHAJANI SECTION MARK 111C5..111C6 ; Po # [2] SHARADA DANDA..SHARADA DOUBLE DANDA 11238..11239 ; Po # [2] KHOJKI DANDA..KHOJKI DOUBLE DANDA 112A9 ; Po # MULTANI SECTION MARK 113D4..113D5 ; Po # [2] TULU-TIGALARI DANDA..TULU-TIGALARI DOUBLE DANDA 1144B..1144C ; Po # [2] NEWA DANDA..NEWA DOUBLE DANDA 115C2..115C3 ; Po # [2] SIDDHAM DANDA..SIDDHAM DOUBLE DANDA 11641..11642 ; Po # [2] MODI DANDA..MODI DOUBLE DANDA 1173C..1173D ; Po # [2] AHOM SIGN SMALL SECTION..AHOM SIGN SECTION 11994 ; Po # DIVES AKURU DOUBLE DANDA 11A42..11A43 ; Po # [2] ZANABAZAR SQUARE MARK SHAD..ZANABAZAR SQUARE MARK DOUBLE SHAD 11A9B..11A9C ; Po # [2] SOYOMBO MARK SHAD..SOYOMBO MARK DOUBLE SHAD 11C41..11C42 ; Po # [2] BHAIKSUKI DANDA..BHAIKSUKI DOUBLE DANDA 11F43..11F46 ; Po # [4] KAWI DANDA..KAWI PUNCTUATION ALTERNATE SECTION MARKER 16A6E..16A6F ; Po # [2] MRO DANDA..MRO DOUBLE DANDA |
[Glossary] | Unicode Glossary https://www.unicode.org/glossary/ For explanations of terminology used in this and other documents. |
[UCD] | Unicode Character Database https://www.unicode.org/ucd/ For detailed documentation about the Unicode Character Database, see Unicode Standard Annex #44: Unicode Character Database https://www.unicode.org/reports/tr44/ |
[Unicode] | The Unicode Standard For the latest version, see: https://www.unicode.org/versions/latest/ |
The following summarizes modifications from the previous version of this document.
4
3
2
1
© 2010–2024 Ken Whistler, Rick McGowan. This publication is protected by copyright, and permission must be obtained from the author and Unicode, Inc. prior to any reproduction, modification, or other use not permitted by the Terms of Use.
Use of this publication is governed by the Unicode Terms of Use. The authors, contributors, and publishers have taken care in the preparation of this publication, but make no express or implied representation or warranty of any kind and assume no responsibility or liability for errors or omissions or for consequential or incidental damages that may arise therefrom. This publication is provided “AS-IS” without charge as a convenience to users.
Unicode and the Unicode Logo are registered trademarks of Unicode, Inc., in the United States and other countries.