This page contains the definitive listing of all errata of record
since the publication of The Unicode Standard, Version 5.1 and
considered resolved by the release of Unicode Version 5.2. These
errata are listed by date in the table below. For prior errata
resolved in Unicode 5.1 and earlier, see
Errata Fixed in Unicode 5.1.
For errata still pending subsequent to the release of Unicode
5.2.0, see the list of current
Updates and Errata.
Date |
Summary |
2009-Jul-02 |
In Unicode 5.2 a new character U+19DA NEW TAI LUE THAM DIGIT ONE
was added to the New Tai Lue script, to represent a commonly
occurring "tham" style alternate digit for one. As part of
this change, three other existing New Tai Lue digits U+19D1,
U+19D2, and U+19D4 have
had their glyphs adjusted to correctly reflect the "hora"
style used for the basic set of of New Tai Lue digits.
The "tham" style representative glyphs in
the charts for Unicode 5.0 and 5.1 are shown on the left. The
corrected "hora" style glyphs shown on the right will be used in
future versions of the code charts.
|
2009-May-26 |
On p. 126 to The Unicode Standard, Version 5.0, there
is an error in the bullets after definition D128.
Replace the existing text:
-
Any string that is not isCased consists entirely of
characters that do not case map to themselves.
-
For example, isCased("abc") is true, and isCased("123") is false.
With this revised text:
- Any isCased string contains at least one character that does not case map to itself.
- For example, isCased("123") is false because all the characters in "123" case map to themselves, while isCased("abc") and isCased("A12") are both true.
|
2009-Feb-12 |
The representative glyph for the character U+135F
ETHIOPIC COMBINING GEMINATION MARK is displayed
incorrectly without a dotted circle in the code charts
for Unicode 5.1. The glyph will be corrected to show
the dotted circle in future versions of the code
charts (and was correct in earlier versions). The
figure below shows the incorrect glyph on the left and
the correct glyph on the right:
|
2009-Jan-16 |
U+5446 and U+7343 are incorrectly listed as simplified variants of each other in Version 5.1.0 of
UAX #38,
"Unicode Han Database (Unihan)". |
2008-Dec-22 |
In the charts for Unicode 5.0 and 5.1, the representative
glyphs for U+075E and U+075F were inadvertently swapped. The left
image below shows the incorrect glyphs, and the right image
below shows the correct glyphs. The representative glyphs will be corrected in the next version of the charts. (Note that the glyphs were displayed correctly in the Unicode 4.1 charts.)
|
2008-Dec-15 |
The documentation for newly added mathematical symbols in Unicode 5.1.0 incorrectly documents a character
U+27CB MATHEMATICAL SPACING LONG SOLIDUS OVERLAY. That
was a proposed character, but it was never actually approved for the standard. The erroneous text will be deleted in future revisions of the standard.
|
2008-Sep-19 |
On p. 507 of The Unicode Standard, Version 5.0, there is an error in the paragraph on "Change in
Representative Glyphs for U+2278 and U+2279", which states that
variant glyphs with vertical strokes for these two characters
can be requested using variation sequences. Actually, variation
sequences for these two characters do not exist, as was
clarified by a late note which was added to Version 3.2 of the standard.
Replace the existing text:
Using U+2278 or U+2279 with VS1 will request these variants
explicitly, as will using U+2276 LESS-THAN OR GREATER-THAN or
U+2277 GREATER-THAN OR LESS-THAN with U+20D2 COMBINING LONG
VERTICAL LINE OVERLAY. Unless fonts are created with the
intention to add support for both forms (via VS1 for the upright
forms), there is no need to revise the glyphs in existing fonts;
...
With this revised text:
Using U+2276 LESS-THAN OR GREATER-THAN or U+2277 GREATER-THAN
OR LESS-THAN with U+20D2 COMBINING LONG VERTICAL LINE OVERLAY
will display these variants explicitly. Unless fonts are created
with the intention to add support for both forms, there is no
need to revise the glyphs in existing fonts; ...
|
2008-Aug-21 |
In
UAX #31, "Unicode Identifier and Pattern Syntax" (Version 5.1.0), there is a mistake in the first bullet of A2 in Section 2.3, Layout and Format Control Characters.
The text ", followed by a Letter" should be deleted from that bullet, to make the textual description consistent with the regular expression description in the following bullet. |
2008-Jun-06 |
In the code charts for Unicode Versions 5.1 and earlier,
the representative glyphs for the case pairs of two
Cyrillic letters for Abkhaz, Abkhazian Ha (04A8/04A9)
and Abkhazian Che (04BE/04BF), are shown in an old style
that is no longer preferred. The glyphs are being updated
to reflect modern preferences. The old glyphs are
shown below on the left; the new, preferred forms on the
right:
|
2008-May-30 |
U+1680 OGHAM SPACE MARK is displayed differently depending on
the design of an Ogham font. Ogham fonts with stemlines (the norm)
show U+1680 as a visible stemline; Ogham fonts without stemlines
show it as a blank. The representative glyph in the standard is
being
updated to reflect this variability and for consistency with
the representative glyphs used for
other whitespace characters in the standard. The old representative
glyph is shown on the left, and the updated glyph on the right.
|
2008-May-27 |
In the XML representation of the UCD, Version 5.1.0, some attributes for the character U+A788 MODIFIER LETTER LOW CIRCUMFLEX ACCENT are incorrect. The gc attribute should be
"Lm" rather than "Sk"; the Alpha, IDS, XIDS, IDC and XIDC attributes should be
'Y" rather than "N", and the WB and SB attributes should be "LE" rather than
"XX". |
2008-May-27 |
In the XML representation of the UCD, Version 5.1.0, the
characters U+0000..U+001F and U+007F..U+009F have the incorrect value
for the na attribute. It should be the empty string, rather than the
string "<control>". |
2008-May-7 |
In UAX #31, "Unicode Identifier and Pattern Syntax" (Version 5.1.0), there is a typo in the description for (X)ID_Start in Table 2, Lexical Classes for Identifiers. "letter numbers (Lu)" should be corrected to read "letter numbers (Nl)". |
2008-April-29 |
In UAX #29, "Unicode Text Segmentation" (Version 5.1.0), there is a typo in the definition of Prepend
in Table 2, Grapheme_Cluster_Break Property Values. The correct definition is: "Logical_Order_Exception=True". |
2008-February-12 |
On p. 124 of The Unicode Standard, Version 5.0, there is an error in the Regular Expressions column for
"More_Above", in the third row of
Table 3-14, Context Specification for Casing.
The corrected regular expression should be:
[^\p{ccc=230}\p{ccc=0}]* [\p{ccc=230}]
|
2007-June-4 |
In Section 12.1, Han on p. 424 of The Unicode Standard, Version 5.0, the last paragraph states that U+FA70 to U+FAD9 are "included in the Unicode Standard to provide full round-trip compatibility with the ideographic repertoire of PKS 5700 parts 1, 2, and 3." However, the Korean standard listed is incorrect, and the text should be corrected to "... the ideographic repertoire of KPS 10721-2000."
|
2007-May-24 |
On p. 479 of The Unicode Standard, Version 5.0, the
subheading for Linear B Ideograms lists the range as
"U+10080--U+108FF". That should be corrected to
"U+10080--U+100FF". |
2007-January-11 |
There is an error in the entry for "Trailing Consonant" on page
1147 in the glossary of The Unicode Standard, Version 5.0.
"Vowel_Jamo" should be "Trailing_Jamo" in definition (1), thus reading "(1) In Korean, a jamo character with the Hangul_Syllable_Type property value
Trailing_Jamo (in the range U+11A8..U+11F9)." |
2007-January-5 |
There is an error in the sample code in section 5.17 on page
182 of The Unicode Standard, Version 5.0. The entry 0x2F in the second row of the rotate table should instead be 0x1F. |
2007-January-4 |
On page 411 of The Unicode Standard, Version 5.0, Table 12-2 incorrectly states the extent of the CJK Unified Ideographs Extension A block. The correct range is U+3400 to U+4DBF. In particular, the Yijing Hexagram
Symbols starting at U+4DC0 are not part of Extension A. |
2007-January-2 |
Due to a printing error, the Unified Canadian Aboriginal
Syllabics glyphs at U+1424, U+1426, and U+1487 are missing in the code charts
and names list on pages 684 and 687-88 of The Unicode Standard,
Version 5.0.
These glyphs were correctly represented in the online charts and can be viewed at
http://www.unicode.org/charts/PDF/U1400.pdf. |
2007-January-2 |
The file UNIHAN/FullRSIndex.pdf on the Unicode 5.0 CD-ROM is missing a final page
with the last half of the entry for 211 (tooth) and the complete entries for 212 (dragon), 213 (turtle), and 214 (flute). The
missing page is available
here
as a PDF. |
2006-December-21 |
Table 11-16 in The Unicode Standard, Version 5.0 shows "kyu" twice:
once at the top of part on page 402 and once at the top of the
part on page 403. The repetition is an error and the second
instance should be removed. |