Errata fixed in Unicode 5.1.0
[Unicode]  Unicode 5.1.0 Home | Site Map | Search
 

Errata Fixed in Unicode 5.1.0

This page contains the definitive listing of all errata of record since the publication of The Unicode Standard, Version 5.0 and considered resolved by the release of Unicode Version 5.1. These errata are listed by date in the table below. For prior errata resolved in Unicode 5.0 and earlier, see Errata Fixed in Unicode 5.0.

For errata still pending subsequent to the release of Unicode 5.1.0, see the list of current Updates and Errata.

Date  Summary 
2008-March-07 The representative glyph for U+1D81 in the Unicode 5.0 chart has an extraneous line running from the lower right to upper left side of the glyph. It is most visible at high resolutions. The incorrect glyph is shown on the left, and a corrected glyph on the right.
incorrect glyph for U+1D81 correct glyph for U+1D81
2007-November-20 The representative glyph for U+1E9A LATIN SMALL LETTER A WITH RIGHT HALF RING in Unicode 2.0 has the ring well to the right. The representative glyph in Unicode 3.0 and later incorrectly had the right half ring over the base letter. Below are shown the incorrect glyph on the left, and the corrected glyph on the right:

incorrect glyph for U+1E9A correct glyph for U+1E9A

2007-August-23 In the code charts for Unicode Version 5.0, the glyphs for U+0333 and U+0347 are incorrect. The glyph for U+0333 should be longer. The glyph for U+0347 should be shorter. The glyphs were not merely swapped: the correct glyph for U+0333 should be longer than the incorrect glyph for U+0347. Below are shown incorrect glyphs on the left and corrected glyphs on the right:

old glyph for U+0333 new glyph for U+0333

old glyph for U+0347 new glyph for U+0347

2007-July-30 In the code charts for Unicode Versions 5.0 and earlier, the representative glyphs for U+0460 and U+047E are shown with "broad omega" shaped glyphs. These are being corrected to show "W"-shaped glyphs for the uppercase letters, matching the shapes of their lowercase counterparts. The incorrect glyphs are shown on the left; the corrected glyphs are shown on the right.

old glyph for U+0460new glyph for U+0460
old gyph for U+047Enew glyph for U+047E

2007-June-7

In the 5.0 code charts, the names for U+075E and U+075F are correct, but the glyphs should be swapped.

2007-April-19 In the code charts for Unicode Versions 5.0 and earlier, the representative glyphs for U+0478 and U+0479 are shown in an Old Church Slavonic (OCS) style typeface. The decision to encode a monograph uk character for OCS has made that style choice inappropriate for these characters. The incorrect glyphs are shown on the left; the corrected glyphs are shown on the right.

old glyph for U+0478/0479 new glyph for U+0478/0479

2007-April-12 In the code charts for Unicode Versions 5.0 and earlier, the lower bar on the glyph for U+2626 ORTHODOX CROSS is slanted downward in the wrong direction. The incorrect glyph is shown on the left; the corrected glyph is shown on the right.

old glyph for U+2626 new glyph for U+2626

2007-March-14 In UAX #15, Unicode Normalization Forms, for Unicode 5.0, there is an erroneous statement in the last paragraph of Section 14, Detecting Normalization Forms. The text currently states:
"...that no string when decomposed with NFD expands to more than 3x in length (measured in code units)."
That text should be corrected to state:
"...that no string when normalized to NFC expands to more than 3x in length (measured in code units)."
2007-February-14 In the code charts for Unicode Versions 5.0 and earlier, the representative glyphs for U+047C and U+047D represent an incorrect understanding of the nature of the character that was encoded ("beautiful omega"). The incorrect glyphs are shown on the left; the corrected glyphs are shown on the right.

old glyph for U047C-047D new glyph for U047C-047D

2007-February-02 The sample code in Section 7 of UAX#14 does not handle leading spaces correctly. Adding the following code before the loop provides a fix:

// treat SP at start of input as if it followed WJ
if (cls == SP)
      cls = WJ;

2007-January-25 In the file DerivedCoreProperties.txt in the Version 5.0 Unicode Character Database, the stated rule in the comments for the generation of the Default_Ignorable_Code_Point property is incomplete. The rule should include all characters with the Variation_Selector property, so that the complete statement of the rule is:

Other_Default_Ignorable_Code_Point + Cf + Cc + Cs
+ Noncharacter_Code_Point + Variation_Selector - White_Space
- FFF9..FFFB (Annotation Characters)

The actual listing of characters in the data file with the Default_Ignorable_Code_Point property is correct.

Note that the stated rule was further updated for Version 5.1 of the standard, so the correction in this erratum notice applies only to the Version 5.0 data file.

2007-January-22 The code point U+00A0 was supposed to have the Sentence_Break property value Sp in the Unicode Character Database for Version 5.0, but that change was overlooked in the updating of SentenceBreakProperty.txt. This will be corrected in a subsequent version of the standard.
2006-September-11 In the code charts for Unicode Version 5.0, the representative glyphs for U+1031 was incorrectly imaged on the wrong side of the dotted circle. The incorrect glyph is shown on the left; the corrected glyph is shown on the right.

U+1031 (old)  U+1031 (new)

2006-September-10

The Index.txt file in version 5.0.0 of the Unicode Character Database is not valid UTF-8. The following substitutions will fix the file:

Replace byte 0x92 in line 74 by U+00FC [ü] LATIN SMALL LETTER U WITH DIAERESIS. Replace byte 0xe1 in lines 854 and 1549 by a space.