|
Errata Fixed in Unicode 4.1.0
This page contains the definitive listing of all errata of
record since the publication of The Unicode Standard, Version
4.0 and considered resolved by the release of Unicode Version
4.1. These errata are listed by date in the table below. For prior
errata resolved in Unicode 4.0 and earlier, see
Errata Fixed in Unicode 4.0.0
For errata still pending subsequent to the release of Unicode 4.1.0, see the list of current
Updates and Errata.
Date |
Summary |
2005-February-17 |
The following 7 Unified Han Ideographs are shown with
incorrect representative glyphs in the printed code charts for Unicode 4.0. The representative glyph on the left below
shows each character as it appeared in the earlier versions of
the code charts; the glyph on the right shows each character as
it should appear.
|
2004-November-15 |
In the character code charts for Unicode 4.0 (http://www.unicode.org/charts/PDF/Unicode-4.0/U40-2B00.pdf)
the following characters are shown with an incorrect representative glyph, which contradict their character
names: U+2B00, U+2B01, U+2B08, and U+2B09. The incorrect glyphs for each pair are shown on the left, the corrected
glyphs are shown on the right. The correction ensures that the glyphs match the character identity as defined by the
character names .
|
2004-November-15 |
In the character code charts for Unicode 4.0 (http://www.unicode.org/charts/PDF/Unicode-4.0/U40-2B00.pdf)
the character U+01B3 LATIN CAPITAL LETTER Y WITH HOOK is shown with a representative glyph,
which is not the preferred form. The preferred form, with hook on the right is shown here:
|
2004-July-02
|
In the 4.0.0 and 4.0.1 versions of
UAX #14
an update to the rules for handling the WJ and GL class was
omitted. The pair table, including its annotations that
reflect which rules are invoked for each pair were updated
correctly. However, the text of the rules should have been
updated as follows to split WJ off from GL and relax the rules for GL
to allow SPACE to override
the non-breaking nature of GL: Word Joiner Non-breaking characters:
LB 11b Don’t break before or after WORD JOINER, NBSP,
and related characters × WJ GL
GL WJ ×
Spaces:
LB 12 Break after spaces SP ÷
Many existing implementations reverse the order of precedence
between rules LB11b and LB12.
Non-breaking characters:
LB 13 Don’t break before or after NBSP, and related
characters
× GL
GL ×
Where the change in rule 12 only affects the comment. The
modification section should have read:
- Several changes to the rules. Moved rule 15b to 18b,
added 14b, split rule 13 and moved WJ from 13
to 11b. Split rule 6 in to 6a and7b and split rule 3a into
3a and 3b. Restated rule 7a and added rule 7c.
|
2004-April-22 |
In the 4.0.1 version of
UAX #29
in Table 1. Default Grapheme Cluster Boundaries there is
a mistake in an explanation. The property value is correct but the
example is not. The following:
Hangul_Syllable_Type=L, e.g.:
U+1100 (ᄀ) HANGUL CHOSEONG KIYEOK
..U+115F (ᅟ) HANGUL CHOSEONG FILLER
should have been:
Hangul_Syllable_Type=L, e.g.:
U+1100 (ᄀ) HANGUL CHOSEONG KIYEOK
..U+1159 (ᅙ) HANGUL CHOSEONG YEORINHIEUH
U+115F (ᅟ) HANGUL CHOSEONG FILLER
Also in Table 1. Default Grapheme Cluster Boundaries, the definition
of the value Control is incorrect. It needed to have been
adjusted for the change in status of the Joiner characters. After the line:
and not U+000A LINE FEED (LF)
the following text is missing:
and not U+200C ZERO WIDTH NON-JOINER (ZWNJ)
and not U+200D ZERO WIDTH JOINER (ZWJ) |
The UTC has committed to having the two properties
Numeric_Type:Decimal and General_Category:Decimal_Number in the Unicode Character Database
encompass exactly the same characters. In 4.0.1, a production
error caused this to be broken.
The following lines in
UnicodeData.txt:
1369;ETHIOPIC DIGIT ONE;Nd;0;L;;;1;1;N;;;;;
...
1371;ETHIOPIC DIGIT NINE;Nd;0;L;;;9;9;N;;;;;
should have been:
1369;ETHIOPIC DIGIT ONE;Nd;0;L;;1;1;1;N;;;;;
...
1371;ETHIOPIC DIGIT NINE;Nd;0;L;;9;9;9;N;;;;;
In
DerivedNumericTypes.txt the following line:
1369..1371 ; Digit # Nd [9] ETHIOPIC DIGIT ONE..ETHIOPIC DIGIT
NINE
should have been:
1369..1371 ; Decimal # Nd [9] ETHIOPIC DIGIT ONE..ETHIOPIC DIGIT
NINE
The precise numeric properties of these characters are under
review and the noted inconsistency will be resolved in the next
version of the standard. |
In DerivedCoreProperties.txt in 4.0.1, the comment line with the derivation for Default_Ignorable_Code_Point is in error. The following:
# Generated from Other_Default_Ignorable_Code_Point + Cf + Cc +
Cs
+ Noncharacters - White_Space - Annotation_characters
should have been:
# Generated from Other_Default_Ignorable_Code_Point + Cf + Cc +
Cs
# + Variation_Selector + Noncharacter_Code_Point
# - White_Space - Annotation_characters |
In the 4.0.1 version of UCD.html under BIDIClass, above the row:
L Otherwise
the following default properties for BN should have been added:
BN |
U+2064..U+2069, U+FDD0..U+FDEF, U+FFFE..U+FFFF, U+1FFFE..U+1FFFF,
U+2FFFE..U+2FFFF, U+3FFFE..U+3FFFF, U+4FFFE..U+4FFFF, U+5FFFE..U+5FFFF,
U+6FFFE..U+6FFFF,
U+7FFFE..U+7FFFF, U+8FFFE..U+8FFFF, U+9FFFE..U+9FFFF, U+AFFFE..U+AFFFF,
U+BFFFE..U+BFFFF, U+CFFFE..U+CFFFF, U+DFFFE..U+E0000, U+E0002..U+E001F,
U+E0080..U+E00FF,
U+E01F0..U+E0FFF, U+EFFFE..U+EFFFF, U+FFFFE..U+FFFFF,
U+10FFFE..U+10FFFF |
|
2004-March-7 |
3396 SQUARE ML: The
representative glyph on the left below shows the character as it
appeared in some versions of previous code charts; the glyph on the
right shows the character as it should appear (with lower case
'm').
The representative glyph was inadvertently shown with an upper
case 'M' in Unicode
4.0 and Unicode 2.0, but was shown correctly in all other versions.
|
2004-March-7 |
In the character code charts for Unicode 4.0 (http://www.unicode.org/charts/PDF/Unicode-4.0/U40-1D300.pdf)
the following characters are shown with an incorrect
representative glyph: U+1D301, U+1D302, and U+1D303. The
incorrect glyphs are shown on the left, the corrected
glyphs are shown on the right.
|
2003-December-01 |
The following Unified Han Idegraphs are shown
with incorrect representative glyphs in the online code charts
for Unicode 3.1(http://www.unicode.org/charts/PDF/Unicode-3.1/U31-20000.pdf)
and the printed code charts for Unicode 4.0: U+2384F, U+25D0E,
U+27CF1, and U+2890F. The representative glyph on the left below
shows each character as it appeared in the earlier versions of
the code charts; the glyph on the right shows each character as
it should appear.
|
2003-August-25 |
The annotation for 200B in the Unicode code
charts should read:
* This character is intended for line break control. It has
no width, but its presence between two characters does not
prevent increased letter spacing in justification. |
2003-May-23 |
031A COMBINING LEFT ANGLE ABOVE: The
representative glyph on the left below shows the character as it
appeared in some versions of previous charts; the glyph on the
right shows the character as it should appear (with left angle
aligned over right shoulder of base character, not centered).
The representative glyph was inadvertantly centered in Unicode
3.0, but was shown correctly in earlier versions.
|
|
|