Unicode® Statistics
This page provides various statistics regarding the Unicode Standard and related specifications.
Last updated: September 11, 2024
Character Counts
One of the most basic questions about the Unicode Standard is, "How many characters are encoded?"
The answer to that question is surprisingly complicated, because there are so many different
types of characters (and code points) involved in the architecture and maintenance of
the universal character encoding.
Over the years, conventions have been developed for how to track the number of encoded
characters of various types in the Unicode Standard. The counts were traditionally
published in
Appendix D, Version History of the Standard
in each new version. That practice continued up to Unicode 12.0. Since then,
to make this information more accessible, it has been restructured for presentation here. For an explanation of
terminology related to code point types mentioned in these tables, see
Section 2.4.1, Types of Code Points in the core specification. For information about some of the odder types
of characters in Unicode, see also
the Private-Use Characters and Noncharacters FAQ.
Raw Character Counts by Unicode Version
To help in visualizing the growth of the Unicode Standard over time, the following simple charts
show some important raw character counts by year.
Charts for Characters Added by Year
Emoji Counts
Counting emoji in the Unicode Standard constitutes a special challenge, because the full
definition of emoji includes many different kinds of character sequences which are
presented as a single emoji glyph to an end user. An obvious example would be a sequence of
two regional indicator characters, which are then interpreted and displayed as a single,
distinct "flag emoji". Tables have been compiled enumerating all the different kinds of emoji
for different versions dating back to Version 3.0 of UTS #51, Unicode Emoji.
Note that Emoji Version
3.0 is the earliest version with meaningful emoji counts. Emoji Version numbers
prior to Version 11.0 were not tightly synched with versions of the Unicode Standard.
For information about which emoji characters were part of the Unicode Standard earlier
than Unicode 9.0, see: Emoji Versions
Emoji Counts by Emoji Version
Number of Scripts
As the Unicode Standard has expanded over the year, the number of scripts supported by the standard
has also increased dramatically. The version-by-version additions are documented on
the Supported Scripts page.
For convenience, that table also tracks a running total of the number of scripts in the standard.
|