Saturday, September 1, 2012

The Most Common Chinese Characters



The list was created using statistic list of Chinese characters and a number of thick dictionaries. All characters are presented in falling statistical order. Pronunciations are specified according to Pinyin and for some characters a number of different possible pronunciations are given. Examples of common words are given for most characters, however with no guarantee that all the most common words are listed or that the given examples are particularly common words. Some of the listed pronunciations for some characters are less used than other pronunciations for the same character, and in those cases translations and examples may lack. Some additional comments are given.

The current edition of the list "The Most Common Chinese Characters in order of frequency" lists more than 2,700 characters. The list is complete for the 2,400 most common characters – after that, the list caintains a number of gaps. The document is coded in GB2312. The size of the document is 337 kbyte, and a printout will fill more than 100 pages. New versions containing more characters and additional details are published from time to time.

Statistics

Both Jun Da and Chih-Hao Tsai present detailed statistics for the use of Chinese characters on their web-sites. In my experience, Jun Da's statistics are quite reliable, so my list of the most common Chinese characters is based on his research.

According to the statistics, a knowledge of a given number of the most common characters should result in the following estimated understanding of the Chinese language:



100 characters → 42% understanding1600 characters → 95.0% understanding
200 characters → 55% understanding1700 characters → 95.5% understanding
300 characters → 64% understanding1800 characters → 96.0% understanding
400 characters → 70% understanding1900 characters → 96.5% understanding
500 characters → 75% understanding2000 characters → 97.0% understanding
600 characters → 79% understanding2100 characters → 97.4% understanding
700 characters → 82% understanding2200 characters → 97.7% understanding
800 characters → 85% understanding2300 characters → 98.0% understanding
900 characters → 87% understanding2400 characters → 98.3% understanding
1000 characters → 89% understanding2500 characters → 98.5% understanding
1100 characters → 90% understanding2600 characters → 98.7% understanding
1200 characters → 91% understanding2700 characters → 98.9% understanding
1300 characters → 92% understanding2800 characters → 99.0% understanding
1400 characters → 93% understanding2900 characters → 99.1% understanding
1500 characters → 94% understanding3000 characters → 99.2% understanding

Ref: http://www.zein.se/patrick/3000en.html

No comments:

Post a Comment