The list was created using statistic list of Chinese characters and a number of thick dictionaries. All characters are presented in falling statistical order. Pronunciations are specified according to Pinyin and for some characters a number of different possible pronunciations are given. Examples of common words are given for most characters, however with no guarantee that all the most common words are listed or that the given examples are particularly common words. Some of the listed pronunciations for some characters are less used than other pronunciations for the same character, and in those cases translations and examples may lack. Some additional comments are given.
The current edition of the list "The Most Common Chinese Characters in order of frequency" lists more than 2,700 characters. The list is complete for the 2,400 most common characters – after that, the list caintains a number of gaps. The document is coded in GB2312. The size of the document is 337 kbyte, and a printout will fill more than 100 pages. New versions containing more characters and additional details are published from time to time.
Statistics
Both Jun Da and Chih-Hao Tsai present detailed statistics for the use of Chinese characters on their web-sites. In my experience, Jun Da's statistics are quite reliable, so my list of the most common Chinese characters is based on his research.
According to the statistics, a knowledge of a given number of the most common characters should result in the following estimated understanding of the Chinese language:
100 characters → 42% understanding | 1600 characters → 95.0% understanding |
200 characters → 55% understanding | 1700 characters → 95.5% understanding |
300 characters → 64% understanding | 1800 characters → 96.0% understanding |
400 characters → 70% understanding | 1900 characters → 96.5% understanding |
500 characters → 75% understanding | 2000 characters → 97.0% understanding |
600 characters → 79% understanding | 2100 characters → 97.4% understanding |
700 characters → 82% understanding | 2200 characters → 97.7% understanding |
800 characters → 85% understanding | 2300 characters → 98.0% understanding |
900 characters → 87% understanding | 2400 characters → 98.3% understanding |
1000 characters → 89% understanding | 2500 characters → 98.5% understanding |
1100 characters → 90% understanding | 2600 characters → 98.7% understanding |
1200 characters → 91% understanding | 2700 characters → 98.9% understanding |
1300 characters → 92% understanding | 2800 characters → 99.0% understanding |
1400 characters → 93% understanding | 2900 characters → 99.1% understanding |
1500 characters → 94% understanding | 3000 characters → 99.2% understanding |
Ref: http://www.zein.se/patrick/3000en.html
No comments:
Post a Comment