I stumbled across this tool years ago but never got around to actually testing the kids. Once we got over the first 1000 characters and they were reading, my obsession with character count kind of went out the window.
In the meantime, the tool came out of research and went into implementation by Taiwanese schools via 天下, a local book publisher. You can now test the kids electronically online! And it looks like even overseas Chinese schools can ask to have online accounts set up.
But alas, they did not make it easy for overseas parents to take the test. I can only download the old version and do it on paper. One reason I didn’t use it till this year!
The test definitely has its use, much like the San Diego Reading Assessment test for English reading level accessment, you can use this to guesstimate the number of characters they know.
The test works by taking a sampling of characters based on frequency list and then extrapolating on a straight ratio of how many characters you know.
How to Give the Test
- Have the child take the student version of the test. Have them read each character from left to right while making a word from it. For example. if they see 歡, they say 歡喜的歡.
- On your teacher version of the test, mark down the characters they can’t make words for.
- Under each character there is a shape (square, circle, triangle). Count up how many of each type of shape there is and calculate the score.
The author divided the frequency list into 3 sets, first 1500, second 1500, and third 2000. So there are 3 test levels, with each level containing more characters from that level; e.g., the beginner version has 48 characters from the first 3000 characters but only 12 from the last 2000. But the advanced test has 36 characters from #30001-#5000 of the frequency list.
For each of the 3 levels, there are 10 tests you can choose from.
A picture is worth a thousand words.
Interpreting the Test
The excel spreadsheet to calculate the estimated number of characters is available at the end of the post.
Basically the formula for number of characters you know is (multiplying by numbers correct):
- Beginner Test – 62.5*easy + 62.5*medium + 167*hard
- Intermediate Test – 167*easy + 62.5*medium + 62.5*hard
- Advanced Test – 250*easy + 83.3*medium + 55.6*hard
According to this website, which lists the research paper results, here are the range of character recognition by grade level.
Grade | Avg Characters Recognized | Range for 95% |
1st | 713 | 371-1053 |
2nd | 1248 | 971-1527 |
3rd | 2108 | 1410-2806 |
4th | 2660 | 1779-3543 |
5th | 3142 | 2425-3859 |
6th | 3340 | 2531-4149 |
What’s interesting from this website is that it says if you know 1558 characters, that covers 95% of words you see in most texts, and 2709 characters covers 99%. 2709 characters is around a 4th grade level.
Their conclusion being, knowing 2709 characters gets you out of illiteracy
However, then it also said most magazines and newspapers have around 5200-5500 different characters. This is different from normal day to day texts. I guess if you want academic Chinese, you’d need to know a lot. But for day to day, it’s not that many.
I wrote a separate post years ago on Chinese Characters by Grade Level, which is number of characters the government wants you to recognize by grade. Basically 400 per grade level.
As you can see, this is way different than what kids would actually recognize in Taiwan. On a day to day basis, they definitely are exposed to way more characters than that minimum 400 a year.
Looking back at the kids I know, the above table sounds about right. it roughly translates to reading level I keep in my head. When the kids first learn to read in 1st and 2nd grade the need about 1000+ to get through those Early Readers.
As they get up to 3rd and 4th grade and start dropping zhuyin, I would have pegged them at knowing at least 2000+ characters, or at least can guess.
Again, it’s really more about comprehension at higher levels. And for us overseas Chinese learners, many kids start lagging behind after 3rd grade because they can’t keep up with comprehension.
So I wonder if the goal may be to know enough characters and have enough comprehension to be able to read 4th grade books by end 2nd grade?
How Did We Do?
One of my kid tested around 5th & 6th grade per the chart above. Whereas I vaguely know 4500 characters.
To me this is pretty accurate. We’ve been stuck at the 5th and 6th grade level for the last few years. She can read middle school and adult texts fine, but ask her to make words or tell you the meaning and she cannot. She mispronounces many characters that I feel are common knowledge.
For me, I could make vocabulary words for way more of the difficult characters and I can also pronounce them. If you need 5200+ to read magazines and newspapers, that is exactly my level. I can read adult magazines just fine. However, since I’m slightly below 5200, I do have trouble with more advanced characters and in general, feel like I can’t quite understand what I’m reading compared with English text.
Caveats
My kid did not like me marking words she couldn’t make! So make sure you have a teacher and student version available.
The test isn’t necessarily suited for a kid who learned their difficult characters by reading, like me. I noticed that I recognize a lot of characters by sight but I couldn’t pronounce them. Or I couldn’t really make a word out of it but I knew their meaning and could recognize them if they appeared next to words, as that was how I learned to recognize it. Taking the character out of a word actually makes it harder for me to recognize, especially because so many characters differ by just one radical!
So the test would work better for people who study Chinese the traditional way. They study characters and make vocabulary words from them.
That said, I still felt the test was helpful in pinpointing holes in our Chinese learning….namely, not knowing how to pronounce some characters. At the same time, I was pleasantly surprised at the vocabulary the kids could make as we never studied characters the traditional way! They learned most of it from reading (input). I find that when you learn from reading, you can’t output (speak or write) as well, even though you comprehend just fine.