Text Analysis in the Wild

Picture 4

The Globe and Mail on November 13th had an interesting example of text analysis in the wild. Crossing pages A10 and A11 they had a box with the high frequency words in the old citizenship guide and the new one with a word cloud in the middle. Here is what the description says:

Discover Canada, a different look at the country

The new citizenship guide, Discover Canada, is much more comprehensive look at Canada’s history and system of government than its predecessor, A Look at Canada, which was produced under the Liberals in 1995. It’s longer (17,536 words to 10,433), with 10 pages devoted to Canadian history, compared to two in the previous version. Its emphasis also differs, with more attention paid to the military, the Crown and Quebec, and less to the environment.

>> Below is a graphi representation of the most frequently used words in the new citizendship guide. The bigger the word the more often it appears.

I had to fold the page to scan it as it is longer than my scanner, but you get the idea. The PDF is here. I would have preferred the two lists at either edge of the box to be closer to let us compare. Note the small print – they used May Eyes and WriteWords which has a word frequency counting tool.