Reading John B. Smith’s “Computer Criticism”, (Style: Vol. XII, No. 4) I came a reference to a content analysis program called the The General Inquirer from the 1960s. This program still has a following and has been rewritten in Java. See the Inquirer Home Page. There is a web version where you can try it here (DO NOT USE A LARGE TEXT).
The General Inquirer “maps” a text to a thesaurus of categories, disambiguating on the way. The web page about How the General Inquirer is used describes what it does thus:
The General Inquirer is basically a mapping tool. It maps each text file with counts on dictionary-supplied categories. The currently distributed version combines the “Harvard IV-4” dictionary content-analysis categories, the “Lasswell” dictionary content-analysis categories, and five categories based on the social cognition work of Semin and Fiedler, making for 182 categories in all. Each category is a list of words and word senses. A category such as “self references” may contain only a dozen entries, mostly pronouns. Currently, the category “negative” is our largest with 2291 entries. Users can also add additional categories of any size.
As they say later on, their categories were developed for “social-science content-analysis research applications” and not for other uses like literary study. The original developer published a book on the tool in 1966:
Philip J. Stone, The General Inquirer: A Computer Approach to Content Analysis. (Cambridge: M. I. T. Press, 1966).