H. P. Luhn, KWIC and the Concordance

We all know that the Google display comes indirectly from the Concordance, but I have found in Luhn’s 1966 “Keyword-in-Context Index for Technical Literature (Kwic Index)” the explicit recognition of the link and the reason for drawing on the concordance.

the significance of such single keywords could, in most instances, be determined only by referring to the statement from which the keyword had been chosen. This somewhat tedious procedure may be alleviated to a significant degree by listing selected keywords together with surrounding words that act as modifiers pointing up the more specific sense in which a keyword has been applied. This method of indexing words is well established in the process of compiling concordances of important works of literature of the past. The added degree of information conveyed by such keyword-in-context indexes, or “KWIC Indexes” for short, can readily be provided by automatic processing. (p. 161)

The problem for Luhn is that simply retrieving words doesn’t give you a sense of their use. His solution, first shown in the late 1950s, was to provide some context (hence “keyword-in-context”) so that readers can disambiguate themselves and make decisions about which index items to follow. It is from the KWIC that we ultimately get the concordance features of the Google display, though it should be noted that Luhn was proposing KWIC as a way of printing automatically generated literature indexes where the kewwords were in the titles. In this quote Luhn explicitly acknowledges that this is a method well established in concordances.

There is also a link between Luhn and Father Busa. According to Black, quoted in Marguerite Fischer, “The Kwic Index Concept: A Retrospective View”,

the Pontifical Faculty of Philosophy in Milan decided that they would make an analytical index and concordance to the Summa Theologica of St. Thomas Aquinas, and approached IBM about the possibility of having the operations performed on Data Processing. Experience gained in this project contributed towards the development of the KWIC Index. (This is a quote on page 123 from Black, J. D., 1962, “The Keyword: Its Use in Abstracting, Indexing, and Retrieving Information”.)

From the concordance to KWIC through to Google?

For some historical notes on Luhn see, H. P. Luhn and Automatic Indexing.