Greg Crane: What Do You Do with a Million Books?

What Do You Do with a Million Books?, by Gregory Crane talks about the implications of large-scale book scanning projects like the Google Print project that is scanning tens of millions of books. He introduces an interesting term, “recombinant documents”, to describe how software (like what they have at Perseus) can add intelligent connections to documents, but also the way documents can be reorganized and combined into “concordances” or hybrid documents. This is similar, I think, to what Mark Olsen was talking about in Toward meaningful computing. Crane’s answer, drawn from the DARPA Global Autonomous Language Exploitation (GALE) project is three core functions:

  1. Analog to text (digitizing speech and print)
  2. Machine translation (from language to language)
  3. Information extraction (mining for linkable dates, names and so on)

Thanks to Mark Olsen for this link.