{"id":1166,"date":"2006-03-29T09:25:02","date_gmt":"2006-03-29T13:25:02","guid":{"rendered":"http:\/\/www.theoreti.ca\/?p=1166"},"modified":"2009-03-31T00:50:00","modified_gmt":"2009-03-31T05:50:00","slug":"greg-crane-what-do-you-do-with-a-million-books","status":"publish","type":"post","link":"https:\/\/theoreti.ca\/?p=1166","title":{"rendered":"Greg Crane: What Do You Do with a Million Books?"},"content":{"rendered":"<p><a title=\"What Do You Do with a Million Books?\" href=\"http:\/\/www.dlib.org\/dlib\/march06\/crane\/03crane.html\">What Do You Do with a Million Books?<\/a>, by Gregory Crane talks about the implications of large-scale book scanning projects like the Google Print project that is scanning tens of millions of books. He introduces an interesting term, &#8220;recombinant documents&#8221;, to describe how software (like what they have at Perseus) can add intelligent connections to documents, but also the way documents can be reorganized and combined into &#8220;concordances&#8221; or hybrid documents. This is similar, I think, to what Mark Olsen was talking about in <a title=\"Toward meaningful computing\" href=\"http:\/\/www.theoreti.ca\/?p=1164\">Toward meaningful computing<\/a>. Crane&#8217;s answer, drawn from the <a title=\"DARPA Global Autonomous Language Exploitation\" href=\"http:\/\/www.theoreti.ca\/wp-content\/uploads\/notes\/000908.html\">DARPA Global Autonomous Language Exploitation<\/a> (GALE) project is three core functions:<\/p>\n<ol>\n<li>Analog to text (digitizing speech and print)<\/li>\n<li>Machine translation (from language to language)<\/li>\n<li>Information extraction (mining for linkable dates, names and so on)<\/li>\n<\/ol>\n<p>Thanks to Mark Olsen for this link.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>What Do You Do with a Million Books?, by Gregory Crane talks about the implications of large-scale book scanning projects like the Google Print project that is scanning tens of millions of books. He introduces an interesting term, &#8220;recombinant documents&#8221;, to describe how software (like what they have at Perseus) can add intelligent connections to &hellip; <a href=\"https:\/\/theoreti.ca\/?p=1166\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Greg Crane: What Do You Do with a Million Books?<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[24],"tags":[],"class_list":["post-1166","post","type-post","status-publish","format-standard","hentry","category-markup-and-text-representation"],"_links":{"self":[{"href":"https:\/\/theoreti.ca\/index.php?rest_route=\/wp\/v2\/posts\/1166","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/theoreti.ca\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/theoreti.ca\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/theoreti.ca\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/theoreti.ca\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1166"}],"version-history":[{"count":0,"href":"https:\/\/theoreti.ca\/index.php?rest_route=\/wp\/v2\/posts\/1166\/revisions"}],"wp:attachment":[{"href":"https:\/\/theoreti.ca\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1166"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/theoreti.ca\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1166"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/theoreti.ca\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1166"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}