{"id":2598,"date":"2009-08-22T10:41:17","date_gmt":"2009-08-22T15:41:17","guid":{"rendered":"http:\/\/www.theoreti.ca\/?p=2598"},"modified":"2009-08-22T10:41:17","modified_gmt":"2009-08-22T15:41:17","slug":"looking-into-a-google-book","status":"publish","type":"post","link":"https:\/\/theoreti.ca\/?p=2598","title":{"rendered":"Looking into a Google Book"},"content":{"rendered":"<p>When I was in the USA <a href=\"http:\/\/books.google.ca\">Google Books<\/a> not long ago, I downloaded a PDF of the public domain work <a href=\"http:\/\/books.google.com\/books?id=cJ_nkyjUxNgC&amp;dq=plato+the+apology+of+socrates\">The apology of Socrates (translated by D. F. Nevill)<\/a>. (Note that you can&#8217;t get the PDF in Canada, probably because of different copyright laws concerning public domain.) I was double checking that Google didn&#8217;t give you easy access to the whole plain text. You either get the PDF (without text) or you can see the Plain Text one page at a time. This makes it hard to run analytical tools for research on public domain texts from Google Books.<\/p>\n<p>What intrigued me, however, were the traces of the scanning in this PDF. For whatever reason there are a number of images of pages partly turned. These images are the trace of a process.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-2600\" title=\"Picture 9\" src=\"http:\/\/www.theoreti.ca\/wp-content\/uploads\/2009\/08\/Picture-9-300x233.png\" alt=\"Picture 9\" width=\"300\" height=\"233\" \/><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-2601\" title=\"Picture 8\" src=\"http:\/\/www.theoreti.ca\/wp-content\/uploads\/2009\/08\/Picture-8-300x235.png\" alt=\"Picture 8\" width=\"300\" height=\"235\" \/><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-2602\" title=\"Picture 7\" src=\"http:\/\/www.theoreti.ca\/wp-content\/uploads\/2009\/08\/Picture-7-300x236.png\" alt=\"Picture 7\" width=\"300\" height=\"236\" \/><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-2603\" title=\"Picture 5\" src=\"http:\/\/www.theoreti.ca\/wp-content\/uploads\/2009\/08\/Picture-5-300x236.png\" alt=\"Picture 5\" width=\"300\" height=\"236\" \/><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-2604\" title=\"Picture 2\" src=\"http:\/\/www.theoreti.ca\/wp-content\/uploads\/2009\/08\/Picture-2-200x300.png\" alt=\"Picture 2\" width=\"200\" height=\"300\" \/><\/p>\n<p>The last one is from the back where the librarians stamp the book when returned. What do we know about the Google scanning technology? CNet has an interesting story, <a href=\"http:\/\/news.cnet.com\/8301-11386_3-10232931-76.html\">Patent reveals Google&#8217;s book-scanning advantage<\/a>, about how they use infrared to compensate for the curvature of pages. There are also stories like this one in Tech Crunch on how <a href=\"http:\/\/www.techcrunch.com\/2007\/12\/06\/google-books-adds-hand-scans\/\">Google Books Adds Hand Scans<\/a> that suggest humans are part of the process.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>When I was in the USA Google Books not long ago, I downloaded a PDF of the public domain work The apology of Socrates (translated by D. F. Nevill). (Note that you can&#8217;t get the PDF in Canada, probably because of different copyright laws concerning public domain.) I was double checking that Google didn&#8217;t give &hellip; <a href=\"https:\/\/theoreti.ca\/?p=2598\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Looking into a Google Book<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[24],"tags":[],"class_list":["post-2598","post","type-post","status-publish","format-standard","hentry","category-markup-and-text-representation"],"_links":{"self":[{"href":"https:\/\/theoreti.ca\/index.php?rest_route=\/wp\/v2\/posts\/2598","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/theoreti.ca\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/theoreti.ca\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/theoreti.ca\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/theoreti.ca\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=2598"}],"version-history":[{"count":2,"href":"https:\/\/theoreti.ca\/index.php?rest_route=\/wp\/v2\/posts\/2598\/revisions"}],"predecessor-version":[{"id":2605,"href":"https:\/\/theoreti.ca\/index.php?rest_route=\/wp\/v2\/posts\/2598\/revisions\/2605"}],"wp:attachment":[{"href":"https:\/\/theoreti.ca\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=2598"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/theoreti.ca\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=2598"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/theoreti.ca\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=2598"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}