Google Book Search Settlement

The Google Book Search Settlement, if approved by Judge Chin, may be a turning point in textual research. In principle, if the settlement goes through, then Google will release the full 7-10 million books for research (“non-consumptive”) use. Should get even the 500,000 public domain books for research we will have a historic corpus far larger than anything else. To quote the Greg Crane D-Lib article, “What can you do with a million books?” and “What effect will millions of books have on the textual disciplines?”

There is understandably a lot of concerns about the settlement especially about the ownership of orphan works. The American Library Association has a web site on the settlement, as do others. I think we need to also start talking about how to develop a research infrastructure to allow the millions of books to be used effectively. What would it look like? What could we do? Some ideas:

  • To be only usable by researchers there would have to be some sort of reasonable firewall.
  • It would be nice if it were truly multilingual/multicultural from the start. The books are, after all.
  • It would be nice if there was a mechanism for researchers to correct the OCRed text where they see typos. Why couldn’t we clean up the plain text together.
  • It would be nice if there was an open architecture search engine scaled to handle the collection and usable by research tools.

Update: Matt pointed me to an article in the Wall Street Journal on Tech’s Bigs Put Google’s Books Deal In Crosshairs.

NY Times: Is This the Future of the Digital Book?

Christian V. pointed me to a New York Times story on Is This the Future of the Digital Book? (Brad Stone, April 4, 2009).

Wattpad, based in Toronto, is among several start-ups soliciting the work of unpublished authors, giving them a route around the big book companies and then distributing their writing on the Web and on mobile phones. Wattpad draws its revenue from advertising and, for now at least, does not pay the authors.

Smart phones now have larger and brighter screens so it is possible people will use them to read, but I’m not yet convinced. Will weaving multimedia into the book save it?

Beatrice Warde: The Crystal Goblet

Reading the book on Canadian book design, The Surface of Meaning I came across a reference to Beatrice Warde’s The Crystal Goblet. This was given as a lecture in London in 1930 with the title “Printing Should be Invisible” and was printed in the 1950s. It is a clear and apparently influential statement of the modernist view of how a book design should be transparent letting the ideas shine through. It starts with a metaphor of the book as a goblet.

Imagine that you have before you a flagon of wine. You may choose your own favourite vintage for this imaginary demonstration, so that it be a deep shimmering crimson in colour. You have two goblets before you. One is of solid gold, wrought in the most exquisite patterns. The other is of crystal-clear glass, thin as a bubble, and as transparent. …

Bear with me in this long-winded and fragrant metaphor; for you will find that almost all the virtues of the perfect wine-glass have a parallel in typography. …

Now the man who first chose glass instead of clay or metal to hold his wine was a ‘modernist’ in the sense in which I am going to use that term. That is, the first thing he asked of his particular object was not ‘How should it look?’ but ‘What must it do?’ and to that extent all good typography is modernist. …

It is sheer magic that I should be able to hold a one-sided conversation by means of black marks on paper with an unknown person half-way across the world. Talking, broadcasting, writing, and printing are all quite literally forms of thought transference, and it is the ability and eagerness to transfer and receive the contents of the mind that is almost alone responsible for human civilization. …

the most important thing about printing is that it conveys thought, ideas, images, from one mind to other minds. This statement is what you might call the front door of the science of typography. …

Type well used is invisible as type, just as the perfect talking voice is the unnoticed vehicle for the transmission of words, ideas. ..

it is mischievous to call any printed piece a work of art, especially fine art: because that would imply that its first purpose was to exist as an expression of beauty for its own sake and for the delectation of the senses. Calligraphy can almost be considered a fine art nowadays, because its primary economic and educational purpose has been taken away; but printing in English will not qualify as an art until the present English language no longer conveys ideas to future generations, and until printing itself hands its usefulness to some yet unimagined successor. …

The book typographer has the job of erecting a window between the reader inside the room and that landscape which is the author’s words. He may put up a stained-glass window of marvellous beauty, but a failure as a window; that is, he may use some rich superb type like text gothic that is something to be looked at, not through. Or he may work in what I call transparent or invisible typography. …

Printing demands a humility of mind, for the lack of which many of the fine arts are even now floundering in self-conscious and maudlin experiments. There is nothing simple or dull in achieving the transparent page. Vulgar ostentation is twice as easy as discipline.

Perhaps the time comes when printing reluctantly hands off its usefulness to a digital successor like the Kindle which would explain the (re)discovery of pressing form into paper and other arts of the book. I am not making a prediction and especially not recommending this, just reflecting on how the book need not be so transparent as it was supposed to be. When the burden of usefulness is eased off the shoulders of books they can become post-modern chalices, opaque and shiny.

For another take see Book Design in Canada at Cardigan Industries.

Now for the wine within.

EMiC: Editing Modernism in Canada

EMiC] Editing Modernism in Canada is a neat project that was funded by SSHRC as a Strategic Cluster. Dean Irvine at Dalhousie leads it and the idea is to produce “critically edited texts by modernist Canadian authors”, especially those out of print. It has a nice link to learning in that one reason for editing modernist Canadian texts is to make them available for teaching and learning. Deeper than that, however, is what seems to be an apprentice model of involving (graduate) students in editing. The network of partnerships is impressive too.

Tupi or not Tupi: the Cannibal Manifesto

At a Global Dialogue meeting Clarissa introduced me us to Oswaldo de Andrade’s Cannibal Manifesto. This is one of those rare documents we should all read. The Manifesto Antropófago dates from 1928 and celebrates Brazilian remediation (such a stuffy word compared to “cannibalism”) of other literatures. The third line, which is in English in the orginal, captures the idea:

Tupi or not tupi that is the question.

The Tupi were an indigenous people of Brazil who were supposed to have ritually eaten their enemies. Not to belabor the point, but the joke eats Shakespeare and English into a modernist manifesto simultaneously rejects Western patterns. The manifesto starts with:

Only Cannibalism unites us. Socially. Economically. Philosophically.

The unique law of the world. The disguised expression of all individualisms, all collectivisms. Of all religions. Of all peace treaties.

It could be the law of blogging that eats the web or the law of social media that eat their versions. Remediation with teeth.

University Affairs: Some graduates question thesis publication requirement

University of Affairs has a story online about how Some graduates question thesis publication requirement. The article gives as examples, students in creative writing programs who obviously want to go on and publish their theses. They don’t mention the serious issue of the license that Theses Canada makes you sign. I wonder if it would be possible for a graduate student to edit the license before signing it?

CaSTA 2008: New Directions in Text Analysis

CaSTA 08 LogoI am at the CaSTA 2008 New Directions in Text Analysis conference at the University of Saskatchewan in Saskatoon. The opening keynote by Meg Twycross was a thorough and excellent tour through manuscript digitization and forensic analysis techniques.

My notes are in a conference report (being written as it happens.)

University Libraries in Google Project to Offer Backup Digital Library – Chronicle.com

Hathi Slogan and LogoFrom Bethany I discovered this story by the Chronicle of Higher Education about the HathiTrust, titled University Libraries in Google Project to Offer Backup Digital Library (Jeffrey R. Young, Oct. 13, 2008). “Hathi” is the hindi word for elephant suggesting memory and size. Here is a quote from the HathiTrust site:

As a digital repository for the nation’s great research libraries, HathiTrust (pronounced hah-TEE) brings together the immense collections of partner institutions.

HathiTrust was conceived as a collaboration of the thirteen universities of the Committee on Institutional Cooperation and the University of California system to establish a repository for these universities to archive and share their digitized collections. Partnership is open to all who share this grand vision.

The repository, among other things, will pool the volumes digitized by Google in collaboration with the universities so there is a backup should Google lose interest. Large-scale search is being studied now and they expect in November to have preview version available.

A Companion to Digital Literary Studies

Cover of Companion The A Companion to Digital Literary Studies edited by Ray Siemens and Susan Schreibman is available online in full text. This is tremendous resource with too many excellent contributions to list individually. Chapters go from Reading on the Screen by Christian Vandendorpe and Algorithmic Criticism by Stephen Ramsay.

There is a good Annotated Overview of Selected Electronic Resources by Tanya Clement and Gretchen Gueguen with links to projects like TAPoR.

Theses Canada: What you (a graduate student) should know

Being on the Faculty of Graduate Studies and Research council my attention was drawn to the issue of what happens to theses. In my day you bound a bunch of copies and one went off to Libraries and Archives Canada where it was indexed, but could not be read online. Since 1997 it looks like they have been digitizing the theses working with contractors. Now they ask graduate students to sign a non-exclusive license that gives LAC remarkable rights. See the page for graduate students, What you should know – at the bottom is the link to the PDF of the license they have to sign which includes the following language:

[I] hereby grant a non-exclusive, for the full term of copyright protection, royalty free license to Library and Archives Canada:

(a) to reproduce, publish, archive, preserve, conserve, communicate to the public by telecommunication or on the Internet, loan, distribute and sell my thesis (the title of which is set forth above) worldwide, for commercial or non-commercial purposes, in microform, paper, electronic and/or any other formats;

(b) to authorize, sub-license, sub-contract or procure any of the acts mentioned in paragraph (a).

I find this language too broad. I can understand why Theses Canada wants these rights in order to be able to run a genuinely useful service that makes Canadian research accessible, but this license is just too broad, especially when enforced by universities that require all graduate students to sign it. There is provision on the Theses Canada site for graduates delaying submission (if they want to register patents, for example) and I’m guessing that most universities would respect a student’s wish to not sign the license.

There is a separate issue around copyright. Part of the License includes this:

If third-party copyrighted material was included in my thesis, I have obtained written copyright permission from the copyright owners to do the acts mentioned in paragraph (a) above for the full term of copyright protection.

I wonder if the accessibility of theses online and the terms of the License might change the willingness of other copyright owners to grant permissions to graduate students.