Google Book Search Settlement

The Google Book Search Settlement, if approved by Judge Chin, may be a turning point in textual research. In principle, if the settlement goes through, then Google will release the full 7-10 million books for research (“non-consumptive”) use. Should get even the 500,000 public domain books for research we will have a historic corpus far larger than anything else. To quote the Greg Crane D-Lib article, “What can you do with a million books?” and “What effect will millions of books have on the textual disciplines?”

There is understandably a lot of concerns about the settlement especially about the ownership of orphan works. The American Library Association has a web site on the settlement, as do others. I think we need to also start talking about how to develop a research infrastructure to allow the millions of books to be used effectively. What would it look like? What could we do? Some ideas:

  • To be only usable by researchers there would have to be some sort of reasonable firewall.
  • It would be nice if it were truly multilingual/multicultural from the start. The books are, after all.
  • It would be nice if there was a mechanism for researchers to correct the OCRed text where they see typos. Why couldn’t we clean up the plain text together.
  • It would be nice if there was an open architecture search engine scaled to handle the collection and usable by research tools.

Update: Matt pointed me to an article in the Wall Street Journal on Tech’s Bigs Put Google’s Books Deal In Crosshairs.

Percussa AudioCubes

Garry pointed me to Percussa AudioCubes. These cubes communicate by infrared to communicate distance information that you can use as input.

Location, orientation and distance information is passed to the software, while you interact with the cubes. The software can connect via MIDI or OpenSoundControl (OSC) to any software or hardware for music or visuals which you already have, or you can use it within your DAW (digital audio workstation) software as a VST plugin, or as a host for VST instruments to let you create sound immediately. (From How do they work?)

Research in Support of Digital Libraries at Xerox PARC Part II: Technology

I came across an interesting article in D-Lib that summarizes some of the work at Xerox PARC, Research in Support of Digital Libraries at Xerox PARC Part II: Technology. This is, as the title suggests, is the second part of an extended survey. The article covers some projects on subjects like “visualization of large text collections, summarization, and automatic detection of thematic structure.” There are some interesting examples of citation browsing tools like the Butterfly Citation Browser here.

Another humanities computing centre is dissolved

On Humanist there was an announcement that John Dawson, the Manager of the Literary and Linguistic Computing Centre of Cambridge (LLCC), was retiring and they were having a 45th year celebration conference and retirement party. What the announcement doesn’t say is that with the retirement of Dawson the Cambridge Computing Service is decommissioning the LLCC. I found this on a Computing Service page dedicated to the LLCC:

John Dawson, Manager of the centre will be retiring in October 2009. The LLCC will then cease to exist as a distinct unit, but Rosemary Rodd, current Deputy Manager, will continue to provide support for Humanities computing as a member of the Computing Service’s Technical User Services. She will be based on the New Museums Site.

It seems symptomatic of some shift.

Gaming as Actions: Students Playing a Mobile Educational Computer Game

The online journal Human IT has an issue on gaming with an interesting article about mobile gaming (or augmented reality gaming) for education. See Elisabet M. Nilsson & Gunilla Svingby: Gaming as Actions: Students Playing a Mobile Educational Computer Game. The article has a clear and short summary of the literature around serious games and education that points out that there isn’t yet much evidence for the theoretical claims.

The overall conclusion seems to be that even if several studies show effects on learning as well as on attitudes, empirical evidence is still lacking in support of the assumption that computer games are advantageous for use in educational settings. (p. 28-9)

The article touches on the problem we all have when we ask students to role play (whether as part of a game or simulation), which is how seriously they take it.

Some of the groups had a clear ironic touch on almost all of their utterances, at the same time as they were taking on the assignment with a serious attitude. When playing the game, they seemed to constantly oscillate back and forth between the imagined game world and their own reality. They played their alloted fictive role, and at the same time referred to their own personal experiences. (p. 43)

I’m convinced this irony has to do with how comfortable students feel playing roles before others. What does it mean in the web of class relationships to ask a student to act before others? Should they have a choice? Obviously they handle the uncomfort with irony as a way of preserving their identity in the class. That they can do both (play a fictive role and their ironic self) at the same time is impressive. On page 53 the authors suggest that a context where students can alternate (motivations) could make for an “engaging learning experience.”

Almost Augmented Reality

Augmented reality is almost real according to a BBC story by Michael Fitzpatrick, Mobile phones get cyborg vision. Developers like Layar have made it possible to get realtime information about your surroundings overlayed over what your camera sees.

Launched this June in Amsterdam, residents and visitors can now see houses for sale, popular bars and shops, jobs on offer in the area, and a list of local doctors and ATMs by scanning the landscape with the software.

The social media implications are tremendous – imagine having a myPlace site where I can add meaning to locations that others can view. Historical tours, ghost stories, contextual music, political rants and so on could be added to real locations.

Thanks to Sean for this.

BBC links to other news sites: Moreover Technology

The BBC News has an interesting feature where their stories link to other stories on the same subject from other news sources. See for example the story on, Chavez backer held over TV attack – on the right there are links to stories on the same subject from other news venues like the Philadelphia Inquirer. They even explain why the BBC links to other news sites.

How does it work?

The Newstracker system uses web search technology to identify content from other news websites that relates to a particular BBC story. A news aggregator like Google News or Yahoo News uses this type of technique to compare the text of stories and group similar ones together.

BBC News gets a constantly updating feed of stories from around 4000 different news websites. The feed is provided to us by Moreover Technologies. The company provides a similar service for other clients.

Our system takes the stories and compares their text with the text of our own stories. Where it finds a match, we can provide a link directly from our story to the story on the external site.

Because we do this comparison very regularly, our stories contain links to the most relevant and latest articles appearing on other sites.

Sounds like an interesting use of “real time” text analysis and an alternative to Google News. Could we implement something like that for blogs? The company that provides them with this is Moreover Technologies.