Digital Scholarship and Digital Libraries

Image of Slide

At the beginning of November I was asked to give a keynote for a Digital Scholarship/Digital Libraries symposium at the beautiful of Emory Conference Centre. My talk was titled “The Social Text: Mashing Electronic Texts and Tools” and my thesis was that we needed to forge a closer relationship between scholarly projects and digital libraries. This is a two-fold call for change:

  1. Scholars develop new methods to analyze and study texts need deeper access to the digital libraries that hold the texts they want to study. On the one hand we need to be able to discover and aggregate study collections that span (often incompatible) digital library collections. On the other hand we need to be able to plug in our tools instead of using the analytical tools built into the publishing engine. I proposed that we look seriously at OpenSocial as a model for hosting social applications.
  2. Scholars editing or creating digital texts need to be willing to accept a much more prescriptive set of encoding guidelines so that their texts can be brought into large digital library collections which then could make the discovery and gathering of study collections possible. Smaller scholarly craft projects will not scale or play well over time – that is a function digital libraries should lead.

A copy of the slides in PDF is up for FTP access. The file is 15 MB.

Flock – The Social Web Browser

Flock ScreenI’ve been experimenting with Flock – The Social Web Browser. It to have integrated support for social network sites like Flickr and Facebook. The interface is confusing, perhaps because of everything it is trying to do, or my not getting it. Some of things it does are:

  • Let you see your Flickr photos in a bar across the top so you can drag them into other services.
  • Upload to social network sites.
  • Track your Flickr and Facebook friends.
  • Read blog feeds.

Hitwise: Web Intelligence

On jill/text I cam across an interesting graph about OpenSocial vs. Facebook showing the difference in market share. Hitwise provides statistics and analysis of internet usage. They get their data from ISPs, which sounds like it could be a privacy issue. See their Product Features for the services they provide that most of us can’t afford. See what they say about how they gather information in How We Do It or here is quote from their press release on Hanah Montana Most Searched for Halloween Costume:

Since 1997, Hitwise has pioneered a unique, network-based approach to Internet measurement. Through relationships with ISPs around the world, Hitwise’s patented methodology anonymously captures the online usage, search and conversion behavior of 25 million Internet users. This unprecedented volume of Internet usage data is seamlessly integrated into an easy to use, web-based service, designed to help marketers better plan, implement and report on a range of online marketing programs.

They have blogs by their analysts, most of whom seem to be in the UK, that have interesting notes about trends like iTunes overtakes Free Music Downloads in Internet Searches.

The Mind Tool: Edward Vanhoutte’s Blog

Edware Vanhoutte, who has done some of the best work on the history of humanities computing (though much is not yet published), has started a blog. In his first entry, The Mind Tool: Edward Vanhoutte’s Blog, he summarizes early text books that were used to teach humanities computing. It would be interesting to look at how these 70s and 80s books conceive of the computer and how they differ from the 50s and 60s work like that of Booth.

OpenSocial – Google Code

OpenSocial ImageTwo days ago, on the day of All Hallows (All Saints), Google announced OpenSocial a collection of APIs for embedded social applications. Actually much of the online documentation like the first OpenSocial API Blog entry didn’t go up until early in the morning on November 2nd after the Campfire talk. On November 1st they had their rather hokey Campfire One in one of the open spaces in the Googleplex. A sort of Halloween for older boys.

Image from YouTube

Screen from YouTube video. Note the campfire monitors.

OpenSocial, is however important to tool development in the humanities. It provides an open model for the type of energetic development we saw in the summer after the Facebook Platform was launched. If it proves rich enough, it will provide a way digital libraries and online e-text sites can open their interface to research tools developed in the community. It could allow us tool developers to create tools that can easily be added by researchers to their sites – tools that are social and can draw on remote sources of data to mashup with the local text. This could enable an open mashup of information that is at the heart of research. It also gives libraries a way to let in tools like the TAPoR Tool bar. For that matter we might see creative tools coming from out students as they fiddle with the technology in ways we can’t imagine.

The key difference between OpenSocial and the Facebook Platform is that the latter is limited to social applications for Facebook, as brilliant as it is. OpenSocial can be used by any host container or social app builder. Some of the other host sites that have committed to using is are Ning and Slide. Speaking of Ning, Marc Andreessen has the best explanations of the significance of both the Facebook Platform phenomenon and OpenSocial potential in his blog, blog.pmarca.com (gander the other stuff on Ning and OpenSocial too).

Republican Debate: Analyzing the Details – The New York Times

Screen Image The New York Times has created another neat text visualization, this time for the Republican Debate. The visualization has two panels. One shows the video, a transcript, and sections. You can jump the video using the transcript or section outline. The other is a “Transcript Analyzer” where you can see a rich prospect of the debate divided by speeches and you can search for words. What is missing is some sort of overview of what the high frequency words are and how they collocate.

So, I have created a public text for analysis in TAPoR and here are some results. Here is a list of words that are high frequency generated using the List Words tool. Some interesting words:

People (76), Think (66), Know (48), Giuliani (42), Clinton (33), Reagan (13), Democrats (16), Republicans (11)

Health (45), Government (35), Security (35), Country (25), Policy (16), Military (15), School (15),

Marriage (23), Insurance (23), Conservative (23), Private (22), Let (21), Gay (12)

Iraq (13), Iran (12), Turkey (7), Canada (2), Darn (2), Europe (5),

Immigrants (5), Citizens (2)

Man (7), Mean (7), Woman (4), Congressman (25)

Answer (10), Problem (10), Solution (5), War (12)

Continue reading Republican Debate: Analyzing the Details – The New York Times