Mark Olsen: Toward meaningful computing

Mark Olsen and Shlomo Argamon have just published a viewpoint in the Communications of the ACM titled, “Toward meaningful computing” that argues, (among other things)

Current initiatives by Google, Yahoo, and a consortium of European research institutions to digitize the holdings of major research libraries worldwide promise to make the world’s knowledge accessible as never before. Yet in order to completely realize this promise, computer scientists must still develop systems that deal effectively with meaning, not just with data and information. This grand research and development challenge motivates our call here to improve collaboration between computer scientists and scholars in the humanities.

They set an ambitious, but I think, doable agenda for us.

Juxta

Juxta has just been released. This is an application for comparing and collating multiple witnesses to a single text. It is open source and has an elegant and clean interface. It was developed at the University of Virginia by Applied Resarch in Patacriticism with funding awarded to Jerome McGann from Mellon.

WuffWuffWare: Analyze Text

WuffWuffWare (yes, I’m serious) has a small text annotation tool for the Mac called, AnalyzeText. It sounds like you can use it like a highlighting and annotating tool, but it also has a concordancer built in. But does it roll over when told to the way my dog does? Tha’s about all the text analysis my dog Leo does.

This is thanks to Alex.

TagCloud

TagCloud is both a way of showing word or tag frequency and tool for content analysis. TagCloud.com has a tool that I think will give you a tagcloud for placing in your blog. The words are sized by importance and link to lists or related entries. A cool idea of content analysis interface that provides a dynamic folksonomy.

TagCloud.com links to a good article on Folksonomy in the Wikipedia.

Web Crawler: Nutch

Nutch is “open source web-search software. It builds on Lucene Java, adding web-specifics, such as a crawler, a link-graph database, parsers for HTML and other document formats, etc.” There is a Nutch Wiki with links to news, presentations and articles on it.

Nutch is basically a open Google-like engine that indexes an intranet (or the web) and gives you search capability. This sort of tool could be useful if there were ways to adapt it to discipline specific crawling.

Latent Semantic Analysis

LSA @ CU Boulder is a site at the University of Colorado at Boulder on Latent Semantic Analysis for education. The neat thing is they provide a web interface to different LSA tools. Could these techniques be used in research text analysis? Could we create them as web services?

A site they point to with a list of links to readings, projects and people is Readings in Latent Semantic Analysis, maintained by Lemaire and Dessus.

They also link to a Wired News article on LSA in education that explains how LSA can be used for automatic marking of essays, see Teachers of Tomorrow?.
Continue reading Latent Semantic Analysis

ATLAS.ti

ATLAS.ti! is a “Knowledge Workbench” for the qualitative analysis of texts, images, audio and video. It looks like a PC program that lets you annotate large quantities of materials for interpretation, coding, and clustering.

I saw this years ago, but it has matured and now handles multimedia. I should add that it is for sale, not free, though they have a trial version.
Continue reading ATLAS.ti