Announcing an interesting call for position statements for a Digital Tools Summit for Linguistics. The summit will run from June 22nd to the 23rd, 2006 at Michigan State University. The deadline is March 31st, 2006. For more information see DTS-L web site.
Continue reading Digital Tools Summit for Linguistics
Category: Text Analysis
Juxta
Juxta has just been released. This is an application for comparing and collating multiple witnesses to a single text. It is open source and has an elegant and clean interface. It was developed at the University of Virginia by Applied Resarch in Patacriticism with funding awarded to Jerome McGann from Mellon.
WuffWuffWare: Analyze Text
WuffWuffWare (yes, I’m serious) has a small text annotation tool for the Mac called, AnalyzeText. It sounds like you can use it like a highlighting and annotating tool, but it also has a concordancer built in. But does it roll over when told to the way my dog does? Tha’s about all the text analysis my dog Leo does.
This is thanks to Alex.
The Gematriculator
Is Gematria text analysis? Alex has pointed me to a Gematriculator that seems to poke fun at the idea by letting you provide a URL or paste text in. I have no idea what the affiliation is of the homokaasu sect. (The FAQ says “homokaasu is Finnish for “gay gas”.)
TagCloud
TagCloud is both a way of showing word or tag frequency and tool for content analysis. TagCloud.com has a tool that I think will give you a tagcloud for placing in your blog. The words are sized by importance and link to lists or related entries. A cool idea of content analysis interface that provides a dynamic folksonomy.
TagCloud.com links to a good article on Folksonomy in the Wikipedia.
Web Crawler: Nutch
Nutch is “open source web-search software. It builds on Lucene Java, adding web-specifics, such as a crawler, a link-graph database, parsers for HTML and other document formats, etc.” There is a Nutch Wiki with links to news, presentations and articles on it.
Nutch is basically a open Google-like engine that indexes an intranet (or the web) and gives you search capability. This sort of tool could be useful if there were ways to adapt it to discipline specific crawling.
Latent Semantic Analysis
LSA @ CU Boulder is a site at the University of Colorado at Boulder on Latent Semantic Analysis for education. The neat thing is they provide a web interface to different LSA tools. Could these techniques be used in research text analysis? Could we create them as web services?
A site they point to with a list of links to readings, projects and people is Readings in Latent Semantic Analysis, maintained by Lemaire and Dessus.
They also link to a Wired News article on LSA in education that explains how LSA can be used for automatic marking of essays, see Teachers of Tomorrow?.
Continue reading Latent Semantic Analysis
ATLAS.ti
ATLAS.ti! is a “Knowledge Workbench” for the qualitative analysis of texts, images, audio and video. It looks like a PC program that lets you annotate large quantities of materials for interpretation, coding, and clustering.
I saw this years ago, but it has matured and now handles multimedia. I should add that it is for sale, not free, though they have a trial version.
Continue reading ATLAS.ti
History Tools
The Center for History and New Media at George Mason University has a developed a collection of
Tools including H-Bot (an automated historical fact finder that uses Google to try to answer “when did …” questions) and the ToolCenter which is a wiki about tools.
Continue reading History Tools
Juxta
Nines project is developing some neat tools. One of the most interesting is Juxta which is a collation tool with a brilliant interface.
It has a histogram of differences between texts.
Continue reading Juxta