Copernic is a company that has licensed text summarization technology from the Institute for Information Technology at the National Research Council. They have agent and summarizer tools that can help searching the web and managing results. The Copernic Summarizer, in particular, looks like an interesting application of summarization for everyday use, including the ability to summarize web pages in real time. Neat!
Continue reading Copernic: NRC Summarizing Tools
XML for Overlapping Structures (XfOS) using a non XML Data Model by Alexander Czmiel was an interesting paper at the 2004 ALLC/ACH on implementing systems with overlapping hierarchies.
While overlapping hierarchies would seem to be an obscure or advanced issue in markup, I think it is important to opening up markup practices to match existing intellectual practices, especially exploratory practices.
LMNL (Layered Markup anNotation Language) is what Alexander ended up using and his paper provided me an introduction to this fascinating language developed by Wendell Piez. LMNL looks like it could be used for exploratory markup and then built up into sophisticated interpretations of text.
Continue reading LMNL and exploratory markup
One way to ask about the place of computing in the humanities is to ask about method. I am reading Plato and the Good by my old prof Rosemary Desjardins. The second chapter nicely teases out Platonic dialectic from the Philebus in a way that can fits what I am going to call neon-baroque theories of folded interruption. Dialectic involves division of the stuff of the continuum into threads (analysis or digitization) and then the weaving of these threads into a fabric (synthesis or processing.) The problem with dialectic that Rosemary teases out is the problem I have with Deleuze’s interruption of the flow – how do you get a flow to divide in the first place?
To the weaver, therefore, we now put our question: what must be the case in order that she be able first to pick out the appropriate fleece, secondly to measure off the divisions that will yield the the threads of warp and woof, and then finally to interweave those threads so as to produce the web of the finished fabric? (p. 42)
Method is not just analysis and synthesis of a continuum, just as humanities computing is not just digitizing and processing the analog. Method, from meta (above, after) + hodos (way, path) involves a capacity to forsee the form you want to generate in the confusion. This is a looking back (after the way) so as to look forward (above the path.) You need to have an idea of what you want to weave before you start dividing (pro-video) and that comes from a recollection of what has been done. Thus Rosemary connects dialectic to Socratic recollection. Method in the humanities is circular – it involves a re-searching – a looking back to look forward. To analyze the flow into discrete digits you need to pull a flow out of chaos – you need to create a particular continuum for sampling, whether it be a flow of of sound or colour.
How does this help us with computing in the humanites? Well … lets go slow here and leave that to later.
Continue reading Method and Technology
Another great paper at the Brown conference was by Domenico Fiormonte on “Textual genesis and the writing process: The Magrelli Genetic Machine”. After giving us a background on philology and textual criticism in Italy, he showed a Flash variant machine that allows one to see manuscript and text interact. Domenico led the development of the Digital Variants site at the University of Edinburgh which has information about tools, theory, texts, and projects.
Continue reading Fiormonte: Genetic Machines
The Listening Post is a networked installation that culls text from online and displays them and synthesizes them.
This looks anticipates a project on the sonification of text that I am working on with Bill Farkas who has developed some cool sonification systems.
Search Technologies” href=”http://a9.com/”>A9.com is a new search engine site from Amazon that lets you search inside books in addition to searching the web. There is supposed to be a feature to allow you to link notes to what you find and you can, if you get an account, keep information about your search history.
Remember when people speculated that Netscape could become your OS? As Google and other (pseudo) portals add features we are returning to the possibility of a network portal OS. My kids use MSN for more and more, I use Google for more and more – at what point do I ditch the “personal” computer for an environment available through any networked device?
As always someone else has implemented any good idea. WebCorp: The Web as Corpus is an aggregator like the TAPoRware Googlizer that we are developing. We do more on the post-processing, theirs has other strengths. What can we learn from this tool? (Thanks to Ian Lancashire for this.)
Daypop Top News Bursts is a site that lists clusters of news stories around word bursts. They don’t give the algorithm, but it seems to do something like what Google News does – provides clusters of stories that have similar subjects and which have a “heightened useage of certain words…”
Can this be used on a text? Could you treat a text with paragraphs as if each paragraph were a story in time. Sentences could be pulled that best show the heightened usage of words. Something like that…
Rob of isagen and I were talking about different types of visualization and sonification of texts. One idea is to have a sonification of a text where keywords are whispered from different directions. A text would be processed into a short sonoric summary. Rob has build scrollers that show the news scrolling by in a window as a way of allowing the user to keep an eye on a (changing) text. I came across this Speed Reader that does something like this here.
What if one ran a process that summarized a text and the summary (a list of frequency sorted words) was then played back through such a reader?
The Gender Genie will try to guess the gender of an author based on 500 words of text. It is a form of playful text analysis based on an algorithm developed by Koppel and Argamon.
Continue reading Gender Guessing