Happy Holidays

Ornament ImageHappy Holidays to all my readers. If you’re still looking for gifts you can always check out the U.S. Government Gifts and Memorabilia for Sale page at Firstgov.gov. There is even a web site for the Drug Enforcement Administration shop with holiday ornaments (see image to the left) that come with a “interpretative insert”.

Or you can do text analysis on a small corpus I put together from pages found by Google on the subject of “happy holidays”. See Geoffrey Rockwell’s Links (Holidays) and click on Analyze.

Chirag Mehta: US Presidential Speeches Tag Cloud

US Presidential Speeches Tag Cloud is a sliding series of word clouds for “speeches, official documents, declarations, and letters written by the Presidents of the US between 1776 – 2006 AD.” Chirag Mehta has organized the clouds with time slider so you can move from the latest, “2006-01-31: State of the Union Address” by George W. Bush back to John Adams, “1776-01-15: Foundation of Government”. By moving the slider you can see changes in the high frequency words. Bill Clinton, surprisingly (to me), talked a lot more about families than Bush.

Mehta describes how he generated this at the bottom of the page. Thanks to Gord for this.

The Dictionary of Words in the Wild

Image of Word Cloud The Dictionary of Words in the Wild is an experiment in public textuality that I’m leading. Andrew MacDonald has done the programming and is contributing images (along with others). You can get an account and upload pictures of words or phrases. We have an application programming interface that you can use to then create web applications that call the dictionary. Join, sample, load! We need pictures.

Try a phrase:


James pointed me to a similar experiment, The Visual Dictionary – a visual exploration of words in the real world. This focuses on single words and has a ranking/rating system. It doesn’t, however, have the API we have. I wonder how we can interoperate? Can such dictionaries be a movement?

Meditation on Electronic Tools

TAPoR Try It

A tool would have a handle with grooves to hold tight. It’s easy to swing into place.

List Words Results

It would have an inhuman steel end. An end unlike my soft flesh. Perhaps the nail dead at the end of the digit.

Tool Broker

Googlizer Results

A tool scratches out its world. A tool outreaches, extends the hand in sight, and where it doesn’t fit (so often), it scrapes a groove. It claws what it can afford.

Visual Collocator

And when it’s finished there’s a pop, a clunk, a ping, and a burr to be swept away. When it’s left, the palm is open to stroke the surface of the craft. A satisfaction puts the tool away.

Error Message

So few parts of the world fit this tool, other than my hand. Perhaps they are not made for work but for the stroking, the holding, and the gripping turn.

Workbench

Which is why I need so many of them, within reach, laid out in frames, carried in bags, on belts, and ready-at-hand and unforseen.

Analyze Text

Then, I’ll pause in the workshop and not do anthing at all. I’ll hold these tools in my mind which is not how to use them.

Images all from the TAPoR portal and TAPoRware.

Ask E.T.: Sparklines: theory and practice

Deficit Sparkline (Sparkline of US deficit over time) Sparklines: theory and practice is a thread in Edward Tufte’s Ask E.T. forum (which is a great place to follow discussions on design issues.) The thread starts with images of some pages from Tufte’s new book, Beautiful Evidence (2006) on sparklines which are defined as “intense, simple, word-sized graphics”. The sparkline at the beginning of this entry is from the Sparkline PHP Graphing Library. Another source of sparkline tools is Bissantz sparkline tools. Thanks to Shawn for this link.
So how can sparklines be woven into text anlysis environments? Small distribution graphs could be included with lists of word or KWIC displays in tools like the TAPoRware tools.

Download Pertinence Summarizer – Text Mining Solutions

Selection from a "Connivence Map" of World PoliticsPertinenceMining.com is a French company that has a number of neat text processing products built on their KENiA or “Knowledge Extraction and Notification Architecture.” One their products is Connivences.info which produces maps of “actors” in the news with weighted lines to indicate relationships.
Another interesting tool is their Google + Pertinence Summarizer that enhances the results from Google with a “Summarize” button which splits the linked page into sentences and tries to rank their pertinence to the document so you can choose to see only the most pertinent. The interface took me a while – I’m not sure it works.

DadaDodo: Exterminate All Rational Thought

DadaDodo is a text generator or “travesty generator” like Dissociated Press. The code is available and unlike programs that randomly cut up text it “it scans bodies of text, and builds a probability tree expressing how frequently word B tends to occur after word A, and various other statistics; then it generates sentences based on those probabilities.” DadaDodo is described by its creator Jamie Zawinski thus:

DadaDodo is a program that analyses texts for word probabilities, and then generates random sentences based on that. Sometimes these sentences are nonsense; but sometimes they cut right through to the heart of the matter, and reveal hidden meanings.

Zawinski’s page has a “cut up” look with downloadable code and interesting links, many of which are no longer active, alas. The effect of DadaDodo are hard to interpret without knowing what the corpus is that it starts with. I am tempted to create a TAPoRware version so that it can be used on existing web pages.

Communications From Elsewhere »

Communications From Elsewhere is a journal (not blog!) by Josh Larios with some interesting text generators including a Postmodernism Generator which randomly generates “completely meaningless” essays using a modified version of The Dada Engine written by Andrew C. Bulhak.

For more on The Dada Engine see the technical report from Monash University, On the simulation of postmodernism and mental debility using recursive transition networks. The Abstract reads:

Recursive transition networks are an abstraction related to context-free grammars and finite-state automata. It is possible, to generate random, meaningless and yet realistic-looking text in genres defined using recursive transition networks, often with quite amusing results. One genre in which this has been accomplished is that of academic papers on postmodernism.

Josh has collected and connected different “Text Generators” to his journal, including an Adolescent Poetry Corner and a Time Cube screed generator. (For an explanation of Gene Ray’s Time Cube theory see DmitryBrant.com ¬ª On Time Cube. The Time Cube site is another story.)