EPIC: Carnivore Documents

omnivore.gif
Omnivore Source Code FOIA Document
Did the FBI build use text analysis for network-tapping? I found an interesting page on the Electronic Privacy Information Centre about Carnivore and Omnivore (its predecessor), two Internet monitoring systems created by the FBI. EPIC has a EPIC Carnivore Page with a summary and scans of documents recieved through Freedom of Information Requests. See also EPIC Carnivore FOIA Documents. The documents are fascinating given all the lines blacked out that you can try to guess at. There is a beauty to these documents with heavy black regions and “Secret” crossed out all over. Note how EPIC uses this aesthetic in their annual report.
Continue reading EPIC: Carnivore Documents

Text Analysis and Alzheimer’s

Both The Globe and Mail and CBC ran stories about researchers who compared word lists from Iris Murdoch’s books looking at word variety. See CBC News: Iris Murdoch novel may be evidence of Alzheimer’s. Now that computers index our files (a feature in Tiger, for example), could we get them to warn us when our word variety goes down? Could my e-mail client or blog be fitted to alert me to changes in my use of language?
Continue reading Text Analysis and Alzheimer’s

Comparison Engine and Clustering Engine

Antonio Gulli has two interesting tools up on the web. The first is a Rank Comparison Engine, which will query a bunch of search engines, get their list of hits and build a table of points (pills) showing which hits are unique to which index and which shared. The results are interactive, allowing you to mouse-over points to see the short description.
The second is SnakeT Clustering Engine (SNippet Aggregation for Knowledge ExTraction.) It searches various indexes and builds a list of high frequency words that cluster with the query word. You can then navigate by the cooccuring words. Neat use of text analysis for concept exploration.
My one complaint is the design – he needs a graphic designer to make these sing.