The Index Thomisticus as Project

Monday, March 14th, 2016

This is a story from early in the technological revolution, when the application was out searching for the hardware, from a time before the Internet, a time before the PC, before the chip, before the mainframe. From a time even before programming itself. (Winter 1999, 3)


Father Busa is rightly honoured as one of the first humanists to use computing for a humanities research task. He is considered the founder of humanities computing for his innovative application of information technology and for the considerable influence of his project and methods, not to mention his generosity to others. He did not only work out how use the information technology of the late 1940s and 1950s, but he pioneered a relationship with IBM around language engineering and with their support generously shared his knowledge widely. Ironically, while we have all heard his name and the origin story of his research into presence in Aquinas, we know relatively little about what actually occupied his time – the planning and implementation of what was for its time one of the major research computing projects, the Index Thomsticus.

This blog essay is an attempt to outline some of the features of the Index Thomisticus as a large-scale information technology project as a way of opening a discussion on the historiography of computing in the humanities. This essay follows from a two-day visit to the Busa Archives at the Università Cattolica del Sacro Cuore. This visit was made possible by Marco Carlo Passarotti who directs the “Index Thomisticus” Treebank project in CIRCSE (Centro Interdisciplinare di Ricerche per la Computerizzazione dei Segni dell’Espressione – Interdisciplinary Centre for Research into the Computerization of Expressive Signs) which evolved out of GIRCSE (Gruppo not Centro – or Group not Centre), the group that Father Busa helped form in the 1980s. Passarotti not only introduced me to the archives, he also helped correct this blog as he is himself an archive of stories and details. Growing up in Gallarate, his family knew Busa, he studied under Busa, he took over the project, and he is one of the few who can read Busa’s handwriting.


Original GIRCSE Plaque kept by Passarotti


Paolo Sordi: I blog therefore I am

Wednesday, February 24th, 2016

On the ethos of digital presence: I participated today in a panel launching the Italian version of Paolo Sordi’s book I Am: Remix Your Web Identity. (The Italian title is Bloggo Con WordPress Dunque Sono.) The panel included people like Domenico Fiormonte, Luisa Capelli, Daniela Guardamangna, Raul Mordenti, and, of course, Paolo Sordi.


Untangling the Tale of Ada Lovelace

Sunday, December 13th, 2015

Stephen Wolfram has written a nice long blog essay on Untangling the Tale of Ada Lovelace. He tackles the question of whether Ada really contributed or was overestimated. He provides a biography of both Ada and Babbage. He speculates about what they were like and could have been. He believes Ada saw the big picture in a way Babbage didn’t and was able to communicate it.

Ada Lovelace was an intelligent woman who became friends with Babbage (there’s zero evidence they were ever romantically involved). As something of a favor to Babbage, she wrote an exposition of the Analytical Engine, and in doing so she developed a more abstract understanding of it than Babbage had—and got a glimpse of the incredibly powerful idea of universal computation.

The essay reflects on what might have happened if Ada had not died prematurely. Wolfram thinks they would have finished the Analytical Engine and possibly explored building an electromechanical version.

We will never know what Ada could have become. Another Mary Somerville, famous Victorian expositor of science? A Steve-Jobs-like figure who would lead the vision of the Analytical Engine? Or an Alan Turing, understanding the abstract idea of universal computation?
That Ada touched what would become a defining intellectual idea of our time was good fortune. Babbage did not know what he had; Ada started to see glimpses and successfully described them.

Literary Analysis and the Wolfram Language

Wednesday, November 4th, 2015


Lately I’ve been trying Wolfram Mathematica more an more for analytics. I was introduced to Mathematica by Bill Turkel and Ian Graham who have done some impressive stuff with it. Bill Turkel has now created a open access, open content, and open source textbook Digital Research Methods with Mathematica. The text is a Mathematica notebook itself so, if you have Mathematica you can actually use the text to do analytics on the spot.

Wolfram has also posted an interesting blog entry on Literary Analysis and the Wolfram Language: Jumping Down a Reading Rabbit Hole. They show how you can generate word clouds and sentiment analysis graphs easily.

While I am still learning Mathematica, some of the features that make it attractive include:

  • It uses a “literate programming” model where you write notebooks meant to be read by humans with embedded code rather than writing code with awkward comments embedded.
  • It has a lot of convenient Web, Language, and Visualization functions that let you do things we want to do in the digital humanities.
  • You can call on Wolfram Alpha in a notebook to get real world knowledge like capital cities or maps or language information.


Thursday, January 15th, 2015

Spiral Graphic

Tributary is an interactive visualization programming environment for Javascript (and D3).  It lets you rapidly prototype visual code and reminds me of the old Design By Numbers which was both a book by John Maeda and a site that similarly let you program and see the visual results.

Games with Purpose: Untrusted

Monday, May 12th, 2014

Screen Shot

From Alex I discovered the serious game Untrusted by Alex Nisnevich. This puzzle game asks you to edit Javascript in order to solve puzzles as a way of learning to think like a programmer. You can read about the game at Games with Purpose – a site that is gathering serious games.

The Programming Historian 2: Cleaning OCR’d text

Wednesday, August 28th, 2013

The Programming Historian 2 is producing some very useful tutorials including some on Cleaning OCR’d Text with Regular Expressions. This was started by William J Turkel and others and is now supported by the Center for History and New Media. The tutorials are released under a Creative Commons so they can be copied and adapted.

Visualizing Collaboration

Saturday, April 27th, 2013

Ofer showed me a interactive visualization of the collaboration around a Wikipedia article. The visualization shows the edits (deletions/insertions) over time in different ways. It allows one to study distributed collaborations (or lack thereof) around things like a Wikipedia article. The ideas can be applied to visualizing any collaboration for which you have data (as often happens when the collaboration happens through digital tools that record activity.)

His hypothesis is that theories about how site-specific teams collaboration don’t apply to distributed teams. Office teams have been studied, but there isn’t a lot of research on how voluntary and distributed teams work.

Sample on Randomness

Friday, January 11th, 2013

Mark Sample has posted his gem of a MLA paper on An Account of Randomness in Literary Computing. I wish I could write papers quite so clear and evocative. He combines interesting historical examples to a question that crosses all sorts of disciplines – that of randomness. He shows how the importance of randomness connects to poetic experiments in computing.

I would recommend reading the article immediately, but I discovered, as with many good works, I ended up spending a lot of time following up the links and reading stuff on sites like the MIT 150 Exhibition which has a section on Analog/Digital MIT with online exhibits on subjects like the MIT Project Athena and the TX-0. Instead I will warn – beware of reading interesting things!

A Thousand Words

Saturday, December 15th, 2012

The Texas Advanced Computing Center has created an Advanced Visualization for the Humanities tool for a project A Thousand Words. The tool, called Most Pixels Ever: Cluster Edition is a library that extends Processing and is designed for large-scale tiled displays. Very neat.