ALLC/ACH 2004, Nerbonne Plenary

The last day of the ALLC_ACH-2004 in Gothenberg and there is too much to blog.
The Opening Plenary was by John Nerbonne on “The Data Deluge: Developments and Delights”. He argued that “The challenge of Humanities Computing is to futher Humanities scholarship by confronting lots of data scientifically.” He gave examples of questions in linguistics (dialectology) that his team had been able to “answer” through computing methods and large data sets.
I wonder if that approach from questions will work in other fields in the humanities.

An advantage of the focus on large data sets is the renewed engagement it enables with traditional humanities questions. We are even *now* answering older questions with new methods.

He asked us to think about “What traditional Humanities problems are we solving?”

What humanities question is posed? Who wants to know?
Is the computer essential? Is lots of data involved?
Do we really have an aswer, or just an analysis?
Suppose you succeed, what will you be in a position to do?

Is there a problem with starting with Humanities questions?
1. Do all Humanities disciplines have questions? Are there disciplines that don’t present questions to themselves or others?
2. Are there disciplines that don’t present questions amenable to solution through lots of data? Ethics vs Sociology of Ethics
3. What is wrong with using humanities methods on digital culture – should HC limit itself to traditional questions?
4. What is wrong with generating new questions? (In fairness John noted that new questions emerge from the confrontation with disciplinary quesitons.)
5. Do we understand the place of posing questions and answering them in the practices of a discpline?
6. Are there types of questions (rhetorical) that are not meant to be answered through scientific inquiry – that are meant to provoke thinking or provoke rapid theorizing.

The deluge of data is not a problem, but an opportunity. John is right that this is going to make difference. Mark Olsen noted before how we can do corpus text analysis now. This is something that couldn’t be done before. Likewise the complexity of certain TEI projects means we have a deluge of complexity and data.