Digging Into Data, Day 2: Making Tools and Using Them

I just discovered (thanks to the Digging Into Data site) that the Chronicle of Higher Education Wired Campus Blog has a nice story on the Digging Into Data Challenge Conference (2011) that talks about the Criminal Intent project I am on. See Digging Into Data, Day 2: Making Tools and Using Them. The article nicely summarizes Steve Ramsay who was our respondent to the effect that,

Mr. Ramsay’s talk celebrated how this kind of Big Data work can enhance rather than diminish the humanities’ traditional engagement with human experience. “The Old Bailey, like the Naked City, has eight million stories. Accessing those stories involves understanding trial length, numbers of instances of poisoning, and rates of bigamy,” he said in his response. “But being stories, they find their more salient expression in the weightier motifs of the human condition: justice, revenge, dishonor, loss, trial. This is what the humanities are about. This is the only reason for an historian to fire up Mathematica or for a student trained in French literature to get into Java.”

The article is by Jennifer Howard and was published June 12, 2011. This nicely contrasts with the Nature article on the event that focused on the culturnomics keynote by Erez Lieberman-Aiden & JB Michel from Harvard rather than the serious work of digging into data. You can see my earlier post on this conference (with a link to my conference report) here.

Academic Amazon Machine Images (AMIs)

From Twitter I learned about James Smithies
Academic Amazon Machine Images (AMIs). These are images for setting up cloud services on Amazon. The two that he now provides are for Omeka and Open Journal Systems. They are not for the technically challenged, but they could be a way for a digital humanities center/project to be set up in the cloud for those who don’t have good university server support. The day may come when you don’t need university infrastructure, but can set up your own. For that matter, this blog is on my private site which gives me a bunch of tools (like WordPress and a wiki) for about $7 a month.

Digging Into Data Conference

I’m now at the Digging Into Data Challenge Conference. This conference brings together investigators from the first round of the Digging Into Data Challenge. Thursday morning we had a meeting with the folks from CLIR who are evaluating the program. See my Conference Notes. The major issues I see are:

  • Gender representation is an issue. The Challenge and in the digital humanities in general we need to work harder to involve women researchers, especially as leaders. We run the risk of DH being seen as the last bastion of old me in the humanities.
  • Representation by new scholars is also an issue. The Challenge should bring together the graduate students and the new faculty – they need to be encouraged to meet up and they need the validation of attention from the research councils.
  • Supporting international research. One of the innovations of Digging is that it has one review process that crossed national boundaries. If your project was approved all the national partners got funded. We should see this model generalized beyond the digital humanities.
  • Encouraging research mashups. Another benefit of Digging is that it encouraged established projects to interoperation. The project I’m on (Datamining with Criminal Intent) built interoperability between the Old Bailey project, Zotero and Voyeur.
  • Encouraging ambitious projects. Digging encouraged ambitious and “blue-sky” proposals that experimented with large datasets.

We had a wide-ranging conversation about the challenge of the digital humanities in general. Many of the usual issues came up. Can the Digging Into Data Challenge be a visible advocate for some of the changes we have been grousing about for years.

Alberts: On Becoming a Digital Humanist

This week I was invited to give a number of talks at the University of North Dakota. Dr. Crystal Alberts organized the talks (along with others). At UND I spoke on:

  • Incorporating the digital in the humanities. This talk was about incorporating the digital into humanities teaching.
  • Supporting the Digital Humanities. This talk was for librarians and discussed mostly how libraries can support our work.
  • Cyberinfrastructure for the Humanities. This talk was delivered by videoconference and went out to a larger state audience discussing cyberinfrastructure in North Dakota.

Crystal has a nice long blog post on participation and inclusion the digital humanities. The post,On Becoming a Digital Humanist talks about Steve Ramsay’s MLA comments and what I wrote on inclusion.

Simon Norfolk: Supercomputers

From Humanist I found Simon Norfolk’s web site which includes a photographic series on “The Supercomputers: ‘I’m sorry Dave, I’m afraid I can’t do that” (click enter, then click the title of the collection and then click photographs.)

The photographs pick out details of HPC installations that are visually arresting. They are without people as if these spaces were silent. In reality when you are in these spaces (at least the computer rooms I’ve been in) they are noisy with cooling systems and there are people nursing the beasts.

IBM’s “Watson” Computing System to Challenge All Time Greatest Jeopardy! Champions

Richard drew my attention to the upcoming competition between IBM’s Watson deep question and answer system and top Jeopardy! champions, IBM’s “Watson” Computing System to Challenge All Time Greatest Jeopardy! Champions. I’d blogged on Watson before – it’s a custom system designed to mine large collections of data for answers to questions. Here is what IBM says its applications are,

Beyond Jeopardy!, the technology behind Watson can be adapted to solve problems and drive progress in various fields. The computer has the ability to sift through vast amounts of data and return precise answers, ranking its confidence in its answers. The technology could be applied in areas such as healthcare, to help accurately diagnose patients, to improve online self-service help desks, to provide tourists and citizens with specific information regarding cities, prompt customer support via phone, and much more.

Compute Canada’s “Strategic” Plan Isn’t

I have become involved in Compute Canada (in addition to being a grateful user of WestGrid.) I came across this critique of their recent strategic plan, Compute Canada’s “Strategic” Plan Isn’t. What interests me is that education is the main issue for Software Carpentry. One of the things in the plan is to establish a virtual HPC centre of excellence for the humanities and social sciences which would run workshops and seminars to help bring people on board.

My sense from the comments is that for funders it is more attractive to develop new technologies than to educate people to use those in place. What government brags about workshops, but petascale computing sound special.

IEEE Spectrum: Ray Kurzweil’s Slippery Futurism

From Slashdot I was led to a great critique of Kurzweil’s futurism, see the IEEE Spectrum: Ray Kurzweil’s Slippery Futurism. I’ve tried to tackle Kurzweil in previous posts here (on Singularity University), but never quite nailed his form of prediction the way John Rennie does.

Therein lie the frustrations of Kurzweil’s brand of tech punditry. On close examination, his clearest and most successful predictions often lack originality or profundity. And most of his predictions come with so many loopholes that they border on the unfalsifiable. Yet he continues to be taken seriously enough as an oracle of technology to command very impressive speaker fees at pricey conferences, to author best-selling books, and to have cofounded Singularity University, where executives and others are paying quite handsomely to learn how to plan for the not-too-distant day when those disappearing computers will make humans both obsolete and immortal.