Using Zotero and TAPOR on the Old Bailey Proceedings

The Digging Into Data program commissioned CLIR (Council on Library and Information Resources) to study and report on the first round of the programme. The report includes case studies on the 8 initial projects including one on our Criminal Intent project that is titled  Using Zotero and TAPOR on the Old Bailey Proceedings: Data Mining with Criminal Intent (DMCI). More interesting are some of the reflections on big data and research in the humanities that the authors make:

1. One Culture. As the title hints, one of the conclusions is that in digital research the lines between disciplines and sectors have been blurred to the point where it is more accurate to say there is one culture of e-research. This is obviously a play on C. P. Snow’s Two Cultures. In big data that two cultures of the science and humanities, which have been alienated from each other for a century or two, are now coming back together around big data.

Rather than working in silos bounded by disciplinary methods, participants in this project have created a single culture of e-research that encompasses what have been called the e-sciences as well as the digital humanities: not a choice between the scientific and humanistic visions of the world, but a coherent amalgam of people and organizations embracing both. (p. 1)

2. Collaborate. A clear message of the report is that to do this sort of e-research people need to learn to collaborate and by that they don’t just mean learning to get along. They mean deliberate collaboration that is managed. I know our team had to consciously develop patterns of collaboration to get things done across 3 countries and many more universities. It also means collaborating across disciplines and this is where the “one culture” of the report is aspirational – something the report both announces and encourages. Without saying so, the report also serves as a warning that we could end up with a different polarization just as the separation of scientific and humanistic culture is healed. We could end up with polarization between those who work on big data (of any sort) using computational techniques and those who work with theory and criticism in the small. We could find humanists and scientists who use statistical and empirical methods in one culture while humanists and scientists who use theory and modelling gather as a different culture. One culture always spawns two and so on.

3. Expand Concepts. The recommendations push the idea that all sorts of people/stakeholders need to expand their ideas about research. We need to expand our ideas about what constitutes research evidence, what constitutes research activity, what constitutes research deliverables and who should be doing research in what configurations. The humanities and other interpretative fields should stop thinking of research as a process that turns the reading of books and articles into the writing of more books and articles. The new scale of data calls for a new scale of concepts and a new scale of organization.

It is interesting how this report follows the creation of the Digging Into Data program. It is a validation of the act of creating the programme and creating it as it was. The funding agencies, led by Brett Bobley, ran a consultation and then gambled on a programme designed to encourage and foreground certain types of research. By and large their design had the effect they wanted. To some extent CLIR reports that research is becoming what Digging encouraged us to think it should be. Digging took seriously Greg Crane’s question, “what can you do with a million books”, but they abstracted it to “what can you do with gigabytes of data?” and created incentives (funding) to get us to come up with compelling examples, which in turn legitimize the program’s hypothesis that this is important.

In other words we should acknowledge and respect the politics of granting. Digging set out to create the conditions where a certain type of research thrived and got attention. The first round of the programme was, for this reason, widely advertised, heavily promoted, and now carefully studied and reported on. All the teams had to participate in a small conference in Washington that got significant press coverage. Digging is an example of how granting councils can be creative and change the research culture.

The Digging into Data Challenge presents us with a new paradigm: a digital ecology of data, algorithms, metadata, analytical and visualization tools, and new forms of scholarly expression that result from this research. The implications of these projects and their digital milieu for the economics and management of higher education, as well as for the practices of research, teaching, and learning, are profound, not only for researchers engaged in computationally intensive work but also for college and university administrations, scholarly societies, funding agencies, research libraries, academic publishers, and students. (p. 2)

The word “presents” can mean many things here. The new paradigm is both a creation of the programme and a result of changes in the research environment. The very presentation of research is changed by the scale of data. Visualizations replace quotations as the favored way into the data. And, of course, granting councils commission reports that re-present a heady mix of new paradigms and case studies.

 

 

Digital Infrastructure Summit 2012

A couple of weeks ago I gave a talk at Digital Infrastructure Summit 2012 which was hosted by the Canadian University Council of Chief Information Officers (CUCCIO). This short conference was very different from any other I’ve been at. CUCCIO, by its nature, is a group of people (university CIOs) who are used to doing things. They seemed committed to defining a common research infrastructure for Canadian universities and trying to prototype it. It seemed all the right people were there to start moving in the same direction.

For this talk I prepared a set of questions for auditing whether a university has good support for digital research in the humanities. See Check IT Out!. The idea is that anyone from a researcher to an administrator can use these questions to check out the IT support for humanists.

My conference notes are here.

Dissertation for Sale: A Cautionary Tale

The other day while browsing around looking for books to read on my iPad I noticed what looked like a dissertation for sale. I’ve been wondering how dissertations could get into e-book stores when I remembered the license that graduate students are being asked to sign these days by Theses Canada. The system here encourages students to give a license to Library and Archives Canada that includes the right,

(a) to reproduce, publish, archive, preserve, conserve, communicate to the public by telecommunication or on the Internet, loan, distribute and sell my thesis (the title of which is set forth above) worldwide, for commercial or non-commercial purposes, in microform, paper, electronic and/or any other formats;

I now just came across this cautionary story in the Chronicle for Higher Education about Dissertation for Sale: A Cautionary Tale. It seems it is also allowed in the US.

Getting Hacked

For those wondering why I haven’t been blogging and why Theoreti.ca seems to be unavailable, the answer is that the blog has been hacked and I’m trying to solve the problem. My ISP rightly freezes things when the blog seems to send spam. Sorry about all this!

Check IT Out!

I posted on 4Humanities a questionnaire that I call Check IT Out!. The idea is to give administrators and researchers a tool for checking out the research information technology (IT) that they have at their university. I developed it for a talk I give tomorrow at the Digital Infrastructure Summit 2012 in Saskatoon. I’m on the “Reality Check Panel” that presents realities faced by researchers. Check IT Out! is meant to address the issue of getting basic computing support and infrastructure for research. It is often sexier to build something new than to make sure that researchers have the basics. That raises the question of what are the basics, which is why I thought I would frame Check IT Out! as a series of questions, not assertions. Often people in computing services know the answers to these, but our colleagues don’t even know how to frame the question.

Save Library and Archives Canada

The Canadian Association of University Teachers has a campaign to Save Library and Archives Canada from the “Badly conceived restructuring, a redefinition of its mandate, and financial cutbacks (that) are undermining LAC’s ability to acquire, preserve and make publicly available Canada’s full documentary heritage.” The issue is not just cuts, but how LAC is dealing with the cuts.

Daniel Caron, Library and Archivist of Canada, has announced that “the new environment is totally decentralized and our monopoly as stewards of the national documentary heritage is over.”

LAC will be decentralizing a large portion of its collections to both public and private institutions. LAC documents refer to this voluntary group of “memory institutions” as a “coalition of the willing.”

Go to the site now, read up on the issues, and consider taking action!

.

New ‘Digital Divide’ Seen in Wasting Time Online

From @nowviskie a New York Times article on the New ‘Digital Divide’ Seen in Wasting Time Online.

As access to devices has spread, children in poorer families are spending considerably more time than children from more well-off families using their television and gadgets to watch shows and videos, play games and connect on social networking sites, studies show.

This fits in interesting ways with research I’ve come across in two other contexts. First, it fits with what Valerie Steeves talked about at the GRAND 2012 conference I went to. (See my conference notes.) She reported on her Young Canadians in an Online World research – she has been interviewing young Canadians, their parents and teachers over the years. Between 2000 and now there has been a shift in attitude towards the internet from believing it was good for learning to thinking of it as a minefield.

The other context is a cool book I’m reading on keitai or mobile phones in Japan. Personal, Portable, Pedestrian is a collection edited by Mizuko Ito, Daisuke Okabe and Misa Matsuda about the cell phone phenomenon in Japan. They point out in passing how there are significant national/cultural differences in how technologies are picked up and used.

In the case of the PC Internet, differences in adoption were most often couched in terms of a digital divide, of haves and have-nots in relation to a universally desirable technological resource. By contrast, mobile media are frequently characterized as having different attractions depending on local contexts and cultures. The discourse of the digital divide has been mobilized in relation to Japanese keitai Internet access (see chapter 1) and is implicit in the discourse suggesting that the United States needs to catch up to Japanese keitai cultures. (p. 6)

While we need to be aware of differences in access to technology, we also should be critical of the assumptions underlying the discourse of divides. Why do we assume that the Internet is good and mobiles less so? Why did the Japanese discourse switch from viewing keitai as promoting youth rudeness and isolation to arguing for Japanese technonationalist exceptionalism (we use mobiles more because there is something exceptional about Japanese culture/spirit.)

Which reminds me of a TechCrunch article on How The Future of Mobile Lies in the Developing World. Cell phones for us are one more gadget with which to access the Internet. In the developing world they are revolutionary in that they leapfrogged the problems of physical infrastructure (phone wires) and now provide connectivity for many who had none. It is no wonder that the growth in the cell market is in the developing world.

For many communities, simple voice and text connections have brought about revolutions in access to financial, health, agricultural and education services and opportunities for employment.  For example, many farmers in rural areas in Africa and Asia use SMS services to to find out the daily prices of prices of agricultural commodities. This information allows them to improve their bargaining position when taking their goods to market, and also allows them to switch between end markets.

War and Peace gets Nookd

From Slashdot I found this blog entry Ocracoke Island Journal: Nookd about how a Nook version of War and Peace had the word “kindle” replaced by “nook” as in “It was as if a light has been Nooked (kindled) in a carved and painted lantern…” It seems that the company that ported the Kindle version over to the Nook ran a search and replace on the word Kindle and replaced it with Nook.

I think this should be turned into a game. We should create an e-reader that plays with the text in various ways. We could adapt some of Steve Ramsay’s algorithmic ideas (reversing lines of poetry). Readers could score points by clicking on the words they think were replaced and guessing the correct one.

Bonfire of the Humanities

I’m sitting at Congress 2012 in the beer tent at Wilfred Laurier. I’ve been writing a conference report of SDH/SEMI 2012. But in the beer tent they are talking about the ARG that Neil Randall (may have) started called Bonfire of the Humanities. Apparently the dean may have shut it down, but traces are left, see #bonfireofthehumanities. See also the YouTube video, Torch Institute Declares War Against University of Waterloo.

Because some may misunderstand, the Torch Institute is probably is Alternate Reality Game (ARG) satirizing the academy. With ARGs you never know what is real or not. The dean shutting things down, like the removal of the YouTube above may or may not be part of the game script. (You can see other Torch Institute videos here.) The guiding idea behind ARGs is TINAG (This is not a game.) ARGs are supposed to be games only in so far as you play with what may or may not be the game. Who knows about the Torch Institute.