The New York Times has an interesting way of visualizing fashion that you can see in their article Front Row to Fashion Week – Interactive Feature. They have abstracted the colour hues to create small swatches of different designers who showed at the New York Fashion Week. These “sparklines” or sparkboxes are an interesting way to compare the shows by designers.
Archive for the ‘Methods’ Category
On July 11th and 12th I was at a conference in Saskatoon on Social Digital Scholarly Editing. This conference was organized by Peter Robinson and colleagues at the University of Saskatchewan. I kept conference notes here.
I gave a paper on “Social Texts and Social Tools.” My paper argued for text analysis tools as a “reader” of editions. I took the extreme case of big data text mining and what scraping/mining tools want in a text and don’t want in a text. I took this extreme view to challenge the scholarly editing view that the more interpretation you put into an edition the better. Big data wants to automate the process of gathering and mining texts – big data wants “clean” texts that don’t have markup, annotations, metadata and other interventions that can’t be easily removed. The variety of markup in digital humanities projects makes it very hard to clean them.
The response was appreciative of the provocation, but (thankfully) not convinced that big data was the audience of scholarly editors.
We are finally getting results in a long slow process of trying to study tool discourse in the digital humanities. Amy Dyrbe and Ryan Chartier are building a corpus of discourse around tools that includes tool reviews, articles about what people are doing with tools, web pages about tools and so on. We took the first coherent chunk and Ryan has been analyzing it with R. The graph above shows which years have the most characters. My hypothesis was that tool reviews and discourse dropped off in the 1990s as the web became more important. This seems to be wrong.
Here are the high-frequency words (with stop words removed). Note the modal verbs “can”, “will”, and “may.” They indicate the potentiality of tools.
“ii” 1514 (Not sure why)
I have been working for a while on archiving the Globalization Compendium which I worked on. Yesterday I got it archived in two Institutional Repositories:
- McMaster University (where the project was developed) has an IR called DigitalCommons@McMaster. See it at Globalization Compendium Archive.
- The University of Alberta (where I am now) has an IR called ERA (Education and Research Archive). See it at Globalization Compendium Archive.
In both cases there is a Zip of a BagIt bag with the XML files, code and other documentation from the site. My first major deposit.
The New York Times now has an article on the Criminal Intent project I was part of. See, Old Bailey Trials Are Tabulated for Scholars Online. They quote a historian who is sceptical of the results of mining, though he appreciates the resource.
“The Old Bailey Online project has done a great service in making those sources widely (and costlessly) available,” Mr. Langbein wrote in an e-mail. But he complained that the claims about data mining have “a breathless quality: ‘you can expect big things from us,’ but as yet it’s all method and no results.” He said that the new findings belittle the work of a generation of scholars who focused on the 18th century as the turning point in the evolution of the criminal justice system.
Alas, he seems didn’t read our report, but the summary in the Chronicle. It is easy to use cute phrases like “breathless quality”, but is he right? Time will tell, but I think the historians on our team have backed up the results found with mining and they never belittled the work of previous scholars – we saw ourselves building on it.
What can mining do? I think mining can give you a big picture so that you see the forest rather than trees in a way that no one could before. Conclusions about the shape of the forest have to be checked against other evidence, but the results of mining is evidence that is not breathless even if it takes your breath away. As Bill Turkel put it,
Mr. Turkel, who developed some of the digital tools, said that data mining reveals unexpected trends and connections that no one would have thought to look for before. Previous scholars “tended to cherry-pick anecdotes without having a sense that it was possible to measure all of that text and treat the whole archive as a single unit,” he said.
Of course, if you then leverage traditional evidence to buttress your argument then the mining is forgotten or trivialized.
I had heard about Bill Turkel’s ‘super secret’ project and how he had decided to keep the idea of the project secret but share the method, which is the opposite of what we usually do. As I am not on research leave (sabbatical) and working on 5 books (ha!) I thought I should learn from Bill. Here is the link to his excellent research workflow, How To « William J Turkel. What I like is that it is all stuff you can do with off-the-shelf tools, though not necessarily free ones.
This one-day event is a chance for research projects that are digitizing evidence to meet up with each other and with units on campus that provide relevant research services. Projects that are creating digital archives of different sorts will give short presentations as will units on campus that support research.
The idea is to bring a lot of digitization projects together to learn about each other and what is happening on campus. My sense is that we have hit a critical mass on campus and now that we have a trusted digital repository ERA (Education and Research Archive) it is time to start talking and sharing knowledge. Each project should not have to reinvent itself.
From Slashdot I came across a story in GamePron about how Australian R18+ games rating gets govt support. In Australia any game that isn’t classified MA 15+ or below is refused classification and thus can’t be sold. (The Australian system is law unlike the voluntary industry ESRB system.) The Australian government is now considering adding a new R 18+ designation based on government supported studies and consultations.
Of particular interest is a literature review on Literature review on the impact of playing violent video games on aggression (PDF). This excellent review concludes that “research into the effects of VVGs (Violent Video Games) on aggression is contested and inconclusive.” (p. 5) This 50 page review by the Australian Government Attorney-General’s Department is a model of clarity and balance – it is worth quoting in greater detail,
There is some consensus in the research that some members of the community, such as people with psychotic personality traits, may be more affected by VVGs than others. However, there is mixed evidence as to whether VVGs have a greater impact on children.
A number of other findings of this review arguably reduce the policy relevance of VVG research.
- There is stronger evidence of short-term VVG effects than of long-term effects.
- The possibility that third variables (like aggressive personality, family and peer influence, socio-economic status) are behind the effect has not been well explored.
- Researchers who argue that VVGs cause aggression have not engaged with or disproved alternative theories propagated by their critics.
- There is little evidence that violent video games have a greater impact than other violent media. (p. 5)
I’m at the Methods Commons workshop and Kirsten Uszkalo presented the WEME project (Witches in Early Modern England.) She showed (for the first time) the Throwing Bones interface which allows one to search the database and survey results as small decks of cards. Each deck has a different set of cards depending on the features of the hit. (See an example below.) You can use these sets to explore the hits. Very neat!
Well my vacation is over and I’m facilitating a retreat on text methods across disciplines. (See Towards a Methods Commons.) With support from the ITST program at SSHRC we brought together 15 linguists, philosophers, historians, and literary scholars to discuss methods in a structured way. The goal is to sketch a commons that gathers “recipes” that show people how to do research things with electronic texts. Stay tuned for a draft web site in about 6 months.