What’s in a number? William Shakespeare’s legacy analysed

shakespeare

The Guardian published an article on What’s in a number? William Shakespeare’s legacy analysed (April 22, 2016). This article is part of a Shakespeare 400 series in honour of the 400th anniversary of the bard’s death. The article is introduced thus:

Shakespeare’s ability to distil human nature into an elegant turn of phrase is rightly exalted – much remains vivid four centuries after his death. Less scrutiny has been given to statistics about the playwright and his works, which tell a story in their own right. Here we analyse the numbers behind the Bard.

The authors offer a series of visualizations of statistics about Shakespeare that are rather more of a tease than anything really interesting. They also ignore the long history of using quantitative methods to study Shakespeare going back to Mendenhall’s study of authorship using word lengths.

Mendenhall, T. C. (1901). “A Mechanical Solution of a Literary Problem.” The Popular Science Monthly. LX(7): 97-105.

CAA and SAH Release Guidelines for the Evaluation of Digital Scholarship in Art and Architectural History

The CAA and SAH Release Guidelines for the Evaluation of Digital Scholarship in Art and Architectural History. The College Art Association and Society of Architectural Historians have released guidelines that include attention to process:

A work of digital scholarship often requires developing or refining a methodology. That work should be evaluated as a contribution to scholarship, just as methodological innovations in traditional scholarship are given weight in assessments of achievement. By extension, digital scholarship may need to be evaluated by the process of analysis in addition to the results of the analysis. (p. 5)

The guidelines go on how to identify the importance of the process through things like project narratives. They also talk about how the “inadequacy of existing peer review for digital scholarship is directly related to the changing nature of publications. In many cases, peer review for a digital publication is little different from that of a print publication,…” It sounds like the arts are going through the same discussions as we are.

Literature Measured

I finally got around to reading the latest Pamphlets of the Stanford Literary Lab. This pamphlet, 12. Literature Measured (PDF) written by Franco Moretti, is a reflection on the Lab’s research practices and why they chose to publish pamphlets. It is apparently the introduction to a French edition of the pamphlets. The pamphlet makes some important points about their work and the digital humanities in general.

Images come  first, in our pamphlets, because – by visualizing empirical findings – they constitute the specific object of study of computational criticism; they are our “text”; the counterpart to what a well-defined excerpt is to close reading. (p. 3)

I take this to mean that the image shows the empirical findings or the model drawn from the data. That model is studied through the visualization. The visualization is not an illustration or supplement.

By frustrating our expectations, failed experiments “estrange” our natural habits of thought, offering us a chance to transform them. (p. 4)

The pamphlet has a good section on failure and how that is not just a rhetorical ploy, but important to research. I would add that only certain types of failure are so. There are dumb failures too. He then moves on to the question of successes in the digital humanities and ends with an interesting reflection on  how the digital humanities and Marxist criticism don’t seem to have much to do with each other.

But he (Bordieu) also stands for something less obvious, and rather perplexing: the near-absence from digital humanities, and from our own work as well, of that other sociological approach that is Marxist criticism (Raymond Williams, in “A Quantitative Literary History”, being the lone exception). This disjunction – perfectly mutual, as the indiference of Marxist criticism is only shaken by its occasional salvo against digital humanities as an accessory to the corporate attack on the university – is puzzling, considering the vast social horizon which digital archives could open to historical materialism, and the critical depth which the latter could inject into the “programming imagination”. It’s a strange state of a airs; and it’s not clear what, if anything, may eventually change it. For now, let’s just acknowledge that this is how things stand; and that – for the present writer – something needs to be done. It would be nice if, one day, big data could lead us back to big questions. (p. 7)

Information Geographies

Thanks to a note from Domenico Fiormonte to Humanist I came across the Information Geographies page at the Oxford Internet Institute. The OII has been producing interesting maps that show aspects of the internet. The one pictured above shows the distribution of Geographic Knowledge in Freebase. Given the importance of Freebase to Google’s Knowledge Graph it is important to understand the bias of its information to certain locations.

Geographic content in Freebase is largely clustered in certain regions of the world. The United States accounts for over 45% of the overall number of place names in the collection, despite covering about 2% of the Earth, less than 7% of the land surface, and less than 5% of the world population, and about 10% of Internet users. This results in a US density of one Freebase place name for every 1500 people, and far more place names referring to Massachusetts than referring to China.

Domenico Fiormonte’s email to Humanist (Humanist Discussion Group, Vol. 29, No. 824) argues that “It is our responsibility to preserve cultural diversity, and even relatively small players can make a difference by building more inclusive ‘representations’.” He argues that we need to be open about the cultural and linguistic biases of the tools and databases we build.

Godwin’s Bot: Recent stories on AI

Godwin’s Bot is a good essay from Misha Lepetic on 3QuarksDaily on artificial intelligence (AI). The essay reflects on the recent Microsoft debacle with @TayandYou, an AI chat bot that was “targeted at 18 to 24 year old in the US.” (About Tay & Privacy) For a New Yorker story on how Microsoft shut it down after Twitter trolls trained it to be offensive see I’ve Seen the Greatest A.I. Minds of My Generation Destroyed By Twitter. Lepetic calls her Godwin’s Bot after Godwin’s Law that asserts that in any online conversation there will eventually be a comparison to Hitler.

What is interesting about the essay is that it then moves to an interview wtih Stephen Wolfram on AI & The Future of Civilization where Wolfram distinguishes between inventing a goal, which is difficult to automate, and (once one can articulate a goal clearly) executing it, which can be automated.

How do we figure out goals for ourselves? How are goals defined? They tend to be defined for a given human by their own personal history, their cultural environment, the history of our civilization. Goals are something that are uniquely human.

Lepetic then asks if Tay had a goal or who had goals for Tay. Microsoft had a goal, and that had to do with “learning” from and about a demographic that uses social media. Lepetic sees it as a “vacuum cleaner for data.” In many ways the trolls did us a favor by misleading it.

Or … TayandYou was troll-bait to train a troll filter.

My question is whether anyone has done a good analysis of how the Tay campaign actually worked?

List of animals with fraudulent diplomas

Thanks to Twitter I came across this List of animals with fraudulent diplomas on the Wikipedia. As others have pointed out, this is the best Wikipedia page (so far). Here is an example to wet your appetite:

Ben Goldacre, a UK-based physician and science journalist, wrote in 2004 that his cat, Henrietta, had obtained a diploma in nutrition from the American Association of Nutritional Consultants; Goldacre had been investigating allegations about the qualifications claimed by Gillian McKeith.

A Supreme Court Pioneer, Now Making Her Mark on Video Games

The New York Times has a nice story about how Sandra Day O’Connor, A Supreme Court Pioneer, Now Making Her Mark on Video Games. O’Connor is promoting a game called Win the White House from iCivics a education group she started in 2009. She has also involved her Supreme Court colleague Sotomayor. The article mentions how such educational games are getting more respect.

The involvement of the two justices in digital educational games underscores a growing belief among educators that interactive tools may improve students’ engagement in their own learning. In January, Microsoft introduced an educational version of Minecraft, the hit game in which players use blocks to construct elaborate virtual worlds. Last fall, Google unveiled Expeditions, a virtual reality system for classroom use that takes students on simulated field trips around the world.

Deep Minds master the game of Go

My colleagues over in Computing Science at the University of Alberta are rightly proud of their supervision of the leads at Google who developed the AlphaGo AI that recently won at Go. Martin Mueller and colleagues have been working on AI and games for some time, see Deep Minds master the game of Go. Now another story has come out about Will humans lose out to AI in eSports too? This story highlights work by another team on AI commentating.