Text Mining The Novel 2015

novelTMworkshop

On Thursday and Friday (Oct. 22nd and 23rd) I was at the 2nd workshop for the Text Mining the Novel project. My conference notes are here Text Mining The Novel 2015. We had a number of great papers on the issue of genre (this year’s topic.) Here are some general reflections:

  • The obvious weakness of text mining is that it operates on the novel as text, specifically digital text (or string.) We need to find ways to also study the novel as material object (thing), as a social object, as a performance (of the reader), and as an economic object in a market place. Then we also have to find ways to connect these.
  • So many analytical and mining processes depend on bags of words from dictionaries to topics. Is this a problem or a limitation? Can we try to abstract characters, plot, or argument.
  • I was interested in the philosophical discussions around the epistemological in novels and philosophical claims about language and literature.

 

Dennis Cooper: Zac’s Haunted House (A Novel)

Dennis Cooper has created an interesting novel of looping animated gifs called Zac’s Haunted House (A Novel). The novel is published by Kiddiepunk. I’m not sure why he deliberately calls it a novel when it has so little language, though one can think of the animated gifs as some sort of linked visual language. Perhaps animated gifs are becoming the visual equivalent of words with which we can compose.

I found this courtesy of 3QuarksDaily.

Alain Resnais: Toute la mémoire du monde

Thanks to 3quarksdaily.com I came across the wonderful short film by Alan Resnais, Toute la mémoire du monde (1956). The short is about memory and the Bibliothèque nationale (of France.) It starts at the roof of this fortress of knowledge and travels down through the architecture. It follows a book from when it arrives from a publisher to when it is shelved. It shows another book called by pneumatique to the reading room where it crosses a boundary to be read. All of this with a philosophical narration on information and memory.

The short shows big analogue information infrastructure at its technological and rational best, before digital informatics disrupted the library.

HathiTrust Research Center Awards Three ACS Projects

A Advanced Collaborative Support project that I was part of was funded, see HathiTrust Research Center Awards Three ACS Projects. Our project, called The Trace of Theory, sets out to first see if we can identify subsets of the HathiTrust volumes that are “theoretical” and then study try to track “theory” through these subsets.

YOU: A Novel

I recently finished the science fiction novel YOU: A Novel by Austin Grossman. The novel is about Russell’s returning to get a job with Black Arts a game development studio. Russell distanced himself from his high-school friends with whom he got into designing a role-playing computer game. After dropping out of the liberal arts to law trajectory he had been on, he goes back and gets a job with the company  his friends had created in the meantime. Grossman has worked in game design, so the book has an authentic feel. It is also good on how franchises are maintained as the story turns around the creation of yet another sequel to a tired franchise with flashbacks to when they created the other versions. There are a number of interesting ideas and enigmas to YOU including:

  • YOU is about the dream of the ultimate game. The book starts and ends with the question of the ultimate game and how the imagining of that game is so important to gaming,

    Finally, there is the secret of the ultimate game, inscribed on a series of crumbling scrolls in a language that is no longer well understood. But partial translations suggest that the secret of the ultimate game is that you’re already in the ultimate game, all the time, forever. That the secret of the ultimate game is that the ultimate game is a paradox, because there’s no way to play a game without knowing you’re playing it. That games are already awesome, or else why are we making such a fuss? (p. 365)

  • Part of the issue of the ultimate game is the drive to greater realism as if graphical realism were the only metric. The novel forces us to ask what is realism and whether better graphics are necessarily more immersive (compared with better stories or more open worlds.)
  • Simon, the genius behind the game engine of Black Arts dies before the beginning of the novelistic present in an elevator. It isn’t clear how he died, but it seems relevant at the beginning of the novel. Grossman answered questions about the novel here including an answer about how Simon’s death is a “bait-and-switch.”
  • Finally, there is a really interesting reference to the story of love told in Plato’s Symposium that men and women are separated and want to join up again,

    To forestall any future threat, the gods decreed we should each be separated into halves, and each half hurled into a separate dimension. There was the human half, weak but endowed with thought and feeling, and the video game half, with glowing and immortal bodies that were mere empty shells lacking wills of their own. We became a fallen race and forgot our origins, but something in us longed to be whole again. And so we invented the video game, the apparatus that bridged the realms and joined us with tour other selves again, through the sacred medium of the video game controller. The first devinces were primititive, but every year the technology improved, and we say and heard and sensed the other world more clearly. Soon enough we’d be able to feel and smell and taste and live entirely in our own bodies again. And on that day, he finished portentously, we’d challenge the gods once more.

    “First of all,” I said, “you ripped that whole thing off from that story in Plato. …” (p. 301)

WPA: Uses and Limitations of Automated Writing Evaluation

The Council of Writing Program Administrators has made available a very useful Research Bibliography on the Uses and Limitations of Automated Writing Evaluation Software (PDF). This is part of a set of WPA-ComPile Research Bibliographies. There are paragraph long summaries of the articles that are quite useful.

What seems to be missing is an ethical discussion of automated evaluation. Do we need to tell people if we use automated evaluation? Writing for someone feels like a very personal act (even in a large class). What are the expectations of writers that their writing would be read?

Pentametron: With algorithms subtle and discrete

Scott send me a link to the Pentametron: With algorithms subtle and discrete / I seek iambic writings to retweet. This site creates iambic pentameter poems from tweets by looking at the rythm of words. It then tries to find ryhming last words to create a AABB rhyming scheme. You can see an article about it on Gawker titled, Weird Internets: The Amazing Found-on-Twitter Sonnets of Pentametron.

Rowling and “Galbraith”: an authorial analysis

JK Rowling has been recently uncovered at the author of The Cuckoo’s Calling which was submitted under the name Robert Galbraith. The Sunday Times revealed this after a hint on Twitter and some forensic stylometry. Patrick Juola, one of the two people to do the analysis has a guest blog where he talks about what he did at: Rowling and “Galbraith”: an authorial analysis. Great short description of an authorship attribution project.

Social Digital Scholarly Editing

On July 11th and 12th I was at a conference in Saskatoon on Social Digital Scholarly Editing. This conference was organized by Peter Robinson and colleagues at the University of Saskatchewan. I kept conference notes here.

I gave a paper on “Social Texts and Social Tools.” My paper argued for text analysis tools as a “reader” of editions. I took the extreme case of big data text mining and what scraping/mining tools want in a text and don’t want in a text. I took this extreme view to challenge the scholarly editing view that the more interpretation you put into an edition the better. Big data wants to automate the process of gathering and mining texts – big data wants “clean” texts that don’t have markup, annotations, metadata and other interventions that can’t be easily removed. The variety of markup in digital humanities projects makes it very hard to clean them.

The response was appreciative of the provocation, but (thankfully) not convinced that big data was the audience of scholarly editors.