In response to unprecedented exigencies, more systemic solutions may be necessary and fully justifiable under fair use and fair dealing. This includes variants of controlled digital lending (CDL), in which books are scanned and lent in digital form, preserving the same one-to-one scarcity and time limits that would apply to lending their physical copies. Even before the new coronavirus, a growing number of libraries have implemented CDL for select physical collections.
To be honest, I am so tired of sitting on my butt that I plan to spend much more time walking to and browsing around the library at the University of Alberta. As much as digital access is a convenience, I’m missing the occasions for getting outside and walking that a library affords. Perhaps we should think of the library as a labyrinth – something deliberately difficult to navigate in order to give you an excuse to walk around.
Perhaps I need a book scanner on a standing desk at home to keep me on my feet.
Michael Sinatra invited me to a “show and tell” workshop at the new Université de Montréal campus where they have a long data wall. Sinatra is the Director of CRIHN (Centre de recherche interuniversitaire sur les humanitiés numériques) and kindly invited me to show what I am doing with Stéfan Sinclair and to see what others at CRIHN and in France are doing.
Exploring through Markup: Recovering COCOA. This paper looked at an experimental Voyant tool that allows one to use COCOA markup as a way of exploring a text in different ways. COCOA markup is a simple form of markup that was superseded by XML languages like those developed with the TEI. The paper recovered some of the history of markup and what we may have lost.
Designing for Sustainability: Maintaining TAPoR and Methodi.ca. This paper was presented by Holly Pickering and discussed the processes we have set up to maintain TAPoR and Methodi.ca.
Our team also had two posters, one on “Generative Ethics: Using AI to Generate” that showed a toy that generates statements about artificial intelligence and ethics. The other, “Discovering Digital Methods: An Exploration of Methodica for Humanists” showed what we are doing with Methodi.ca.
This directory contains 450 novels that appeared between 1770 and 1930 in German, French and English. It is designed for us in teaching and research.
Andrew Piper mentioned a corpus that he put together, txtlab Multilingual Novels. This corpus is of some 450 novels from the late 18th century to the early 20th (1920s). It has a gender mix and is not only English novels. This corpus was supported by SSHRC through the Text Mining the Novel project.
The obvious weakness of text mining is that it operates on the novel as text, specifically digital text (or string.) We need to find ways to also study the novel as material object (thing), as a social object, as a performance (of the reader), and as an economic object in a market place. Then we also have to find ways to connect these.
So many analytical and mining processes depend on bags of words from dictionaries to topics. Is this a problem or a limitation? Can we try to abstract characters, plot, or argument.
I was interested in the philosophical discussions around the epistemological in novels and philosophical claims about language and literature.
This project brings together researchers and partners from 21 different academic and non-academic institutions to produce the first large-scale quantitative history of the novel. Our aim is to bring new computational approaches in the field of text mining to the study of literature as well as bring the unique knowledge of literary studies to bear on larger debates about data mining and the place of information technology within society.
NovelTM is led by Andrew Piper at McGill University. At the University of Alberta I will be gathering a team that will share the resulting computing methods through TAPoR and developing recipes or tutorials so that others can try them.
Tyler Trkowski has written a Feature for NOISEY (Music by Vice) on Rap Game Riff Raff Textual Analysis. It is a neat example of text analysis outside the academy. He used Voyant and Many Eyes to analyze Riff Raff’s lyrical canon. (Riff Raff, or Horst Christian Simco, is an eccentric rapper.) What is neat is that they embedded a Voyant word cloud right into their essay along with Word Trees from Many Eyes. Riff Raff apparently “might” like “diamonds” and “versace”.
We are finally getting results in a long slow process of trying to study tool discourse in the digital humanities. Amy Dyrbe and Ryan Chartier are building a corpus of discourse around tools that includes tool reviews, articles about what people are doing with tools, web pages about tools and so on. We took the first coherent chunk and Ryan has been analyzing it with R. The graph above shows which years have the most characters. My hypothesis was that tool reviews and discourse dropped off in the 1990s as the web became more important. This seems to be wrong.
Here are the high-frequency words (with stop words removed). Note the modal verbs “can”, “will”, and “may.” They indicate the potentiality of tools.
Tim sent me a link to another news story on the Criminal Intent project that I am part of. This one is in Science News and is titled, Crime’s Digital Past. The article in by Bruce Bower and dated July 30th, 2011 (which, I know, is in the future.) One of the better stories.