Mt Fuji with the setting behind
Sitting on a hill with a view of Mt. Fuji across the water is the Shonan Village Center where I just finished a research retreat on Modelling Cultural Processes. This was organized by Mits Inaba, Martin Roth, and Gehard Heyer from Ritsumeikan University and the University of Leipzig. It brought together people in computing, linguistics, game studies, political science, literary studies and the digital humanities. My conference notes are here.
Unlike a conference, much of the time was spent in working groups discussing issues like identity, shifting content, and constructions of culture. As part of our working groups we developed a useful model of the research process across the humanities and social sciences such that we can understand where there are shifts in content.
Mt Fuji in the distance across the water
Computer programming once had much better gender balance than it does today. What went wrong?
The New York Times has a nice long article on The Secret History of Women in Coding – The New York Times. We know a lot of the story from books like Campbell-Kelly’s From Airline Reservations to Sonic the Hedgehog: a History of the Software Industry (2003), Chang’s Brotopia (2018), and Rankin’s A People’s History of Computing in the United States (2018).
The history is not the heroic story of personal computing that I was raised on. It is a story of how women were driven out of computing (both the academy and businesses) starting in the 1960s.
A group of us at the U of Alberta are working on archiving the work of Sally Sedelow, one of the forgotten pioneers of humanities computing. Dr. Sedelow got her PhD in English in 1960 and did important early work on text analysis systems.
Peter Robinson gave a talk on “Textual Communities: A Platform for Collaborative Scholarship on Manuscript Heritages” as part of the Singhmar Guest Speaker Program | Faculty of Arts.
He started by talking about whether textual traditions had any relationship to the material world. How do texts relate to each other?
Today stemata as visualizations are models that go beyond the manuscripts themselves to propose evolutionary hypotheses in visual form.
He then showed what he is doing with the Canterbury Tales Project and then talked about the challenges adapting the time-consuming transcription process to other manuscripts. There are lots of different transcription systems, but few that handle collation. There is also the problem of costs and involving a distributed network of people.
He then defined text:
A text is an act of (human) communication that is inscribed in a document.
I wondered how he would deal with Allen Renear’s argument that there are Real Abstract Objects which, like Platonic Forms are real, but have no material instance. When we talk, for example, of “hamlet” we aren’t talking about a particular instance, but an abstract object. Likewise with things like “justice”, “history,” and “love.” Peter responded that the work doesn’t exist except as its instances.
He also mentioned that this is why stand-off markup doesn’t work because texts aren’t a set of linear objects. It is better to represent it as a tree of leaves.
So, he launched Textual Communities – https://textualcommunities.org/
This is a distributed editing system that also has collation.
Paolo showed me a neat demonstration of Word2Vec Vis of Pride and Prejudice. Lynn Cherny trained a Word2Vec model using Jane Austen’s novels and then used that to find close matches for key words. She then show the text of a novel with the words replaced by their match in the language of Austen. It serves as a sort of demonstration of how Word2Vec works.
Halloween Costume Names Generated by a Weird AI
Jingwei, a bright digital humanities student working as a research assistant, has been playing with generative AI approaches from aiweirdness.com – Letting neural networks be weird. Janelle Shane has made neural networks funny by using the to generate things like New My Little Ponies. Jingwei scraped titles of digital humanities conferences from various conference sites and trained and generated new titles just waiting to be proposed as papers:
The Catalogue of the Cultural Heritage Parts
Automatic European Pathworks and Indexte Corpus and Mullisian Descriptions
Minimal Intellectual tools and Actorical Normiels: The Case study of the Digital Humanities Classics
Automatic European Periodical Mexico: The Case of the Digital Hour
TEIviv Industics – Representation dans le perfect textbook
Conceptions of the Digital Homer Centre
Preserving Critical Computational App thinking in DH Languages
DH Potential Works: US Work Film Translation Science
Translation Text Mining and GiS 2.0
DH Facilitating the RIATI of the Digital Scholar
Shape Comparing Data Creating and Scholarly Edition
DH Federation of the Digital Humanities: The Network in the Halleni building and Web Study of Digital Humanities in the Hid-Cloudy
The First Web Study of Build: A “Digitie-Game as the Moreliency of the Digital Humanities: The Case study of the Digital Hour: The Scale Text Story Minimalism: the Case of Public Australian Recognition Translation and Puradopase
The Computational Text of Contemporary Corpora
The Social Network of Linguosation in Data Washingtone
Designing formation of Data visualization
The Computational Text of Context: The Case of the World War and Athngr across Theory
The Film Translation Text Center: The Context of the Cultural Hermental Peripherents
The Social Infrastructure PPA: Artificial Data In a Digital Harl to Mexquise (1950-1936)
EMO Artificial Contributions of the Hauth Past Works of Warla Management Infriction
DAARRhK Platform for Data
Automatic Digital Harlocator and Scholar
Complex Networks of Computational Corpus
IMPArative Mining Trail with DH Portal
Pursour Auchese of the Social Flowchart of European Nation
The Stefanopology: The Digital Humanities
What happens to old digital humanities projects? Most vanish without a trace. Some get archived like the work of John Burrows and others at the Centre For Literary And Linguistic Computing (CLLC). Dr. Alexis Antonia kept an archive of CLLC materials which is now available from the Centre For 21st Century Humanities.
Torn Apart is a curation and visualization of publicly available data concerning ICE, CBP facilities, and usages. Also lists of allied and pro-immigrant facilities.
At DH 2018 I heard Roopika Risam speak about the impressive critical digital humanities Torn Apart / Separados project she is part of. (See my conference notes here.) The project is rightly getting attention. For example, the Inside Higher Ed has a story on Digital Humanities for Social Good. This story presents Torn Apart / Separados as an answer to critiques about the digital humanities that they are not critical enough and/or lack interpretative value. (See Stanley Fish’s Stop Trying to Sell the Humanities.) The Inside Higher Ed article rightly points out that there have been socially engaged digital humanities projects for some time.
What I find impressive and think is truly important is how nimble the project is. This project was imagined and implemented in “real” time – ie. it was developed in response to events unfolding in the news. It was also developed without a grant and by a distributed team of volunteers. Thats what computing in the humanities should be – a way to think through issues critically not a way to get funding.
This year we had busy CSDH and CGSA meetings at Congress 2018 in Regina. My conference notes are here. Some of the papers I was involved in include:
- “Code Notebooks: New Tools for Digital Humanists” was presented by Kynan Ly and made the case for notebook-style programming in the digital humanities.
- “Absorbing DiRT: Tool Discovery in the Digital Age” was presented by Kaitlyn Grant. The paper made the case for tool discovery registries and explained the merger of DiRT and TAPoR.
- “Splendid Isolation: Big Data, Correspondence Analysis and Visualization in France” was presented by me. The paper talked about FRANTEXT and correspondence analysis in France in the 1970s and 1980s. I made the case that the French were doing big data and text mining long before we were in the Anglophone world.
- “TATR: Using Content Analysis to Study Twitter Data” was a poster presented by Kynan Ly, Robert Budac, Jason Bradshaw and Anthony Owino. It showed IPython notebooks for analyzing Twitter data.
- “Climate Change and Academia – Joint Panel with ESAC” was a panel I was on that focused on alternatives to flying for academics.
- “Archiving an Untold History” was presented by Greg Whistance-Smith. He talked about our project to archive John Szczepaniak’s collection of interviews with Japanese game designers.
- “Using Salience to Study Twitter Corpora” was presented by Robert Budac who talked about different algorithms for finding salient words in a Twitter corpus.
- “Political Mobilization in the GG Community” was presented by ZP who talked about a study of a Twitter corpus that looked at the politics of the community.
Also, a PhD student I’m supervising, Sonja Sapach, won the CSDH-SCHN (Canadian Society for Digital Humanities) Ian Lancashire Award for Graduate Student Promise at CSDHSCHN18 at Congress. The Award “recognizes an outstanding presentation at our annual conference of original research in DH by a graduate student.” She won the award for a paper on “Tagging my Tears and Fears: Text-Mining the Autoethnography.” She is completing an interdisciplinary PhD in Sociology and Digital Humanities. Bravo Sonja!
A paper that Stéfan Sinclair and wrote about Peter Luhn and the Keyword-in-Context (KWIC) has just been published by the Fudan Journal of the Humanities and Social Sciences, Too Much Information and the KWIC | SpringerLink. The paper is part of a series that replicates important innovations in text technology, in this case, the development of the KWIC by Peter Luhn at IBM. We use that as a moment to reflect on the datafication of knowledge after WW II, drawing on Lyotard.
Project to digitise and publish his marginalia online will allow scholars to see his cutting remarks on Ralph Waldo Emerson
The Guardian has a story on an interesting digital humanities project, JS Mill scribbles reveal he was far from a chilly Victorian intellectual. The project, Mill Marginalia Online, is digitizing an estimated 40,000 comments, doodles, and other marks that John Stuart Mill wrote in his collection of 1,700 books, now at Somerville College, Oxford. His collection was donated to Somerville 30 years after his death in 1905 because the women of the college weren’t allowed to access the Oxford libraries at the time.
His comments are not just scholarly notes. For example, above is an image of the title page of Emerson’s Essays that Mill added text to in order to mock it. The new title page with Mill’s penciled in elaboration and the original reads,
Sentimental Essays: in the art of
Sense and Nonsense:
R. W. Emerson,
of Concord, Massachusetts.
A clever + well organised youth brought up
in the old traditions.
In thought “all’s fish that comes to net.”
With Fog Preface
By Thomas Carlyle.
“Patent Divine-light Self-acting Foggometer”
To the Court of
Her mAJESTy Queen Vic.
A JEST indeed. The Daily Nous has an article on this with the title, Mill’s Myriad Marginalia: Mundane, Mysterious, Mocking.
All this from Humanist.