3quarksdaily, one of the better web sites for extracts of interesting essays, pointed me to this essay on Are Algorithms Building the New Infrastructure of Racism? in Nautilus by Aaron M. Bornstein (Dec. 21, 2017). The article reviews some of the terrain covered by Cathy O’Neil’s book Weapons of Math Destruction, but the article also points out how AIs are becoming infrastructure and infrastructure with bias baked in is very hard to change, like the low bridges that Robert Moses built to make it hard for public transit to make it into certain areas of NYC. Algorithmic decisions that are biased and visible can be studied and corrected. Decisions that get built into infrastructure disappear and get much harder to fix.
a fundamental question in algorithmic fairness is the degree to which algorithms can be made to understand the social and historical context of the data they use …
Just as important is paying attention to the data that is used to train the AIs in the first place. Historic data carries the biases of these generations and they need to be questioned as they get woven into our infrastructure.
Domenico Fiormonte has recently blogged about an interesting document he has by Father Busa that relates to a difficult moment in the history of the digital humanities in Italy in 2002. The two page “Conditional Agreement”, which I translate below, was given to Domenico and explained the terms under which Busa would agree to sign a letter to the Minister (of Education and Research) Moratti in response to Moratti’s public statement about the uselessness of humanities informatics. A letter was being prepared to be signed by a large number of Italian (and foreign) academics explaining the value of what we now call the digital humanities. Busa had the connections to get the letter published and taken seriously for which reason Domenico visited him to get his help, which ended up being conditional on certain things being made clear, as laid out in the document. Domenico kept the two pages Busa wrote and recently blogged about them. As he points out in his blog, these two pages are a mini-manifesto of Father Busa’s later views of the place and importance of what he called textual informatics. Domenico also points out how political is the context of these notes and the letter eventually signed and published. Defining the digital humanities is often about positioning the field in the larger academic and public political spheres we operate in.
I’ve just come across some important blog essays by David Gaertner. One is Why We Need to Talk About Indigenous Literature in the Digital Humanities where he argues that colleagues from Indigenous literature are rightly skeptical of the digital humanities because DH hasn’t really taken to heart the concerns of Indigenous communities around the expropriation of data.
I was struck by the number of sessions of papers on mapping projects. I don’t know if I have ever seen so many geospatial projects. Many of the papers talked about how mapping is a different way of analyzing the data whether it is the location of eateries in Roman Pompeii or German construction projects before 1924.
I gave a paper on “Information Wants to Be Free, Or Does It? Ethics in the Digital Humanities.”
The folks at #dariah Teach have put together a first of a series of videos on My Digital Humanities. Despite appearing in it, the video seems very nicely produced and there is a nice mix of people. Stéfan Sinclair and I were interviewed together, something that isn’t clear in the first part, but will presumably become clear later.
An article about authorship attribution led me to this nice site on Common Errors in English Usage. The site is for a book with that title, but the author Paul Brians has organized all the errors into a hypertext here. For example, here is the entry on why you shouldn’t use enjoy to.
I gave the first talk on “Tremendous Labour: Busa’s Methods” – a paper coming from the work Stéfan Sinclair and I are doing. I talked about the reconstruction of Busa’s Index project. I claimed that Busa and Tasman made two crucial innovations. The first was figuring out how to represent data on punched cards so that it could be processed (the data structures). The second was figuring out how to use the punched card machines at hand to tokenize unstructured text. I walked through what we know about their actual methods and talked about our attempts to replicate them:
The Canadian Writing Research Collaboratory (CWRC) today launched its Collaboratory. The Collaboratory is a distributed editing environment that allows projects to edit scholarly electronic texts (using CWRC Writer), manage editorial workflows, and publish collections. There are also links to other tools like CWRC Catalogue and Voyant (that I am involved in.) There is an impressive set of projects already featured in CWRC, but it is open to new projects and designed to help them.
Susan Brown deserves a lot of credit for imagining this, writing the CFI (and other) proposals, leading the development and now managing the release. I hope it gets used as it is a fabulous layer of infrastructure designed by scholars for scholars.
One important component in CWRC is CWRC-Writer, an in-browser XML editor that can be hooked into content management systems like the CWRC back-end. It allows for stand-off markup and connects to entity databases for tagging entities in standardized ways.
The week of the 11th tot he 16th of July was Digital Humanities 2016 in Kraków. This conference was, in my opinion, the best organized DH conference I have attended (and I have attended most of them since the first joint ACH-ALLC conference in Toronto in 1989.) Jan Rybicki and Maciej Eder deserve credit for a lovely conference.
Diversity. There was a lot of discussion and sessions dedicated to diversity of different sorts. Real differences were aired that I think most people felt was good.
Pedagogy. Perhaps it is what I attended, but it seemed that there was a new energy around pedagogical discussions. I was impressed by the creative approaches and also by the large-scale projects like Dariah-EU working group on Training and Education.
Web Historiography. There were a number of talks/panels that drew on the web as evidence. I was pleased to see a discussion of the need to think historiographically about the web. What is archived? What is missing?
Some of the events and papers I was involved in include:
New Scholars Symposium which was supported by CHCI and centerNet. I co-organized this with Rachel Hendry.
Innovations in Digital Humanities Pedagogy: Local, National, and International Training. I was part of a one day mini-conference on training and gave a short presentation on Visualization at the final panel on Publication Approaches Supporting DH Pedagogy.
CWRC & Voyant Tools: Text Repository Meets Text Analysis. I was one of three instructors on a workshop on CWRC and Voyant.
Curating Just-In-Time Datasets from the Web. I gave a paper on a project that is scraping Twitter that was coauthored with Todd Suomela and Ryan Chartier.
The Trace of Theory: Extracting Subsets from Large Collections. I introduced and gave one of the short papers on a panel of work we did as part of the Text Mining the Novel project with the HathiTrust Research Center.
Web Historiography – A New Challenge for Digital Humanities? I gave a short presentation on the Ethics of Scraping Twitter.