Domenico Fiormonte has recently blogged about an interesting document he has by Father Busa that relates to a difficult moment in the history of the digital humanities in Italy in 2002. The two page “Conditional Agreement”, which I translate below, was given to Domenico and explained the terms under which Busa would agree to sign a letter to the Minister (of Education and Research) Moratti in response to Moratti’s public statement about the uselessness of humanities informatics. A letter was being prepared to be signed by a large number of Italian (and foreign) academics explaining the value of what we now call the digital humanities. Busa had the connections to get the letter published and taken seriously for which reason Domenico visited him to get his help, which ended up being conditional on certain things being made clear, as laid out in the document. Domenico kept the two pages Busa wrote and recently blogged about them. As he points out in his blog, these two pages are a mini-manifesto of Father Busa’s later views of the place and importance of what he called textual informatics. Domenico also points out how political is the context of these notes and the letter eventually signed and published. Defining the digital humanities is often about positioning the field in the larger academic and public political spheres we operate in.
Pharos is an effort among 14 institutions to create a database that will eventually hold and make accessible 22 million images of artworks.
The New York Times has a story about a collaboration to develop the Pharos consortium photo archive, ‘Photo Archives Are Sleeping Beauties.’ Pharos Is Their Prince. The consortium has a number of interesting initiatives they are implementing in Pharos:
- They are applying the CIDOC Conceptual Reference Model.
The CIDOC Conceptual Reference Model (CRM) provides definitions and a formal structure for describing the implicit and explicit concepts and relationships used in cultural heritage documentation.
- They have a visual search (which doesn’t seem to find anything at the moment.)
- They are looking at Research Space (which uses CRM) for a research linked data environment.
I’ve just come across some important blog essays by David Gaertner. One is Why We Need to Talk About Indigenous Literature in the Digital Humanities where he argues that colleagues from Indigenous literature are rightly skeptical of the digital humanities because DH hasn’t really taken to heart the concerns of Indigenous communities around the expropriation of data.
The Cloud is an airily deceptive name connoting a floating world far removed from the physical realities of data.
The Gathering Cloud by J. R. Carpenter is a great interactive work that uses Luke Howard’s Essay on the Modification of Clouds from 1803 to meditate on the digital cloud. The The work “is a hybrid print- and web-based work by J. R. Carpenter commissioned by NEoN Digital Arts Festival 2016.”
The Globe and Mail has been publishing a fabulous data-driven expose on how the police categorize one out of five sexual assault reports as unfounded. They have a web essay Will police believe you? that summarizes the investigation. There is another article on How The Globe collected and analyzed sexual assault statistics to report on unfounded figures across Canada. While this isn’t big data, it shows the power of data in showing us that there is a problem and prodding police departments to start reviewing their practices.
I just came across a great French project called Transcrire. The Huma-Num Very Large Facility has built a system for the crowdsourcing of transcription of archival materials. It looks like they have built infrastructure for crowdsourcing (or citizen science) in the humanities. Playing around, it looks very professional.
Bill Robinson has penned a nice essay Marking 70 years of eavesdropping in Canada. The essay gives the background of Canada’s signals intelligence unit, the Communications Security Establishment (CSE) which just marked its 70th anniversary (on Sept. 1st.)
The original unit was the peacetime version of the Joint Discrimination Unit called the CBNRC (Communications Branch of the National Research Council). I can’t help wondering what was meant by “discrimination”?
Unable to read the Soviets’ most secret messages, the UKUSA allies resorted to plain-language (unencrypted) communications and traffic analysis, the study of the external features of messages such as sender, recipient, length, date and time of transmission—what today we call metadata. By compiling, sifting, and fusing a myriad of apparently unimportant facts from the huge volume of low-level Soviet civilian and military communications, it was possible to learn a great deal about the USSR’s armed forces, the Soviet economy, and other developments behind the Iron Curtain without breaking Soviet codes. Plain language and traffic analysis remained key sources of intelligence on the Soviet Bloc for much of the Cold War.
Robinson is particularly interesting on “The birth of metadata collection” as the Soviets frustrated developed encryption that couldn’t be broken.
At the European Summer University in Digital Humanities 2016 I was luck to be able to attend some sessions on Stylometry run by Maciej Eder. In his historical review he mentioned people like Valla and Mendenhall, but also mentioned a fellow Pole, Wincenty Lutoslawksi whose book The origin and growth of Plato’s logic; with an account of Plato’s style and of the chronology of his writings (1897) is the first to use the term “stylometry”. Lutoslawski develops a Theory of Stylometry and reviewed “500 peculiarities of Plato’s style” as part of his work on Plato’s logic. The nice thing is that the book is available through the Internet Archive.
Eder has a nice page about the work he and ogthers in the Computational Stylistics Group are doing. In the workshop sessions I was able to attend he showed us how to set up and run his “stylo” package (PDF) that provides a simple user interface over R for doing stylometry. He also showed us how to then use Gephi for network visualization.
Information is Beautiful has a great interactive on World’s Biggest Data Breaches & Hacks. The interactive shows how data breaches are getting worse, but it also lets you look at different types of breaches.
ProPublica has a great op-ed about Making Algorithms Accountable. The story starts from a decision from the Wisconsin Supreme Court on computer-generated risk (of recidivism) scores. The scores used in Wisconsin come from Northpointe who provide the scores as a service based on a proprietary alogorithm that seems biased against blacks and not that accurate. The story highlights the lack of any legislation regarding algorithms that can affect our lives.
Update: ProPublica has responded to a Northpointe critique of their findings.