Sean Gouglas Remembers Stéfan Sinclair

Sean Gouglas shared these memories of Stéfan Sinclair with me and asked me to post them. They are from when they started the Humanities Computing programme at the University of Alberta where I am lucky to now teach.

In the summer of 2001, two newly-minted PhDs started planning how they were going to build and then teach a new graduate program in Humanities Computing at the University of Alberta. This was the first such program in North America. To be absolutely honest, Stéfan Sinclair and I really had no idea what we were doing. The next few months were both exhausting and exhilarating. Working with Stéfan was a professional and personal treat, especially considering that he had an almost infinite capacity for hard work. I remember him coding up the first Humanities Computing website in about seven minutes — the first HuCo logo appearing like a rising sun on a dark blue background. It also had an unfortunate typo that neither of us noticed for years. 

It was an inspiration to work with Stéfan. He was kind and patient with students, demanding a lot from them but giving even more back. He promoted the program passionately at every conference, workshop, and seminar. Over the next three years, there was a lot of coffee, a lot of spicy food, a beer or two, some volleyball, some squash, and then he and Stephanie were off to McMaster for their next adventure. 

Our Digital Humanities program has changed a lot since then — new courses, new programs, new faculty, and even a new name. Through that change, the soul of the program remained the same and it was shaped and molded by the vision and hard work of Stéfan Sinclair. 

On the 6th of August, Stéfan died of cancer. The Canadian Society for Digital Humanities has a lovely tribute, which can be found here: https://csdh-schn.org/stefan-sinclair-in-memoriam/. It was written in part by Geoffrey Rockwell, who worked closely with Stéfan for more than two decades. 

Celebrating Stéfan Sinclair: A Dialogue from 2007

Sadly, last Thursday Stéfan Sinclair passed away. A group of us posted an obituary for CSDH-SCHN here,  Stéfan Sinclair, In Memoriam and boy do I miss him already. While the obituary describes the arc of his career I’ve been trying to think of how to celebrate how he loved to play with ideas and code. The obituary tells the what of his life but doesn’t show the how.

You see, Stéfan loved to toy with ideas of text through the development of software toys. The hermeneuti.ca project started with a one day text analysis vacation/hackathon. We decided to leave all the busy work of being an academic in our offices, and spend a day in the TAPoR lab at McMaster. We decided to mess around and try the analytical equivalent of extreme programming. That included a version of “pair programming” where we alternated one at the keyboard doing the analysis while the other would take notes and direct. We told ourselves we would just devote one day without interruptions to this folly and see if together we could take a project from conception to some sort of finished result in a day.

Little did we know we would still be at play right until a few weeks ago. We failed to finish that day, but we got far enough to know we enjoyed the fooling around enough to do it again and again. Those escapes into what we later called agile hermeneutics, to give it a serious name, eventually led to a monster of a project that reflected back on the play. The project culminated in the jointly authored book Hermeneutica (MIT Press, 2016) and Voyant 2.0, both of which tried to not only think-through some of the potential of the play, but also give others a way of making their own interpretative toys (which we called hermeneutica). But these too are perhaps too serious to commemorate Stéfan’s presence.

Which brings me to the dialogue we wrote and performed on “Reading Tools.” Thanks to Susan I was reminded of this script that we acted out at the University of Illinois, Urbana-Champaign in June of 2007. May it honour how Stéfan would want to be remembered. Imagine him smiling at the front of the room as he starts,

Sinclair: Why do we care so much for the opinions of other humanists? Why do we care so much whether they use computing in the humanities?

Rockwell: Let me tell you an old story. There was once a titan who invented an interpretative technology for his colleagues. No, … he wasn’t chained to a rock to have his liver chewed out daily. … Instead he did the smart thing and brought it to his dean, convinced the technology would free his colleagues from having to interpret texts and let them get back to the real work of thinking.

Sinclair: I imagine his dean told him that in the academy those who develop tools are not the best judges of their inventions and that he had to get his technology reviewed as if it were a book.

Rockwell: Exactly, and the dean said, “And in this instance, you who are the father of a text technology, from a paternal love of your own children have been led to attribute to them a quality which they cannot have; for this discovery of yours will create forgetfulness in the learners’ souls, because they will not study the old ways; they will trust to the external tools and not interpret for themselves. The technology which you have discovered is an aid not to interpretation, but to online publishing.”

Sinclair: Yes, Geoffrey, you can easily tell jokes about the academy, paraphrasing Socrates, but we aren’t outside the city walls of Athens, but in the middle of Urbana at a conference. We have a problem of audience – we are slavishly trying to please the other – that undigitized humanist – why don’t we build just for ourselves? …

Enjoy the full dialogue here: Reading Tools Script (PDF).

OSS advise on how to sabotage organizations or conferences

On Twitter someone posted a link to a 1944 OSS Simple Sabotage Field Manual. This includes simple, but brilliant advice on how to sabotage organizations or conferences.

This sounds a lot like what we all do when we academics normally do as a matter of principle. I particularly like the advice to “Make ‘speeches.'” I imagine many will see themselves in their less cooperative moments in this list of actions or their committee meetings.

The OSS (Office of Strategic Services) was the US office that turned into the CIA.

The Useless Web

The Useless Web Button… just press it, and find where it takes you.

Bettina pointed me to this The Useless Web site. It sends you to a useless we site. Examples include The Passive Aggressive Password Machine and Always Judge a Book by its Cover which shows real books with ridiculous titles (go ahead, follow the link and see if you agree.)

My question is whether the The Useless Web Button is one of the sites that you could be taken too?

Documenting the Now (and other social media tools/services)

Documenting the Now develops tools and builds community practices that support the ethical collection, use, and preservation of social media content.

I’ve been talking with the folks at MassMine (I’m on their Advisory Board) about tools that can gather information off the web and I was pointed to the Documenting the Now project that is based at the University of Maryland and the University of Virginia with support from Mellon. DocNow have developed tools and services around documenting the “now” using social media. DocNow itself is an “appraisal” tool for twitter archiving. They then have a great catalog of twitter archives they and others have gathered which looks like it would be great for teaching.

MassMine is at present a command-line tool that can gather different types of social media. They are building a web interface version that will make it easier to use and they are planning to connect it to Voyant so you can analyze results in Voyant. I’m looking forward to something easier to use than Python libraries.

Speaking of which, I found a TAGS (Twitter Archiving Google Sheet) which is a plug-in for Google Sheets that can scrape smaller amounts of Twitter. Another accessible tool is Octoparse that is designed to scrape different database driven web sites. It is commercial, but has a 14 day trial.

One of the impressive features of Documenting the Now project is that they are thinking about the ethics of scraping. They have a Social Labels set for people to indicate how data should be handled.

MIT apologizes, permanently pulls offline huge dataset that taught AI systems to use racist, misogynistic slurs

Vinay Prabhu, chief scientist at UnifyID, a privacy startup in Silicon Valley, and Abeba Birhane, a PhD candidate at University College Dublin in Ireland, pored over the MIT database and discovered thousands of images labelled with racist slurs for Black and Asian people, and derogatory terms used to describe women. They revealed their findings in a paper undergoing peer review for the 2021 Workshop on Applications of Computer Vision conference.

Another one of those “what were they thinking when they created the dataset stories” from The Register tells about how MIT apologizes, permanently pulls offline huge dataset that taught AI systems to use racist, misogynistic slurs. The MIT Tiny Images dataset was created automatically using scripts that used the WordNet database of terms which itself held derogatory terms. Nobody thought to check either the terms taken from WordNet or the resulting images scoured from the net. As a result there are not only lots of images for which permission was not secured, but also racists, sexist, and otherwise derogatory labels on the images which in turn means that if you train an AI on these it will generate racist/sexist results.

The article also mentions a general problem with academic datasets. Companies like Facebook can afford to hire actors to pose for images and can thus secure permissions to use the images for training. Academic datasets (and some commercial ones like the Clearview AI  database) tend to be scraped and therefore will not have the explicit permission of the copyright holders or people shown. In effect, academics are resorting to mass surveillance to generate training sets. One wonders if we could crowdsource a training set by and for people?

Internet Archive closes the National Emergency Library

Within a few days of the announcement that libraries, schools and colleges across the nation would be closing due to the COVID-19 global pandemic, we launched the temporary National Emergency Library to provide books to support emergency remote teaching, research activities, independent scholarship, and intellectual stimulation during the closures.  […]

According to the Internet Archive blog the Temporary National Emergency Library to close 2 weeks early, returning to traditional controlled digital lending. The National Emergency Library (NEL) was open to anyone in the world during a time when physical libraries were closed. It made books the IA had digitized available to read online. It was supposed to close at the end of June because four commercial publishers decided to sue. 

The blog entry points to what the HathiTrust is doing as part of their Emergency Temporary Access Service which lets libraries that are members (and the U of Alberta Library is one) provide access to digital copies of books they have corresponding physical copies of. This is only available to “member libraries that have experienced unexpected or involuntary, temporary disruption to normal operations, requiring it to be closed to the public”. 

It is a pity the IS NEL was discontinued, for a moment there it looked like large public service digital libraries might become normal. Instead it looks like we will have a mix of commercial e-book services and Controlled Digital Lending (CDL) offered by libraries that have the physical books and the digital resources to organize it. The IA blog entry goes on to note that even CDL is under attack. Here is a story from Plagiarism Today:

Though the National Emergency Library may have been what provoked the lawsuit, the complaint itself is much broader. Ultimately, it targets the entirety of the IA’s digital lending practices, including the scanning of physical books to create digital books to lend.

The IA has long held that its practices are covered under the concept of controlled digital lending (CDL). However, as the complaint notes, the idea has not been codified by a court and is, at best, very controversial. According to the complaint, the practice of scanning a physical book for digital lending, even when the number of copies is controlled, is an infringement.

Finland accepts the Demoscene on its national UNESCO list of intangible cultural heritage of humanity

“Demoskene is an international community focused on demos, programming, graphics and sound creatively real-time audiovisual performances. [..] Subculture is an empowering and important part of identity for its members.”

The Art of Coding has gotten Demoscene listed by Finland in the National Inventory of Living Heritage, Breakthrough of Digital Culture: Finland accepts the Demoscene on its national UNESCO list of intangible cultural heritage of humanity. This means that Demoscene may be the first form of digital culture put forward to UNESCO as a candidate intangible cultural heritage (ICH).

In a previous blog post I argued that ICH is a form of culture that would be hard to digitize by definition. I could be proved wrong with Demoscene. Or it could be that what makes Demoscene ICH is not the digital demos, but the intangible cultural scene, which is not digital.

Either way, it is interesting to see how digital practices are also becoming intangible culture that could disappear.

You can learn more about Demoscene from these links:

Digitization in an Emergency: Fair Use/Fair Dealing and How Libraries Are Adapting to the Pandemic

In response to unprecedented exigencies, more systemic solutions may be necessary and fully justifiable under fair use and fair dealing. This includes variants of controlled digital lending (CDL), in which books are scanned and lent in digital form, preserving the same one-to-one scarcity and time limits that would apply to lending their physical copies. Even before the new coronavirus, a growing number of libraries have implemented CDL for select physical collections.

The Association of Research Libraries has a blog entry on Digitization in an Emergency: Fair Use/Fair Dealing and How Libraries Are Adapting to the Pandemic by Ryan Clough (April 1, 2020) with good links. The closing of the physical libraries has accelerated a process of moving from a hybrid of physical and digital resources to an entirely digital library. Controlled digital lending (where only a limited number of patrons can read an digital asset at a time) seems a sensible way to go.

To be honest, I am so tired of sitting on my butt that I plan to spend much more time walking to and browsing around the library at the University of Alberta. As much as digital access is a convenience, I’m missing the occasions for getting outside and walking that a library affords. Perhaps we should think of the library as a labyrinth – something deliberately difficult to navigate in order to give you an excuse to walk around.

Perhaps I need a book scanner on a standing desk at home to keep me on my feet.

Codecademy vs. The BBC Micro

The Computer Literacy Project, on the other hand, is what a bunch of producers and civil servants at the BBC thought would be the best way to educate the nation about computing. I admit that it is a bit elitist to suggest we should laud this group of people for teaching the masses what they were incapable of seeking out on their own. But I can’t help but think they got it right. Lots of people first learned about computing using a BBC Micro, and many of these people went on to become successful software developers or game designers.

I’ve just discovered Two-Bit History (0b10), a series of long and thorough blog essays on the history of computing by Sinclair Target. One essay is on Codecademy vs. The BBC Micro. The essay gives the background of the BBC Computer Literacy Project that led the BBC to commission as suitable microcomputer, the BBC Micro. He uses this history to then compare the way the BBC literacy project taught a nation (the UK) computing to the way the Codeacademy does now. The BBC project comes out better as it doesn’t drop immediately into drop into programming without explaining, something the Codecademy does.

I should add that the early 1980s was a period when many constituencies developed their own computer systems, not just the BBC. In Ontario the Ministry of Education launched a process that led to the ICON which was used in Ontario schools in the mid to late 1980s.