Modelling Cultural Processes

Mt Fuji with the setting behind

Sitting on a hill with a view of Mt. Fuji across the water is the Shonan Village Center where I just finished a research retreat on Modelling Cultural Processes. This was organized by Mits Inaba, Martin Roth, and Gehard Heyer from Ritsumeikan University and the University of Leipzig. It brought together people in computing, linguistics, game studies, political science, literary studies and the digital humanities. My conference notes are here.

Unlike a conference, much of the time was spent in working groups discussing issues like identity, shifting content, and constructions of culture. As part of our working groups we developed a useful model of the research process across the humanities and social sciences such that we can understand where there are shifts in content.

Mt Fuji in the distance across the water

A Critical Assessment of the Movement for Ethical Artificial Intelligence and Machine Learning

Greene, Hoffmann, and Stark have written a much needed conference paper on Better, Nicer, Clearer, Fairer: A Critical Assessment of the Movement for Ethical Artificial Intelligence and Machine Learning (PDF) for the Hawaii International Conference on System Sciences in Maui, HI. They look at a number of the important ethics statements/declarations out there and try to understand their “moral background.” Here is the abstract:

This paper uses frame analysis to examine recent high-profile values statements endorsing ethical design for artificial intelligence and machine learning (AI/ML). Guided by insights from values in design and the sociology of business ethics, we uncover the grounding assumptions and terms of debate that make some conversations about ethical design possible while forestalling alternative visions. Vision statements for ethical AI/ML co-opt the language of some critics, folding them into a limited, technologically deterministic, expert-driven view of what ethical AI/ML means and how it might work.

I get the feeling that various outfits (of experts) are trying to define what ethics in AI/ML is rather then engaging in a dialogue. There is a rush to be the expert on ethics. Perhaps we should imagine a different way of developing an ethical consensus.

For that matter, is there room for critical positions? What it would mean to call for a stop all research into AI/ML as unethical until proven otherwise? Is that even thinkable? Can we imagine another way that the discourse of ethics might play out?

This article is a great start.

Burrows and Antonia Archives: Centre For 21st Century Humanities

What happens to old digital humanities projects? Most vanish without a trace. Some get archived like the work of John Burrows and others at the Centre For Literary And Linguistic Computing (CLLC). Dr. Alexis Antonia kept an archive of CLLC materials which is now available from the Centre For 21st Century Humanities.

The structure of recent philosophy (II) · Visualizations

In this codebook we will investigate the macro-structure of philosophical literature. As a base for our investigation I have collected about fifty-thousand reco

Stéfan sent me a link to this interesting post, The structure of recent philosophy (II) · Visualizations. Maximilian Noichl has done a fascinating job using the Web of Science to develop a model of the field of Philosophy since the 1950s. In this post he describes his method and the resulting visualization of clusters (see above). In a later post (version III of the project) he gets a more nuanced visualization that seems more true to the breadth of what people do in philosophy. The version above is heavily weighted to anglo-american analytic philosophy while version III has more history of philosophy and continental philosophy.

Here is the final poster (PDF) for version III.

I can’t help wondering if his snowball approach doesn’t bias the results. What if one used full text of major journals?

AI Weirdness

I just came across a neat site called AI Weirdness. The site describes all sorts of “weird” experiments in learning neural networks. Some examples:

The site has a nice FAQ that describes her tools and how to learn how to do it.

Franken-algorithms: the deadly consequences of unpredictable code

The death of a woman hit by a self-driving car highlights an unfolding technological crisis, as code piled on code creates ‘a universe no one fully understands’

The Guardian has a good essay by Andrew Smith about Franken-algorithms: the deadly consequences of unpredictable code. The essay starts with the obvious problems of biased algorithms like those documented by Cathy O’Neil in Weapons of Math Destruction. It then goes further to talk about cases where algorithms are learning on the fly or are so complex that their behaviour becomes unpredictable. An example is high-frequency trading algorithms that trade on the stock market. These algorithmic traders try to outwit each other and learn which leads to unpredictable “flash crashes” when they go rogue.

The problem, he (George Dyson) tells me, is that we’re building systems that are beyond our intellectual means to control. We believe that if a system is deterministic (acting according to fixed rules, this being the definition of an algorithm) it is predictable – and that what is predictable can be controlled. Both assumptions turn out to be wrong.

The good news is that, according to one of the experts consulted this could lead to “a golden age for philosophy” as we try to sort out the ethics of these autonomous systems.

CSDH and CGSA 2018

This year we had busy CSDH and CGSA meetings at Congress 2018 in Regina. My conference notes are here. Some of the papers I was involved in include:


  • “Code Notebooks: New Tools for Digital Humanists” was presented by Kynan Ly and made the case for notebook-style programming in the digital humanities.
  • “Absorbing DiRT: Tool Discovery in the Digital Age” was presented by Kaitlyn Grant. The paper made the case for tool discovery registries and explained the merger of DiRT and TAPoR.
  • “Splendid Isolation: Big Data, Correspondence Analysis and Visualization in France” was presented by me. The paper talked about FRANTEXT and correspondence analysis in France in the 1970s and 1980s. I made the case that the French were doing big data and text mining long before we were in the Anglophone world.
  • “TATR: Using Content Analysis to Study Twitter Data” was a poster presented by Kynan Ly, Robert Budac, Jason Bradshaw and Anthony Owino. It showed IPython notebooks for analyzing Twitter data.
  • “Climate Change and Academia – Joint Panel with ESAC” was a panel I was on that focused on alternatives to flying for academics.


  • “Archiving an Untold History” was presented by Greg Whistance-Smith. He talked about our project to archive John Szczepaniak’s collection of interviews with Japanese game designers.
  • “Using Salience to Study Twitter Corpora” was presented by Robert Budac who talked about different algorithms for finding salient words in a Twitter corpus.
  • “Political Mobilization in the GG Community” was presented by ZP who talked about a study of a Twitter corpus that looked at the politics of the community.

Also, a PhD student I’m supervising, Sonja Sapach, won the CSDH-SCHN (Canadian Society for Digital Humanities) Ian Lancashire Award for Graduate Student Promise at CSDHSCHN18 at Congress. The Award “recognizes an outstanding presentation at our annual conference of original research in DH by a graduate student.” She won the award for a paper on “Tagging my Tears and Fears: Text-Mining the Autoethnography.” She is completing an interdisciplinary PhD in Sociology and Digital Humanities. Bravo Sonja!

Re-Imagining Education In An Automating World conference at George Brown

On May 25th I had a chance to attend a gem of a conference organized the Philosophy of Education (POE) committee at George Brown. They organized a conference with different modalities from conversations to formal talks to group work. The topic was Re-Imagining Education in An Automating World (see my conference notes here) and this conference is a seed for a larger one next year.

I gave a talk on Digital Citizenship at the end of the day where I tried to convince people that:

  • Data analytics are now a matter of citizenship (we all need to understand how we are being manipulated).
  • We therefore need to teach data literacy in the arts and humanities, so that
  • Students are prepared to contribute to and critique the ways analytics are used deployed.
  • This can be done by integrating data and analytical components in any course using field-appropriate data.


Too Much Information and the KWIC

A paper that Stéfan Sinclair and wrote about Peter Luhn and the Keyword-in-Context (KWIC) has just been published by the Fudan Journal of the Humanities and Social Sciences, Too Much Information and the KWIC | SpringerLink. The paper is part of a series that replicates important innovations in text technology, in this case, the development of the KWIC by Peter Luhn at IBM. We use that as a moment to reflect on the datafication of knowledge after WW II, drawing on Lyotard.

Duplex shows Google failing at ethical and creative AI design

Google CEO Sundar Pichai milked the woos from a clappy, home-turf developer crowd at its I/O conference in Mountain View this week with a demo of an in-the-works voice assistant feature that will e…

A number of venues, including TechCruch have discussed the recent Google demonstration of an intelligent agent Duplex who can make appointments. Many of the stories note how Duplex shows Google failing at ethical and creative AI design. The problem is that the agent didn’t (at least during the demo) identify as a robot. Instead it appeared to deceive the person it was talking to. As the TechCrunch article points out, there is really no good reason to deceive if the purpose is to make an appointment.

What I want to know is what are the ethics of dealing with a robot? Do we need to identify as human to the robot? Do we need to be polite and give them the courtesy that we would a fellow human? Would it be OK for me to hang up as I do on recorded telemarketing calls? Most of us have developed habits of courtesy when dealing with people, including strangers, that the telemarketers take advantage of in their scripts. Will the robots now take advantage of that? Or, to be more precise, will those that use the robots to save their time take advantage of us?

A second question is how Google considers the ethical implications of their research? It is easy to castigate them for this demonstration, but the demonstration tells us nothing about a line of research that has been going on for a while and what processes Google may have in place to check the ethics of what they do. As companies explore the possibilities for AI, how are they to check their ethics in the excitement of achievement?

I should note that Google’s parent Alphabet has apparently dropped the “Don’t be evil” motto from their code of conduct. There has also been news about how a number of employees quit over a Google program to apply machine learning to drone footage for the military.  This is after over 3000 Google employees signed a letter taking issue with the project. See also the Open Letter in Support of Google Employees and Tech Workers that researchers signed. As they say:

We are also deeply concerned about the possible integration of Google’s data on people’s everyday lives with military surveillance data, and its combined application to targeted killing. Google has moved into military work without subjecting itself to public debate or deliberation, either domestically or internationally. While Google regularly decides the future of technology without democratic public engagement, its entry into military technologies casts the problems of private control of information infrastructure into high relief.