Ethics in the Age of Smart Systems

Today was the third day of a symposium I helped organize on Ethics in the Age of Smart Systems. For this we experimented with first organizing a “dialogue” or informal paper and discussion on a topic around AI ethics once a month. These led into the symposium that ran over three days. We allowed for an ongoing conversation after the formal part of the event each day. We were also lucky that the keynotes were excellent.

  • Veena Dubal talked about Proposition 22 and how it has created a new employment category of those managed by algorithm (gig workers.) She talked about how this is a new racial wage code as most of the Uber/Lyft workers are people of colour or immigrants.
  • Virginia Dignum talked about how everyone is announcing their principles, but these principles are enough. She talked about how we need standards; advisory panels and ethics officers; assessment lists (checklists); public awareness; and participation.
  • Rafael Capurro gave a philosophical paper about the smart in smart living. He talked about metis (the Greek for cunning) and different forms of intelligence. He called for hesitation in the sense of taking time to think about smart systems. His point was that there are time regimes of hype and determinism around AI and we need to resist them and take time to think freely about technology.

Digital humanities – How data analysis can enrich the liberal arts

But despite data science’s exciting possibilities, plenty of other academics object to it

The Economist has a nice Christmas Special on the Digital humanities – How data analysis can enrich the liberal arts. The article tells a bit of our history (starting with Busa, of course) and gives examples of new work like that of Ted Underwood. The note criticism about how DH may be sucking up all the money or corrupting the humanities, but they also point out how little DH gets from the NEH pot (some $60m out of $16bn) which is hardly evidence of a take over. The truth is, as they note, that the humanities are under attack again and the digital humanities don’t make much of a difference either way. The neighboring fields that I see students moving to are media arts, communication studies and specializations like criminology. Those are the threats, but also sanctuaries for the humanities.


Gather is a video-calling space that lets multiple people hold separate conversations in parallel, walking in and out of those conversations just as easily as they would in real life.

Kisha introduced me to Gather, a cross between Second Life and Zoom. If you have a Gather account you can create a space – your own little classroom with different gathering spots. People then move around these 8-bit animated spaces and when they are in hearing distance they can video conference. Users can also read posters put up, or documents left around, or watch videos created for a space. It actually looks like a nice type of space for a class to use as an alternative to Zoom.

A Digital Project Handbook

A peer-reviewed, open resource filling the gap between platform-specific tutorials and disciplinary discourse in digital humanities.

From a list I am on I learned about Visualizing Objects, Places, and Spaces: A Digital Project Handbook. This is a highly modular text book that covers a lot of the basics about project management in the digital humanities. They have a call now for “case studies (research projects) and assignments that showcase archival, spatial, narrative, dimensional, and/or temporal approaches to digital pedagogy and scholarship.” The handbook is edited by Beth Fischer (Postdoctoral Fellow in Digital Humanities at the Williams College Museum of Art) and Hannah Jacobs (Digital Humanities Specialist, Wired! Lab, Duke University), but parts are authored by all sorts of people.

What I like about it is the way they have split up the modules and organized things by the type of project. They also have deadlines which seem to be for new iterations of materials and for completion of different parts. This could prove to be a great resource for teaching project management.

Why basing universities on digital platforms will lead to their demise – Infolet

I’m republishing here a blog essay originally in Italian that Domenico Fiormonte posted on Infolet that is worth reading,

Why basing universities on digital platforms will lead to their demise

By Domenico Fiormonte

(All links removed. They can be found in the original post – English Translation by Desmond Schmidt)

A group of professors from Italian universities have written an open letter on the consequences of using proprietary digital platforms in distance learning. They hope that a discussion on the future of education will begin as soon as possible and that the investments discussed in recent weeks will be used to create a public digital infrastructure for schools and universities.

Dear colleagues and students,

as you already know, since the COVID-19 emergency began, Italian schools and universities have relied on proprietary platforms and tools for distance learning (including exams), which are mostly produced by the “GAFAM” group of companies (Google, Apple, Facebook, Microsoft and Amazon). There are a few exceptions, such as the Politecnico di Torino, which has adopted instead its own custom-built solutions. However, on July 16, 2020 the European Court of Justice issued a very important ruling, which essentially says that US companies do not guarantee user privacy in accordance with the European General Data Protection Regulation (GDPR). As a result, all data transfers from the EU to the United States must be regarded as non-compliant with this regulation, and are therefore illegal.

A debate on this issue is currently underway in the EU, and the European Authority has explicitly invited “institutions, offices, agencies and organizations of the European Union to avoid transfers of personal data to the United States for new procedures or when securing new contracts with service providers.” In fact the Irish Authority has explicitly banned the transfer of Facebook user data to the United States. Finally, some studies underline how the majority of commercial platforms used during the “educational emergency” (primarily G-Suite) pose serious legal problems and represent a “systematic violation of the principles of transparency.”

In this difficult situation, various organizations, including (as stated below) some university professors, are trying to help Italian schools and universities comply with the ruling. They do so in the interests not only of the institutions themselves, but also of teachers and students, who have the right to study, teach and discuss without being surveilled, profiled and catalogued. The inherent risks in outsourcing teaching to multinational companies, who can do as they please with our data, are not only cultural or economic, but also legal: anyone, in this situation, could complain to the privacy authority to the detriment of the institution for which they are working.

However, the question goes beyond our own right, or that of our students, to privacy. In the renewed COVID emergency we know that there are enormous economic interests at stake, and the digital platforms, which in recent months have increased their turnover (see the study published in October by Mediobanca), now have the power to shape the future of education around the world. An example is what is happening in Italian schools with the national “Smart Class” project, financed with EU funds by the Ministry of Education. This is a package of “integrated teaching” where Pearson contributes the content for all the subjects, Google provides the software, and the hardware is the Acer Chromebook. (Incidentally, Pearson is the second largest publisher in the world, with a turnover of more than 4.5 billion euros in 2018.) And for the schools that join, it is not possible to buy other products.

Finally, although it may seem like science fiction, in addition to stabilizing proprietary distance learning as an “offer”, there is already talk of using artificial intelligence to “support” teachers in their work.

For all these reasons, a group of professors from various Italian universities decided to take action. Our initiative is not currently aimed at presenting an immediate complaint to the data protection officer, but at avoiding it, by allowing teachers and students to create spaces for discussion and encourage them to make choices that combine their freedom of teaching with their right to study. Only if the institutional response is insufficient or absent, we will register, as a last resort, a complaint to the national privacy authority. In this case the first step will be to exploit the “flaw” opened by the EU court ruling to push the Italian privacy authority to intervene (indeed, the former President, Antonello Soro, had already done so, but received no response). The purpose of these actions is certainly not to “block” the platforms that provide distance learning and those who use them, but to push the government to finally invest in the creation of a public infrastructure based on free software for scientific communication and teaching (on the model of what is proposed here and
which is already a reality for example in France, Spain and other European countries).

As we said above, before appealing to the national authority, a preliminary stage is necessary. Everyone must write to the data protection officer (DPO) requesting some information (attached here is the facsimile of the form for teachers we have prepared). If no response is received within thirty days, or if the response is considered unsatisfactory, we can proceed with the complaint to the national authority. At that point, the conversation will change, because the complaint to the national authority can be made not only by individuals, but also by groups or associations. It is important to emphasize that, even in this avoidable scenario, the question to the data controller is not necessarily a “protest” against the institution, but an attempt to turn it into a better working and study environment for everyone, conforming to European standards.

Guido Milanese: Filologia, letteratura, computer

Cover of the book "Filologia, Letteratura, Computer"
Philology, Literature, Computer: Ideas and instruments for humanistic informatics

Un manuale ampio ed esauriente che illustra tra teoria e prassi il tema dell’informatica umanistica per l’insegnamento e l’apprendimento universitario.

The publisher (Vita e Pensiero) kindly sent me a copy of Guido Milanese’s Filologia, letteratura, computer (Philology, Literature, Computer), an introduction to thinking about and thinking through the computer and texts. The book is designed to work as a text book that introduces students to the ideas and to key technologies, and then provides short guides to further ideas and readings.

The book focuses, as the title suggests, almost exclusively on digital filology or the computational study of texts. At the end Milanese has a short section on other media, but he is has chosen, rightly I think, to focus on set of technologies in depth rather than try a broad overview. In this he draws on an Italian tradition that goes back to Father Busa, but more importantly includes Tito Orlandi (who wrote the preface) and Numerico, Fiormonte, and Tomasi’s L’umanista digitale (this has been translated into English- see The digital humanist).

Milanese starts with the principle from Giambattista Vico that knowledge is made (verum ipsum factum.) Milanese believes that “reflection on the foundations identifies instruments and operations, and working with instruments and methods leads redefining the reflection on foundations.” (p. 9 – my rather free translation) This is virtuous circle in the digital humanities of theorizing and praxis where either one alone would be barren. Thus the book is not simply a list of tools and techniques one should know, but a series of reflections on humanistic knowledge and how that can be implemented in tools/techniques which in turn may challenge our ideas. This is what Stéfan Sinclair and I have been calling “thinking-through” where thinking through technology is both a way of learning about the thinking and about the technology.

An interesting example of this move from theory to praxis is in chapter 7 on “The Markup of Text.” (“La codifica del testo”) He moves from a discussion of adding metadata to the datafied raw text to Minsky’s idea of frames of knowledge as a way of understanding XML. I had never thought of Minsky’s ideas about articial intelligence contributing to the thinking behind XML, and perhaps Milanese is the first to do so, but it sort of works. The idea, as I understand it, goes something like this – human knowing, which Minsky wants to model for AI, brings frames of knowledge to any situation. If you enter a room that looks like a kitchen you have a frame of knowledge about how kitchens work that lets you infer things like “there must be a fridge somewhere which will have a snack for me.” Frames are Minsky’s way of trying to overcome the poverty of AI models based on collections of logical statements. It is a way of thinking about and actually representing the contextual or common sense knowledge that we bring to any situation such that we know a lot more than what is strictly in sight.

Frame systems are made up of frames and connections to other frames. The room frame connects hierarchically to the kitchen-as-a-type-of-room frame which connects to the fridge frame which then connects to the snack frame. The idea then is to find a way to represent frames of knowledge and their connections such that they can be used by AI systems. This is where Milanese slides over to XML as a hierarchical way of adding metadata to a text that enriches it with a frame of knowledge. I assume the frame (or Platonic form?) would be the DTD or Schema which then lets you do some limited forms of reasoning about an instance of an encoded text. The markup explicitly tells the computer something about the parts of the text like this (<author>Guido Milanese</author>) is the author.

The interesting thing is to refect on this application of Minsky’s theory. To begin, I wonder if it is historically true that the designers of XML (or its parent SGML) were thinking of Minsky’s frames. I doubt it, as SGML is descended from GML that predates Minsky’s 1974 Memo on “A Framework for Representing Knowledge.” That said, what I think Milanese is doing is using Minsky’s frames as a way of explaining what we do when modelling a phenomena like a text (and our knowledge of it.) Modelling is making explicit a particular frame of knowledge about a text. I know that certain blocks are paragraphs so I tag them as such. I also model in the sense of create a paradigmatic version of what my perspective on the text is. This would be the DTD or Schema which defines the parts and their potential relationships. Validating a marked up text would be a way of testing the instance against the model.

This nicely connects back to Vico’s knowing is making. We make digital knowledge not by objectively representing the world in digital form, but by creating frames or models for what can be digital known and then apply those frames to instances. It is a bit like object-oriented programming. You create classes that frame what can be represented about a type of object.

There is an attractive correspondence between the idea of knowledge as a hierarchy of frames and an XML representation of a text as a hierarchy of elements. There is a limit, however, to the move. Minsky was developing a theory of knowing such that knowledge could be artificially represented on a computer that could then do knowing (in the sense of complete AI tasks like image recognition.) Markup and marking up strike me as more limited activities of structuring. A paragraph tag doesn’t actually convey to the computer all that we know about paragraphs. It is just a label in a hierarchy of labels to which styles and processes can be attached. Perhaps the human modeller is thinking about texts in all their complexity, but they have to learn not to confuse what they know with what they can model for the computer. Perhaps a human reader of the XML can bring the frames of knowledge to reconstitute some of what the tagger meant, but the computer can’t.

Another way of thinking about this would be Searle’s Chinese room paradox. The XML is the bits of paper handed under the door in Chinese for the interpreter in the room. An appropriate use of XML will provoke the right operations to get something out (like a legible text on the screen) but won’t mean anything. Tagging a string with <paragraph> doesn’t make it a real paragraph in the fullness of what is known of paragraphs. It makes it a string of characters with associated metadata that may or may not be used by the computer.

Perhaps these limitations of computing is exactly what Milanese wants us to think about in modelling. Frames in the sense of picture frames are a device for limiting the view. For Minsky you can have many frames with which to make sense of any phenomena – each one is a different perspective that bears knowledge, sometimes contradictory. When modelling a text for the computer you have to decide what you want to represent and how to do it so that users can see the text through your frame. You aren’t helping the computer understand the text so much as representing your interpretation for other humans to use and, if they read the XML, re-interpret. This is making a knowing.


Milanese, G. (2020). Filologia, Letteratura, Computer: Idee e strumenti per l’informatica umanistica. Milan, Vita e Pensiero.

Minsky, M. (1974, June). A Framework for Representing Knowledge. MIT-AI Laboratory Memo 306. MIT.

Searle, J. R. (1980). “Minds, Brains and Programs.” Behavioral and Brain Sciences. 3:3. 417-457.

Conference: Artificial Intelligence for Information Accessibility

AI for Society and the Kule Institute for Advanced Research helped organize a conference on Artificial Intelligence for Information Accessibility (AI4IA) on September 28th, 2020. This conference was organized on the International Day for Universal Access to Information which is why the focus was on how AI can be important to access to information. An important partner in the conference was the UNESCO Information For All Programme (IFAP) Working Group on Information Accessibility (WGIA)

International Day for Universal Access to Information focused on the right to information in times of crisis and on the advantages of having constitutional, statutory and/or policy guarantees for public access to information to save lives, build trust and help the formulation of sustainable policies through and beyond the COVID-19 crisis. Speakers talked about how vital access to accurate information is in these pandemic times and the role artificial intelligence could play as we prepare for future crises. Tied to this was a discussion of the important role for international policy initiatives and shared regulation in ensuring that smaller countries, especially in the Global South, benefit from developments in AI. The worry is that some countries won’t have the digital literacy or cadre of experts to critically guide the introduction of AI.

The AI4S Associate Director, Geoffrey Rockwell, kept conference notes on the talks here,  Conference Notes on AI4AI 2020.

Ryan Cordell: Programmable Type: the Craft of Printing, the Craft of Code

A line of R code set in movable type

I want situate the kinds of programming typically practiced in digital humanities research and teaching in relation to practices more familiar to book historians and bibliographers, such as the work of compositors and printers working with moveable type.

Ryan Cordell sent me a link to a talk on  Programmable Type: the Craft of Printing, the Craft of Code. The talk looks at the “modes of thought and labor” of composing movable type and programming. He is careful to warns us about the simplistic story that has movable type and the computer as two information technologies that caused revolutions in how we think about knowledge. What is particularly interesting is how he weaves hands-on work into his course Technologies of Text. He asks students to not just read about printing, but to try doing it. Likewise for programming in R. There is a knowing that comes from doing something and attending to the labor of that doing. Replicating the making of texts gives students (and researchers) a sense of the materiality and contexts of media. It is a way of doing media archaeology.

In the essay, Cordell writes about the example of the visual poem “A Dude” and its many iterations composed with different type. I had blogged about “A Dude”, but hadn’t thought about how the poem would have been a way for the compositor to show of their craft much like a twitterbot might be a way for a programmer to show off theirs.

Cordell frames this discussion by considering the controversy around whether digital humanists should need to be able to code. He raises an interesting challenge – whether learning the craft of programming (or letterpress printing) might make it harder to view the craft critically. In committing time and labour to learning a craft does one get implicated or corrupted by the craft? Doesn’t one want end up valuing the craft simply because it is something one can now do, and to critique it would be to critique oneself.


Why Uber’s business model is doomed

Like other ridesharing companies, it made a big bet on an automated future that has failed to materialise, says Aaron Benanav, a researcher at Humboldt University

Aaron Benanav has an important opinion piece in The Guardian about Why Uber’s business model is doomed. Benanav argues that Uber and Lyft’s business model is to capture market share and then ditch the drivers they have employed for self-driving cars as they become reliable. In other words they are first disrupting the human taxi services so as to capitalize on driverless technology when it comes. Their current business is losing money as they feast on venture capital to get market share and if they can’t make the switch to driverless it is likely they go bankrupt.

This raises the question of whether we will see driverless technology good enough to oust the human drivers? I suspect that we will see it for certain geo-fenced zones where Uber and Lyft can pressure local governments to discipline the streets so as to be safe for driverless. In countries with chaotic and hard to accurately map streets (think medieval Italian towns) it may never work well enough.

All of this raises the deeper ethical issue of how driverless vehicles in particular and AI in general are being imagined and implemented. While there may be nothing unethical about driverless cars per se, there IS something unethical about a company deliberately bypassing government regulations, sucking up capital, driving out the small human taxi businesses, all in order to monopolize a market that they can then profit on by firing the drivers that got them there for driverless cars. Why is this the way AI is being commercialized rather than trying to create better public transit systems or better systems for helping people with disabilities? Who do we hold responsible for the decisions or lack of decisions that sees driverless AI technology implemented in a particularly brutal and illegal fashion. (See Benanav on the illegality of what Uber and Lyft are doing by forcing drivers to be self-employed contractors despite rulings to the contrary.)

It is this deeper set of issues around the imagination, implementation, and commercialization of AI that needs to be addressed. I imagine most developers won’t intentionally create unethical AIs, but many will create cool technologies that are commercialized by someone else in brutal and disruptive ways. Those commercializing and their financial backers (which are often all of us and our pension plans) will also feel no moral responsibility because we are just benefiting from (mostly) legal innovative businesses. Corporate social responsibility is a myth. At most corporate ethics is conceived of as a mix of public relations and legal constraints. Everything else is just fair game and the inevitable disruptions in the marketplace. Those who suffer are losers.

This then raises the issue of the ethics of anticipation. What is missing is imagination, anticipation and planning. If the corporate sector is rewarded for finding ways to use new technologies to game the system, then who is rewarded for planning for the disruption and, at a minimum, lessening the impact on the rest of us? Governments have planning units like city planning units, but in every city I’ve lived in these units are bypassed by real money from developers unless there is that rare thing – a citizen’s revolt. Look at our cities and their spread – despite all sorts of research and a history of spread, there is still very little discipline or planning to constrain the developers. In an age when government is seen as essentially untrustworthy planning departments start from a deficit of trust. Companies, entrepreneurs, innovation and yes, even disruption, are blessed with innocence as if, like children, they just do their thing and can’t be expected to anticipate the consequences or have to pick up after their play. We therefore wait for some disaster to remind everyone of the importance of planning and systems of resilience.

Now … how can teach this form of deeper ethics without sliding into political thought?

Automatic grading and how to game it

Edgenuity involves short answers graded by an algorithm, and students have already cracked it

The Verge has a story on how students are figuring out how to game automatic marking systems like Edgenuity. The story is titled, These students figured out their tests were graded by AI — and the easy way to cheat. The story describes a keyword salad approach where you just enter a list of words that the grader may be looking for. The grader doesn’t know whether what your wrote is legible or nonsense, it just looks for the right words. The students in turn get good as skimming the study materials for the keywords needed (or find lists shared by other students online.)

Perhaps we could build a tool called Edgenorance which you could feed the study materials to and it would generate the keyword list automatically. It could watch the lectures for you, do the speech recognition, then extract the relevant keywords based on the text of the question.

None of this should be surprising. Companies have been promoting algorithms that were probably word based for a while. The algorithm works if it is not understood and thus not gamed. Perhaps we will get AIs that can genuinely understand a short paragraph answer and assess it, but that will be close to an artificial general intelligence and such an AGI will change everything.