Theoreti.ca – Page 4 – Research notes taken on subjects around multimedia, electronic texts, and computer games.

Humanitext Antiqua

Here at DH 2025 in Lisbon, Portugal, I heard a paper about a neat Japanese projects, Humanitext Antiqua – ヒューマニテクスト. They allow you to identify an ancient philosophy subcorpus (eg. Plato) and then ask questions of that. I was able to get all sorts of interesting results since they have trained their system on Greek and Roman philosophers like Aristotle and Cicero.

Here is a reference to the project:

Naoya Iwata, Ikko Tanaka, Jun Ogawa, ‘Improving Semantic Search Accuracy of Classical Texts through Context- Oriented Translation’, Proceedings of IPSJ SIG Computers and the Humanities Symposium. Download link: https://researchmap.jp/n.iwata/published_papers/48448512

Feminist Data Manifest-No

Hannah L. Jacobs presented a great paper on “Critical Refusal, Slowness, and Openness: Possibilities and Challenges in Community-Oriented Digital Archival Initiatives” at DH 2025. She talked about refusing to complete a project once they realized they didn’t really have community approval to share their data. She also pointed to this Feminist Data Manifest-No.

There was a great question about whether one can mention in a grant that one wants to go slow and that the community may refuse to be studied. Our grant system rewards and supports innovation, not slow research. I’m reminded of The Slow Professor. Perhaps it is tenure that makes slowness possible, not grants.

Reviews in Digital Humanities

Home page for Reviews in Digital Humanities, a journal edited by Dr. Jennifer Guiliano and Dr. Roopika Risam

Listening to a great talk by Paul Spence and Maria Jose Afanador-Llach (and others) on “Evaluation models, global diversity and DH” I learned about the Reviews in Digital Humanities. This looks like a really useful resource for students.

More generally their paper highlighted the importance of local evaluation models that are appropriate to the community.

It is clear that the MLA Guidelines that I worked on, back in the day (and have been updated), are influential, but they need to be adapted.

Colloque « DH@LLM: Grands modèles de langage et humanités numériques » @ IEA & Sorbonne U

DH@LLM: Grands modèles de langage et humanités numériques Colloque organisé par Alexandre Gefen (CNRS-Sorbonne Nouvelle), Glenn Roe (Sorbonne Université), Ayla Rigouts Terryn (Université de Montréal) et Michael Sinatra (Université de Montréal) En collaboration avec l’Observatoire des textes, des idées et des corpus (ObTIC), le Centre de recherche interuniversitaire sur les humanités numériques (CRIHN), l’Institut d’Études […]

Today I gave a keynote to open this symposium on Large Language Models and the digital humanities, Colloque « DH@LLM: Grands modèles de langage et humanités numériques » @ IEA & Sorbonne U. I didn’t talk much about LLMs, instead I talked about “Care and Repair for Responsibility Practices in Artificial Intelligence”. I argued that the digital humanities has a role play in developing the responsibility practices that address the challenges of LLMs. I argued for an ethics of care approach that looks at the relationships between stakeholders (both individual and institutional) and asks how we can care for those more vulnerable and how can we repair emergent systems.

Brandolini’s law

In a 3QuarksDaily post about Bullshit and Cons: Alberto Brandolini and Mark Twain Issue a Warning About Trump I came across Brandolini’s law of Refutation which states:

The amount of energy needed to refute bullshit is an order of magnitude bigger than to produce it.

This law or principle goes a long way to explaining why bullshit, conspiracy theories, and disinformation are so hard to refute. The very act of refutation becomes suspect as if you are protesting too much. The refuter is made to look like the person with an agenda that we should be skeptical of.

The corollary is that it is less work to lie about someone before they have accused you of lying than to try to refute the accusation. Better to accuse the media of purveying fake news early than to wait until they publish news about you.

As for AI hallucinations, which I believe should be called AI bullshit, we can imagine Rockwell’s corollary:

The amount of energy needed to correct for AI hallucinations in a prompted essay is an order of magnitude bigger than the work of just writing it yourself.

CSDH/SCHN Congress 2025: Reframing Togetherness

These last few days I have been at the CSDH/SCHN conference that is part of the Congress 2025. With colleagues and graduate research assistants I was part of a number of papers and panels. See CSDH/SCHN Congress 2025: Reframing Togetherness. The programme is here. Some of the papers I was involved in included:

Exploring the Deceptive Patterns of Chinook: Visualization and Storytelling Approaches Critical Software Study – Roya Sharifi; Ralph Padilla; Zahra Farhangfar; Yasmeen Abu-Laban; Eleyan Sawafta; and Geoffrey Rockwell
Building a Consortium: An Approach to Sustainability – Geoffrey Martin Rockwell; Michael Sinatra; Susan Brown; John Bradley; Ayushi Khemka; and Andrew MacDonald
Integrating Large Language Models with Spyral Notebooks – Sean Lis and Geoffrey Rockwell
AI-Driven Textual Analysis to Decode Canadian Immigration Social Media Discourse – Augustine Farinola & Geoffrey Martin Rockwell
The List in Text Analysis – Geoffrey Martin Rockwell; Ryan Chartier; and Andrew MacDonald

I was also part of a panel on Generative AI, LLMs, and Knowledge Structures organized by Ray Siemens. My paper was on Forging Interpretations with Generative AI. Here is the abstract:

Using large language models we can now generate fairly sophisticated interpretations of documents using natural language prompts. We can ask for classifications, summaries, visualizations, or specific content to be extracted. In short we can automate content analysis of the sort we used to count as research. As we play with the forging of interpretations at scale we need to consider the ethics of using generative AI in our research. We need to ask how we can use these models with respect for sources, care for transparency, and attention to positionality.

Welcome to the Artificial Intelligence Incident Database

The starting point for information about the AI Incident Database

Maria introduced me to the Artificial Intelligence Incident Database. It contains summaries and links regarding different types of incidents related to AI. Good place to get a sense of the hazards.

The AI Incident Database is dedicated to indexing the collective history of harms or near harms realized in the real world by the deployment of artificial intelligence systems. Like similar databases in aviation and computer security, the AI Incident Database aims to learn from experience so we can prevent or mitigate bad outcomes.

Moral Resposibility

On Thursday (the 29th of May) I gave a talk on Moral Responsibility and Artificial Intelligence at the Canadian AI 2025 conference in Calgary, Alberta.

I discussed what moral responsibility might be in the context of AI and argued for an ethic of care (and repair) approach to building relationships of responsibility and responsibility practices.

There was a Responsible AI track in the conference that had some great talks by Gideon Christian (U of Calgary in Law) and Julita Vassileva (U Saskatchewan.)

Rewiring the Humanities: Notes from the DH Winter School 2025

The Digital Humanities Winter School, a unique initiative by the Department of Humanities and Social Sciences, Indian Institute of Technology Delhi (IIT Delhi), was held in February 2025. Its primary goal was to bridge the gap for scholars and students from the humanities, social sciences, and other non-STEM disciplines in India, providing them with a hands-on introduction to computational tools and digital scholarship methods. This introduction aimed to foster algorithmic thinking, conceptualize data-centric research projects, encourage collaborative ventures, and instill critical approaches toward algorithms. The DH Winter School, with its promise of a low learning curve, was designed to boost the confidence of participants who came with little or no exposure to digital applications or programming. By addressing the limited opportunities for students of the humanities and social sciences in India to learn these methods, the DH Winter School aimed to impact the academic landscape significantly.

Michael Sinatra drew my attention to this post about the DH Winter School at IIT Delhi that I contributed to. See, Rewiring the Humanities: Notes from the DH Winter School 2025.

News Media Publishers Run Coordinated Ad Campaign Urging Washington to Protect Content From Big Tech and AI

Today, hundreds of news publishers launched the “Support Responsible AI” ad campaign, which calls on Washington to make Big Tech pay for the content it takes to run its AI products.

I came across one these ads about AI Theft from the News Media Alliance and followed it to this site about how, News Media Publishers Run Coordinated Ad Campaign Urging Washington to Protect Content From Big Tech and AI. They have three asks:

Require Big Tech and AI companies to fairly compensate content creators.

Mandate transparency, sourcing, and attribution in AI-generated content

Prevent monopolies from engaging in coercive and anti-competitive practices.

Gary Marcus has a substack column on Sam Altman’s attitude problem that talks about Altman’s lack of a response when confronted with an example of what seems like IP theft. I think the positions are hardening as groups begin to use charged language like “theft” for what AI companies are doing.