How Canada Accidentally Helped Crack Computer Translation

A technological whodunit—featuring Parliament, computer scientists, and a tipsy plane flight

Arun sent me a link to a neat story about How Canada Accidentally Helped Crack Computer Translation. The story is by Christine Mitchell and is in the Walrus (June 2023). It describes how IBM got ahold of a magnetic reel tape with 14 years of the Hansard – the translated transcripts of the Canadian Parliament. IBM went on to use this data trove to make advances in automatic translation.

The story mentions the politics of automated translation research in Canada. I have previously blogged about the Booths who were recruited by the NRC to Saskatchewan to work on automated translation. They were apparently pursuing a statistical approach like that IBM took later on, but their funding was cut.

Speaking of automatic translation, Canada had a computerized system, METEO for translating daily weather forecasts from Environment Canada. This ran from 1981 to 2001 and was an early successful implementation of automatic translation in the real world. It came out of work at the TAUM (Traduction Automatique à l’Université de Montréal) research group at the Université de Montréal that was set up in the late 1960s.

The case for taking AI seriously as a threat to humanity

From the Open Philanthropy site I came across this older (2020) Vox article, The case for taking AI seriously as a threat to humanity by Kelsey Piper. The article nicely summarizes some of the history of concerns around AGI (Artificial General Intelligence) as people tend to call an AI so advanced it might be comparable to human intelligence. This history goes back to Turing’s colleague I.J. Good who speculated in 1965 that,

An ultraintelligent machine could design even better machines; there would then unquestionably be an ‘intelligence explosion,’ and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make.

Such an explosion has been called the Singularity by Vernor Vinge and was popularized by Ray Kurzweil.

I came across this following threads on the whole issue of whether AI would soon become an existential threat. The question of the dangers of AI (whether AGI (Artificial General Intelligence) or just narrow AI) has gotten a lot of attention especially since Geoffrey Hinton ended his relationship with Google so he could speak about it. He and other signed a short statement published on the site of the Center for AI Safety,

Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.

The existential question only become relevant if one believes, as many do, that there is considerable risk that AI research and development is moving so fast that it may soon achieve some level of generality at which point such an AGI could begin act in unpredictable and dangerous ways. Alternatively people could misuse such powerful AGIs to harm us. Open Philanthropy is one group that is focused on Potential Risks form Advanced AI. They could be classed as an organization with a longtermism view, a view that it is important to ethics (and philanthropy) to consider long-term issues.

Advances in AI could lead to extremely positive developments, but could also potentially pose risks from intentional misuse or catastrophic accidents.

Others have called for a Manhattan Project for AI Safety. There are, of course, those (including me) that feel that this is distracting from the immediate unintended effects of AI and/or that there is little existential danger for the moment as AGI is decades off. The cynic in my also wonders how much the distraction is intentional as it both hypes the technology (its dangerous therefore it must be important) or justifies ignoring the stubborn immediate problems like racist bias in the training data.

Kelsey Piper has in the meantime published A Field Guide to AI Safety.

The question still remains whether AI is dangerous enough to merit the sort of ethical attention that nuclear power, for example, has recieved.

Jeff Pooley, “Surveillance Publishing”

Arun sent me the link to a good paper by Jeff Pooley on Surveillance Publishing in the Journal of Electronic Publishing. The article compares what Google does to rank pages based on links to citation analysis (which inspired Brin and Page). The article looks at how both web search and citation analysis have been monetized by Google and citation network services like Web of Science. Now publishing companies like Elsevier make money off tools that report and predict on publishing. We write papers with citations and publish them. Then we buy services built on our citational work and administrators buy services telling them who publishes the most and where the hot areas are. As Pooley puts it,

Siphoning taxpayer, tuition, and endowment dollars to access our own behavior is a financial and moral indignity.

The article also points out that predictive services have been around since before Google. The insurance and credit rating businesses have used surveillance for some time.

Pooley ends by talking about how these publication surveillance tools then encourage quantification of academic work and facilitate local and international prioritization. The Anglophone academy measures things and discovers itself so it can then reward itself. What gets lost is the pursuit of knowledge.

In that sense, the “decision tools” peddled by surveillance publishers are laundering machines—context-erasing abstractions of our messy academic realities.

The full abstract is here:

This essay develops the idea of surveillance publishing, with special attention to the example of Elsevier. A scholarly publisher can be defined as a surveillance publisher if it derives a substantial proportion of its revenue from prediction products, fueled by data extracted from researcher behavior. The essay begins by tracing the Google search engine’s roots in bibliometrics, alongside a history of the citation analysis company that became, in 2016, Clarivate. The essay develops the idea of surveillance publishing by engaging with the work of Shoshana Zuboff, Jathan Sadowski, Mariano-Florentino Cuéllar, and Aziz Huq. The recent history of Elsevier is traced to describe the company’s research-lifecycle data-harvesting strategy, with the aim to develop and sell prediction products to unviersity and other customers. The essay concludes by considering some of the potential costs of surveillance publishing, as other big commercial publishers increasingly enter the predictive-analytics business. It is likely, I argue, that windfall subscription-and-APC profits in Elsevier’s “legacy” publishing business have financed its decade-long acquisition binge in analytics. The products’ purpose, moreover, is to streamline the top-down assessment and evaluation practices that have taken hold in recent decades. A final concern is that scholars will internalize an analytics mindset, one already encouraged by citation counts and impact factors.

Source: Pooley | Surveillance Publishing | The Journal of Electronic Publishing

Bridging Divides – Research and Innovation

Thanks to my colleague Yasmeen, I was included in an important CFREF, Bridging Divides – Research and Innovation led by Anna Triandafyllidou at Toronto Metropolitan University. Some of the topics I hope to work on include how information technology is being used to surveil and manage immigrants. Conversely, how immigrants use information technology.

Lisa: Steve Jobs’ sabotage and Apple’s secret burial

Who remembers the Lisa? The Verge has a nice short documentary on the Lisa: Steve Jobs’ sabotage and Apple’s secret burial. The Lisa, named after Jobs’ daughter and released in 1983, was the first Apple with a graphical user interface. Alas it was too expensive (almost $10K USD at the time) and was eventually superseded by the Macintosh that came out in 1994 despite being technically superior.

The documentary is less about the Lisa than the end of the Lisa including an interview with Bob Cook who sold remaindered and used Lisa’s after they were discontinued thanks to a deal with Apple until Apple decided to bury them all in a landfill in Utah. (Which reminds me of the Atari video game cartridge burial of 1983.) The documentary is also, as every Apple story is, about Steve Jobs and his return to Apple in the late 1990s which led to its turnaround into the successful company it is now. Was it Jobs who wanted to bury the Lisa?

Statement on AI Risk

Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.

The Center for AI Safety has issued a very short Statement on AI Risk (see sentence above.) This has been signed by the likes of Yoshua Bengio and Geoffrey Hinton. I’m not sure if it is an alternative to the much longer Open Letter, but it focuses on the warning without any prescription as to what we should do. The Open Letter was criticized many in the AI community, so perhaps CAIS was trying to find wording that could bring together “AI Scientists” and “Other Notable Figures.”

I personally find this alarmist. I find myself less and less impressed with ChatGPT as it continues to fabricate answers of little use (because they are false.) I tend to agree with Elizabeth Renieris who is quoted in this BBC story on Artificial intelligence could lead to extinction, experts warn to the effect that there are a lot more pressing immediate issues with AI to worry about. She says,

“Advancements in AI will magnify the scale of automated decision-making that is biased, discriminatory, exclusionary or otherwise unfair while also being inscrutable and incontestable,” she said. They would “drive an exponential increase in the volume and spread of misinformation, thereby fracturing reality and eroding the public trust, and drive further inequality, particularly for those who remain on the wrong side of the digital divide”.

All the concern about extinction has me wondering if this isn’t a way of hyping AI to make everyone one and every AI business more important. If there is an existential risk then it must be a priority, and if it is a priority then we should be investing in it because, of course, the Chinese are. (Note that the Chinese have actually presented draft regulations that they will probably enforce.) In other words, the drama of extinction could serve the big AI companies like OpenAI, Microsoft, Google, and Meta in various ways:

  • The drama could convince people that there is real disruptive potential in AI so they should invest now! Get in before it is too late.
  • The drama could lead to regulation which would actually help the big AI companies as they have the capacity to manage regulation in ways that small startups don’t. The big will get bigger with regulation.

I should stress that this is speculation. I probably shouldn’t be so cynical. Instead lets look to what we can do locally.

The Institution of Knowledge

Last week the Kule Institute for Advanced Study, the colab and the Dunlop Art Gallery organized and exhibit/symposium on The Institution of KnowledgeThe exhibit featured artists reflecting on knowledge and institutions and the symposium including performance lectures, panels and talks.

I gave a talk on “The Knowledge We Bear” that looked at four of the main structures that discipline the ways we bear knowledge in the university as institution. I also moderated a dialogue between Kevin Kee and Jacques Beauvais.

The three days were extraordinary thanks to the leadership of my co-organizer Natalie Loveless. I learned a lot about the weaving of research and creation together.

In many ways this was my last major initiative as Director of KIAS. On July 1st Michael O’Driscoll will take over. It was a way of reflecting on institutes and what they can do with others. I’m grateful to all those who participated.

Ricordando Dino Buzzetti, co-fondatore e presidente onorario dell’AIUCD

The AIUCD (Association for Humanistic Informatics and Digital Culture) have posted a nice blog entry with memories of Dino Buzetti (in Italian). See Ricordando Dino Buzzetti, co-fondatore e presidente onorario dell’AIUCD – Informatica Umanistica e Cultura Digitale: il blog dell’ AIUCD. 

Dino was the co-founder and honorary president of the AIUCD. He was one of the few other philosophers in the digital humanities. I last saw him in Tuscany and wish I had taken more time to talk with him about his work. His paper “Towards an operational approach to computational text analysis” is in the recent collection I helped edit On Making in the Digital Humanities.

Institutions and Knowledge

University of Alberta is home to 18 faculties and dozens of research centres and institutes.

Institutions like the University of Alberta are typically divided into colleges, faculties and then departments. The U of Alberta has recently reorganized around three major Colleges that correspond to the three major granting councils in Canada. See Colleges + Faculties | University of AlbertaWe then have centres and institutes that attempt to bridge the gaps created between units. The Kule Institute for Advanced Study, for example, supports interdisciplinary research and intersectoral research in an attempt to span the gaps between departments.

What are the institutional structures that guide and constrain knowledge creation and sharing at a University? Here is a rough list:

  • The annual faculty performance assessment process has a major impact on the knowledge created by faculty. University processes and standards for assessment influence what we do or not. Typically research is what is valued and that sets the tone. The tenure-track process does free one eventually to be able to do research that isn’t understood, but one still gets regular feedback that can influence directions one takes.
  • The particular division of a University into departments structures what knowledge one is expected to create and teach. The divisions are a topology of what is considered important fields of knowledge even if there are centres and institutes that cross boundaries. These divisions into departments and faculties have history; they are not fixed, but neither are they fluid. They come and go. A university is too large to manage without divisions, but divisions can lead to silos that don’t communicate as much.
  • What one can teach and is assigned to teach has a dramatic effect on  the knowledge one shares and thinks about. Even if one supposedly knows what one teaches, teaching, especially at the graduate level, encourages sustained reflection on some issues. Teaching is also one of the most important ways knowledge is replicated and shared.
  • Knowledge infrastructure like the library and available labs make possible or constrain what one can do. If one doesn’t have access to publications in a field it limits one’s ability to study it. This is why libraries are so important to research in some fields. Likewise, if you don’t have access to the right sort of lab and research equipment you can’t do research. The ongoing competition for infrastructure resources from space to book is part of the shifting politics of knowledge.
  • Universities will also have different incentives and support for research from small grants to grant writing staff. Research services has programs, staff and so on that can support new knowledge creation or not.

Then there are structures that are outside the university like the granting councils, but that is for another blog post.

 

Auto-GPT

An experimental open-source attempt to make GPT-4 fully autonomous. – Auto-GPT/README.md at master · Torantulino/Auto-GPT

From a video on 3 Quarks Daily on whether ChatGPT can prompt itself I discovered, Auto-GPT. Auto-GPT is powered by GPT-4. You can describe a mission and it will try to launch tasks, assess them, and complete the mission. Needless to say it was inevitable that someone would find a way to use ChatGPT or one of its relatives to try to complete complicated jobs including taking over the world, as Chaos-GPT claims to want to do (using Auto-GPT.)

How long will it be before someone figures out how to use these tools to do something truly nasty? I give it about 6 months before we get stories of generative AI being used to systematically harass people, or find information on how to harm people, or find ways to waste resources like the paperclip maximizer. Is it surprising that governments like Italy have banned ChatGPT?