How AI Image Generators Make Bias Worse – YouTube

A team at the LIS (London Interdisciplinary School) have created a great short video on the biases of AI image generators. The video covers the issues quickly and is documented with references you can follow for more. I had been looking at how image generators portrayed academics like philosophers, but this reports on research that went much further.

What is also interesting is how this grew out of a LIS undergrad’s first year project. It says something about LIS that they encourage and build on such projects. This got me wondering about the LIS which I had never heard of before. It seems to be a new teaching college in London, UK that is built around interdisciplinary programmes, not departments, that deal with “real-world problems.” It sounds a bit like problem-based learning.

Anyway, it will be interesting to watch how it evolves.

CEO Reminds Everyone His Company Collects Customers’ Sleep Data to Make Zeitgeisty Point About OpenAI Drama

The Eight Sleep pod is a mattress topper with a terms of service and a privacy policy. The company “may share or sell” the sleep data it collects from its users.

From SlashDot a story about how a CEO Reminds Everyone His Company Collects Customers’ Sleep Data to Make Zeitgeisty Point About OpenAI Drama. The story is worrisome because of the data being gathered by a smart mattress company and the use it is being put to. I’m less sure of the CEO’s (Matteo Franceschetti) inferences from his data and his call to “fix this.” How would Eight Sleep fix this? Sell more product?

The Bletchley Declaration by Countries Attending the AI Safety Summit, 1-2 November 2023

Today and tomorrow representatives from a number of countries have gathered at Bletchley Park to discuss AI safety. Close to 30 countries, including Canada were represented and they issued The Bletchley Declaration by Countries Attending the AI Safety Summit, 1-2 November 2023. This declaration starts with,

Artificial Intelligence (AI) presents enormous global opportunities: it has the potential to transform and enhance human wellbeing, peace and prosperity. To realise this, we affirm that, for the good of all, AI should be designed, developed, deployed, and used, in a manner that is safe, in such a way as to be human-centric, trustworthy and responsible.

The declaration discusses opportunities and the need to support innovation, but also mentions that “AI also poses significant risks” and mentions the usual suspects, especially “capable, general-purpose models” that could be repurposed for misuse.

What stands out is the commitment to international collaboration among the major players, including China. This is a good sign.

Many risks arising from AI are inherently international in nature, and so are best addressed through international cooperation. We resolve to work together in an inclusive manner to ensure human-centric, trustworthy and responsible AI that is safe, and supports the good of all through existing international fora and other relevant initiatives, to promote cooperation to address the broad range of risks posed by AI.

Bletchley Park is becoming a UK symbol of computing. It was, of course, where the Allied code-breaking centre was set up. It is where Turing worked on the Colossus, an important early computer used to decode the German ciphers and give the Allies a crucial advantage. It is appropriate that UK Prime Minister Sunak has used this site to gather representatives. Unfortunately few leaders joined him there, sending representatives instead, though Trudeau may show up on the 2nd.

Alas, the Declaration is short on specifics though individual countries like the United States and Canada are securing voluntary commitments from players to abide by codes of conduct. China and the EU are also passing laws regulating artificial intelligence.

One thing not mentioned at all are the dangers of military uses of AI. It is as if warbots are off the table in AI safety discussions.

The good news is that there will be follow up meetings at which we can hope that concrete agreements might be worked out.

 

 

 

Artificial General Intelligence Is Already Here

Today’s most advanced AI models have many flaws, but decades from now, they will be recognized as the first true examples of artificial general intelligence.

Blaise Agüera Y Arcas and Peter Norvig have an essay making the argument that  Artificial General Intelligence Is Already Here. Their point is that the latest machines like ChatGPT are far more general than previous narrow AIs. They may not be as general as a human, at least without embodiment, but they can do all sorts of textual tasks including tasks not deliberately programmed into them. Some of the ways they are general include their ability to deal with all sorts of topics, their ability to do different types of tasks, their ability to deal with different modalities (images, text …), their language ability, and instructability.

The article also mentions reasons why people are still reluctant to admit that we have a form of AGI:

  • “A healthy skepticism about metrics for AGI

  • An ideological commitment to alternative AI theories or techniques

  • A devotion to human (or biological) exceptionalism

  • A concern about the economic implications of AGI”

To some extent the goal post changes as AI’s solve different challenges. We used to think playing chess well was a sign of intelligence, now that we know how a computer can do it, it no longer seems a test of intelligence.

 

AI Has Already Taken Over. It’s Called the Corporation

If corporations were in fact real persons, they would be sociopaths, completely lacking the ability for empathy that is a crucial element of normal human behavior. Unlike humans, however, corporations are theoretically immortal, cannot be put in prison, and the larger multinationals are not constrained by the laws of any individual country.

Jeremy Lent has an essay arguing that AI Has Already Taken Over. It’s Called the Corporation. He isn’t the only one making this point. Indrajit (Indi) Samarajiva has a Medium essay on Corporations Are Already AI that corporations are legally artificial people with many of the rights of people. They can own property (including people), they have agency, they communicate, and they have intelligence. Just because they aren’t software running on a computer doesn’t make them artificial intelligences.

As Samarajiva points out, it would be interesting to review the history of the corporation looking at examples like the Dutch East India Company to see if we can understand how AGIs might also emerge and interact with us. He feels that Corporate AIs hate us or at least are indifferent.

Another essay that also touches on this is a diary entry by David Runciman on AI in the London Review of Books. His reflections on how our fears about AI mirror earlier fears about corporations are worth quoting in full,

Just as adult human beings are not the only model for natural intelligence – along with children, we heard about the intelligence of plants and animals – computers are not the only model for intelligence of the artificial kind. Corporations are another form of artificial thinking machine, in that they are designed to be capable of taking decisions for themselves. Information goes in and decisions come out that cannot be reduced to the input of individual human beings. The corporation speaks and acts for itself. Many of the fears that people now have about the coming age of intelligent robots are the same ones they have had about corporations for hundreds of years. If these artificial creatures are taking decisions for us, how can we hold them to account for what they do? In the words of the 18th-century jurist Edward Thurlow, ‘corporations have neither bodies to be punished nor souls to be condemned; they may therefore do as they like.’ We have always been fearful of mechanisms that ape the mechanical side of human intelligence without the natural side. We fear that they lack a conscience. They can think for themselves, but they don’t really understand what it is that they are doing.

The AP lays the groundwork for an AI-assisted newsroom

The Associated Press published standards today for generative AI use in its newsroom.

As we deal with we deal with the changes brought about by this recent generation of chatbots in the academy we could learn from guidelines emerging from other fields like journalism. Endgadget reports that  The AP lays the groundwork for an AI-assisted newsroom and you can see the Associated Press guidelines here.

Accuracy, fairness and speed are the guiding values for AP’s news report, and we believe the mindful use of artificial intelligence can serve these values and over time improve how we work.

AP also suggests they don’t see chatbots replacing journalists any time soon as the “the central role of the AP journalist – gathering, evaluating and ordering facts into news stories, video, photography and audio for our members and customers – will not change.”

It should be noted (as AP does) that they have an agreement with OpenAI.

The Illusion Of AI’s Existential Risk

In sum, AI acting on its own cannot induce human extinction in any of the ways that extinctions have happened in the past. Appeals to the competitive nature of evolution or previous instances of a more intelligent species causing the extinction of a less intelligent species reflect a common mischaracterization of evolution by natural selection.

Could artificial intelligence (AI) soon get to the point where it could enslave us? An Amii colleague sent me to this sensible article, The Illusion Of AI’s Existential Risk that argues that it is extremely unlikely that an AI could evolve to the point where it could manipulate us and prevent us from turning it off. One of the points they make is that the situation is completely different from past extinctions.

Our safety is the topic of Brian Christian’s excellent The Alignment Problem book which talks about different approaches to developing AIs so they are aligned with our values. An important point made by Stuart Russell and quoted in the book is that we don’t want AIs to have the same values as us, we want them to value our having values and to pay attention to our values.

This raises the question of how an AI might know what we value. One approach is Constitutional AI where we train ethical AIs on a constitution that captures our values and then use it to model others.

One of the problems, however, with ethics is that human ethics isn’t simple and may not be something one can capture in a constitution. For this reason another approach is Inverse Reinforcement Learning (IRL) where were ask an AI to infer our values from a mass of evidence of ethical discourse and behaviour.

My guess is that this is what they are trying at OpenAI in their Superalignment project. Imagine an ethical surveillance project that uses IRL to develop a (black) moral box which can be used to train AIs to be aligned. Imagine if it could be tuned to different community ethics?

OpenAI adds Code Interpreter to ChatGPT Plus

Upload datasets, generate reports, and download them in seconds!

OpenAI has just released a plug-in called Code Interpreter which is truly impressive. You need to have ChatGPT Plus to be able to turn it on. It then allows you to upload data and to use plain English to analyze it. You write requests/prompts like:

What are the top 20 content words in this text?

It then interprets your request and describes what it will try to do in Python. Then it generates the Python and runs it. When it has finished, it shows the results. You can see examples in this Medium article: 

ChatGPT’s Code Interpreter Was Just Released. Here’s How It Will Change Data Science Forever

I’ve been trying to see how I can use it to analyze a text. Here are some of the limitations:

  • It can’t handle large texts. This can be used to study a book length text, not a collection of books.
  • It frequently tries to load NLTK or other libraries and then fails. What is interesting is that it then tries other ways of achieving the same goal. For example, I asked for adjectives near the word “nature” and when it couldn’t load the NLTK POS library it then accessed a list of top adjectives in English and searched for those.
  • It can generate graphs of different sorts, but not interactives.
  • It is difficult to get the full transcript of an experiment where by “full” I mean that I want the Python code, the prompts, the responses, and any graphs generated. You can ask for a iPython notebook with the code which you can download. Perhaps I can also get a PDF with the images.

The Code Interpreter is in beta so I expect they will be improving it. It is none the less very impressive how it can translate prompts into processes. Particularly impressive is how it tries different approaches when things fail.

Code Interpreter could make data analysis and manipulation much more accessible. Without learning to code you can interrogate a data set and potentially run other processes. It is possible to imagine an unshackled Code Interpreter that could access the internet and do all sorts of things (like running a paper-clip business.)

How Canada Accidentally Helped Crack Computer Translation

A technological whodunit—featuring Parliament, computer scientists, and a tipsy plane flight

Arun sent me a link to a neat story about How Canada Accidentally Helped Crack Computer Translation. The story is by Christine Mitchell and is in the Walrus (June 2023). It describes how IBM got ahold of a magnetic reel tape with 14 years of the Hansard – the translated transcripts of the Canadian Parliament. IBM went on to use this data trove to make advances in automatic translation.

The story mentions the politics of automated translation research in Canada. I have previously blogged about the Booths who were recruited by the NRC to Saskatchewan to work on automated translation. They were apparently pursuing a statistical approach like that IBM took later on, but their funding was cut.

Speaking of automatic translation, Canada had a computerized system, METEO for translating daily weather forecasts from Environment Canada. This ran from 1981 to 2001 and was an early successful implementation of automatic translation in the real world. It came out of work at the TAUM (Traduction Automatique à l’Université de Montréal) research group at the Université de Montréal that was set up in the late 1960s.

Jeff Pooley, “Surveillance Publishing”

Arun sent me the link to a good paper by Jeff Pooley on Surveillance Publishing in the Journal of Electronic Publishing. The article compares what Google does to rank pages based on links to citation analysis (which inspired Brin and Page). The article looks at how both web search and citation analysis have been monetized by Google and citation network services like Web of Science. Now publishing companies like Elsevier make money off tools that report and predict on publishing. We write papers with citations and publish them. Then we buy services built on our citational work and administrators buy services telling them who publishes the most and where the hot areas are. As Pooley puts it,

Siphoning taxpayer, tuition, and endowment dollars to access our own behavior is a financial and moral indignity.

The article also points out that predictive services have been around since before Google. The insurance and credit rating businesses have used surveillance for some time.

Pooley ends by talking about how these publication surveillance tools then encourage quantification of academic work and facilitate local and international prioritization. The Anglophone academy measures things and discovers itself so it can then reward itself. What gets lost is the pursuit of knowledge.

In that sense, the “decision tools” peddled by surveillance publishers are laundering machines—context-erasing abstractions of our messy academic realities.

The full abstract is here:

This essay develops the idea of surveillance publishing, with special attention to the example of Elsevier. A scholarly publisher can be defined as a surveillance publisher if it derives a substantial proportion of its revenue from prediction products, fueled by data extracted from researcher behavior. The essay begins by tracing the Google search engine’s roots in bibliometrics, alongside a history of the citation analysis company that became, in 2016, Clarivate. The essay develops the idea of surveillance publishing by engaging with the work of Shoshana Zuboff, Jathan Sadowski, Mariano-Florentino Cuéllar, and Aziz Huq. The recent history of Elsevier is traced to describe the company’s research-lifecycle data-harvesting strategy, with the aim to develop and sell prediction products to unviersity and other customers. The essay concludes by considering some of the potential costs of surveillance publishing, as other big commercial publishers increasingly enter the predictive-analytics business. It is likely, I argue, that windfall subscription-and-APC profits in Elsevier’s “legacy” publishing business have financed its decade-long acquisition binge in analytics. The products’ purpose, moreover, is to streamline the top-down assessment and evaluation practices that have taken hold in recent decades. A final concern is that scholars will internalize an analytics mindset, one already encouraged by citation counts and impact factors.

Source: Pooley | Surveillance Publishing | The Journal of Electronic Publishing