‘New York Times’ considers legal action against OpenAI as copyright tensions swirl : NPR

The news publisher and maker of ChatGPT have held tense negotiations over striking a licensing deal for the use of the paper’s articles to train the chatbot. Now, legal action is being considered.

Finally we are seeing a serious challenge to the way AI companies are exploiting written resources on the web as the New York Times engaged Open AI,  ‘New York Times’ considers legal action against OpenAI as copyright tensions swirl.

A top concern for the Times is that ChatGPT is, in a sense, becoming a direct competitor with the paper by creating text that answers questions based on the original reporting and writing of the paper’s staff.

It remains to be seen what the legalities are. Does using a text in order to train a model constitute the making of a copy in violation of copyright? Does the model contain something equivalent to a copy of the original? These issues are being explored in the AI image generating space where Stability AI is being sued by Getty Images. I hope the New York Times doesn’t just settle quietly before there is a public airing of the issues around the exploitation/ownership of written work. I also note that the Author’s Guild is starting to advocate on behalf of authors,

“It says it’s not fair to use our stuff in your AI without permission or payment,” said Mary Rasenberger, CEO of The Author’s Guild. The non-profit writers’ advocacy organization created the letter, and sent it out to the AI companies on Monday. “So please start compensating us and talking to us.”

This could also have repercussions in academia as many of us scrape the web and social media when studying contemporary issues. For that matter what do we think about the use of our work? One could say that our work, supported as it is by the public, should be fair game from gathering, training and innovative reuse. Aren’t we supported for the public good? Perhaps we should assert that academic prose is available for training models?

What are our ethics?

Worldcoin ignored initial order to stop iris scans in Kenya, records show

The Office of the Data Protection Commissioner in Kenya first instructed Worldcoin to stop collecting personal data in May.

I don’t know what to think about Worldcoin. Is it one more crypto project doomed to disappear or could it be a nasty exploitive project designed to corner identity by starting starting in Kenya. Imagine having to get orbed just to use local government services online! Fortunately Kenya is now ordering them to stop their exploitation; see the TechCrunch story, Worldcoin ignored initial order to stop iris scans in Kenya, records show.

The Illusion Of AI’s Existential Risk

In sum, AI acting on its own cannot induce human extinction in any of the ways that extinctions have happened in the past. Appeals to the competitive nature of evolution or previous instances of a more intelligent species causing the extinction of a less intelligent species reflect a common mischaracterization of evolution by natural selection.

Could artificial intelligence (AI) soon get to the point where it could enslave us? An Amii colleague sent me to this sensible article, The Illusion Of AI’s Existential Risk that argues that it is extremely unlikely that an AI could evolve to the point where it could manipulate us and prevent us from turning it off. One of the points they make is that the situation is completely different from past extinctions.

Our safety is the topic of Brian Christian’s excellent The Alignment Problem book which talks about different approaches to developing AIs so they are aligned with our values. An important point made by Stuart Russell and quoted in the book is that we don’t want AIs to have the same values as us, we want them to value our having values and to pay attention to our values.

This raises the question of how an AI might know what we value. One approach is Constitutional AI where we train ethical AIs on a constitution that captures our values and then use it to model others.

One of the problems, however, with ethics is that human ethics isn’t simple and may not be something one can capture in a constitution. For this reason another approach is Inverse Reinforcement Learning (IRL) where were ask an AI to infer our values from a mass of evidence of ethical discourse and behaviour.

My guess is that this is what they are trying at OpenAI in their Superalignment project. Imagine an ethical surveillance project that uses IRL to develop a (black) moral box which can be used to train AIs to be aligned. Imagine if it could be tuned to different community ethics?

Lessons from the Robodebt debacle

How to avoid algorithmic decision-making mistakes: lessons from the Robodebt debacle

The University of Queensland has a research alliance looking at Trust, Ethics and Governance and one of the teams has recently published an interesting summary of How to avoid algorithmic decision-making mistakes: lessons from the Robodebt debacleThis is based on an open paper Algorithmic decision-making and system destructiveness: A case of automatic debt recovery. The web summary article is a good discussion of the Australian 2016 robodebt scandal where an unsupervised algorithm issued nasty debt collection letters to a large number of welfare recipients without adequate testing, accountability, or oversight. It is a classic case of a simplistic and poorly tested algorithm being rushed into service and having dramatic consequences (470,000 incorrectly issued debt notices). There is, as the article points out, also a political angle.

UQ’s experts argue that the government decision-makers responsible for rolling out the program exhibited tunnel vision. They framed welfare non-compliance as a major societal problem and saw welfare recipients as suspects of intentional fraud. Balancing the budget by cracking down on the alleged fraud had been one of the ruling party’s central campaign promises.

As such, there was a strong focus on meeting financial targets with little concern over the main mission of the welfare agency and potentially detrimental effects on individual citizens. This tunnel vision resulted in politicians’ and Centrelink management’s inability or unwillingness to critically evaluate and foresee the program’s impact, despite warnings. And there were warnings.

What I find even more disturbing is a point they make about how the system shifted the responsibility for establishing the existence of the debt from the government agency to the individual. The system essentially made speculative determinations and then issued bills. It was up to the individual to figure out whether or not they had really been overpaid or there was a miscalculation. Imagine if the police used predictive algorithms to fine people for possible speeding infractions who then had to prove they were innocent or pay the fine.

One can see the attractiveness of such a “fine first then ask” approach. It reduces government costs by shifting the onerous task of establishing the facts to the citizen. There is a good chance that many who were incorrectly billed will pay anyway as they are intimidated and don’t have the resources to contest the fine.

It should be noted that this was not the case of an AI gone bad. It was, from what I have read, a fairly simple system.

A New Way to Inoculate People Against Misinformation

A new set of online games holds promise for helping identify and prevent harmful misinformation from going viral.

Instead of fighting misinformation after it’s already spread, some researchers have shifted their strategy: they’re trying to prevent it from going viral in the first place, an approach known as “prebunking.” Prebunking attempts to explain how people can resist persuasion by misinformation. Grounded in inoculation theory, the approach uses the analogy of biological immunization. Just as weakened exposure to a pathogen triggers antibody production, inoculation theory posits that pre-emptively exposing people to a weakened persuasive argument builds people’s resistance against future manipulation.

Prebunking is being touted as A New Way to Inoculate People Against Misinformation. The idea is that one can inoculate people against the manipulation of misinformation. This strikes me as similar to how we were taught to “read” advertising in order to inoculate us to corporate manipulation. Did it work?

The Cambridge Social Decision-Making Lab has developed some games like the Bad News Game to build psychological resistance to misinformation.

That viruses and inoculation can be metaphors for patterns of psychological influence is worrisome. It suggests a lack of agency or reflection among people. How are memes not like viruses?

The Lab has been collaborating with Google’s Jigsaw on Inoculation Science which has developed the games and videos to explain misinformation.

Pentagon believes its precognitive AI can predict events ‘days in advance’

The US military is testing AI that helps predict events days in advance, helping it make proactive decisions..

Endgadget has a story on how the Pentagon believes its precognitive AI can predict events ‘days in advance’. It is clear that for most the value in AI and surveillance is prediction and yet there are some fundamental contradictions. As Hume pointed out centuries ago, all prediction is based on extrapolation from past behaviour. We simply don’t know the future; the best we can do is select features of past behaviour that seemed to do a good job predicting (retrospectively) and hope they will work in the future. Alas, we get seduced by the effectiveness of retrospective work. As Smith and Cordes put it in The Phantom Pattern Problem:

How, in this modern era of big data and powerful computers, can experts be so foolish? Ironically, big data and powerful computers are part of the problem. We have all been bred to be fooled—to be attracted to shiny patterns and glittery correlations. (p. 11)

What if machine learning and big data were really best suited for suited for studying the past and not predicting the future? Would there be the hype? the investment?

When the next AI winter comes we in the humanities could pick up the pieces and use these techniques to try to explain the past, but I’m getting ahead of myself and predicting another winter.

Can’t Get You Out of My Head

I finally finished watching the BBC documentary series Can’t Get You Out of My Head by Adam Curtis. It is hard to describe this series which is cut entirely from archival footage with Curtis’ voice interpreting and linking the diverse clips. The subtitle is “An Emotional History of the Modern World” which is true in that the clips are often strangely affecting, but doesn’t convey the broad social-political connections Curtis makes in the narration. He is trying out a set of theses about recent history in China, the US, the UK, and Russia leading up to Brexit and Trump. I’m still digesting the 6 part series, but here are some of the threads of theses:

  • Conspiracies. He traces our fascination and now belief in conspiracies back to a memo by Jim Garrison in 1967 about the JFK assassination. The memo, Time and Propinquity: Factors in Phase I presents results of an investigative technique built on finding patterns of linkages between fragments of information. When you find strange coincidences you then weave a story (conspiracy) to join them rather than starting with a theory and checking the facts. This reminds me of what software like Palantir does – it makes (often coincidental) connections easy to find so you can tell stories. Curtis later follows the evolution of conspiracies as a political force leading to liberal conspiracies about Trump (that he was a Russian agent) and alt-right conspiracies like Q-Anon. We are all willing to surrender our independence of thought for the joys of conspiracies.
  • Big Data Surveillance and AI. Curtis connects this new mode of investigation to what the big data platforms like Google now do with AI. They gather lots of fragments of information about us and then a) use it to train AIs, and b) sell inferences drawn from the data to advertisers while keeping us anxious through the promotion of emotional content. Big data can deal with the complexity of the world which we have given up on trying to control. It promises to manage the complexity of fragments by finding patterns in them. This reminds me of discussions around the End of Theory and shift from theories to correlations.
  • Psychology. Curtis also connects this to emerging psychological theories about how our minds may be fragmented with different unconscious urges moving us. Psychology then offers ways to figure out what people really want and to nudge or prime them. This is what Cambridge Analytica promised – the ability to offer services we believed due to conspiracy theories. Curtis argues at the end that behavioural psychology can’t replicate many of the experiments undergirding nudging. Curtis suggests that all this big data manipulation doesn’t work though the platforms can heighten our anxiety and emotional stress. A particularly disturbing part of the last part is the discussion of how the US developed “enhanced” torture techniques based on these ideas after 9/11 to create “learned helplessness” in prisoners. The idea was to fragment their consciousness so that they would release a flood of these fragments, some of which might be useful intelligence.
  • Individualism. A major theme is the rise of individualism since the war and how individuals are controlled. China’s social credit model of explicit control through surveillance is contrasted to the Western consumer driven platform surveillance control. Either way, Curtis’ conclusion seems to be that we need to regain confidence in our own individual powers to choose our future and strive for it. We need to stop letting others control us with fear or distract us with consumption. We need to choose our future.

In some ways the series is a plea for everyone to make up their own stories from their fragmentary experience. The series starts with this quote,

The ultimate hidden truth of the world is that it is something we make, and could just as easily make differently. (David Graeber)

Of course, Curtis’ series could just be a conspiracy story that he wove out of the fragments he found in the BBC archives.

Addressing the Alarming Systems of Surveillance Built By Library Vendors

The Scholarly Publishing and Academic Resources Coalition (SPARC) are drawing attention to how we need to be Addressing the Alarming Systems of Surveillance Built By Library Vendors. This was triggered by a story in The Intercept that LexisNexis (is) to provide (a) giant database of personal information to ICE

The company’s databases offer an oceanic computerized view of a person’s existence; by consolidating records of where you’ve lived, where you’ve worked, what you’ve purchased, your debts, run-ins with the law, family members, driving history, and thousands of other types of breadcrumbs, even people particularly diligent about their privacy can be identified and tracked through this sort of digital mosaic. LexisNexis has gone even further than merely aggregating all this data: The company claims it holds 283 million distinct individual dossiers of 99.99% accuracy tied to “LexIDs,” unique identification codes that make pulling all the material collected about a person that much easier. For an undocumented immigrant in the United States, the hazard of such a database is clear. (The Intercept)

That LexisNexis has been building databases on people isn’t new. Sarah Brayne has a book about predictive policing titled Predict and Surveil where, among other things, she describes how the LAPD use Palantir and how police databases integrated in Palantir are enhanced by commercial databases like those sold by LexisNexis. (There is an essay that is an excerpt of the book here, Enter the Dragnet.)

I suspect environments like Palantir make all sorts of smaller and specialized databases more commercially valuable which is leading what were library database providers to expand their business. Before, a database about repossessions might be of interest to only a specialized community. Now it becomes linked to other information and is another dimension of data. In particular these databases provide information about all the people who aren’t in police databases. They provide the breadcrumbs needed to surveil those not documented elsewhere.

The SPARC call points out that we (academics, university libraries) have been funding these database providers. 

Dollars from library subscriptions, directly or indirectly, now support these systems of surveillance. This should be deeply concerning to the library community and to the millions of faculty and students who use their products each day and further underscores the urgency of privacy protections as library services—and research and education more generally—are now delivered primarily online.

This raises the question of our complicity and whether we could do without some of these companies. At a deeper level it raises questions about the curiosity of the academy. We are dedicated to knowledge as an unalloyed good and are at the heart of a large system of surveillance – surveillance of the past, of literature, of nature, of the cosmos, and of ourselves.

What Sky Bet, The Gambling App, Knows About You

Sky Bet, the most popular one in Britain, compiled extensive records about a user, tracking him in ways he never imagined.

The New York Times has a good story about What Sky Bet, The Gambling App, Knows About You. It talks about the profile that Sky Bet in the UK built on a customer who had an addiction problem with gambling.

The company, or one of the data providers it had hired to collect information about users, had access to banking records, mortgage details, location coordinates, and an intimate portrait of his habits wagering on slots and soccer matches.

We tend to focus on what the big guys have and forget all the lesser known information aggregators and middlemen who buy and sell data. This story also provides an example of how valuable data can be to a business like online gambling that wants to attract the clients who are likely to get addicted to gambling.

Freedom Online Coalition joint statement on artificial intelligence

The Freedom Online Coalition (FOC) has issued a joint statement on artificial intelligence (AI) and human rights.  While the FOC acknowledges that AI systems offer unprecedented opportunities for human development and innovation, the Coalition expresses concern over the documented and ongoing use of AI systems towards repressive and authoritarian purposes, including through facial recognition technology […]

The Freedom Online Coalition is a coalition of countries including Canada that “work closely together to coordinate their diplomatic efforts and engage with civil society and the private sector to support Internet freedom – free expression, association, assembly, and privacy online – worldwide.” It was founded in 2011 at the initiative of the Dutch.

FOC has just released Joint Statement on Artificial Intelligence and Human Rights that calls for “transparency, traceability and accountability” in the design and deployment of AI systems. They also reaffirm that “states must abide by their obligations under international human rights law to ensure that human rights are fully respected and protected.” The statement ends with a series of recommendations or “Calls to action”.

What is important about this statement is the role of the state recommended. This is not a set of vapid principles that developers should voluntarily adhere to. It calls for appropriate legislation.

States should consider how domestic legislation, regulation and policies can identify, prevent, and mitigate risks to human rights posed by the design, development and use of AI systems, and take action where appropriate. These may include national AI and data strategies, human rights codes, privacy laws, data protection measures, responsible business practices, and other measures that may protect the interests of persons or groups facing multiple and intersecting forms of discrimination.

I note that yesterday the Liberals introduced a Digital Charter Implementation Act that could significantly change the regulations around data privacy. More on that as I read about it.

Thanks to Florence for pointing this FOC statement out to me.