Jeff Pooley, “Surveillance Publishing”

Arun sent me the link to a good paper by Jeff Pooley on Surveillance Publishing in the Journal of Electronic Publishing. The article compares what Google does to rank pages based on links to citation analysis (which inspired Brin and Page). The article looks at how both web search and citation analysis have been monetized by Google and citation network services like Web of Science. Now publishing companies like Elsevier make money off tools that report and predict on publishing. We write papers with citations and publish them. Then we buy services built on our citational work and administrators buy services telling them who publishes the most and where the hot areas are. As Pooley puts it,

Siphoning taxpayer, tuition, and endowment dollars to access our own behavior is a financial and moral indignity.

The article also points out that predictive services have been around since before Google. The insurance and credit rating businesses have used surveillance for some time.

Pooley ends by talking about how these publication surveillance tools then encourage quantification of academic work and facilitate local and international prioritization. The Anglophone academy measures things and discovers itself so it can then reward itself. What gets lost is the pursuit of knowledge.

In that sense, the “decision tools” peddled by surveillance publishers are laundering machines—context-erasing abstractions of our messy academic realities.

The full abstract is here:

This essay develops the idea of surveillance publishing, with special attention to the example of Elsevier. A scholarly publisher can be defined as a surveillance publisher if it derives a substantial proportion of its revenue from prediction products, fueled by data extracted from researcher behavior. The essay begins by tracing the Google search engine’s roots in bibliometrics, alongside a history of the citation analysis company that became, in 2016, Clarivate. The essay develops the idea of surveillance publishing by engaging with the work of Shoshana Zuboff, Jathan Sadowski, Mariano-Florentino Cuéllar, and Aziz Huq. The recent history of Elsevier is traced to describe the company’s research-lifecycle data-harvesting strategy, with the aim to develop and sell prediction products to unviersity and other customers. The essay concludes by considering some of the potential costs of surveillance publishing, as other big commercial publishers increasingly enter the predictive-analytics business. It is likely, I argue, that windfall subscription-and-APC profits in Elsevier’s “legacy” publishing business have financed its decade-long acquisition binge in analytics. The products’ purpose, moreover, is to streamline the top-down assessment and evaluation practices that have taken hold in recent decades. A final concern is that scholars will internalize an analytics mindset, one already encouraged by citation counts and impact factors.

Source: Pooley | Surveillance Publishing | The Journal of Electronic Publishing

Bridging Divides – Research and Innovation

Thanks to my colleague Yasmeen, I was included in an important CFREF, Bridging Divides – Research and Innovation led by Anna Triandafyllidou at Toronto Metropolitan University. Some of the topics I hope to work on include how information technology is being used to surveil and manage immigrants. Conversely, how immigrants use information technology.

Statement on AI Risk

Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.

The Center for AI Safety has issued a very short Statement on AI Risk (see sentence above.) This has been signed by the likes of Yoshua Bengio and Geoffrey Hinton. I’m not sure if it is an alternative to the much longer Open Letter, but it focuses on the warning without any prescription as to what we should do. The Open Letter was criticized many in the AI community, so perhaps CAIS was trying to find wording that could bring together “AI Scientists” and “Other Notable Figures.”

I personally find this alarmist. I find myself less and less impressed with ChatGPT as it continues to fabricate answers of little use (because they are false.) I tend to agree with Elizabeth Renieris who is quoted in this BBC story on Artificial intelligence could lead to extinction, experts warn to the effect that there are a lot more pressing immediate issues with AI to worry about. She says,

“Advancements in AI will magnify the scale of automated decision-making that is biased, discriminatory, exclusionary or otherwise unfair while also being inscrutable and incontestable,” she said. They would “drive an exponential increase in the volume and spread of misinformation, thereby fracturing reality and eroding the public trust, and drive further inequality, particularly for those who remain on the wrong side of the digital divide”.

All the concern about extinction has me wondering if this isn’t a way of hyping AI to make everyone one and every AI business more important. If there is an existential risk then it must be a priority, and if it is a priority then we should be investing in it because, of course, the Chinese are. (Note that the Chinese have actually presented draft regulations that they will probably enforce.) In other words, the drama of extinction could serve the big AI companies like OpenAI, Microsoft, Google, and Meta in various ways:

  • The drama could convince people that there is real disruptive potential in AI so they should invest now! Get in before it is too late.
  • The drama could lead to regulation which would actually help the big AI companies as they have the capacity to manage regulation in ways that small startups don’t. The big will get bigger with regulation.

I should stress that this is speculation. I probably shouldn’t be so cynical. Instead lets look to what we can do locally.

2023 Annual Public Lecture in Philosophy

Last week I gave the 2023 Annual Public Lecture in Philosophy. You can Watch a Recording here. The talk was on The Eliza Effect: Data Ethics for Machine Learning.

I started the talk with the case of Kevin Roose’s interaction with Sydney (Microsoft’s name for Bing Chat) where it ended up telling Roose that it loved him. From there I discussed some of the reasons we should be concerned with the latest generation of chatbots. I then looked at the ethics of LAION-5B as an example of how we can audit the ethics of projects. I ended with some reflections on what an ethics of AI could be.

Pause Giant AI Experiments: An Open Letter

We call on all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4.

The Future of Life Institute is calling on AI labs to pause with a letter signed by over 1000 people (including myself), Pause Giant AI Experiments: An Open Letter – Future of Life Institute. The letter asks for a pause so that safety protocols can be developed,

AI labs and independent experts should use this pause to jointly develop and implement a set of shared safety protocols for advanced AI design and development that are rigorously audited and overseen by independent outside experts. These protocols should ensure that systems adhering to them are safe beyond a reasonable doubt.

This letter to AI labs follows a number of essays and opinions that maybe we are going too fast and should show restraint. This in the face of the explosive interest in large language models after ChatGPT.

  • Gary Marcus wrote an essay in his substack on “AI risk ≠ AGI risk” arguing that just because we don’t have AGI doesn’t mean there isn’t risk associated with the Mediocre AI systems we do have.
  • Yuval Noah Harari has an opinion in the New York Times with the title, “You Can Have the Blue Pill or the Red Pill, and We’re Out of Blue Pills” where he talks about the dangers of AIs manipulating culture.

We have summoned an alien intelligence. We don’t know much about it, except that it is extremely powerful and offers us bedazzling gifts but could also hack the foundations of our civilization. We call upon world leaders to respond to this moment at the level of challenge it presents. The first step is to buy time to upgrade our 19th-century institutions for a post-A.I. world and to learn to master A.I. before it masters us.

It is worth wondering whether the letter will have an effect, and if it doesn’t, why we can’t collectively slow down and safely explore AI.

ChatGPT: Chatbots can help us rediscover the rich history of dialogue

The rise of AI chatbots provides an opportunity to expand the ways we do philosophy and research, and how we engage in intellectual discourse.

I published an article in The Conversation today on, ChatGPT: Chatbots can help us rediscover the rich history of dialogue. This touches on a topic that I’ve been thinking about a lot … how chatbots are dialogue machines and how we can draw on the long history of dialogue in philosophy to understand the limits/potential of chatbots like ChatGPT.

 

ChatGPT listed as author on research papers: many scientists disapprove

The editors-in-chief of Nature and Science told Nature’s news team that ChatGPT doesn’t meet the standard for authorship. “An attribution of authorship carries with it accountability for the work, which cannot be effectively applied to LLMs,” says Magdalena Skipper, editor-in-chief of Nature in London. Authors using LLMs in any way while developing a paper should document their use in the methods or acknowledgements sections, if appropriate, she says.

We are beginning to see interesting ethical issues crop up regarding the new LLMs (Large Language Models) like ChatGPT. For example, Nature has an news article, ChatGPT listed as author on research papers: many scientists disapprove,.

It makes sense to document use, but why would we document use of ChatGPT and not, for example, use of a library or of a research tool like Google Scholar? What about the use of ChatGPT demands that it be acknowledged?

Fuck the Poetry Police: On the Index of Major Literary Prizes in the United States

The LARB has a nice essay by Dan Sinykin on how researchers have used data to track how poetry prizes are distributed unequally titled, Fuck the Poetry Police: On the Index of Major Literary Prizes in the United States. The essay talks about the creation of the Post45 Data Collective which provides peer review for post-1945 cultural datasets.

Sinykin talks about this as an “act as groundbreaking as the research itself” which seems a bit of an exaggeration. It is important that data is being reviewed and published, but it has been happening for a while in other fields. Nonetheless, this is a welcome initiative, especially if it gets attention like the LARB article. In 2013 the Tri-Council (of research agencies in Canada) called for a culture of research data stewardship. In 2015 I worked with Sonja Sapach and Catherine Middleton on a report on a Data Management Plan Recommendation for Social Science and Humanities Funding Agencies. This looks more at the front end of requiring plans from people submitting grant proposals that are asking for funding for data-driven projects, but this was so that data could be made available for future research.

Sinykin’s essay looks at the poetry publishing culture in the US and how white it is. He shows how data can be used to study inequalities. We also need to ask about the privilege of English poetry and that of culture from the Global North. Not to mention research and research infrastructure.

Why scientists are building AI avatars of the dead | WIRED Middle East

Advances in AI and humanoid robotics have brought us to the threshold of a new kind of capability: creating lifelike digital renditions of the deceased.

Wired Magazine has a nice article about Why scientists are building AI avatars of the dead. The article talks about digital twin technology designed to create an avatar of a particular person that could serve as a family companion. You could have your grandfather modelled so that you could talk to him and hear his stories after he has passed.

The article also talks about the importance of the body and ideas about modelling personas with bodies. Imagine wearing motion trackers and other sensors so that your bodily presence could be modelled. Then imagine your digital twin being instantiated in a robot.

Needless to say we aren’t anywhere close yet. See this spoof video of the robot Sophia on a date with Will Smith. There are nonetheless issues about the legalities and ethics of creating bots based on people. What if one didn’t have permission from the original? Is it ethical to create a bot modelled on a historical person? a living person?

We routinely animate other people in novels, dialogue (of the dead), and in conversation. Is impersonating someone so wrong? Should people be able to control their name and likeness under all circumstances?

Then there are the possibilities for the manipulation of a digital twin or through such a twin.

As for the issue of data breaches, digital resurrection opens up a whole new can of worms. “You may share all of your feelings, your intimate details,” Hickok says. “So there’s the prospect of malicious intent—if I had access to your bot and was able to talk to you through it, I could change your attitude about things or nudge you toward certain actions, say things your loved one never would have said.”

 

How AI image generators work, like DALL-E, Lensa and stable diffusion

Use our simulator to learn how AI generates images from “noise.”

The Washington Post has a nice explainer on how text to image generators work: How AI image generators work, like DALL-E, Lensa and stable diffusion. They let you play with the generator, though you have to stick with the predefined phrases. What I hadn’t realized was the role of static noise in the diffusion model. Not sure how it works, but it seems to train the AI to recognize and then generate in noisy images.