JSTOR Text Analyzer

JSTOR, and some other publishers of electronic research, have started building text analysis tools into their publishing tools. I came across this at the end of a JSTOR article where there was a link to “Get more results on Text Analyzer” which leads to a beta of the JSTOR labs Text Analyzer environment.

JSTOR Labs Text Analyzer

This analyzer environment provides simple an analytical tools for surveying an issue of a journal or article. The emphasis is on extracting keywords and entities so that one can figure out if an article or journal is useful. One can use this to find other similar things.

Results of Text Analyzer

What intrigues me is this embedding of tools into reading environments which is different from the standard separate data and tools model. I wonder how we could instrument Voyant so that it could be more easily embedded in other environments.

Racism, misogyny, death threats: Why can’t the booming video-game industry curb toxicity? – Silicon Valley

Silicon Valley is reprinting a story from the Washington post, Racism, misogyny, death threats: Why can’t the booming video-game industry curb toxicity? The story is one more on how nasty online gaming can be. The usual companies try to reduce the toxicity of game culture and don’t really succeed. So we are left to just ignore it?

With no clear methods to effectively monitor, halt or eliminate toxic behavior, many in the gaming community have simply tried to ignore it and continue playing anyway. Many of the titles cited most for toxic players remain the industry’s most popular.

A Critical Assessment of the Movement for Ethical Artificial Intelligence and Machine Learning

Greene, Hoffmann, and Stark have written a much needed conference paper on Better, Nicer, Clearer, Fairer: A Critical Assessment of the Movement for Ethical Artificial Intelligence and Machine Learning (PDF) for the Hawaii International Conference on System Sciences in Maui, HI. They look at a number of the important ethics statements/declarations out there and try to understand their “moral background.” Here is the abstract:

This paper uses frame analysis to examine recent high-profile values statements endorsing ethical design for artificial intelligence and machine learning (AI/ML). Guided by insights from values in design and the sociology of business ethics, we uncover the grounding assumptions and terms of debate that make some conversations about ethical design possible while forestalling alternative visions. Vision statements for ethical AI/ML co-opt the language of some critics, folding them into a limited, technologically deterministic, expert-driven view of what ethical AI/ML means and how it might work.

I get the feeling that various outfits (of experts) are trying to define what ethics in AI/ML is rather then engaging in a dialogue. There is a rush to be the expert on ethics. Perhaps we should imagine a different way of developing an ethical consensus.

For that matter, is there room for critical positions? What it would mean to call for a stop all research into AI/ML as unethical until proven otherwise? Is that even thinkable? Can we imagine another way that the discourse of ethics might play out?

This article is a great start.

The Secret History of Women in Coding

Computer programming once had much better gender balance than it does today. What went wrong?

The New York Times has a nice long article on The Secret History of Women in Coding – The New York TimesWe know a lot of the story from books like Campbell-Kelly’s From Airline Reservations to Sonic the Hedgehog: a History of the Software Industry (2003), Chang’s Brotopia (2018), and Rankin’s A People’s History of Computing in the United States (2018).

The history is not the heroic story of personal computing that I was raised on. It is a story of how women were driven out of computing (both the academy and businesses) starting in the 1960s.

A group of us at the U of Alberta are working on archiving the work of Sally Sedelow, one of the forgotten pioneers of humanities computing. Dr. Sedelow got her PhD in English in 1960 and did important early work on text analysis systems.

Applying an Ethics of Care to Internet Research: Gamergate and Digital Humanities

Article: Applying an Ethics of Care to Internet Research: Gamergate and Digital Humanities

Thanks to Todd Suomela’s lead, we just published an article on Applying an Ethics of Care to Internet Research: Gamergate and Digital Humanities in Digital Studies. This article is a companion to an article I wrote with Bettina Berendt on Information Wants to Be Free, Or Does It? We and others are exploring the Ethics of Care as a different way of thinking about the ethics of digital humanities research.

Peter Robinson, “Textual Communities: A Platform for Collaborative Scholarship on Manuscript Heritages”

Peter Robinson gave a talk on “Textual Communities: A Platform for Collaborative Scholarship on Manuscript Heritages” as part of the Singhmar Guest Speaker Program | Faculty of Arts.

He started by talking about whether textual traditions had any relationship to the material world. How do texts relate to each other?

Today stemata as visualizations are models that go beyond the manuscripts themselves to propose evolutionary hypotheses in visual form.

He then showed what he is doing with the Canterbury Tales Project and then talked about the challenges adapting the time-consuming transcription process to other manuscripts. There are lots of different transcription systems, but few that handle collation. There is also the problem of costs and involving a distributed network of people.

He then defined text:

A text is an act of (human) communication that is inscribed in a document.

I wondered how he would deal with Allen Renear’s argument that there are Real Abstract Objects which, like Platonic Forms are real, but have no material instance. When we talk, for example, of “hamlet” we aren’t talking about a particular instance, but an abstract object. Likewise with things like “justice”, “history,” and “love.” Peter responded that the work doesn’t exist except as its instances.

He also mentioned that this is why stand-off markup doesn’t work because texts aren’t a set of linear objects. It is better to represent it as a tree of leaves.

So, he launched Textual Communities – https://textualcommunities.org/

This is a distributed editing system that also has collation.

Finding Lena Forsen, the Patron Saint of JPEGs | WIRED

In 1972, a photo of a Swedish Playboy model was used to engineer the digital image format that would become the JPEG. The model herself was mostly a mystery—until now.

Wired has another story on Finding Lena Forsen, the Patron Saint of JPEGs. This is not, however, the first time her story has been told. I blogged about the use of the Lena image back in 2004. It seems like this story will be rediscovered every decade.

What has changed is that people are calling out the casual sexism of tech culture. An example is Chang’s book Brotopia that starts with the Lena story.

Word2Vec Vis of Pride and Prejudice

Paolo showed me a neat demonstration of Word2Vec Vis of Pride and PrejudiceLynn Cherny trained a Word2Vec model using Jane Austen’s novels and then used that to find close matches for key words. She then show the text of a novel with the words replaced by their match in the language of Austen. It serves as a sort of demonstration of how Word2Vec works.

Cybersecurity

The New York Times has a nice short video on cybersecurity which is increasingly an issue. One of the things they mention is how it was the USA and Israel that may have opened the Pandora’s box of cyberweapons when they used Stuxnet to damage Iran’s nuclear programme. By using a sophisticated worm first we both legitimized the use of cyberwar against other countries which one is not at war with, and we showed what could be done. This, at least, is the argument of a good book on Stuxnet, Countdown to Zero Day.

Now the problem is that the USA, while having good offensive capability, is also one of the most vulnerable countries because of the heavy use of information technology in all walks of life. How can we defend against the weapons we have let loose?

What is particularly worrisome is that cyberweapons are being designed so that they are hard to trace and subtly disruptive in ways that are short of all out war. We are seeing a new form of hot/cold war where countries harass each other electronically without actually declaring war and getting civilian input. After 2016 all democratic countries need to protect against electoral disruption which then puts democracies at a disadvantage over closed societies.

Making AI accountable easier said than done, says U of A expert

Geoff McMaster of the Folio (U of A’s news site) wrote a nice article about how Making AI accountable easier said than done, says U of A expert. The article quotes me on on accountability and artificial intelligence. What we didn’t really talk about is forms of accountability for automata including:

  • Explainability – Can someone get an explanation as to how and why an AI made a decision that affects them? If people can get an explanation that they can understand then they can presumably take remedial action and hold someone or some organization accountable.
  • Transparency – Is an automated decision making process fully transparent so that it can be tested, studied and critiqued? Transparency is often seen as a higher bar for an AI to meet than explainability.
  • Responsibility – This is the old computer ethics question that focuses on who can be held responsible if a computer or AI harms someone. Who or what is held to account?

In all these cases there is a presumption of process both to determine transparency/responsibility and to then punish or correct for problems. Otherwise people will have no real recourse.