Deepfakes and Epistemic Degeneration

Two deepfake images of the pileup of cars.

There are a number of deepfake images of the 100 car pileup on the highway between Calgary and Airdre on the 17th. You can see some here CMcalgary with discussion. These deepfakes raise a number of issues:

  • How would you know it is a deepfake? Do we really have to examine images like this closely to make sure they aren’t fake?
  • Given the proliferation of deepfake images and videos, does anyone believe photos any more? We are in a moment of epistemic transition from generally believing photographs and videos to no longer trusting anything. We have to develop new ways of determining the truth of photographic evidence presented to us. We need to check whether the photograph makes sense; question the authority of whoever shared it; check against other sources; and check authoritative news sources.
  • Liar’s dividend – given the proliferation of deepfakes, public figures can claim anything is fake news in order avoid accountability. In an environment where no one knows what is true, bullshit reigns and people don’t feel they have to believe anything. Instead of the pursuit of truth we all just follow what fits our preconceptions. A example of this is what happened in 2019 when the New Year’s message from President Ali Bongo was not believed as it looked fake leading to an attempted coup.
  • It’s all about attention. We love to look at disaster images so the way to get attention is to generate and share them, even if they are generated. On some platforms you are even rewarded for attention.
  • Trauma is entertaining. We love to look at the trauma of others. Again, generating images of an event like the pileup that we heard about, is a way to get the attention of those looking for images of the trauma.
  • Even when people suspect the images are fake they can provide a “where’s Waldo” sort of entertainment where we comb them for evidence of the fakery.
Image of pileup with containership across the highway.
Pileup with Container Ship
  • Deepfakes then generate more deepfakes and eventually people start responding with ironic deepfakes where a container ship is beached across the highway causing the pileup.
  • Evenutally there may be legal ramifications. On the one hand people may try to use fake images for insurance claims. Insurance companies may then refuse photographs as evidence for a claim. People may treat a fake image as a form of identity theft if it portrays them or identifiable information like a license plate.

 

MIT apologizes, permanently pulls offline huge dataset that taught AI systems to use racist, misogynistic slurs

Vinay Prabhu, chief scientist at UnifyID, a privacy startup in Silicon Valley, and Abeba Birhane, a PhD candidate at University College Dublin in Ireland, pored over the MIT database and discovered thousands of images labelled with racist slurs for Black and Asian people, and derogatory terms used to describe women. They revealed their findings in a paper undergoing peer review for the 2021 Workshop on Applications of Computer Vision conference.

Another one of those “what were they thinking when they created the dataset stories” from The Register tells about how MIT apologizes, permanently pulls offline huge dataset that taught AI systems to use racist, misogynistic slurs. The MIT Tiny Images dataset was created automatically using scripts that used the WordNet database of terms which itself held derogatory terms. Nobody thought to check either the terms taken from WordNet or the resulting images scoured from the net. As a result there are not only lots of images for which permission was not secured, but also racists, sexist, and otherwise derogatory labels on the images which in turn means that if you train an AI on these it will generate racist/sexist results.

The article also mentions a general problem with academic datasets. Companies like Facebook can afford to hire actors to pose for images and can thus secure permissions to use the images for training. Academic datasets (and some commercial ones like the Clearview AI  database) tend to be scraped and therefore will not have the explicit permission of the copyright holders or people shown. In effect, academics are resorting to mass surveillance to generate training sets. One wonders if we could crowdsource a training set by and for people?

Facebook to Pay $550 Million to Settle Facial Recognition Suit

It was another black mark on the privacy record of the social network, which also reported its quarterly earnings.

The New York Times has a story on how Facebook to Pay $550 Million to Settle Facial Recognition Suit (Natasha Singer and Mike Isaac, Jan. 29, 2020.) The Illinois case has to do with Facebook’s face recognition technology that was part of Tag Suggestions that would suggest names for people in photos. Apparently in Illinois it is illegal to harvest biometric data without consent. The Biometric Information Privacy Act (BIPA) passed in 2008 “guards against the unlawful collection and storing of biometric information.” (Wikipedia entry)

BIPA suggests a possible answer to the question of what is unethical about face recognition. While I realize that a law is not ethics (and vice versa) BIPA hints at one of the ways we can try to unpack the ethics of face recognition. The position suggested by BIPA would go something like this:

  • Face recognition is dependent on biometric data which is extracted from an image or in other form of scan.
  • To collect and store biometric data one needs the consent of the person whose data is collected.
  • The data has to be securely stored.
  • The data has to be destroyed in a timely manner.
  • If there is consent, secure storage, and timely deletion of the data, then the system/service can be said to not be unethical.

There are a number of interesting points to be made about this position. First, it is not the gathering, storing and providing access to images of people that is at issue. Face recognition is an ethical issue because biometric data about a person is being extracted, stored and used. Thus Google Image Search is not an issue as they are storing data about whole images while FB stores information about the face of individual people (along with associated information.)

This raises issues about the nature of biometric data. What is the line between a portrait (image) and biometric information? Would gathering biographical data about a person become biometric at some point if it contained descriptions of their person?

Second, my reading is that a service like Clearview AI could also be sued if they scrape images of people in Illinois and extract biometric data. This could provide an answer to the question of what is ethically wrong about the Clearview AI service. (See my previous blog entry on this.)

Third, I think there is a missing further condition that should be specified, names that the company gathering the biometric data should identify the purpose for which they are gathering it when seeking consent and limit their use of the data to the identified uses. When they no longer need the data for the identified use, they should destroy it. This is essentially part of the PIPA principle of Limiting Use, Disclosure and Retention. It is assumed that if one is to delete data in a timely fashion there will be some usage criteria that determine timeliness, but that isn’t always the case. Sometimes it is just the passage of time.

Of course, the value of data mining is often in the unanticipated uses of data like biometric data. Unanticipated uses are, by definition, not uses that were identified when seeking consent, unless the consent was so broad as to be meaningless.

No doubt more issues will occur to me.

The Secretive Company That Might End Privacy as We Know It

“I’ve come to the conclusion that because information constantly increases, there’s never going to be privacy,” Mr. Scalzo said. “Laws have to determine what’s legal, but you can’t ban technology. Sure, that might lead to a dystopian future or something, but you can’t ban it.”

The New York Times has an important story about Clearview AI, The Secretive Company That Might End Privacy as We Know It. Clearview, which is partly funded by Peter Thiel, scraped a number of social media sites for pictures of people and has developed an AI application that you can upload a picture to and it tries to recognize the person and show you their social media trail. They are then selling the service to police forces.

Needless to say, this is a disturbing use of face recognition for surveillance using our own social media. They are using public images that anyone of us could look at, but at a scale no person could handle. They are doing something that would almost be impossible to stop, even with legislation. What’s to stop the intelligence services of another country doing this (and more)? Perhaps privacy is no longer possible.

Continue reading The Secretive Company That Might End Privacy as We Know It

Watch Andy Warhol “Paint” On A Commodore Computer: Gothamist

Eric Hayot at the Novel Worlds conference showed a slide with an image of Debbie Harry of Blondie painted on the Amiga by Andy Warhol. There is a video of Warhol painting on the Amiga at the premiere of the Commodore Amiga.

This is discussed in a documentary The Invisible Photograph: Part 2 (Trapped). The documentary also talks about recovering other images from Warhol’s original Amiga that was preserved by the The Andy Warhol Museum.

Technologizer has a nice retrospective on the Amiga, Amiga: 25 Years Later. I remember when it came out in 1985. I had a Mac by then, but was intrigued by the colour Amiga and the video work people were doing with it.

‘Photo Archives Are Sleeping Beauties.’ Pharos Is Their Prince

Pharos is an effort among 14 institutions to create a database that will eventually hold and make accessible 22 million images of artworks.

The New York Times has a story about a collaboration to develop the Pharos consortium photo archive, ‘Photo Archives Are Sleeping Beauties.’ Pharos Is Their Prince. The consortium has a number of interesting initiatives they are implementing in Pharos:

  • They are applying the CIDOC Conceptual Reference Model.

The CIDOC Conceptual Reference Model (CRM) provides definitions and a formal structure for describing the implicit and explicit concepts and relationships used in cultural heritage documentation.

  • They have a visual search (which doesn’t seem to find anything at the moment.)
  • They are looking at Research Space (which uses CRM) for a research linked data environment.