Facebook to Pay $550 Million to Settle Facial Recognition Suit

It was another black mark on the privacy record of the social network, which also reported its quarterly earnings.

The New York Times has a story on how Facebook to Pay $550 Million to Settle Facial Recognition Suit (Natasha Singer and Mike Isaac, Jan. 29, 2020.) The Illinois case has to do with Facebook’s face recognition technology that was part of Tag Suggestions that would suggest names for people in photos. Apparently in Illinois it is illegal to harvest biometric data without consent. The Biometric Information Privacy Act (BIPA) passed in 2008 “guards against the unlawful collection and storing of biometric information.” (Wikipedia entry)

BIPA suggests a possible answer to the question of what is unethical about face recognition. While I realize that a law is not ethics (and vice versa) BIPA hints at one of the ways we can try to unpack the ethics of face recognition. The position suggested by BIPA would go something like this:

  • Face recognition is dependent on biometric data which is extracted from an image or in other form of scan.
  • To collect and store biometric data one needs the consent of the person whose data is collected.
  • The data has to be securely stored.
  • The data has to be destroyed in a timely manner.
  • If there is consent, secure storage, and timely deletion of the data, then the system/service can be said to not be unethical.

There are a number of interesting points to be made about this position. First, it is not the gathering, storing and providing access to images of people that is at issue. Face recognition is an ethical issue because biometric data about a person is being extracted, stored and used. Thus Google Image Search is not an issue as they are storing data about whole images while FB stores information about the face of individual people (along with associated information.)

This raises issues about the nature of biometric data. What is the line between a portrait (image) and biometric information? Would gathering biographical data about a person become biometric at some point if it contained descriptions of their person?

Second, my reading is that a service like Clearview AI could also be sued if they scrape images of people in Illinois and extract biometric data. This could provide an answer to the question of what is ethically wrong about the Clearview AI service. (See my previous blog entry on this.)

Third, I think there is a missing further condition that should be specified, names that the company gathering the biometric data should identify the purpose for which they are gathering it when seeking consent and limit their use of the data to the identified uses. When they no longer need the data for the identified use, they should destroy it. This is essentially part of the PIPA principle of Limiting Use, Disclosure and Retention. It is assumed that if one is to delete data in a timely fashion there will be some usage criteria that determine timeliness, but that isn’t always the case. Sometimes it is just the passage of time.

Of course, the value of data mining is often in the unanticipated uses of data like biometric data. Unanticipated uses are, by definition, not uses that were identified when seeking consent, unless the consent was so broad as to be meaningless.

No doubt more issues will occur to me.

The Secretive Company That Might End Privacy as We Know It

“I’ve come to the conclusion that because information constantly increases, there’s never going to be privacy,” Mr. Scalzo said. “Laws have to determine what’s legal, but you can’t ban technology. Sure, that might lead to a dystopian future or something, but you can’t ban it.”

The New York Times has an important story about Clearview AI, The Secretive Company That Might End Privacy as We Know It. Clearview, which is partly funded by Peter Thiel, scraped a number of social media sites for pictures of people and has developed an AI application that you can upload a picture to and it tries to recognize the person and show you their social media trail. They are then selling the service to police forces.

Needless to say, this is a disturbing use of face recognition for surveillance using our own social media. They are using public images that anyone of us could look at, but at a scale no person could handle. They are doing something that would almost be impossible to stop, even with legislation. What’s to stop the intelligence services of another country doing this (and more)? Perhaps privacy is no longer possible.

Continue reading The Secretive Company That Might End Privacy as We Know It

ParityBOT: Twitter bot

ParityBOT is a chatbot developed here in Edmonton that tweets positive things about women in politics in response to hateful tweets. It send empowering messages.

You can read about it in a CBC story, Engineered-in-Edmonton Twitter bot combats misogyny on the campaign trail.

The bot follows all women candidates in the election and uses some sort of AI or sentiment detection to identify nasty tweets aimed at them and then responds with a positive message from a collection crowdsources from the public. What isn’t clear is if the positive message is sent to the offending tweeter or just posted generally?

ParityBOT was developed by ParityYEG which is a collaboration between the Alberta Machine Intelligence Institute and scientist Kory Mathewson.

Incels, Pickup Artists, and the World of Men’s Seduction Training

On Monday, April 23rd, a 25-year old man named Alek Minassian drove a rented van down a sidewalk in Toronto, killing eight women and two men. The attack was reminiscent of recent Islamist terror attacks in New York, London, Stockholm, Nice, and Berlin. Just before his massacre, he posted a note on Facebook announcing: “Private (Recruit) Minassian Infantry 00010, wishing to speak to Sgt 4chan please. C23249161, the Incel Rebellion has already begun! We will overthrow all the Chads and Stacys! All hail the Supreme Gentleman Elliot Rodger!”

Anders Wallace has published an essay in 3 Quarks Daily on Incels, Pickup Artists, and the World of Men’s Seduction Training that starts with the recent attack in Toronto by a self-styled “incel” Minassian who adapted a terror tactic and moves on to seduction training. Wallace has been participated in seduction training and immersed himself in the “manosphere” which he defines thus:

The manosphere is a digital ecosystem of blogs, podcasts, online forums, and hidden groups on sites like Facebook and Tumblr. Here you’ll find a motley crew of men’s rights activists, white supremacists, conspiracy theorists, angry divorcees, disgruntled dads, male victims of abuse, self-improvement junkies, bodybuilders, bored gamers, alt-righters, pickup artists, and alienated teenagers. What they share is a vicious response to feminists (often dubbed “feminazis”) and so-called “social justice warriors.” They blame their anger on identity politics, affirmative action, and the neoliberal state, which they perceive are compromising equality and oppressing their own free speech.

The essay doesn’t provide easy answers though one can find temptations (like the idea that these incels are men who were undermothered), instead it nicely surveys the loose network of ideas, resentments and desires that animate the manosphere. What stands out is the lack of alternative models of heterosexual masculinity. Too many of the mainstream role models we are presented with (from sports to media role models to superheros) reinforce characteristics incels want training in from stoicism to aggression.

After the Facebook scandal it’s time to base the digital economy on public v private ownership of data

In a nutshell, instead of letting Facebook get away with charging us for its services or continuing to exploit our data for advertising, we must find a way to get companies like Facebook to pay for accessing our data – conceptualised, for the most part, as something we own in common, not as something we own as individuals.

Evgeny Morozov has a great essay in The Guardian on how After the Facebook scandal it’s time to base the digital economy on public v private ownership of data. He argues that better data protection is not enough. We need to “to articulate a truly decentralised, emancipatory politics, whereby the institutions of the state (from the national to the municipal level) will be deployed to recognise, create, and foster the creation of social rights to data.” In Alberta that may start with a centralized clinical information system called Connect Care managed by the Province. The Province will presumably control access to our data to those researchers and health-care practitioners that commit to using access appropriately. Can we imagine a model where Connect Care is expanded to include social data that we can then control and give others (businesses) access to?

More on Cambridge Analytica

More stories are coming out about Cambridge Analytica and the scraping of Facebook data. The Guardian has some important new articles:

Perhaps the most interesting article is in The Conversation and argues that Claims about Cambridge Analytica’s role in Africa should be taken with a pinch of saltThe article carefully sets out evidence that CA didn’t have the effect they were hired to have in either the Nigerian election (when they failed to get Goodluck Jonathan re-elected) or the Kenyan election where they may have helped Uhuru Kenyatta stay in power. The authors (Gabrielle Lynch, Justin Willis, and Nic Cheeseman) talk about how,

Ahead of the elections, and as part of a comparative research project on elections in Africa, we set up multiple profiles on Facebook to track social media and political adverts, and found no evidence that different messages were directed at different voters. Instead, a consistent negative line was pushed on all profiles, no matter what their background.

They also point out that the majority of Kenyans are not on Facebook and that negative advertising has a long history. They conclude that exaggerating what they can do is what CA does.

Mother Jones has another story, one of the best summaries around, Cloak and Data, that questions the effectiveness of Cambridge Analytica when it comes to the Trump election. They point out how CA’s work before in Virginia and for Cruz at the beginning of the primaries doesn’t seem to have worked. They go on to suggest that CA had little to do with the Trump victory which instead was ascribed by Parscale, the head of digital operations, to investing heavily in Facebook advertising.

During an interview with 60 Minutes last fall, Parscale dismissed the company’s psychographic methods: “I just don’t think it works.” Trump’s secret strategy, he said, wasn’t secret at all: The campaign went all-in on Facebook, making full use of the platform’s advertising tools. “Donald Trump won,” Parscale said, “but I think Facebook was the method.”

The irony may be that Cambridge Analytica is brought down by its boasting, not what it actually did. Further irony is how it may bring down Facebook and finally draw attention to how our data is used to manipulate us, even though it didn’t work.

The story of Cambridge Analytica’s rise—and its rapid fall—in some ways parallels the ascendance of the candidate it claims it helped elevate to the presidency. It reached the apex of American politics through a mix of bluffing, luck, failing upward, and—yes—psychological manipulation. Sound familiar?

How Trump Consultants Exploited the Facebook Data of Millions

Cambridge Analytica harvested personal information from a huge swath of the electorate to develop techniques that were later used in the Trump campaign.

The New York Times has just published a story about How Trump Consultants Exploited the Facebook Data of MillionsThe story is about how Cambridge Analytica, the US arm of SCL, a UK company, gathered a massive dataset from Facebook with which to do “psychometric modelling” in order to benefit Trump.

The Guardian has been reporting on Cambridge Analytica for some time – see their Cambridge Analytica Files. The service they are supposed to have provided with this massive dataset was to model types of people and their needs/desires/politics and then help political campaigns, like Trump’s, through microtargeting to influence voters. Using the models a campaign can create content tailored to these psychometrically modelled micro-groups to shift their opinions. (See articles by Paul-Olivier Dehaye about what Cambridge Analytica does and has.)

What is new is that there is a (Canadian) whistleblower from Cambridge Analytica, Christopher Wylie who was willing to talk to the Guardian and others. He is “the data nerd who came in from the cold” and he has a trove of documents that contradict what other said.

The Intercept has a earlier and related story about how Facebook Failed to Protect 30 Million Users From Having Their Data Harvested By Trump Campaign Affiliate. This tells how people were convinced to download a Facebook app that then took your data and that of their friends.

It is difficult to tell how effective the psychometric profiling with data is and if can really be used to sway voters. What is clear, however, is that Facebook is not really protecting their users’ data. To some extent their set up to monetize such psychometric data by convincing those who buy access to the data that you can use it to sway people. The problem is not that it can be done, but that Facebook didn’t get paid for this and are now getting bad press.

Social networks are creating a global crisis of democracy

[N]etworks themselves offer ways in which bad actors – and not only the Russian government – can undermine democracy by disseminating fake news and extreme views. “These social platforms are all invented by very liberal people on the west and east coasts,” said Brad Parscale, Mr. Trump’s digital-media director, in an interview last year. “And we figure out how to use it to push conservative values. I don’t think they thought that would ever happen.” Too right.

The Globe and Mail this weekend had an essay by Niall Ferguson on how Social networks are creating a global crisis of democracy. The article is based on Ferguson’s new book The Square and the Tower: Networks and Power from the Freemasons to Facebook. The article points out that manipulation is not just an American problem, but also points out that the real problem is our dependence on social networks in the first place.

Continue reading Social networks are creating a global crisis of democracy

Common Crawl

The Common Crawl is a project that has been crawling the web and making an open corpus of web data from the last 7 years available for research. There crawl corpus is petabytes of data and available as WARCs (Web Archives.) For example, their 2013 dataset is 102TB and has around 2 billion web pages. Their collection is not as complete as the Internet Archive, which goes back much further, but it is available in large datasets for research.