How Trump Consultants Exploited the Facebook Data of Millions

Cambridge Analytica harvested personal information from a huge swath of the electorate to develop techniques that were later used in the Trump campaign.

The New York Times has just published a story about How Trump Consultants Exploited the Facebook Data of MillionsThe story is about how Cambridge Analytica, the US arm of SCL, a UK company, gathered a massive dataset from Facebook with which to do “psychometric modelling” in order to benefit Trump.

The Guardian has been reporting on Cambridge Analytica for some time – see their Cambridge Analytica Files. The service they are supposed to have provided with this massive dataset was to model types of people and their needs/desires/politics and then help political campaigns, like Trump’s, through microtargeting to influence voters. Using the models a campaign can create content tailored to these psychometrically modelled micro-groups to shift their opinions. (See articles by Paul-Olivier Dehaye about what Cambridge Analytica does and has.)

What is new is that there is a (Canadian) whistleblower from Cambridge Analytica, Christopher Wylie who was willing to talk to the Guardian and others. He is “the data nerd who came in from the cold” and he has a trove of documents that contradict what other said.

The Intercept has a earlier and related story about how Facebook Failed to Protect 30 Million Users From Having Their Data Harvested By Trump Campaign Affiliate. This tells how people were convinced to download a Facebook app that then took your data and that of their friends.

It is difficult to tell how effective the psychometric profiling with data is and if can really be used to sway voters. What is clear, however, is that Facebook is not really protecting their users’ data. To some extent their set up to monetize such psychometric data by convincing those who buy access to the data that you can use it to sway people. The problem is not that it can be done, but that Facebook didn’t get paid for this and are now getting bad press.

Digital Cultures Big Data And Society

Last week I presented a keynote at the Digital Cultures, Big Data and Society conference. (You can seem my conference notes at Digital Cultures Big Data And Society.) The talk I gave was titled “Thinking-Through Big Data in the Humanities” in which I argued that the humanities have the history, skills and responsibility to engage with the topic of big data:

  • First, I outlined how the humanities have a history of dealing with big data. As we all know, ideas have histories, and we in the humanities know how to learn from the genesis of these ideas.
  • Second, I illustrated how we can contribute by learning to read the new genres of documents and tools that characterize big data discourse.
  • And lastly, I turned to the ethics of big data research, especially as it concerns us as we are tempted by the treasures at hand.

Are Algorithms Building the New Infrastructure of Racism?

Robert Moses

3quarksdaily, one of the better web sites for extracts of interesting essays, pointed me to this essay on Are Algorithms Building the New Infrastructure of Racism? in Nautilus by Aaron M. Bornstein (Dec. 21, 2017). The article reviews some of the terrain covered by Cathy O’Neil’s book Weapons of Math Destruction, but the article also points out how AIs are becoming infrastructure and infrastructure with bias baked in is very hard to change, like the low bridges that Robert Moses built to make it hard for public transit to make it into certain areas of NYC. Algorithmic decisions that are biased and visible can be studied and corrected. Decisions that get built into infrastructure disappear and get much harder to fix.

a fundamental question in algorithmic fairness is the degree to which algorithms can be made to understand the social and historical context of the data they use …

Just as important is paying attention to the data that is used to train the AIs in the first place. Historic data carries the biases of these generations and they need to be questioned as they get woven into our infrastructure.

Vault7 – Wikileaks releases CIA documents

Wikileaks has just released a first part of a series of what purports to be a large collection of CIA documents documenting their hacking tools. See Vault7, as they call the whole leak. Numerous news organizations like the New York Times are reporting on this and saying they think they might be “on first review”.

Continue reading Vault7 – Wikileaks releases CIA documents

How The Globe collected and analyzed sexual assault statistics to report on unfounded figures across Canada

Fourteen years ago, Statistics Canada stopped publishing unfounded rates, over concerns about the quality of the data. In “Unfounded,” The Globe and Mail has tried to fill the gaps in the data.

The Globe and Mail has been publishing a fabulous data-driven expose on how the police categorize one out of five sexual assault reports as unfounded. They have a web essay Will police believe you? that summarizes the investigation. There is another article on How The Globe collected and analyzed sexual assault statistics to report on unfounded figures across Canada. While this isn’t big data, it shows the power of data in showing us that there is a problem and prodding police departments to start reviewing their practices.

Brianna Wu appalled at FBI’s #GamerGate investigative report

Screenshot of text from FBI Report
From FBI #GamerGate Report

The FBI has released their report on #GamerGate after a Freedom Of Information request and it doesn’t seem that they took the threats that seriously. According to a Venturebeat story Brianna Wu (is) appalled at FBI’s #GamerGate investigative report.

Wu, who is running for Congress, said in an email that she is “fairly livid” because it appears the FBI didn’t check out many of her reports about death threats. Wu catalogued more than 180 death threats that she said she received because she spoke out against sexism in the game industry and #GamerGate misogyny that eventually morphed into the alt-right movement and carried into the U.S. presidential race.

It sounds like the FBI either couldn’t trace the threats or they didn’t think they were serious enough and eventually closed down the investigation. In the aftermath of the shooting at the Québec City mosque we need to take the threats of trolls more seriously as Anita Sarkeesian did when she was threatened with a “Montreal Massacre style attack” before speaking at the University of Utah. Yes, only a few act on their threats, but threats piggy-back on the terror to achieve their end. Those making the threats may justify it as just for the lulz, but they do so knowing that some people act on their threats.

On another point, having just given a paper on Palantir I was intrigued to read that the FBI used it in their investigation. The report says that “A search of social media logins using Palantir’s search around feature revealed a common User ID number for two of the above listed Twitter accounts, profiles [Redacted] … A copy of the Palantir chart created from the Twitter results will be uploaded to the case file under a separate serial.” One wonders how useful connecting to Twitter accounts to one ID is.

Near the end of the report, which is really just a collection of redacted documents, there is a heavily redacted email from one of those harassed where all but a couple of lines are left for us to read including,

We feel like we are sending endless emails into the void with you.

2016 Chicago Colloquium On Digital Humanities And Computer Science

I’ve just come back from the Chicago Colloquium on Digital Humanities and Computer Science at the University of Illinois, Chicago. The Colloquium is a great little conference where a lot of new projects get shown. I kept conference notes on the Colloquium here.

I was struck by the number of sessions of papers on mapping projects. I don’t know if I have ever seen so many geospatial projects. Many of the papers talked about how mapping is a different way of analyzing the data whether it is the location of eateries in Roman Pompeii or German construction projects before 1924.

I gave a paper on “Information Wants to Be Free, Or Does It? Ethics in the Digital Humanities.”