The Guardian has been reporting on Cambridge Analytica for some time – see their Cambridge Analytica Files. The service they are supposed to have provided with this massive dataset was to model types of people and their needs/desires/politics and then help political campaigns, like Trump’s, through microtargeting to influence voters. Using the models a campaign can create content tailored to these psychometrically modelled micro-groups to shift their opinions. (See articles by Paul-Olivier Dehaye about what Cambridge Analytica does and has.)
What is new is that there is a (Canadian) whistleblower from Cambridge Analytica, Christopher Wylie who was willing to talk to the Guardian and others. He is “the data nerd who came in from the cold” and he has a trove of documents that contradict what other said.
It is difficult to tell how effective the psychometric profiling with data is and if can really be used to sway voters. What is clear, however, is that Facebook is not really protecting their users’ data. To some extent their set up to monetize such psychometric data by convincing those who buy access to the data that you can use it to sway people. The problem is not that it can be done, but that Facebook didn’t get paid for this and are now getting bad press.
3quarksdaily, one of the better web sites for extracts of interesting essays, pointed me to this essay on Are Algorithms Building the New Infrastructure of Racism? in Nautilus by Aaron M. Bornstein (Dec. 21, 2017). The article reviews some of the terrain covered by Cathy O’Neil’s book Weapons of Math Destruction, but the article also points out how AIs are becoming infrastructure and infrastructure with bias baked in is very hard to change, like the low bridges that Robert Moses built to make it hard for public transit to make it into certain areas of NYC. Algorithmic decisions that are biased and visible can be studied and corrected. Decisions that get built into infrastructure disappear and get much harder to fix.
a fundamental question in algorithmic fairness is the degree to which algorithms can be made to understand the social and historical context of the data they use …
Just as important is paying attention to the data that is used to train the AIs in the first place. Historic data carries the biases of these generations and they need to be questioned as they get woven into our infrastructure.
Wu, who is running for Congress, said in an email that she is “fairly livid” because it appears the FBI didn’t check out many of her reports about death threats. Wu catalogued more than 180 death threats that she said she received because she spoke out against sexism in the game industry and #GamerGate misogyny that eventually morphed into the alt-right movement and carried into the U.S. presidential race.
It sounds like the FBI either couldn’t trace the threats or they didn’t think they were serious enough and eventually closed down the investigation. In the aftermath of the shooting at the Québec City mosque we need to take the threats of trolls more seriously as Anita Sarkeesian did when she was threatened with a “Montreal Massacre style attack” before speaking at the University of Utah. Yes, only a few act on their threats, but threats piggy-back on the terror to achieve their end. Those making the threats may justify it as just for the lulz, but they do so knowing that some people act on their threats.
On another point, having just given a paper on Palantir I was intrigued to read that the FBI used it in their investigation. The report says that “A search of social media logins using Palantir’s search around feature revealed a common User ID number for two of the above listed Twitter accounts, profiles [Redacted] … A copy of the Palantir chart created from the Twitter results will be uploaded to the case file under a separate serial.” One wonders how useful connecting to Twitter accounts to one ID is.
Near the end of the report, which is really just a collection of redacted documents, there is a heavily redacted email from one of those harassed where all but a couple of lines are left for us to read including,
We feel like we are sending endless emails into the void with you.
I was struck by the number of sessions of papers on mapping projects. I don’t know if I have ever seen so many geospatial projects. Many of the papers talked about how mapping is a different way of analyzing the data whether it is the location of eateries in Roman Pompeii or German construction projects before 1924.
I gave a paper on “Information Wants to Be Free, Or Does It? Ethics in the Digital Humanities.”
Yesterday I gave a talk at Access 2016. This conference brings together archivists and librarians interested in library technology. I was honoured to give the Dave Binkley Memorial Lecture at the end of the conference. My conference notes are here. My talk was about the ethics of digitization, or more generally datafication.