The Digital Scholars Group at the U of Alberta organized a nice half-day conference On the Benefits of Failure. The first speaker was Quinn Dombrowski who spoke on her experience with different types of failure.
Last week I presented a keynote at the Digital Cultures, Big Data and Society conference. (You can seem my conference notes at Digital Cultures Big Data And Society.) The talk I gave was titled “Thinking-Through Big Data in the Humanities” in which I argued that the humanities have the history, skills and responsibility to engage with the topic of big data:
- First, I outlined how the humanities have a history of dealing with big data. As we all know, ideas have histories, and we in the humanities know how to learn from the genesis of these ideas.
- Second, I illustrated how we can contribute by learning to read the new genres of documents and tools that characterize big data discourse.
- And lastly, I turned to the ethics of big data research, especially as it concerns us as we are tempted by the treasures at hand.
3quarksdaily, one of the better web sites for extracts of interesting essays, pointed me to this essay on Are Algorithms Building the New Infrastructure of Racism? in Nautilus by Aaron M. Bornstein (Dec. 21, 2017). The article reviews some of the terrain covered by Cathy O’Neil’s book Weapons of Math Destruction, but the article also points out how AIs are becoming infrastructure and infrastructure with bias baked in is very hard to change, like the low bridges that Robert Moses built to make it hard for public transit to make it into certain areas of NYC. Algorithmic decisions that are biased and visible can be studied and corrected. Decisions that get built into infrastructure disappear and get much harder to fix.
a fundamental question in algorithmic fairness is the degree to which algorithms can be made to understand the social and historical context of the data they use …
Just as important is paying attention to the data that is used to train the AIs in the first place. Historic data carries the biases of these generations and they need to be questioned as they get woven into our infrastructure.
Wikileaks has just released a first part of a series of what purports to be a large collection of CIA documents documenting their hacking tools. See Vault7, as they call the whole leak. Numerous news organizations like the New York Times are reporting on this and saying they think they might be “on first review”.
The Globe and Mail has been publishing a fabulous data-driven expose on how the police categorize one out of five sexual assault reports as unfounded. They have a web essay Will police believe you? that summarizes the investigation. There is another article on How The Globe collected and analyzed sexual assault statistics to report on unfounded figures across Canada. While this isn’t big data, it shows the power of data in showing us that there is a problem and prodding police departments to start reviewing their practices.
The FBI has released their report on #GamerGate after a Freedom Of Information request and it doesn’t seem that they took the threats that seriously. According to a Venturebeat story Brianna Wu (is) appalled at FBI’s #GamerGate investigative report.
Wu, who is running for Congress, said in an email that she is “fairly livid” because it appears the FBI didn’t check out many of her reports about death threats. Wu catalogued more than 180 death threats that she said she received because she spoke out against sexism in the game industry and #GamerGate misogyny that eventually morphed into the alt-right movement and carried into the U.S. presidential race.
It sounds like the FBI either couldn’t trace the threats or they didn’t think they were serious enough and eventually closed down the investigation. In the aftermath of the shooting at the Québec City mosque we need to take the threats of trolls more seriously as Anita Sarkeesian did when she was threatened with a “Montreal Massacre style attack” before speaking at the University of Utah. Yes, only a few act on their threats, but threats piggy-back on the terror to achieve their end. Those making the threats may justify it as just for the lulz, but they do so knowing that some people act on their threats.
On another point, having just given a paper on Palantir I was intrigued to read that the FBI used it in their investigation. The report says that “A search of social media logins using Palantir’s search around feature revealed a common User ID number for two of the above listed Twitter accounts, profiles [Redacted] … A copy of the Palantir chart created from the Twitter results will be uploaded to the case file under a separate serial.” One wonders how useful connecting to Twitter accounts to one ID is.
Near the end of the report, which is really just a collection of redacted documents, there is a heavily redacted email from one of those harassed where all but a couple of lines are left for us to read including,
We feel like we are sending endless emails into the void with you.
I’ve just come back from the Chicago Colloquium on Digital Humanities and Computer Science at the University of Illinois, Chicago. The Colloquium is a great little conference where a lot of new projects get shown. I kept conference notes on the Colloquium here.
I was struck by the number of sessions of papers on mapping projects. I don’t know if I have ever seen so many geospatial projects. Many of the papers talked about how mapping is a different way of analyzing the data whether it is the location of eateries in Roman Pompeii or German construction projects before 1924.
I gave a paper on “Information Wants to Be Free, Or Does It? Ethics in the Digital Humanities.”
Yesterday I gave a talk at Access 2016. This conference brings together archivists and librarians interested in library technology. I was honoured to give the Dave Binkley Memorial Lecture at the end of the conference. My conference notes are here. My talk was about the ethics of digitization, or more generally datafication.
Information is Beautiful has a great interactive on World’s Biggest Data Breaches & Hacks. The interactive shows how data breaches are getting worse, but it also lets you look at different types of breaches.
ProPublica has a great op-ed about Making Algorithms Accountable. The story starts from a decision from the Wisconsin Supreme Court on computer-generated risk (of recidivism) scores. The scores used in Wisconsin come from Northpointe who provide the scores as a service based on a proprietary alogorithm that seems biased against blacks and not that accurate. The story highlights the lack of any legislation regarding algorithms that can affect our lives.
Update: ProPublica has responded to a Northpointe critique of their findings.