The Common Crawl is a project that has been crawling the web and making an open corpus of web data from the last 7 years available for research. There crawl corpus is petabytes of data and available as WARCs (Web Archives.) For example, their 2013 dataset is 102TB and has around 2 billion web pages. Their collection is not as complete as the Internet Archive, which goes back much further, but it is available in large datasets for research.
From this motley chorus of suburban parents, journalists, tech leaders, and conservative intellectuals, Yiannopoulos’s function within Breitbart and his value to Bannon becomes clear. He was a powerful magnet, able to attract the cultural resentment of an enormously diverse coalition and process it into an urgent narrative about the way liberals imperiled America. It was no wonder Bannon wanted to groom Yiannopoulos for media infamy: The bigger the magnet got, the more ammunition it attracted.
Many of those who wrote Milo seem to be disgruntled people who feel oppressed by the “political correctness” of their situation, whether in a tech company or entertainment business. They email Milo to vent or pass tips or just get sympathy.
Alice and Bob is a web site and paper by Quinn DuPont and Alana Cattapan that nicely tells the history of the famous virtual couple used to explain cryptology.
While Alice, Bob, and their extended family were originally used to explain how public key cryptography works, they have since become widely used across other science and engineering domains. Their influence continues to grow outside of academia as well: Alice and Bob are now a part of geek lore, and subject to narratives and visual depictions that combine pedagogy with in-jokes, often reflecting of the sexist and heteronormative environments in which they were born and continue to be used. More than just the world’s most famous cryptographic couple, Alice and Bob have become an archetype of digital exchange, and a lens through which to view broader digital culture.
The web site provides a timeline going back to 1978. The history is then explained more fully in the full paper (PDF). They end by talking about the gendered history of cryptography. They mention other examples where images of women serve as standard test images like the image of Lena from Playboy.
The design of the site nicely shows how a paper can be remediated as an interactive web site. It isn’t that fancy, but you can navigate the timeline and follow links to get a sense of this “couple”.
I’ve just come across some important blog essays by David Gaertner. One is Why We Need to Talk About Indigenous Literature in the Digital Humanities where he argues that colleagues from Indigenous literature are rightly skeptical of the digital humanities because DH hasn’t really taken to heart the concerns of Indigenous communities around the expropriation of data.
From Slashdot a story about an FBI game/interactive that is online and which aims at Countering Violent Extremism | What is Violent Extremism?. The subtitle is “Don’t Be A Puppet” and the game is part of a collection of interactive materials that try to teach about extremism in general and encourage some critical distance from the extremism. The game has you as a sheep avoiding pitfalls.
Bill Robinson has penned a nice essay Marking 70 years of eavesdropping in Canada. The essay gives the background of Canada’s signals intelligence unit, the Communications Security Establishment (CSE) which just marked its 70th anniversary (on Sept. 1st.)
Unable to read the Soviets’ most secret messages, the UKUSA allies resorted to plain-language (unencrypted) communications and traffic analysis, the study of the external features of messages such as sender, recipient, length, date and time of transmission—what today we call metadata. By compiling, sifting, and fusing a myriad of apparently unimportant facts from the huge volume of low-level Soviet civilian and military communications, it was possible to learn a great deal about the USSR’s armed forces, the Soviet economy, and other developments behind the Iron Curtain without breaking Soviet codes. Plain language and traffic analysis remained key sources of intelligence on the Soviet Bloc for much of the Cold War.
Robinson is particularly interesting on “The birth of metadata collection” as the Soviets frustrated developed encryption that couldn’t be broken.
I could see in my daily work how difficult it was to inform people about their privacy issues. Nobody seemed to care. My hypothesis was that the whole subject was too complex. There were no examples, no images that could help the audience to understand the process behind the mass surveillance.
The answer is to mock up a design fiction of an NSA surveillance dashboard based on what we know and then a video describing a fictional use of it to track an architecture student from Berlin. It seems to me the video and mock designs nicely bring together a number of things we can infer about the tools they have.