Alain Resnais: Toute la mémoire du monde

Thanks to I came across the wonderful short film by Alan Resnais, Toute la mémoire du monde (1956). The short is about memory and the Bibliothèque nationale (of France.) It starts at the roof of this fortress of knowledge and travels down through the architecture. It follows a book from when it arrives from a publisher to when it is shelved. It shows another book called by pneumatique to the reading room where it crosses a boundary to be read. All of this with a philosophical narration on information and memory.

The short shows big analogue information infrastructure at its technological and rational best, before digital informatics disrupted the library.

HathiTrust Research Center Awards Three ACS Projects

A Advanced Collaborative Support project that I was part of was funded, see HathiTrust Research Center Awards Three ACS Projects. Our project, called The Trace of Theory, sets out to first see if we can identify subsets of the HathiTrust volumes that are “theoretical” and then study try to track “theory” through these subsets.

The problem with calls for more online data laws

One of the outcomes of the Charlie Hebdo attack is that politicians are using the terrorist attacks to call for more intrusive surveillance legislation. For example the BBC reports that UK Prime Minister David Cameron says new online data laws needed. Gibbs and Hern for the Guardian interpret Cameron as calling for “anti-terror laws to give the security services the ability to read encrypted communications in extreme circumstances.” (David Cameron in ‘cloud cuckoo land’ over encrypted messaging apps ban, Jan. 13, 2015) This would mean that either back doors are built into communications technologies with encryption or the technologies are banned in the UK.

Needless to say all sorts of people are responding to these calls for new legislation by pointing out the dangers of deliberately crippling encryption. If there are back doors they can be found and used by criminals which will mean that all sorts of companies that need/offer strong encryption will move out of the UK. For that matter, what would this mean for the use of global systems that might have encryption. (See James Ball’s article in the Guardian, Cameron wants to ban encryption – he can say goodbye to digital Britain, Jan. 13, 2015).

What few people are commenting on is the effectiveness of SIGINT (signals intelligence) in cases like the attacks in Paris. Articles in The Globe and Mail and the Guardian suggest that a combination of human intelligence and early interventions would be more likely to make a difference. The alleged culprits were known to all sorts of people (neighbours, people at their mosque, police). The problem was how difficult it is to know what to do with that information and when to intervene. This is a human problem not a signals intelligence problem. SIGINT could just add to the noise without guiding authorities as to how to deal with people.

To be honest I don’t know what would work, and perhaps predictive analytics, for all its problems, could be part of identifying at-risk youth early so that they are not thrown together in prison (as the Paris attackers were) and so interventions could be organized. Nonetheless, we clearly need more studies of the circumstances of those that are radicalized and we need to seriously try to intervene in positive ways. The alternative is arresting people for intents which are very hard to prove and has all sorts of problems as an approach.

We also need research and discussion about the balance of approaches, something that is impossible as long as surveillance is inaccessible to any oversight and accountability. Who would know if funding was better spent on human approaches? Who would dare cut the budget to nice clean modern digital intelligence in favour of a messy mix of human approaches? How to compare approaches that are hard to measure given the thankfully small numbers of incidents?

Some links:

And … we need to be able to talk openly about the issues without fear – Je suis Charlie

UNIty in diVERSITY talk on “Big Data in the Humanities”

Last week I gave a talk for the UNIty in diVERSITY speaker series on “Big Data in the Humanities.” They have now put that up on Vimeo. The talk looked at the history of reading technologies and then some of the research at U of Alberta we are doing around issues of what to do with all that big data.

Michael Jordan on the Delusions of Big Data

IEEE Spectrum has an interview with Michael Jordan that touches on the Delusions of Big Data and Other Huge Engineering Efforts. He is worried about the white noise or false positives. If a dataset is big enough you can always find something to correlate with what you want. That doesn’t mean it is causal or informatively correlated. He predicts a “big-data winter” after the bubble of excitement pops.

After a bubble, when people invested and a lot of companies overpromised without providing serious analysis, it will bust. And soon, in a two- to five-year span, people will say, “The whole big-data thing came and went. It died. It was wrong.” I am predicting that.

Adobe is Spying on Users, Collecting Data on Their eBook Libraries


Nate Hoffelder on The Digital Reader blog has broken a story about how Adobe is Spying on Users, Collecting Data on Their eBook Libraries. He and Arts Technica report that the Adobe’s Digital Editions 4 send data home about what you read and how far (what page) you get to. The data is sent in plain text.

Hoffelder used a tool called Wireshark to look at what was being sent out from his computer.

Sensitive Words: Hong Kong Protests

On Thursday I heard a great talk by Ashley Esarey on “Understanding Chinese Information Control and State Preferences for Stability Maintenance.” He has been studying a dataset of over 4,000 censorship directives issued by the Chinese state to website administrators to do things like stop mentioning Obama’s inauguration in headlines or to delete all references to certain issues. I hadn’t realized how hierarchical and human the Chinese control of the internet was. Directives came from all levels and seem to also have been ignored.

In his talk Esarey mentioned how the China Digital Times has been tracking various internet censorship issues in China. At that site I found some fascinating stories and lists of words censored. See:

Exclusive: Hundreds Of Devices Hidden Inside New York City Phone Booths

From The Intercept I followed a link to a Buzzfeed Exclusive: Hundreds Of Devices Hidden Inside New York City Phone Booths. Buzzfeed found that the company that manages the advertising surrounding New York phone booths had installed beacons that could interact with apps on smartphones as the passed by. The beacons are made by Gimbal which claims to have “the world’s largest deployment of industry-leading Bluetooth Smart beacons…” The Buzzfeed article describes what information can be gathered by these beacons:

Gimbal has advertised its “Profile” service. For consumers who opt in, the service “passively develops a profile of mobile usage and other behaviors” that allow the company to make educated guesses about their demographics “age, gender, income, ethnicity, education, presence of children”, interests “sports, cooking, politics, technology, news, investing, etc”, and the “top 20 locations where [the] user spends time home, work, gym, beach, etc..”

The image above is from Buzzfeed who got it from Gimbal and it illustrates how Gimbal is collecting data about “sightings” that can be aggregated and mined both by Gimbal and by 3rd parties who pay for the service. Apple is however responsible for an important underlying technology, iBeacon. If you want the larger picture on beacons and the hype around them see the BEEKn site (which is about “beacons, brands and culture on the Internet of Things) or read about Apple’s iBeacon technology. I am not impressed with the use cases described. They are mostly about advertisers telling us (without our permission) about things on sale. They can be used for location specific (very specific) information like the Tulpenland (tulip garden) app but outdoors you can do this with geolocation. A better use would be indoors for museums where GPS doesn’t work as Prophets Kitchen is doing for the Rubens House Antwerp Museum though the implementation shown looks really lame (multiple choice questions about Rubens!). The killer app for beacons has yet to appear, though mobile payments may be it.

What is interesting is that the Intercept article indicates that users don’t appreciate being told they are being watched. It seems that we only mind be spied on when we are personally told that we are being spied on, but that may be an unwarranted inference. We may come to accept a level of tracking as the price we pay for cell phones that are always on.

In the meantime New York has apparently ordered the beacons removed, but they are apparently installed in other cities. Of course there are also Canadian installations.



chart (1)The folks behind the Google Ngram Viewer have developed a new tools called bookworm. It has a number of corpora (the example above is from bills from It lets you describe more complex queries and you can upload your own data.

Bookworm is hosted by the Cultural Observatory at Harvard directed by Erez Lieberman Aiden and Jean-Baptiste Michel who were behind the NGgam Viewer. They have recently published a book Uncharted where they talk about different cultural trends they studied using the NGram Viewer. The book is accessible though a bit light.

Evgeny Morozov: How much for your data?

Evgeny Morozov has a nice essay in Le Monde Diplomatique (English Edition, August 2014) on Whilst you whistle in the shower: How much for your data? (article on LMD here). He raises questions about the monetization of all of our data and how we are willing to give up more and more data. He describes the limited options being debated on the issue of data and privacy,

the future offered to us by Lanier and Pentland fits into the German “ordoliberal” tradition, which sees the preservation of
market competition as a moral project, and treats all monopolies as dangerous. The Google approach fits better with the American school of neoliberalism that developed at the University of Chicago. Its adherents are mostly focused on efficiency and consumer welfare, not morality; and monopolies are never assumed to be evil just because they are monopolies, some might be socially beneficial.

The essay covers some of the same ground that Mike Bulajewski covered in The Cult of Sharing about how the gift economy rhetoric is being hijacked by monetization interests.

Since established taxi and hotel industries are detested, the public
debate has been framed as a brave innovator taking on sluggish,
monopolistic incumbents. Such skewed presentation, while not inaccurate
in all cases, glosses over the fact that the start-ups of the “sharing
economy” operate on the pre-welfare model: social protections for
workers are minimal, they have to take on risks previously assumed by
their employers, and there are almost no possibilities for collective