America is about to kill the open internet – and towns like this will pay the price

Residents of Winlock, Washington can barely stream Spotify and Netflix. Changes to Obama’s net neutrality rules are going to make things even worse

There are lots of stories right now about net neutrality and how the FCC (of the USA) is repeal requirements of ISPs. I find it hard to explain why net neutrality is important which is probably why there isn’t more a public outcry. The Guardian has a story that makes this real,  America is about to kill the open internet – and towns like this will pay the price. Global News has a nice story about Net neutrality: Why Canadians should care about the internet changes in the U.S. This story describes what happens in countries like Portugal which don’t have net neutrality regulations and it includes some John Oliver segments on how the FCC is going to fix the Internet (which isn’t broken.)

 

Common Crawl

The Common Crawl is a project that has been crawling the web and making an open corpus of web data from the last 7 years available for research. There crawl corpus is petabytes of data and available as WARCs (Web Archives.) For example, their 2013 dataset is 102TB and has around 2 billion web pages. Their collection is not as complete as the Internet Archive, which goes back much further, but it is available in large datasets for research.

DataCamp

I’ve been playing with DataCamp‘s Python lessons and they are quite good. Python is taught in the context of data analysis rather than the turtle drawing of How to Think Like a Computer Scientist. They have a nice mix of video tutorials and then exercises where you get a tripartite screen (see above.) You have an explanation and instructions on the left, a short script to fill in on the upper-right and interactive python shell where you can try stuff below.

Continue reading DataCamp

The Real Threat of Artificial Intelligence – The New York Times

It’s not robot overlords. It’s economic inequality and a new global order.

Kai-Fu Lee has written a short and smart speculation on the effects of AI, The Real Threat of Artificial Intelligence . To summarize his argument:

  • AI is not going to take over the world the way the sci-fi stories have it.
  • The effect will be on tasks as AI takes over tasks that people are paid to do, putting them out of work.
  • How then will we deal with the unemployed? (This is a question people asked in the 1960s when the first wave computerization threatened massive unemployment.)
  • One solution is “Keynesian policies of increased government spending” paid for taxing the companies made wealthy by AI. This spending would pay for “service jobs of love” where people act as the “human interface” to all sorts of services.
  • Those in the jobs that can’t be automated and that make lots of money might also scale back on their time at work so as to provide more jobs of this sort.

Continue reading The Real Threat of Artificial Intelligence – The New York Times

Busa Letter Outlining Textual Informatics

Page 1 of “Conditional Agreement” by Father Busa

Domenico Fiormonte has recently blogged about an interesting document he has by Father Busa that relates to a difficult moment in the history of the digital humanities in Italy in 2002. The two page “Conditional Agreement”, which I translate below, was given to Domenico and explained the terms under which Busa would agree to sign a letter to the Minister (of Education and Research) Moratti in response to Moratti’s public statement about the uselessness of humanities informatics. A letter was being prepared to be signed by a large number of Italian (and foreign) academics explaining the value of what we now call the digital humanities. Busa had the connections to get the letter published and taken seriously for which reason Domenico visited him to get his help, which ended up being conditional on certain things being made clear, as laid out in the document. Domenico kept the two pages Busa wrote and recently blogged about them. As he points out in his blog, these two pages are a mini-manifesto of Father Busa’s later views of the place and importance of what he called textual informatics. Domenico also points out how political is the context of these notes and the letter eventually signed and published. Defining the digital humanities is often about positioning the field in the larger academic and public political spheres we operate in.

Continue reading Busa Letter Outlining Textual Informatics

‘Photo Archives Are Sleeping Beauties.’ Pharos Is Their Prince

Pharos is an effort among 14 institutions to create a database that will eventually hold and make accessible 22 million images of artworks.

The New York Times has a story about a collaboration to develop the Pharos consortium photo archive, ‘Photo Archives Are Sleeping Beauties.’ Pharos Is Their Prince. The consortium has a number of interesting initiatives they are implementing in Pharos:

  • They are applying the CIDOC Conceptual Reference Model.

The CIDOC Conceptual Reference Model (CRM) provides definitions and a formal structure for describing the implicit and explicit concepts and relationships used in cultural heritage documentation.

  • They have a visual search (which doesn’t seem to find anything at the moment.)
  • They are looking at Research Space (which uses CRM) for a research linked data environment.

Why We Need to Talk About Indigenous Literature in the Digital Humanities

Screenshot from 1991 BBC Horizon documentary

I’ve just come across some important blog essays by David Gaertner. One is Why We Need to Talk About Indigenous Literature in the Digital Humanities where he argues that colleagues from Indigenous literature are rightly skeptical of the digital humanities because DH hasn’t really taken to heart the concerns of Indigenous communities around the expropriation of data.

Continue reading Why We Need to Talk About Indigenous Literature in the Digital Humanities

Carpenter: The Gathering Cloud

From the Modification of Clouds

The Cloud is an airily deceptive name connoting a floating world far removed from the physical realities of data.

The Gathering Cloud by J. R. Carpenter is a great interactive work that uses Luke Howard’s Essay on the Modification of Clouds from 1803 to meditate on the digital cloud. The The work “is a hybrid print- and web-based work by J. R. Carpenter commissioned by NEoN Digital Arts Festival 2016.”

Continue reading Carpenter: The Gathering Cloud

How The Globe collected and analyzed sexual assault statistics to report on unfounded figures across Canada

Fourteen years ago, Statistics Canada stopped publishing unfounded rates, over concerns about the quality of the data. In “Unfounded,” The Globe and Mail has tried to fill the gaps in the data.

The Globe and Mail has been publishing a fabulous data-driven expose on how the police categorize one out of five sexual assault reports as unfounded. They have a web essay Will police believe you? that summarizes the investigation. There is another article on How The Globe collected and analyzed sexual assault statistics to report on unfounded figures across Canada. While this isn’t big data, it shows the power of data in showing us that there is a problem and prodding police departments to start reviewing their practices.