Humanities Computing – Page 10

CIFAR: Do you have a question?

Back in the Spring I blogged about how CIFAR was launching a new programme that might be open to humanists called, Do you have a question with the potential to change the world?. CIFAR doesn’t have much of a track record supporting arts or humanities research as their own reports note. An open call for questions would surely attract some questions that humanists would recognize. Alas, no.

Despite getting 280 Letters of Interest not one of the seven selected comes from the arts, humanities or social sciences. The closest is the Brain, Mind, and Consciousness project which is based in neuroscience and will apparently involve philosophers and ethicists. Here is the list of the seven selected for the next round:

Biology, Energy, and Technology
BrainLight: Cracking the Sensory Code
Brain, Mind, and Consciousness
Life in a Changing Ocean: New Perspectives on Marine Functions and Services
Making a Molecular Map of the Cell: Towards a Direct Determination of the Structure-Function Correlation of Biological Systems
Microbes and Humans
The Planetary Biodiversity Project

It is time to ask the question, Why doesn’t CIFAR support the arts and humanities? (In previous programmes they have supported the social sciences.) It is unbelievable that they did not get interesting questions from the humanities. Either no one bothered to submit an interesting question (which I happen to know is not true) or they aren’t interested in the questions we ask. Here are some of the some of the possible explanations I can think of for CIFAR’s ignorance:

None of the 280 LOIs were of the quality of the seven selected.
The panel was composed primarily of scientists and engineers. The one humanist was Pauline Yu.
The type of questions they were looking for were not the sort we ask in the humanities. They were looking for questions that could be answered with a bit of money rather than the questions we deal with that may never be answered.
Their idea of “questions with the potential to change the world” does not include questions about government, race, democracy, culture, art, education or literature.
This programme wasn’t really intended as a way to bring in new areas of research as I was told when I asked about the dearth of humanities support.

I think it is time CIFAR be honest with the larger community and admit that they are focusing on support for research in Science, Technology, Engineering and Medicine with some forays into the Social Sciences. No one would blame them for focusing their support. Deep in their reports they admit that “the growth of its programs in the social sciences and humanities has not kept pace with growth in the natural sciences” (Final Report CIFAR Performance Audit and Evaluation) though I frankly don’t see any growth at all.

Digital Humanities 2013

I’m at the end of the Digital Humanities 2013 conference at the University of Nebraska. Inclusivity is a major issue. The final keynote was by Isabel Galina on “Is There Anybody Out There? Building a Global DH Community.” My conference notes are here.

It was a great conference and the organizers deserve praise.

CIFAR: Renewing their vision

Today I went to a meeting about Canadian Institute For Advanced Knowledge (CIFAR) in the hopes that they might have programs in the humanities. They do and they don’t.

One new initiative they have that is open to humanists is their global call for ideas. The call is open to anyone:

Do you have a question with the potential to change the world?

A number of their programs like Successful Societies, Social Interactions, Identity & Well-Being, and Institutions, Organizations & Growth seem to have humanists and social scientists involved, even if they aren’t issues central to the humanities.

In recognition of the absence of humanities programs they started a Humanities Initiative in 2009. Alas, it hasn’t yet developed any programs we could participate in. Here is some history:

In their 2009-2010 Annual Performance Report they state:

CIFAR organized a discussion with senior humanities researchers drawn from institutions across North America in May 2009 about the role CIFAR could play in supporting advanced research in the humanities. The meeting participants recommended the creation of an ad hoc Steering Committee that would undertake the process of identifying in detail how CIFAR should approach and support advanced humanities research. This Steering Committee met in December 2009, and following a telephone conference in April 2010 recommended that the Institute proceed with several pilot projects in the next year. Work on refining these projects and identifying task force members was underway by June 2010.

In a 2010, Final Report CIFAR Performance Audit and Evaluation, the evaluators note:

CIFAR’s Strategic Plan notes that the growth of its programs in the social sciences and humanities has not kept pace with growth in the natural sciences. CIFAR is, consequently, examining how its research model might be adapted to research in these disciplines with a specific focus in this five-year period on the humanities.

It is now 2013 and it seems the steering group recommended two pilot projects, neither of which seem to have done more than meet.

Pekka Sinervo, who presented here, suggested that it is hard to find examples of sustained conversations around a single question in the humanities of the sort that CIFAR supports. He challenged me to find examples they could use as models. Perhaps there isn’t a tradition of think tanks in the humanities? Perhaps senior humanists, of the sort CIFAR has recruited, are more solitary scholars who just can’t get excited about getting together to talk about ideas? Perhaps the humanities has lapsed into Cartesian solipsism – we think, we are, but alone.

I personally think CIFAR should restart and rethink their Humanities Initiative. If they are finding it hard to get humanists engaged in the ways other fields are, then try something different. I would encourage them to look at some examples from the digital humanities that have demonstrated the capacity to initiate and sustain conversations in innovative ways:

The Humanities and Technology Camp (THATCamp) is an extremely successful example of an open and inclusive form of conversation. Mellon supports this initiative that supports inexpensive “unconferences” around the world.
Networked Infrastructure for Nineteenth-century Electronic Scholarship Online (NINES) is a reinvented scholarly association that was formed to support old and new media research. This is not an elite exclusive community, but a reimagined association that is capable of recognizing enquiry through digital scholarship.
The Day of Digital Humanities is a sustained look at the question, “Just what do digital humanists really do?” Started at U of Alberta in 2009, the latest version was run by Michigan State University’s MATRIX: The Center for Digital Humanities & Social Sciences. Other organizations have used this “Day of …” paradigm to get discussion going around issues like digital archaeology.
4Humanities is a loose group that looks at how to advocate for the humanities in the face of funding challenges. With minimal funding we support local chapters, international correspondents, and various activities.

In short, there are lots of examples of sustained conversations, especially if you don’t limit yourself to a particular model. Dialogue has been central to the humanities since Plato’s Academy; perhaps the humanities should be asked by CIFAR to imagine new forms of dialogue. Could CIFAR make a virtue of the problem they face around humanities conversations?

Can you start a dialogue with the potential to change the world?

Digital Classics Symposium in Buffalo

I am heading home the day after giving the closing remarks at a conference in Buffalo on Word, Space, Time: Digital Perspectives on the Classical World. This is the first conference of the new Digital Classics Association. It was a gem of a conference where I learned about a succession of neat projects. Here are some notes. My laptop ran out of juice at times so I was not able to take notes on everything.

Greg Crane gave the opening keynote announcing his new Humbolt appointment and what he is gong to try to do there. He announced that he wanted to: 1) Advance the role of Greco-Roman culture and Classical Greek and Latin in human intellectual life as broadly and as deeply as possible in a global world. And 2) To blow the dust off the simple, cogent and ancient term philology and to support an open philology that can, in turn, support a dialogue among civilizations. He talked about the history and importance of philology and then announced the Open Philology Project. This project has as its goals:
- Open greek and latin texts (the TLG is not open)
- Comprehensive open data about the classical world
- Multitext digital editions
- Annotations
- Deep linguistic annotation
- Full workflow through true digital edition
This is a worthy and ambitious vision and I tried to remind people of it at the end. Classics is the right size and has the right interdiscplinarity to be able to model a comprehensive system.
Crane talked about Alpheios, a text editing and learning system that Perseus is connecting to. Monica Berti showed her work on fragmenta in Alpheios and I later learned that this is a philanthropically funded project. Berti’s demo of how she is handling fragmenta is at http://services.perseus.tufts.edu/berti_demo/
Marco Büchler gave a tantalizing paper on “Using Google PageRank to detect text reuse”. His was not the only text reuse project – it is technique that is important to classicists who study how classical authors have been quoted, alluded to, and reused over time. Büchler’s software is TRACER which will be available once he has some documentation. I think the idea of using a PageRank to sort hits is a great idea and would love to play with his tools. He encouraged interested parties to join a Google group on text reuse.
Walter Scheidel showed the Orbis system in a paper on “Redrawing the map of the Roman world.” Orbis is a brilliant tool for measuring time and cost for travel in the Roman world. It is a great example of spatial analysis.
Tom Elliot talked about the Pleiades project and how they have around 34,000 places registered and linked. He was initially skeptical about semantic web technologies and RDF, but is now using it in a way that shows what we can do in the humanities with this approach. I am struck by how Plieades now provides a service to all sorts of other projects. What Classics now need is similar projects for people, passages (texts), periods (events and time), and other primitives. Classics could set an example of coordinated semantic data.
Ryan Horne wrapped a great session on geospatial work with a presentation on “Mapping antiquity a-la-carte: a GIS interface of the ancient world”. He showed Antiquity À la carte which allows you to generate all sorts of maps of the Classical world. Great tool for teachers.
Kevin D. Fisher gave a fascinating presentation on “Digital approaches to ancient cities: The Kalavasos and Maroni built environments project, Cyprus.” In The Crane Project they are using all sorts of cool technology like 3D laser scanners and ground penetrating radar to map their dig in Cyprus. I liked how was using techniques to model how the environments were lived in. What could you see from where, what were the accessible rooms in buildings?
My favorite project of conference was Christopher Johanson’s visual argument on RomeLab: Performance on the ephemeral stage. He presented an argument about temporary stages in the Roman forum that was made through a virtual Rome that you can travel around through the browser. The argument is in a sequence points that can be opened and which will move you around the world to see what the argument is about. His paper was an example of a visual argument through RomeLab and by extension about RomeLab. Despite a technical glitch, it was an impressive performance that made its point on so many levels.
I attended a neat little workshop on R led by Jeff Rydberg-Cox. His learning materials are at http://daedalus.umkc.edu/StatisticalMethods/index.html and he pointed us to a neat tutorial at http://tryr.codeschool.com/.
At the end there was a great panel on Literary Criticism and Digital Methods. Matt Jockers presented is work on macroanalysis of 19th century literature. He had a neat word cloud visualization of topic modeling results. Patrick J. Burns was very good on “Distant reading alliteration in Latin poetry.” He was very good on walking us through his method and illustrating it with humour. Neil Bernstein talked about the Tesserae project. The Tesserae project is looking at text reuse and has neat tools online for people to see how author A gets reused in author B.

I gave the closing remarks and I tried to draw attention to the history of the vision of a perfect reading (or philology) machine. I think took advantage of being the last to offer suggestions as to how digital classics might move research forward:

The Digital Classics Association should take seriously Greg Cranes invitation to influence his Open Philology Project. Classics is, for various reason, in a unique position to imagine comprehensive research and learning environments.
They should think about primitives and how they support them. What Pleiades has done for place other should think of doing for people, periods (events and time), buildings and things, and so on. The idea would be to have a network of projects managing semantic data about the things that matter to Classicists.
I encouraged people to think about how to include the larger public into research using crowdsourcing and gaming.
I encouraged them to think about how digital research is shared and assessed. They should look at the work from the MLA on assessment and the DCA could adapt stuff for Classics.
Finally I talked a bit out infrastructure and the dangers of developing infrastructure prematurely. I called for infrastructure experiments.

I think the DCA will be putting up a video of my closing remarks.

The National Digital Public Library Is Launched! by Robert Darnton

Robert Darnton has written an essay about the launch of the Digital Public Library of America that everyone should read. A great writer and a historian he provides a historical context and a contemporary context. He quotes from the original mission statement to show the ambition,

“an open, distributed network of comprehensive online resources that would draw on the nation’s living heritage from libraries, universities, archives, and museums in order to educate, inform, and empower everyone in the current and future generations.”

The essay, The National Digital Public Library Is Launched! by Robert Darnton is in the New York Review of Books. A lot of it talks about what Harvard is contributing (Darnton is the University Librarian there), which is OK as it is good to see leadership.

He also mentions that Daniel Cohen is the new executive director. Bravo! Great choice!

Diagramming Text Analysis

I asked some graduate students to draw diagrams theorizing text analysis. This was partly so that we could test ideas about how things (like diagrams) communicate. Here you can see the diagrams that Carmen, Jared, Samia, Sandra, and Tianyi developed.

I must admit I would never have thought of some of the visual ideas they deployed.

Big Buzz about Big Data: Does it really have to be analyzed.

The Guardian has a story by John Burn-Murdoch on how Study: less than 1% of the world’s data is analysed, over 80% is unprotected.

This Guardian article reports on a Digital Universe Study that reports that the “global data supply reached 2.8 zettabytes (ZB) in 2012” and that “just 0.5% of this is used for analysis”. The industry study emphasizes that the promise of “Big Data” is in its analysis,

First, while the portion of the digital universe holding potential analytic value is growing, only a tiny fraction of territory has been explored. IDC estimates that by 2020, as much as 33% of the digital universe will contain information that might be valuable if analyzed, compared with 25% today. This untapped value could be found in patterns in social media usage, correlations in scientific data from discrete studies, medical information intersected with sociological data, faces in security footage, and so on. However, even with a generous estimate, the amount of information in the digital universe that is “tagged” accounts for only about 3% of the digital universe in 2012, and that which is analyzed is half a percent of the digital universe. Herein is the promise of “Big Data” technology — the extraction of value from the large untapped pools of data in the digital universe. (p. 3)

I can’t help wondering if industry studies aren’t trying to stampede us to thinking that there is lots of money to be made in analytics. These studies often seem to come from the entities that benefit from investment into analytics. What if the value of Big Data turns out to be in getting people to buy into analytical tools and services (or be left behind.) Has there been any critical analysis (as opposed to anecdotal evidence) of whether analytics really do warrant the effort? A good article I came across on the need for analytical criticism is Trevor Butterworth’s Goodbye Anecdotes! The Age of Big Data Demands Real Criticsm. He starts with,

Every day, we produce 2.5 exabytes of information, the analysis of which will, supposedly, make us healthier, wiser, and above all, wealthier—although it’s all a bit fuzzy as to what, exactly, we’re supposed to do with 2.5 exabytes of data—or how we’re supposed to do whatever it is that we’re supposed to do with it, given that Big Data requires a lot more than a shiny MacBook Pro to run any kind of analysis.

Of course the Digital Universe Study is not only about the opportunities for analytics. It also points out:

That data security is going to become more and more of a problem
That more and more data is coming from emerging markets
That we could get a lot more useful analysis done if there was more metadata (tagging), especially at the source. They are calling for more intelligence in the gathering devices – the surveillance cameras, for example. They could add metadata at the point of capture like time, place, and then stuff like whether there are faces.
That the promising types of data that could generate value start with surveillance and medical data.

Reading about Big Data I also begin to wonder what it is. Fortunately IDC (who are behind the Digital Universe Study have a definition,

Last year, Big Data became a big topic across nearly every area of IT. IDC defines Big Data technologies as a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data by enabling high-velocity capture, discovery, and/or analysis. There are three main characteristics of Big Data: the data itself, the analytics of the data, and the presentation of the results of the analytics. Then there are the products and services that can be wrapped around one or all of these Big Data elements. (p. 9)

Big Data is not really about data at all. It is about technologies and services. It is about the opportunity that comes with “a big topic across nearly every area of IT.” Big Data is more like Big Buzz. Now we know what follows Web 2.0 (and it was never going to be Web 3.0.)

For a more academic and interesting perspective on Big Data I recommend (following Butterworth) Martin Hilbert’s “How much information is there in the ‘information society’?” (Significance, 9:4, 8-12, 2012.) One of the more interesting points he makes is the growing importance of text,

Despite the general percep- tion that the digital age is synonymous with the proliferation of media-rich audio and videos, we find that text and still images cap- ture a larger share of the world’s technological memories than they did before4. In the early 1990s, video represented more than 80% of the world’s information stock (mainly stored in analogue VHS cassettes) and audio almost 15% (on audio cassettes and vinyl records). By 2007, the share of video in the world’s storage devices had decreased to 60% and the share of audio to merely 5%, while text increased from less than 1% to a staggering 20% (boosted by the vast amounts of alphanumerical content on internet servers, hard disks and databases.) The multimedia age actually turns out to be an alphanumeric text age, which is good news if you want to make life easy for search engines. (p. 9)

One of the points that Hilbert makes that would support the importance of analytics is that our capacity to store data is catching up with the amount of data broadcast and communicated. In other words we are getting closer to being able to be able store most of what is broadcast and communicated. Even more dramatic is the growth in computation. In short available computation is growing faster than storage and storage faster than transmission. With excess comes experimentation and with excess computation and storage, why not experiment with what is communicated. We are, after all, all humanists who are interested primarily ourselves. The opportunity to study ourselves in real time is too tempting to give up. There may be little commercial value in the Big Reflection, but that doesn’t mean it isn’t the Big Temptation. The Delphic oracle told us to Know Thyself and now we can in a new new way. Perhaps it would be more accurate to say that the value in Big Data is in our narcissism. The services that will do well are those that feed our Big Desire to know more and more (recently) ourselves both individually and collectively. Privacy will be trumped by the desire for analytic celebrity where you become you own spectacle.

This could be good news for the humanities. I’m tempted to announce that this will be the century of the BIG BIG HUMAN. With Big Reflection we will turn on ourselves and consume more and more about ourselves. The humanities could claim that we are the disciplines that reflect on the human and analytics are just another practice for doing so, but to do so we might have to look at what is written in us or start writing in DNA.

In 2007, the DNA in the 60 trillion cells of one single human body would have stored more information than all of our technological devices together. (Hilbert, p. 11)

Digital Humanities Pedagogy: Practices, Principles and Politics

Open Book Publishers has just published Digital Humanities Pedagogy: Practices, Principles and Politics online. Stéfan Sinclair and I have two chapters in the collection, one on “Acculturation and the Digital Humanities Community” and one on “Teaching Computer-Assisted Text Analysis.”

The Acculturation chapter sets out the ways in which we try to train students by involving them in project teams rather than only through courses. This approach I learned watching Jerome McGann and Johanna Drucker at the University of Virginia. My goal has always to be able to create the sort of project culture they did (and now the Scholar’s Lab continues.)

The editor Brett D. Hirsch deserves a lot of credit for gently seeing this through.

MLA 2013 Conference Notes

I’ve just posted my MLA 2013 convention notes on philosophi.ca (my wiki). I participated in a workshop on getting started with DH organized by DHCommons, gave a paper on “thinking through theoretical things”, and participated in a panel on “Open Sesame” (interoperability for literary study.)

The sessions seemed full, even the theory one which started at 7pm! (MLA folk are serious about theorizing.)

At the convention the MLA announced and promoted a new digital MLA Commons. I’ve been poking around and trying to figure out what it will become. They say it is “a developing network linking members of the Modern Language Association.” I’m not sure I need one more venue to link to people, but it could prove an important forum if promoted.