The X Factor – Brainstorm – The Chronicle of Higher Education

From Humanist, a link to an article on online education, The X Factor (in the Brainstorm blog of the Chronicle of Higher Education. The post talks about how Harvard University has joined with MIT to create edX, an online education consortium. Harvard is now joining the MOOC (Massive Online Open Courses) bandwagon pioneered by some Stanford profs who opened their courses to thousands. The author, Kevin Carey, points out that edX won’t compete with MIT or Harvard, but with other online providers and with less prestigious institutions.

I worry we are going to see a lessening of educational diversity. I worry that the star quality of MIT, Harvard and Stanford will drive out less prestigious players leaving us with a small number of online courses. Fewer instructors for more people will mean more standardization of education and less diversity.

The New York Times has a Room for Debate on this, Got a Computer? Get a Degree with different reactions to the news. Most seem positive, but few feel that certificates for taking MOOCs are comparable to real course credit.

Kurt Vonnegut on the Shapes of Stories

I’ve been meaning to blog on the video circulating of Kurt Vonnegut talking about the Shape of Stories. He describes the curves followed by popular stories like “boy meets girl” and suggests computers could even understand such simple curves. In Lapham’s Quarterly you can read the text of this lecture with illustrations. See Kurt Vonnegut at the Blackboard. In this version he asks about the value of such systems, a question which could apply equally to computer generated visualization,

The question is, does this system I’ve devised help us in the evaluation of literature? Perhaps a real masterpiece cannot be crucified on a cross of this design. How about Hamlet?

He concludes that the system doesn’t work because the truth is ambiguous. We simply don’t know in complex works (like Hamlet) if news is good or bad. Good literature is open to interpretation.

But there’s a reason we recognize Hamlet as a masterpiece: it’s that Shakespeare told us the truth, and people so rarely tell us the truth in this rise and fall here [indicates blackboard]. The truth is, we know so little about life, we don’t really know what the good news is and what the bad news is.

Many have noticed this amusing play on visualization including an infographic on Visua.ly, Kurt Vonnegut on the Shapes of Stories:

Plot visualization from Vonnegut

Prism: Collaborative Interpretation

Prism is the coolest idea I have come across in a long time. Coming from the University of Virginia Scholar’s Lab, Prism is a collaborative interpretation environment. Someone comes up with categories like “Rhetoric”, “Orientalism” and “Social Darwinism” for a text like Notes on the State of Virginia. Then people (with accounts, which you can get freely) go through and mark passages. This creates overlapping interpretative markup of the sort you used to get with COCOA in TACT, but unlike TACT, many people can do the interpretation – it can be crowdsourced.

They are planning some visualizations of the results including what look like the types of visualizations that TACT gave where you can see words distributed over tagged areas.

Bethany Nowviskie explains the background to the project in this Scholar’s Lab post.

Robo-Readers Used to Grade Test Essays

A nice story from the New York Times by Michael Winerip, Robo-Readers Used to Grade Test Essays (April 22, 2012) talks automated essay scoring software (AES). The story first reports a study from the University of Akron that showed that AES software is comparable to human graders (see A Win for the Robo-Readers by Steve Kolowich from Inside Higher Ed.) The NYT story goes then to report how Les Perelman, a director of writing at MIT, has shown how you can game AES tools. Among other things they don’t check facts or truth so you can write all sorts of outrageous things and still get a good score from AES. The story discusses some of the patterns that get good scores like lexical variety and long sentences. The story ends with the possibility that AES could be matched by essay writing software,

Two former students who are computer science majors told him (Perelman) that they could design an Android app to generate essays that would receive 6’s from e-Rater. He says the nice thing about that is that smartphones would be able to submit essays directly to computer graders, and humans wouldn’t have to get involved.

Particularly interesting is an essay Perelman wrote to show how poor essays can game the system. I wish I could say that I never saw writing like this and that therefore there was no danger of AES systems rewarding the poor writing found in real essays,

In today’s society, college is ambiguous. We need it to live, but we also need it to love. Moreover, without college most of the world’s learning would be egregious. College, however, has myriad costs. One of the most important issues facing the world is how to reduce college costs. Some have argued that college costs are due to the luxuries students now expect. Others have argued that the costs are a result of athletics. In reality, high college costs are the result of excessive pay for teaching assistants.

Faculty Advisory Council Memorandum on Journal Pricing § THE HARVARD LIBRARY TRANSITION

From Slashdot a story about how the Faculty Advisory Council to the Library (of Harvard) sent around a Memorandum on Journal Pricing arguing that periodical subscriptions are not sustainable and that faculty should therefore publishing in open-access journals.

The Faculty Advisory Council to the Library, representing university faculty in all schools and in consultation with the Harvard Library leadership, reached this conclusion: major periodical subscriptions, especially to electronic journals published by historically key providers, cannot be sustained: continuing these subscriptions on their current footing is financially untenable. Doing so would seriously erode collection efforts in many other areas, already compromised.

Whistleblower: The NSA is Lying–U.S. Government Has Copies of Most of Your Emails

According to National Security Agency (of the USA) whistleblower William Binney, the NSA probably has most of our email. See the video Whistleblower: The NSA is Lying–U.S. Government Has Copies of Most of Your Emails. The question then is what they are doing with it? He mentions that the email can be “put it into forms of graphing, which is building relationships or social networks for everybody, and then you watch it over time, you can build up knowledge about everyone in the country.” (see transcript on page). In other words they could (are) building a large social graph that they can use in various ways.

In the transcript of the longer video Binney talks about various programs developed to filter out all the information:

Well, it was called Thin Thread. I mean, Thin Thread was our—a test program that we set up to do that. By the way, I viewed it as we never had enough data, OK? We never got enough. It was never enough for us to work at, because I looked at velocity, variety and volume as all positive things. Volume meant you got more about your target. Velocity meant you got it faster. Variety meant you got more aspects. These were all positive things. All we had to do was to devise a way to use and utilize all of those inputs and be able to make sense of them, which is what we did.

Binney goes on to talk about the code named Stellar Wind program that Bush authorized and then was forced to change after a revolt of some sort in the Justice Department in 2004. Stories tell of senior Bush advisors trying to get Ashcroft to sign authorization papers for the program while he was in the hospital.  As for Stellar Wind, it seems to be mostly about metadata – the date, to, and from of emails that you could use to build a diachronic social graph which is what Binney was talking about. Strictly speaking this would be social network analysis rather than text analysis, but they might have supplemented the system with some keyword capabilities. Another story from Time points out the problem with such analysis – that it generates too many vague false positives. “Leads from the Stellar Wind program were so vague and voluminous that field agents called them “Pizza Hut cases” — ostensibly suspicious calls that turned out to be takeout food orders.”

Either way, these hints give us a tantalizing view into how text and network analysis is being experimented with. Are there any useful research applications?

Globalization Compendium Archive

I have been working for a while on archiving the Globalization Compendium which I worked on. Yesterday I got it archived in two Institutional Repositories:

In both cases there is a Zip of a BagIt bag with the XML files, code and other documentation from the site. My first major deposit.

A walk through The Waste Land

Daniel sent the link to this YouTube video, A walk through The Waste Land, that shows an iPad edition of The Waste Land developed by Touch Press. The version has the text, audio readings by various people, a video of a performance, the manuscripts, notes and photos. I was struck by how this extends to the iPad the experiments of the late 1980s and 1990s that exploded with the availability of HyperCard, Macromedia Director and CD-ROM. The most active publisher was Voyager that remediated books and documentaries to create interactive works like Poetry in Motion (Vimeo demo of CD) or the expanded book series, but all sorts of educational materials were also being created that never got published. As a parent I was especially aware of the availability of titles as I was buying them for my kids (who, frankly, ignored them.) Dr. Seuss ABC was one of the more effective remediations. Kids (and parents) could click on anything on the screen and entertaining animations would reinforce the alphabet.

Continue reading A walk through The Waste Land

Leximancer

Susan pointed me to Leximancer which is a commercial text analysis tool that creates mind maps of your information. I’m struck by how compelling people find mind maps.

Leximancer enables you to navigate the complexity of text in a uniquely automated fashion. Our software identifies ‘Concepts’ within the text – not merely keywords but focused clusters of related, defining terms as conceptualised by the Author. Not according to a predefined dictionary or thesaurus.

The Concepts are presented in a compelling, interactive display so that you can clearly visualise and interrogate their inter-connectedness and co-occurrence – which is as important as the Concepts themselves – right down to the original text that spawned them.