Text Analysis – Page 19

Project Bamboo

I attended Workshop 3 of Project Bamboo in Tucson Arizona this week. I think I’m beginning to understand it, though understanding what Bamboo is was one of the favorite subjects of conversation of the meeting (so I’m conscious that . There is a deliberate ambiguity to the project since they are trying to listen to the community in order to become what we want rather than what we suspect. Some of my takeaway thoughts:

It is being structured as a consortium. Thus the long term sustainability model is that universities (and possibly associations and individuals) will contribute resources into the consortium and get back services for their faculty. This seems the right way to get to a level of broad support.
One thing Bamboo will do is develop shared services that participating universities can use to deliver research support.
One of the challenges is figuring out how to listen to the community. The stories are the mechanism being used for this. Scholars are contributing stories of what they do and what they want to do. In some cases the stories are being contributed by people who talk to faculty.
Recipes (like those we developed for TAPoR) will be a key way to connect stories to the shared services. A recipe is a way of abstracting from a lot of stories something that can be used to identify the tools and content needed by researchers to do useful work.
Bamboo probably won’t build tools, but they will build and run services with which others can build tools. Bamboo may be the project that runs SEASR as a service for the rest of us, for example. We can then build tools with SEASR for our research projects.
Bamboo is talking about running the shared services in a cloud. I’m not sure what that means yet.

Cornell Web Lab: Large scale web research

The Cornell Web Lab is an interesting example of a high performance computing project in the humanities and social sciences. As they say,

The Web Laboratory is a joint project of Cornell University and the Internet Archive to provide data and computing tools for research about the Web and the information on the Web.

In a paper on the project, A Research Library Based on the Historical Collections of the Internet Archive, William Arms and colleagues point out that the data challenge of the social sciences (and humanities) is that the data is poorly structured and there is a lot of it. The Internet Archive is a case in point; as of 2006 they had 5 to 6 petabytes of data of web pages. While it is amazing that we have such archives in computer (and human) readable form, it is hard to do anything with that much. The Web Lab approach is to provide HPC basic services for extracting subsets of the whole that can then be used by other tools.

Pliny: Welcome

Pliny, the annotation and note management tool by John Bradley at King’s College London just got a Mellon Award for Technology Collaboration.

The Mellon Awards honour not-for-profit organisations for leadership in the collaborative development of open source software tools with application to scholarship in the arts and humanities, as well as cultural-heritage not-for-profit activities.

Pliny is free and you can try it out on the Mac or PC. John has thought a lot about how tools fit in the research process of humanists.

NiCHE: The Programming Historian

NiCHE (Network in Canadian History & Environment) has a useful wiki called The Programming Historian by William Turkel and Alan MacEachern. The wiki is a “tutorial-style introduction to programming for practicing historians” but it is could also be used by textual scholars who want to be able to program their own tools. It takes you through learning and using Python for text processing for things like word frequencies and KWICs. It reminds me of Susan Hockey’s book, Snobol Programming for the Humanities. (Oxford: Oxford University Press, 1985) which I loved at the time, even if I couldn’t find a Snobol interpreter for the Mac.

We need more of such books/wikis.

Conference Report: Tools For Data-Driven Scholarship

I just got back from the Tools For Data-Driven Scholarship meeting organized by MITH and the Centre for New Media and History. This meeting was funded by the NEH, NSF, and the IMLS and brought together tool developers, content providers (like museums and public libraries), and funders (NEH, JISC, Mellon, NSF and IMLS.) The goal was to imagine initiative(s) that could advance humanities tool development and connect tools better with audiences. I have written a Conference Report with my notes on the meeting. One of the interesting questions asked by a funder was “What do the developers really want?” It was unclear that developers really wanted some of the proposed solutions like a directory of tools or code repository. Three things the breakout group I was in came up with was:

Recognition, credit and rewards for tool development – mechanisms to get academic credit for tool development. This could take the form of tool review, competitions, prizes or just citation when our tool is used. In other words we want attention.
Long-term Funding so that tool development can be maintained. A lot of tool development takes place in grants that run out before the tool can really be tested and promoted to the community. In other words we want funding to continue tool development without constantly writing grants.
Methods, Recipes, and Training that are documented that bring together tools in the context of humanities research practices. We want others with the outreach and writing skills to weave stories about their use to help introduce tools to others. In other words we want others to do the marketing of our tools.

A bunch of us sitting around after the meeting waiting for a plane had the usual debriefing about such meetings. What do they achieve even if they don’t lead to initiatives. From my perspective these meeting are useful in unexpected ways:

You meet unexpected people and hear about tools that you didn’t know about. The social dimension is important to meetings organized by others that bring people together from different walks. I, for example, finally met William Turkle of Digital History Hacks.
Reports are generated that can be used to argue for support without quoting yourself. There should be a report from this meeting.
Ideas for initiatives are generated that can get started in unexpected ways. Questions emerge that you hadn’t thought of. For example, the question of audience (both for tools and for initiatives) came up over and over.

University Libraries in Google Project to Offer Backup Digital Library – Chronicle.com

From Bethany I discovered this story by the Chronicle of Higher Education about the HathiTrust, titled University Libraries in Google Project to Offer Backup Digital Library (Jeffrey R. Young, Oct. 13, 2008). “Hathi” is the hindi word for elephant suggesting memory and size. Here is a quote from the HathiTrust site:

As a digital repository for the nationâ€™s great research libraries, HathiTrust (pronounced hah-TEE) brings together the immense collections of partner institutions.

HathiTrust was conceived as a collaboration of the thirteen universities of the Committee on Institutional Cooperation and the University of California system to establish a repository for these universities to archive and share their digitized collections. Partnership is open to all who share this grand vision.

The repository, among other things, will pool the volumes digitized by Google in collaboration with the universities so there is a backup should Google lose interest. Large-scale search is being studied now and they expect in November to have preview version available.

A Companion to Digital Literary Studies

The A Companion to Digital Literary Studies edited by Ray Siemens and Susan Schreibman is available online in full text. This is tremendous resource with too many excellent contributions to list individually. Chapters go from Reading on the Screen by Christian Vandendorpe and Algorithmic Criticism by Stephen Ramsay.

There is a good Annotated Overview of Selected Electronic Resources by Tanya Clement and Gretchen Gueguen with links to projects like TAPoR.

Newsknitter: Knitted Visualization

Image of knitted news
Newsknitter is a project that gathers news from RSS feeds and then generates a visualization that can then be knitted into a sweater. Check out the images of sweaters knitted. This project has been exhibited at Ars Electronica and is the work of two PhD candidates at KunstuniversitÃ¤t Linz. At first the idea of machine knitted sweaters of text visualization sounds like a conceptual art work with no future, but as I think about it, the idea of just-in-time information being visualized and used to generate stable material objects like a sweater sounds timely. All sorts of objects could have their designs generated on the spot and on demand from information off the net. Why should data be only visualized and not materialized?

Walter Ong on the Textual Squint

We know now that there is much more than text. “Texts,” as Geoffrey Hartman, has observed, “are false bottom.” The implications of scholars’ blindness to the nontextual and of their recent discovery of their own blindness have still not been worked out entirely. Textual squint is still with us, and, in some ways, with deconstruction has become more disabling in certain quarters at the very time that its diagnosis has become easier. The way to overcome textual squint is not to devise theories, which textualism promotes ad nauseam, but to call attention to reality, to the relationship of texts to the full human lifeworld, …”
Page 2 of “MLA 1984 Literacy Studies”

This passage is from the second page of a five page edited typescript at The Walter J. Ong Collection. The web site notes that “Ong’s notes indicate that this talk was part of the ‘What is Literacy Theory’ session (program item #190) of the 1984 MLA Convention.” I wonder what Ong would make of the Dictionary of Words in the Wild? I don’t think Ong had wild text in mind as a way of overcoming the “textual squint”; the hand notation “Alice Springs” in the left-hand margin suggests what he thought would be an example of nontextual human lifeworld.

TAPoRware and the Digital Humanities Quarterly

Screen capture

The latest version of the Digital Humanities Quarterly is out and they have done something neat. They have included some of the TAPoRware tools in the bar at the top of articles like Wendell’s reflections, Something Called Digital Humanities.

online pharmacy

cialis

levitra

soma

viagra

This is someone anyone can do. We provide instructions on the code to put in your HTML on the TAPoRware Add Tools Demo page. There are different models. You will also find code on the documentation pages for individual tools on the TADA Documentation Pages.