Markup and Text Representation – Page 11

The Spectator’s View of Web Standards

One of my favourite software writers/bloggers is Joel Spolsky: he is thoughtful, funny, and knows how to tell a story. Yesterday he posted a longer-than-usual disquisition on the upcoming web-standards smackdown that will follow on the heels of the release of Internet Explorer 8.

My sympathies tend to fall with the standards purists (though the need to deliver a product forces me to appreciate compromise), I find the elegance of good abstraction irresistible and standards compliant design makes for more stable, comprehensible, editable and elegant sites (from the perspective of the developer, that is: I’m saying nothing about how anything looks to the actual eye…). And there’s a large and vocal community that shares this attitude. The nagging voice of reason, however, (and I am only assuming it is the voice of reason, I haven’t mentioned this to a psychiatric professional) does frequently ask “Is this semantic markup?” The practical distinction between ‘presentation’ and ‘logic’ only looks clear from the periphery; the middle ground is big and grey and muddy.

So, Joel’s remarks on the casual meaning of ‘standards’ when applied to web development are, I think, appropriate, and his story illustrating the history of incremental standards compromise in the service of progress is undeniable (except, perhaps, to a fanatical idealist). His pragmatic arguments that 1) there is no practical web-standard benchmark against which to measure browser compliance, 2) that the expression of standards specifications in W3C documentation are frequently impenetrable, and 3) that Microsoft like any other company has to maintain the good will of their existing customers by supporting legacy products and document formats in new products, are all well argued and substantially acceptable. It is almost enough to make me feel some sympathy for Microsoft. Almost.

Of course, talking about IE is not quite like talking about Word, where the evolution of the document format is bound to the product alone; any web developer will ask why there are so many fewer discrepancies found on a first test of a site architecture between FireFox, Safari, and Opera than between any of these and IE6 (indeed, a measure of the improvement in standards compliance of IE7 is that there may now be more discrepancy between IE6 and IE7 than between IE7 and the other major browsers (maybe)). Surely at least some of the blame for the whole fracas with respect to IE and the rise of web standards fanaticism rests with Microsoft’s historical unwillingness to accept any general standards not of their own making. (Witness ODF vs OOXML as just one example.)

I’ll stop there and leave the flaming for other, more capable participants. In the end, one can’t really disagree with Joel’s point that the demand by compliance fanatics within Microsoft (I know, the very idea of their existence leaves me a little breathless) that IE8 be so rigid in it’s adherence to standards based code that only 37% (or whatever number…) of existing web pages will accurately render is just silly. The plea one wants to make is for the middle path: too much unpredictability in a platform will hinder development and so will too much inflexibility: the question is “how much is too much?”. We complain about caprice in the rendering decisions of various browsers (some more than others), but it is almost certainly a good thing that we are required to reinvent from time to time; the human impulse is to improvise and the best measure of our ingenuity is our capacity to swede the world. (Well, I liked the “be kind, rewind” site so much I had to work it in somewhere.)

SET 26

Image of GROCK

SET 26 is a Swiss design company that sells furniture shaped like letters from the Roman alphabet. Each letter costs about 1,500 Euros and has doors that open revealing shelves. They have a Konfigurator so you can see any combination of 5 letters in the available colours (like the GROCK above.)

I read about this in a strange online Facsimile Magazine while reading a reproduction of a 1970 Time Life Books’ Nature/Science Annual article on “Art’s New Ally – Science.” The article documents a number of technological arts projects including the Experiments in Arts and Technology (E.A.T.) cooperative founded by Billy KlÃ¼ver and Robert Rauschenberg.

Project Gutenberg: The Killer App

Michael Hart of Project Gutenberg, wrote a provocative answer to Willard’s question (Humanist Discussion Group, Vol. 21, No. 495) about the “killer-app” of digital humanities that I reproduce here verbatim:

True, you can’t convince the skeptics. . .you still can’t say that digitial music has wiped out analog music because a few places still make analog records which are really better, not that a true skeptic needs those last few words.

Even when there are more eBooks than paper books, no way.

Even when there are 100 times as many eBooks, not happening.

It’s not going to matter what they SAY about eBooks, reality is going in that direction and paper books will never reverse that trend, simply because you can /OWN/ MILLIONS OF eBOOKS IN A TERABYTE DRIVE [costing under $200].

Before Gutenberg the average person could own zero books.

Before Project Gutenberg an average person could own 0 libraries.

It’s literally as simple as that.

The cost/benefit ratio for eBooks is too much better than paper.

Thanks!!!

Is he right?

Spreading the load – volunteer computing

Martin Mueller and James Chartrand both pointed me to an article in the Economist on volunteer computing, Spreading the load. The article nicely covers a number of projects that enlist volunteers over the web, like those I noted in Tagging Games. They don’t really distinguish the projects like BOINC that enlist volunteer processing from the ones like BOSSA (and the Mechanical Turk) that enlist volunteer human contributions, and perhaps there isn’t such a difference. It is always a human volunteering some combination of their time and computing to a larger project.

What Martin has suggested is that we think about how humanities computing projects might be enabled by distributed skill support. Could we enlist volunteer taggers for electronic texts with the right set up? Would we need to make it a game like ESP to check tagging choices against each other? The only example I can think of in the humanities is the Suda On Line (SOL), a project where volunteers are translating the Suda, “Byzantine encyclopedia known as the Suda, a 10th century CE compilation of material on ancient literature, history, and biography.” (From the SOL About page.) Can that infrastructure be generalized to a translating and enrichment engine for language, literature, history and philosophy?

Tagging Games

Peter O pointed me to a new phenomenon on the web that I’ve been meaning to blog for a while. That is the leveraging of human players for tasks that can’t be easily automated. Perhaps the best example is the ESP Game. The online game is described in “How to Play”:

The ESP Game is a two-player game. Each time you play you are randomly paired with another player whose identity you don’t know. You can’t communicate with your partner, and the only thing you have in common with them is that you can both see the same image. The goal is to guess what your partner is typing on each image. Once you both type the same word(s), you get a new image.

The game (and its Google Image Labeler spin-off) leverages fun to get image tagging done. Remember when we thought computer image recognition would do that? Now we are using online games to make it fun for humans to do what we do best – instant complex judgements about the visual. If you get enough people playing we could make serious inroads into tagging the visual web.

What is impressive about ESP is what a simple and powerful idea it is and this is Luis von Ahn‘s second sweet contribution, the first one being CAPTCHA and reCAPTCHA.

While it isn’t quite as clean, a generalized version of the idea of people power is Amazon’s Mechanical Turk. The idea is that people can,

Complete simple tasks that people do better than computers. And, get paid for it. Learn more.

Choose from thousands of tasks, control when you work, and decide how much you earn.

Developers can register tasks, people can work on HITs (Human Intelligence Tasks) and get paid for the work, and Amazon can become the largest labour market for small tasks.

WorldCat Identities: Publication Timelines

Publication Timeline Image

WorldCat Identities is a experimental project by the OCLC that connects to their WorldCat catalogue of libary holdings. Identities presents you with a cloud of authors (identities):

Word Cloud Image

If you click on an author you get publication information about the author, including a publication timeline like the one for Marx above. You can also connect to WorldCat and find a copy of the book near you by giving a postal code, for example.

Texto Digital: a-writings

Humanist posted an announcement for a new issue of the Brazilian journal Text Digital that includes some interesting animated experiments (like the image above) including a series a-writing by Gerard Dalmon. The address “To the reader” starts with,

To weave, write and inscribe thoughts on the digital medium is the purpose of this journal that reaches its fifth number with a somewhat different content. It is the first time we publish an issue with more creative than theoretic interventions.

In Search of Stupidity, over 20 years of high-tech marketing disasters

In Search of Stupidity, over 20 years of high-tech marketing disaster is an amusing book about the marketing and development of commercial software by Merrill R. Chapman. Some of the chapters deal with poor decisions by word-processing companies like MicroPro that ended up with two competing products (WordStar 3.3 and Wordstar 2000) and completely different programs. MicroPro International, according to Chapman was in 1983 the largest microcomputer software company with close to 70 million in sales. The problem was they the WordStart programming team was fired (or quit) and a new team bought up had a different word-processor in development.

One thing this book documents well is the battles between the management/marketing folk, on the one hand, and the developers, on the other. The fault does not always lie with the marketing folk. Chapman describes situations where the developers decide to totally redevelop a product from the ground up when the market is expecting a timely upgrade. Philippe Kahn of Borland, for example, decided to redevelop Paradox completely in object-oriented code and ended up alienating his users just when Microsoft released Access.

The one company that stands out as consistently avoiding fatal stupid mistakes is Microsoft which may explain why they are now so much bigger than any other software company. That Microsoft had an experienced programmer as lead probably meant there was never the sort of disconnect that doomed other software companies.

The book is partly a response to In Search of Excellence which lauded a number of high-tech companies as having excellent coporate cultures. Unfortunately many of the “excellent” companies didn’t last … hence the search for stupidity.

Check out their Museum Exhibits of stupid marketing.

Amazon Kindle and Sony Reader

The e-book that seemed to be dead as an idea is back. Sony has their Reader which uses e-ink to offer a more paper-like reading experience.
Amazon has just announced the Kindle which has a keyboard and can EV-DO free wirless access so you can order material from Amazon without connecting to your PC. The Amazon video mentions that you can automatically get newspapers and updates from blogs.

I’m guessing one of the real strengths of Kindle is Amazon – that they will have the best content and with EV-DO they will have easy access to content wherever you can get a connection. On the other hand the Kindle looks dorky (not that the Sony looks much better.) As they say, WWAD (what would Apple do?)

To be honest I thought the e-book reader as a device was dead after the last round of devices like the Rocket eBook.Â I figured tablet PCs and PDAs would make dedicated readers obsolete – we do after all read lots of pages off screens already. See Cory Doctorow on Ebooks: Neither E, Nor Books. But, I was wrong … it seems the big guys think there is a market for such appliances.

Otto Ege and Karpeles: Manuscript Mavericks

Screen Shot of Manuscript Interface
Manuscripts are on my mind. At the 2007 Congress I heard a lecture by Peter Stoicheff about the architecture of the page. Stoicheff talked about Otto Ege, a manuscript trader who cut up manuscripts and sold sets of pages – one (set) ended up at the University of Saskatchewan and was featured in an exhibit Scattered Leaves that ran during the conference. I was fascinated first by his reorientation from the book to the page and then by the project of Remaking the Book – virtually reconstructing the books that were scattered across the sets Ege assembled. Stoicheff pointed out that it is easy to criticize Ege for cutting up books to sell pages, but went on to ask about the history of the book as the privileged object. The immediate horror we feel when we hear or see the cutting up of a book hints at how fundamental and unexamined an object the book is to academics. Stoicheff’s The Future of the Page (conference and book)

I was reminded of this story while reading about the Karpeles Manuscript Library Museums, two of which are in historic buildings nearby in Buffalo. David Karpeles has put together what is supposed to be one of the largest private manuscript collections and makes many of the manuscripts available both through his museums and online through the Karpeles Manuscript Library. I particularly like the neat interface for viewing the manuscript with a lens to see the plain text. The web site for the Museums, however, is idiosyncratc, with music (including O Canada) that plays and poor navigation. Is Karpeles another manuscript maverick like Otto Ege?