Hobson-Jobson: A Glossary of Colloquial Anglo-Indian Words and Phrases, and of Kindred Terms, Etymological, Historical, Geographical and Discursive

I just came across a peculiar dictionary, the Hobson-Jobson: A Glossary of Colloquial Anglo-Indian Words and Phrases, and of Kindred Terms, Etymological, Historical, Geographical and Discursive. It is a dictionary of words of Indian (and other) origin that would have been used by the English in India. It is a dictionary of the Raj and traderoutes that is full of surprises. It is a work of its time, published just at the end of the 19th century. The title, “Hobson-Johnson” is an example of the colloquial terms covered:

HOBSON-JOBSON , s. A native festal excitement; a tamƒÅsha (see TUMASHA); but especially the Moharram ceremonies. This phrase may be taken as a typical one of the most highly assimilated class of Anglo-Indian argot, and we have ventured to borrow from it a concise alternative title for this Glossary. It is peculiar to the British soldier and his surroundings, with whom it probably originated, and with whom it is by no means obsolete, as we once supposed. My friend Major John Trotter tells me that he has repeatedly heard it used by British soldiers in the Punjab; and has heard it also from a regimental Moonshee. It is in fact an Anglo-Saxon version of the wailings of the Mahommedans as they beat their breasts in the procession of the Moharram — “YƒÅ Hasan! YƒÅ Hosain!’

Alas they don’t have the word “dylok” – supposed to be an Indian-East-African version of “dialogue” – used for variety shows and drama.
Update: I just discovered that this is “back-ended” by The ARTFL Project.

Google Book Search: Opinion

To scan or not to scan? (Guardian Unlimited, March 8, 2006) is an blog entry by Culture Vulture Victor Keegan in defense of Google’s scanning of millions of books, including books still under copyright. The comments are good too. The key issue seems to be whether this is covered by “fair use”.

Technically, as Charles Arthur points out, this is blatant infringement. Turn to the front of any book and you will find a paragraph that states that no part of it shall be copied or stored without the publisher’s permission. The University of Michigan is keeping material that is still within copyright “dark” until the copyright runs out, while Google argues that letting people read snippets of copyrighted books is covered by “fair use” provisions of the kind that mean we don’t go to jail for sneaking into Waterstone’s to look up a reference.

I found this from the Google Book Search – News & Views – Media Coverage page.

Web Slide Shows

I once wrote an XML language to replace PowerPoint that had style sheet that turned my class outline into slides, but I have found a better way. HTML Slidy by Dave Raggett is a slide show about using XHTML to do slide shows. Dave has done a nice job at it using Javascript and CSS. My one problem is that the code is verbose – it needs a wiki crib language for quickly getting ideas out.

Another approach is to use Opera – see the Opera Show Tutorial. They have a online Opera Show Generator which creates the HTML for you. Neat.
Continue reading Web Slide Shows

Greg Crane: What Do You Do with a Million Books?

What Do You Do with a Million Books?, by Gregory Crane talks about the implications of large-scale book scanning projects like the Google Print project that is scanning tens of millions of books. He introduces an interesting term, “recombinant documents”, to describe how software (like what they have at Perseus) can add intelligent connections to documents, but also the way documents can be reorganized and combined into “concordances” or hybrid documents. This is similar, I think, to what Mark Olsen was talking about in Toward meaningful computing. Crane’s answer, drawn from the DARPA Global Autonomous Language Exploitation (GALE) project is three core functions:

  1. Analog to text (digitizing speech and print)
  2. Machine translation (from language to language)
  3. Information extraction (mining for linkable dates, names and so on)

Thanks to Mark Olsen for this link.

Ubiquity: Why People Don’t Read Online

Wendell Piez pointed me to an article in Ubiquity: An ACM IT Magazine and Forum with the title Why People Don’t Read Online and What to do About It. It is by Michelle Cameron (Vol 6, Issue 40, Nov. 2-8, 2005) and is short and easy to read. It nicely summarizes the reasons people don’t read off the screen and how to write for those who scan. The last guideline is:

Know when to stop writing. Like now.

Ask-an-Expert for Linguistics Documentation

Alex Sevigny pointed me to a post on the Linguist List (16.2917) about Ask-an-Expert System: Digitizing Lang Documentation. The announcement is for a system run by E-MELD, which is a project dedicated to the preservation of endangered languages data and documentation, for people to Ask an Expert. This is not a bot, the questions go to real human experts. Interesting way to support the community.