Tagging Full Text Searchable Articles: An Overview of Social Tagging Activity in Historic Australian Newspapers August 2008 – August 2009

D-Lib has an article by Rose Holley of the Australian Newspapers Digitisation Program (ANDP), on Tagging Full Text Searchable Articles: An Overview of Social Tagging Activity in Historic Australian Newspapers August 2008 – August 2009 (January/February 2010, Volume 16, Number 1/2.)

The Australian Newspapers project is a leader in crowdsourcing. They encourage users correct the full text of articles and tag them. This D-Lib article focuses on the tagging and mentions other projects that have researched the effectiveness (and found it wanting compared to professional subject tagging.) The conclusion endorses user tagging,

The observations show that there were both similarities and differences in tagging activity and behaviours across a full text collection as compared to the research done on tagging in image collections. Similarities included that registered users tag more than anonymous users, that distinct tags form 21-37% of the tag pool, that 40% or more of the tag pool is created by ‘super-taggers’ (top 10 tag creators), that abuse of tags occurs rarely if at all, and that spelling mistakes occur fairly frequently if spell-check or other mechanisms are not implemented at the tag creation point. Notable differences were the higher percentage of distinct tags used only once (74% at NLA) and the predominant use of personal names in these tags. This is perhaps related to the type of resource (historic newspaper) rather than its format (full-text). It is likely that this difference may be duplicated if tagging were enabled across archive and manuscript collections. There was an expectation from users that since this was a library service offering tagging, there would be some ‘strict library rules’ for creating tags, and users were surprised there were none. The users quickly developed their own unwritten guidelines. Clay Shirky suggests “Tagging gets better with scale” and libraries have lots of scale – both in content and users. We shouldn’t get too hung up on guidelines and quality. I agree with Shirky that “If there is no shelf, then even imagining that there is one right way to organise things is an error”.

The experience of the National Library of Australia shows that tagging is a good thing, users want it, and it adds more information to data. It costs little to nothing and is relatively easy to implement; therefore, more libraries and archives should just implement it across their entire collections. This is what the National Library of Australia will have done by the end of 2009.