RDUES: WebCorp: The Web as Corpus

The Research and Development Unit for English Studies (RDUES) of UCE of Birmingham has a tool WebCorp: The Web as Corpus which searches google for a term and then goes to the top 199 documents Google identifies and searches them. It takes a while and works like our Googlizer, but produces more verbose results. It produces a concordance organized by document with links to a full word frequency list for the doc. The advanced search form has some interesting features, including the ability to point it at other engines.

UCE Birmingham is strange place from the web. UCE stands for “University of Central England” and you have to go deep to the At A Glance : History Of UCE Birmingham to find this out. (There’s no point explaining it to outsiders anywhere on the web page.) They seem to have been formed out of all the little colleges, polytechnics and schools in the area in 1992.