Theorizing from Data is a 52 minute long presentation by Peter Norvig at Google about what they can do (and are doing) with lots of textual data. While it is long, it is excellent. One possible conclusion is that linguistic models don’t compete with lots of data and statistical methods. Some of the useful applications of this are Google Sets which shows you terms that cluster with the items you provide. Another is Google Trends
and Google Translate.
This is part of a set of videos from the Google Developers Day.
Thanks to Alex for this.