The size of the World Wide Web

sizeofweb

Reading a paper by Lev Manovich I came across a reference to the web site WorldWideWebSize.com which graphs the size of the World Wide Web. The web site searches Google and Bing daily for different words from a corpus and then uses the total results to estimate the size of the web.

When you know, for example, that the word ‘the’ is present in 67,61% of all documents within the corpus, you can extrapolate the total size of the engine’s index by the document count it reports for ‘the’. If Google says that it found ‘the’ in 14.100.000.000 webpages, an estimated size of the Google’s total index would be 23.633.010.000.

In the screen grab above you can see that the estimated size can change dramatically over time.  Hard to tell why.