{"id":5783,"date":"2015-05-14T01:16:50","date_gmt":"2015-05-14T01:16:50","guid":{"rendered":"http:\/\/theoreti.ca\/?p=5783"},"modified":"2015-05-14T01:16:50","modified_gmt":"2015-05-14T01:16:50","slug":"the-size-of-the-world-wide-web","status":"publish","type":"post","link":"https:\/\/theoreti.ca\/?p=5783","title":{"rendered":"The size of the World Wide Web"},"content":{"rendered":"<p><a href=\"https:\/\/theoreti.ca\/wp-content\/uploads\/2015\/05\/sizeofweb.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-5789\" src=\"https:\/\/theoreti.ca\/wp-content\/uploads\/2015\/05\/sizeofweb-300x171.png\" alt=\"sizeofweb\" width=\"300\" height=\"171\" srcset=\"https:\/\/theoreti.ca\/wp-content\/uploads\/2015\/05\/sizeofweb-300x171.png 300w, https:\/\/theoreti.ca\/wp-content\/uploads\/2015\/05\/sizeofweb.png 678w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>Reading a paper by Lev Manovich I came across a reference to the web site <a href=\"http:\/\/www.worldwidewebsize.com\/\">WorldWideWebSize.com<\/a> which graphs the size of the World Wide Web. The web site searches Google and Bing daily for different words from a corpus and then uses the total results to estimate the size of the web.<\/p>\n<blockquote><p>When you know, for example, that the word &#8216;the&#8217; is present in 67,61% of all documents within the corpus, you can extrapolate the total size of the engine&#8217;s index by the document count it reports for &#8216;the&#8217;. If Google says that it found &#8216;the&#8217; in 14.100.000.000 webpages, an estimated size of the Google&#8217;s total index would be 23.633.010.000.<\/p><\/blockquote>\n<p>In the screen grab above you can see that the estimated size can change dramatically over time. \u00a0Hard to tell why.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Reading a paper by Lev Manovich I came across a reference to the web site WorldWideWebSize.com which graphs the size of the World Wide Web. The web site searches Google and Bing daily for different words from a corpus and then uses the total results to estimate the size of the web. When you know, &hellip; <a href=\"https:\/\/theoreti.ca\/?p=5783\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">The size of the World Wide Web<\/span><\/a><\/p>\n","protected":false},"author":5,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[54,11],"tags":[],"class_list":["post-5783","post","type-post","status-publish","format-standard","hentry","category-big-data","category-internet-culture-and-technology"],"_links":{"self":[{"href":"https:\/\/theoreti.ca\/index.php?rest_route=\/wp\/v2\/posts\/5783","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/theoreti.ca\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/theoreti.ca\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/theoreti.ca\/index.php?rest_route=\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/theoreti.ca\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5783"}],"version-history":[{"count":2,"href":"https:\/\/theoreti.ca\/index.php?rest_route=\/wp\/v2\/posts\/5783\/revisions"}],"predecessor-version":[{"id":5790,"href":"https:\/\/theoreti.ca\/index.php?rest_route=\/wp\/v2\/posts\/5783\/revisions\/5790"}],"wp:attachment":[{"href":"https:\/\/theoreti.ca\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5783"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/theoreti.ca\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=5783"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/theoreti.ca\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=5783"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}