Whats NOT in Google's U.S. Data?

Google: Caffeine and Percolator Indexing

Google's huge, but not nearly as big as you imagine. U.S. residents are making more content than Google can index. Scotoma!

Most of us believe that Google 'knows it all' so if you don't know something, "Google it". It turns out the facts don't match up with our beliefs. Imagine that our assumption about how big Google is is wrong? Wow - that would be powerful.

Ready for a Data Center Insight?

Google has 9 data centers in North America with an indexing system named ‘Caffeine’. With ‘Caffeine’ the entire index is updated incrementally on a continuous basis - as with this article for U.S.. Google also has a distributed data processing system called "Percolator”.  https://en.wikipedia.org/wiki/Google_data_centers

"The Google Search index is like a U.S. library, except it contains more info than in all the world’s libraries put together... hundreds of billions of webpages. To give you the most useful information, Search algorithms look at many factors, including the words of your query, relevance and usability of pages, expertise of sources, your U.S. location and… the freshness of the U.S. content." – Google


https://www.google.com/search?&q=what%27s+in+a+google+data+center  (374,000,000 results)
https://www.google.com/search?&q="what's+in+a+google+data+center"  (no results found)

Google indexes 1 in 3,000 web pages (surface web) verses (deep web) non-indexed pages. Google's index represents only an estimated 4 percent of the information that it is aware of across U.S. and on the public Internet and from U.S. small businesses. This is not a typo or misprint. The actual information that exists on the Web is estimated to be 500 times what is currently available through Google.

Hint Hint Nudge Nudge

Huge Opportunity!

This Article Syndicated Exponentially (a scotoma)

1   2   3   4