Google Search Engine

This is a demo of the Google Search Engine. Note, it is research in progress so expect downtimes and malfunctions. You can find the older demo Backrub web page here.

Google is being developed by Larry Page and Sergey Brin with very talented implementation help by Scott Hassan and Alan Steremberg.

Search Stanford



Search The Web


Current Status of Google:

Web Page Statistics

Number of Web Pages Fetched 24 million
Number of Urls Seen 76.5 million
Number of Email Addresses 1.7 million
Number of 404's 1.6 milion

Storage Statistics

Total Size of Fetched Pages 147.8 GB
Compressed Repository 53.5 GB
Short Inverted Index 4.1 GB
Full Inverted Index 37.2 GB
Lexicon 293 MB
Temporary Anchor Data
(not in total)
6.6 GB
Document Index Incl.
Variable Width Data
9.7GB
Links Database 3.9 GB
Total Without Repository
Total With Repository
55.2 GB
108.7 GB

Known Problems:

  1. We have only crawled US looking domains so as not to congest international links. This makes the search engine somewhat incomplete.
  2. There has been some corruption in docid's for anchor hits. This results in some random looking matches (about 1 in 10).
    SB: I have tried to patch the code to acoount for this but there are still many problems.
  3. Also, some docinfo pointers are corrupted. SB: I have patched the code to account for most of these but I don't have tight bounds on the extent of corruption.
  4. The preformance is somewhat poor right now. The partly due to data going over NFS and antiquated hardware. However, we are anticipating equipment donations from IBM and Intel to help with preformance and increase our disk capacity so we can scale to 100 million pages.

Before emailing, please read the FAQ. Thanks.

Please send ay comments to backrub@google.standford.edu.

Copyright © 1997 Larry Page, Sergey Brin, Scott Hansan, Alan Steremberg
Backrub

Last modified: Thu Dec 4 10:09:44 PST