Effective searches with Google
How Google Works
i. The web server sends the query to the index servers. The content inside the index servers is similar to the index in the back of a book – it tells which pages contain the words that match any particular query term.
ii. The query travels to the doc servers, which actually retrieve the stored documents. Snippets are generated to describe each search result.
iii. The search results are returned to the user in a fraction of a second.
Google runs on a distributed network of thousands of low-cost computers and can therefore carry out fast parallel processing. Parallel processing is a method of computation in which many calculations can be performed simultaneously, significantly speeding up data processing. Google has three distinct parts:
- Googlebot, a web crawler that finds and fetches web pages.
- The indexer that sorts every word on every page and stores the resulting index of words in a huge database.
- The query processor, which compares your search query to the index and recommends the documents that it considers most relevant.
Use the buttons below to navigate through the lesson
Googlebot is Google’s web crawling robot, which finds and retrieves pages on the web and hands them off to the Google indexer. It’s easy to imagine Googlebot as a little spider scurrying across the strands of cyberspace, but in reality Googlebot doesn’t traverse the web at all. It functions much like your web browser, by sending a request to a web server for a web page, downloading the entire page, then handing it off to Google’s indexer.
Googlebot finds pages in two ways:
- through an add URL form, www.google.com/addurl.html, and
- through finding links by crawling the web.
Googlebot gives the indexer the full text of the pages it finds. These pages are stored in Google’s index database. This index is sorted alphabetically by search term, with each index entry storing a list of documents in which the term appears and the location within the text where it occurs. This data structure allows rapid access to documents that contain user query terms.
Google’s Query Processor
The query processor has several parts, including the user interface (search box), the “engine” that evaluates queries and matches them to relevant documents, and the results formatter.
Page Rank is Google’s system for ranking web pages. A page with a higher PageRank is deemed more important and is more likely to be listed above a page with a lower PageRank.
Google considers over a hundred factors in computing a PageRank and determining which documents are most relevant to a query, including the popularity of the page, the position and size of the search terms within the page, and the proximity of the search terms to one another on the page.
In order to view page ranking, The Google tool bar must be installed – Search for and install the Google toolbar
The first result directly
Click on the I’m Feeling Lucky button on Google’s home page to go directly to the first result for your query. Instead of showing you a list of pages, Google sends you immediately to the result that may be most relevant to your query. For example, if you enter the query MCSE and click the I’m Feeling Lucky button, Google may send you to the home page of Microsoft Learning.
Note: I’m Feeling Lucky doesn’t consider the various sponsored links on the first results page, which are paid advertisements, when deciding where to take you. In other words, the I’m Feeling Lucky button will send you to what Google considers the most relevant result that is not a paid advertisement.
Search MCSE again and note the sponsored links above the page displayed by I’m feeling lucky result
The search terms you enter and the order in which you enter them affect both the order and pages that appear in your search results. In the examples below, click on the similar ways of specifying various searches and note how the results differ.
When Google detects very common words such as where, do, I, for, and a, known as stop words, it ignores them so Google may return relevant results.
Not sure how to spell something? Don’t worry, try gessing or speling any way you can. In just the first few months on the job, Google engineer Noam Shazeer developed a spelling correction (suggestion) system based on what other users have entered. The system automatically checks whether you are using the most common spelling of each word in your query.
Here’s how to find results similar to another Google search result. Let’s say you’re interested in finding sites similar to that of Consumer Reports. First, search for their site.
The link may be useful for finding more consumer resources, or information on Consumer Reports’ competitors.
When Google finds products relevant to your query, above your search results, you may find up to three links to items that merchants list in Google’s Product Search service.When Google finds products relevant to your query, above your search results, you may find up to three links to items that merchants list in Google’s Product Search service.
You can customize the way your search results appear by configuring your Google global preferences, options that apply across most Google search services. To change these options, click on the Preferences link, which is to the right of Google’s search box.
The set of languages in which you want to allow messages and labels, text on buttons, and tips to be displayed. Your choice of interface languages is much larger than the “translate” set of languages (those that can be translated into your interface language). It includes relatively obscure languages, such as Catalan, Maltese, Occitan, and Welsh; designed languages like Interlingua and Esperanto; and frivolous languages such as Klingon!, Hacker, and Pig Latin. By default, Google Web search includes all pages on the Web. You can choose to restrict your searches to those pages written in the languages of your choice by setting the search language.
Google’s SafeSearch filters out sites with pornography and explicit sexual content. Moderate filtering, the default, is set to exclude most explicit images from Google Image search results but not Google Web search or other Google search services.
Number of Results
The most important setting, located near the bottom of the page, is “Number of Results.” By default, Google returns just 10 results for a search. Since Google’s search algorithms are so accurate, this default saves Google both computer resources and downloading time. But I always increase the default to 100. Although such searches take a little longer to download (especially over a dial-up connection), getting back 100 results saves me time when I’m searching for anything out-of-the-ordinary; it’s much faster to scroll through a Web page than to manually click through 10 pages of intermediate results.
New Results Window
After you set the Results Window option on the Preferences page, when you click on the main link (typically the page title) for a result, Google will open the corresponding page in a new window.
By using special characters and operators, such as +, –, ~, .., *, OR, and quotation marks, you can fine-tune your search query and increase the accuracy of its results. To search for a phrase, a proper name, or a set of words in a specific order, put them in double quotes.
A query with terms in quotes finds pages containing the exact quoted phrase. For example, [ “children’s favorite book“ ] finds pages containing the phrase “children’s favorite book” exactly. So this query would find pages mentioning children’s favorite book, but not pages containing shopping results or educators lists.
Force Google to include a term by preceding the term with a “+” sign.
To force Google to search for a particular term, put a + sign operator in front of the word in the query. Note that you should not put a space between the + and the word. So, to search for the satirical newspaper The Onion, use [ +The Onion ], not [ + The Onion ].
The + operator is typically used in front of stop words that Google would otherwise ignore or when you want Google to return only those pages that match your search terms exactly. However, the + operator can be used on any term.
Precede each term you do not want to appear in any result with a “–” sign.
To find pages without a particular term, put a – sign operator in front of the word in the query. The – sign indicates that you want to subtract or exclude pages that contain a specific term. Do not put a space between the – and the word,
The tilde (~) operator takes the word immediately following it and searches both for that specific word and for the word’s synonyms. It also searches for the term with alternative endings. The tilde operator works best when applied to general terms and terms with many synonyms. As with the + and – operators, put the ~ (tilde) next to the word, with no spaces between the ~ and its associated word.
Why did Google use tilde? In math, the “~” symbol means “is similar to.” The tilde tells Google to search for pages that are synonyms or similar to the term that follows.
[ ~inexpensive ] matches “inexpensive,” “cheap,” “affordable,” and “low cost”
[ ~run ] matches “run,” “runner’s,” “running,” as well as “marathon”
Specify synonyms or alternative forms with an uppercase OR or | (vertical bar).
The OR operator, for which you may also use | (vertical bar), applies to the search terms immediately adjacent to it. The first and second examples will find pages that include either “Tahiti” or “Hawaii” or both terms, but not pages that contain neither “Tahiti” nor “Hawaii.” The third and fourth examples will find pages that contain any one, two, or all three of the terms “blouse,” “shirt,” and “chemise.”
[ Tahiti OR Hawaii ]
[ Tahiti | Hawaii ]
[ blouse OR shirt OR chemise ]
[ blouse | shirt | chemise ]
Note: If you write OR with a lowercase “o” or a lowercase “r” Google interprets the word as a search term instead of an operator.
Note: Unlike OR, a | (vertical bar) need not be surrounded by spaces.
[ bicycle|cycle ]
Specify that results contain numbers in a range by specifying two numbers, separated by two periods, with no spaces.
For example, specify that you are searching in the price range £250 to £1000 using the number range specification £250..£1000.
[ recumbent bicycle £250..£1000 ]
Find the year the Russian Revolution took place.
[ Russian Revolution 1800..2000 ]
Use *, an asterisk character, known as a wildcard, to match one or more words in a phrase (enclosed in quotes).
Each * represents just one or more words. Google treats the * as a placeholder for a word or more than one word. For example, [ “Google * my life“ ] tells Google to find pages containing a phrase that starts with “Google” followed by one or more words, followed by “my life.” Phrases that fit the bill include: “Google changed my life,” “Google runs my life,” and “Google is my life.”
[ “Google * my life“ ]
Search a known site for a specific item using the Site operator.
Use the site operator to search www.ebuyer.co.uk for hard drives