How Google Indexes Websites

Google doesn’t index your site, it indexes pages within your site. Each page is indexed independently of the other pages, although the linking structure between your pages will have an effect on the way your site is returned in the search engine results.

Submitting your website and the Googlebot

There is no need to submit your website to Google. Google uses a ‘spider’ called the Googlebot to crawl the web looking for links to new pages, so the chances are, Googlebot will find you by itself. One of the most important factors Google looks at when deciding where to position your pages are the number, and quality, of links pointing to your pages. If you do submit, and Googlebot is unable to find any links pointing to your pages, you will get indexed, but your site won't come up under any search terms.

Page Rank

The number of links pointing to your pages, and the quality of those links will determine the Page Rank of your pages. This is the score Google gives to your pages to show their value. If you have a high Page Rank, your pages are more valuable than those with low Page Rank. When Google comes to ranking your pages in the Search Engine Results Pages (SERPS), your Page Rank plays a key role. If your page has a PR of 2, you are going to have difficulty coming above pages with a PR 8. Your Page Rank is calculated automatically using a complex algorithm, but in summary, the Page Rank of your page is made up of the sum of links pointing to your page, and takes into account the Page Rank of those links pointing to your page, and the number of other links on the page in which your link is contained.

 

In Figure 1 above, Page A has lots of links going to other pages. If your page is Page B, it will receive less Page Rank than if Page A only had one link pointing to your page, as in Figure 2 above.

 

In Figure 3 above, Page A has lots of links pointing to it from other pages. If your page is Page B, it will receive more Page Rank than if there was only one page pointing to Page A, as in Figure 4 above.

Toolbar Page Rank and Real Page Rank

The actual Page Rank is used by Google's internal systems and the actual Page Rank for any one page is never published. The nearest approximation to the actual Page Rank is the toolbar Page Rank. This can be seen in the Google toolbar, and is a value between one and ten, ten being the most valuable. Unfortunately, the toolbar Page Rank is updated infrequently, approximately every three months, and is limited, in that is does not differentiate between pages of, for example, high Page Rank 3 and low Page Rank 3.

Relevancy

Page Rank alone does not determine your final position in the SERPS, which will also depend upon how relevant your page is to a given key phrase. For example, a page about ‘cars’ will not be listed in the SERPS for a search on ‘holidays’, no matter how high the PR. There are hundreds of factors which affect a pages relevancy to a particular key phrase, and the level of importance of each factor is constantly changing.

On page factors

Some parts of a page carry more weight than others when determining the relevancy. For example, if your title tag, h1 heading and body text contain the key phrase ‘blue cars’, your page will be deemed more relevant than a page which only has ‘blue cars’ in the body text. A page with ‘blue cars’ as 10% of the body text will be more relevant than a page with ‘blue cars’ as 5% of the body text. The most important on page factor is the title tag, followed by the various heading tags (h1, h2…), and finally the body text. Alt tags on images used

Off page factors

On page factors are easily manipulated by website owners, but off page factors less so. The text used in links pointing to your site is currently the most important factor used in determining your position in the SERPS. You can increase the relevancy of pages in your site by ensuring you use relevant text in links to all your pages, for links from external sites, and links in your own site navigation. Linked images don’t pass on any relevancy, so avoid them.

Filters and penalties

It would be easy to get links from hundreds of site, all using your targeted key phrase, add the phrase to your page title, heading, and pack the body text with the phrase, but Google implements various filters and penalties in an attempt to prevent ‘spamming’. The penalties applied vary greatly, and may include a complete ban of the entire site, a reduction, or complete removal of PR for the entire site, a ban on individual pages, removal or reduction of PR for individual pages. Google prefers to use automatic penalties, but if your site is reported, may implement manual penalties. Penalties can be due to any of the following

As a guideline, do not employ any techniques which do not add value for the users of your site. A penalty is applied very quickly, but it may take months for your site to recover if you’re lucky. If you’re unlucky, your site may never recover.

Toolbar Page Rank and Real Page Rank

The actual Page Rank is used by Google's internal systems and the actual Page Rank for any one page is never published. The nearest approximation to the actual Page Rank is the toolbar Page Rank. This can be seen in the Google toolbar, and is a value between one and ten, ten being the most valuable. Unfortunately, the toolbar Page Rank is updated infrequently, approximately every three months, and is limited, in that is does not differentiate between pages of, for example, high Page Rank 3 and low Page Rank 3.

Relevancy

Page Rank alone does not determine your final position in the SERPS, which will also depend upon how relevant your page is to a given key phrase. For example, a page about ‘cars’ will not be listed in the SERPS for a search on ‘holidays’, no matter how high the PR. There are hundreds of factors which affect a pages relevancy to a particular key phrase, and the level of importance of each factor is constantly changing.

On page factors

Some parts of a page carry more weight than others when determining the relevancy. For example, if your title tag, h1 heading and body text contain the key phrase ‘blue cars’, your page will be deemed more relevant than a page which only has ‘blue cars’ in the body text. A page with ‘blue cars’ as 10% of the body text will be more relevant than a page with ‘blue cars’ as 5% of the body text. The most important on page factor is the title tag, followed by the various heading tags (h1, h2…), and finally the body text. Alt tags on images used

Off page factors

On page factors are easily manipulated by website owners, but off page factors less so. The text used in links pointing to your site is currently the most important factor used in determining your position in the SERPS. You can increase the relevancy of pages in your site by ensuring you use relevant text in links to all your pages, for links from external sites, and links in your own site navigation. Linked images don’t pass on any relevancy, so avoid them.

Filters and penalties

It would be easy to get links from hundreds of site, all using your targeted key phrase, add the phrase to your page title, heading, and pack the body text with the phrase, but Google implements various filters and penalties in an attempt to prevent ‘spamming’. The penalties applied vary greatly, and may include a complete ban of the entire site, a reduction, or complete removal of PR for the entire site, a ban on individual pages, removal or reduction of PR for individual pages. Google prefers to use automatic penalties, but if your site is reported, may implement manual penalties. Penalties can be due to any of the following

As a guideline, do not employ any techniques which do not add value for the users of your site. A penalty is applied very quickly, but it may take months for your site to recover if you’re lucky. If you’re unlucky, your site may never recover.