Crawling and Indexing

spiderThis is the very basics of the search process. Google and other search engines use a ‘spider’ to venture forth and collect information on pages.

Once a page is found all the links on that page are analysed to find further pages on your site.

After some time this leaves a search engine a list of URL’s related to your site and this should be finite (at that particular time). In order to make sure that the spiders have a complete list of pages you should also submit a sitemap to the search engines so that they have the complete list.

The search engines will then begin to apply formula to your pages in order to allow them to grade and rank your pages. Formulas will also be applied which will determine how often they will visit your site (to get new pages or updated information) and how many pages they will look at on each visit.

All that information is available (after a while) in your Google Webmaster Tools account.


A search engine then analyses each page to see what words and phrases appear on a page and, interestingly, in what position. They also look at key content tags such a titles, H1’s, alts, etc (that’s why they are so important). If you want to see how your pages appear to a spider, you can use a Lynx Viewer. This will give you a good idea of how a spider will see your page.

See blog article of a real life example of the difference between the human and spiders view.

So at this point a search engine will have a lot of data about your pages. It might know, for example, your site is about nuts and bolts and that you have a page about ‘blue bolts’. So, when someone searches for “Blue bolts” a search engine knows that your page should be included in the results, but in what position will your page be?

That’s determined by the search engines. The Google Algorithm uses over 200 different factors to determine the pecking order of your page in a search result. The SEO process helps to determine what those factors are and the importance of each one.

One factor they have never minded admitting was page rank. That is the importance of a page based on the incoming links from other pages. That’s where link building comes into its own. Link building cannot/should not be done until your ‘house is in order’.