Friday, November 21, 2008

In this post the best part of search algorithm, how search engines performs, main functions and how the crawlers behave are explained briefly.

In order to give reliable information search engines follow various search algorithms like crawling, indexing, processing and ranking. They run automated programs called bots or spiders.

Web crawling policies are one of the critical algorithms. These bots/crawlers follow the hyper link structure of the web to crawl the pages.

The next step after crawling a website is to make an index report of the crawled documents called Indexing Documents.These index reports are loaded on large servers and this search engine functionality helps search engines easily identify (just in fractions of second) the search parsed.

Querry Algorithm is another functionality where search engines verifies the technique used by the user in the search field in order to get the information. For example, search engine displays 8 lakhs results for the keyword seo algorithm but a keyword in quotes "seo algorithm" will display only 2500 results.

And finally ranking algorithm. This is not the page rank or some other ranking function. This algorithm is to rank, which results have most relevancy for a query.They arrange these on the results pages which are called SERP(search engine results page).

In the process of learning various search engine functions there are some more important points needs to be noticed in the process of optimizing a website.

Relevancy and Popularity Measurement:

If a document that is returned in the results page for the query parsed it refers to as relevancy.

Popularity is measured based on reference given by one document to another. As much references as a document receives it gains that much popularity

The latest algorithms coined for these two measurements are know as Document analysis and Link analysis. These two algorithms seams to sound simple about the search engine functionalities but it is not even to a normal programmer's imagination what happens internally with these algorithms because it is not just the English words.