New active, community directed crawler.

Share onTweet about this on TwitterShare on FacebookShare on Google+Share on RedditBuffer this pageShare on LinkedInShare on VK

Good results are most important for search. Most valuable raw material is a well filled index.
Of course in the long run the whole internet should be indexed. In the mean time we may improve search result also with a smaller index, if the crawler is exactly indexing the pages, the users are looking for. In this way for the same index size we may improve its efficiency.

This is exactly what our new active, community directed crawler is intended for. Additional to crawling visited pages, FAROO is now able to crawl autonomously.  Crawler start points are derived from visited pages and searches of the FAROO users. If a search returns only few or no results, pages are crawled in real-time and included in the results of that search. While searching the community directed crawler increases the index exactly there where it’s needed. If there are missing results, gaps are instantly closed.

Active crawling increases the index size at faster pace and overcomes the chicken egg problem when crawling only visited pages with relatively few users. By active crawling also passive peers may contribute. Increasing the index becomes independent from browsing activity, in this way also pages get indexed which nobody from the current FAROO community visited before.

The improved efficiency and speed of crawling and indexing will provide you with richer results every day.

Share onTweet about this on TwitterShare on FacebookShare on Google+Share on RedditBuffer this pageShare on LinkedInShare on VK

3 thoughts on “New active, community directed crawler.

  1. Pingback: Revisited: Deriving crawler start points from visited pages by monitoring HTTP traffic « FAROO Blog

  2. Pingback: Six degrees of distribution in search « FAROO Blog

  3. Pingback: The six degrees of distribution in search.

Leave a Reply

Your email address will not be published. Required fields are marked *