FAROO at the CHORUS P2P Workshop

FAROO joined the CHORUS P2P Workshop 1P2P4mm, which was colocated with the InfoScale 2008 conference.

This first workshop on peer to peer architectures for multimedia retrieval (1p2p4mm) took place in Vico Equense, Naples, Italy, on June 6 2008. The workshop was arranged by the CHORUS Coordination Action to discuss what challenges must be met and what bottlenecks must be addressed by research and engineering efforts in the near future.

We had a great and intense discussion on the true benefits of p2p for search and on building a joint p2p platform and a better connection between academic and web2.0 communities as possible measures to reach the critical mass (in terms of number of users) and gain traction as a serious alternative approach.

For more information and the position papers of the participants please visit the workshop homepage.

Unconference & BoF at Web 2.0 Expo

“The Social Side of Search”, a Micro-Unconference initiated by FAROO, took place on April, 25 in the Oracle Booth at the Web 2.0 Expo in San Francisco.

The day before we presented FAROO at the Birds of a Feather (BoF) Session ”People Powered Search”:

People powered search:

Social Networks are very successful, there are social networks for near everything, Only in search you are still on your own? Searching together is natural: Asking friends, family, Experts …

  • Why we haven’t social search yet (on large scale)? Chicken-egg problem, Many users required to be useful, First your plain search must be competitive , than you can add features -> costs!
  • What could be the benefits? Personalization, benefit from search experience of your community …
  • What could be the risks? Spammers, Edit wars, Privacy, locked in community -> alternate opinions get filtered out

Examples of how searching together could benefit – a lot of different flavours:

  • Providing Infrastructure (using P2P technology)
  • Directing the crawler (websites which often appear in results are crawled more frequently/deeper)
  • User generated Ranking (using attention data)
  • Annotating results
  • Editing results
  • Creating results
  • Bringing users with similar search interests together ( FAROO Social Search )
  • Collaborative Searching: Partitioning search among users
  • Personalization using your Social Graph
  • Many more …

Collaboration not only between users, but also between social search projects:

Todays social networks have one problem: walled gardens ( possible workaround: open social, friendfeed api ). Would it be possible to define a standard/protocol to have all social search initiatives to work together from begin?

  • Can the user take its profile with him?
  • Can the user take its attention data/query stream with him?
  • Are the privacy settings standardized?
  • Can the different search projects exchange index and usage data and use them together, to join their forces? Intense discussion on this topic at Alternative Search Engines Day, a conference hosted by Charles Knight

 

Of course also beyond the unconference the Web 2.0 Expo was a great place to meet interesting people and look what others are heading for.


San Francisco day …

… and night

Echo in the blogosphere

For a p2p model it is essentially to share a common vision with your users. Therefore it’s always interesting to see how your ideas are discussed and perceived.

A very encouraging and profound example is the ReadWriteWeb blog post “Could P2P Search Change the Game?” by Bernard Lunn.

 
Additionally here is a short roundup of selected previous blog posts:

FAROO: The Social Side of Search

FAROO gets exciting new social search network functions.

Combining the two mega trends search and social networks, we try to harness the wisdom of crowds and network effects for search.

Why social search? Because searching today is being alone with your question. There is no conversation at all.
But in the real world you are successful if you are not only silently piling through heaps of documents, but ask your colleges and get hints from your friends.

Many social networks are entertaining. But when it comes to search, recommending and connecting you to people who are working at the same topic at the same time, you are on your own again. Because social networks help you to stay in touch with people you already know, they barely help you finding the right (yet unknown to you) people at the right time.

Why context sensitive search advertising is so successful? Because you are presented the right ads at just the moment when you are interested in a specific topic. Now how it would be, if you were presented not ads, but like-minded people? You got the idea. That’s what FAROO’s social search is all about.

And we reconcile social search AND privacy. Unlock the collective intelligence of a search community of peers without sacrificing your privacy. To use the social features no registration is required. You are using an arbitrary alias or nickname.

We just bring people with same interest together. You can at once communicate with like-minded people, profit from their search experience, follow their discoveries, exchange ideas. But you may decide later if you want to become friends and when to reveal your identity.

FAROO’s Social Search is pure opt-in. It can bring you into conversation. But only if you like.

The social features are in alpha stage yet, but here is a Sneak Peak.

Screenshot

What do you think about it?

FAROO – Major Update

We have been busy to further improve the search experience of our p2p web search engine:

Screenshot

Network Visualization
New geographical P2P network visualization at the search page.

Auto Suggest
New Ajax based query auto suggest.

External Sources
External source integration for additional results, if a query can’t be answered from the distributed index.

Active Crawler
New active, community directed crawler. Crawler start points are derived from searches of the FAROO users.

And much more.

Tell us, what you think, and what are you still missing …

New active, community directed crawler.

Good results are most important for search. Most valuable raw material is a well filled index.
Of course in the long run the whole internet should be indexed. In the mean time we may improve search result also with a smaller index, if the crawler is exactly indexing the pages, the users are looking for. In this way for the same index size we may improve its efficiency.

This is exactly what our new active, community directed crawler is intended for. Additional to crawling visited pages, FAROO is now able to crawl autonomously.  Crawler start points are derived from visited pages and searches of the FAROO users. If a search returns only few or no results, pages are crawled in real-time and included in the results of that search. While searching the community directed crawler increases the index exactly there where it’s needed. If there are missing results, gaps are instantly closed.

Active crawling increases the index size at faster pace and overcomes the chicken egg problem when crawling only visited pages with relatively few users. By active crawling also passive peers may contribute. Increasing the index becomes independent from browsing activity, in this way also pages get indexed which nobody from the current FAROO community visited before.

The improved efficiency and speed of crawling and indexing will provide you with richer results every day.

FAROO now supports OpenSearch!

The OpenSearch technology allows an easy integration of FAROO into the search bar of your browser or a third party search client.

Browser Search Bar

The OpenSearch Description XML file identifies and describes a search engine.  The OpenSearch “Auto-discovery” signals the presence of a searchplugin link to the user. This Auto-discovery feature is embedded into the FAROO home page and search page.

Now you can search with FAROO right away from the search bar of Firefox or Internet Explorer 7.

Attention economy, the implicit web and myware

The term attention economy describes an actual trend: the increasing “wealth of information creates a poverty of attention”. The separation of important information from the unimportant noise becomes more and more crucial.

We are leveraging the experience of other users, who sacrificed their attention before, and voted implicitly on the content they visited. This saves our time; we can focus on consuming already preselected, relevant information instead of searching for the needle in the haystack again. FAROO uses this wisdom of crowds for its user generated, user centric, attention based ranking.

To find the most relevant information possible, we have to rate the whole web. To ensure an objective ranking, each document has to be rated by many people. But the extra time required for manual voting would prevent the majority of visitors to vote on every document they visit or to vote at all. Only an automatic, implicit rating ensures that each visitor votes for each document he visits.
This is what the implicit web is about. Analyzing our behavior and using traces left during our journey through the web, we are voting automatically on the fly, implicit without manual action. An interesting blog post of Alex Iskold of Read/WriteWeb illustrates this further.

While this seems a useful thing, it raises privacy concerns. We feel and fear our privacy is once more fading away. But than, myware reconciles personalization and privacy. Myware is tracking our behavior, but is not revealing it to any third party, but using it solely to benefit the user.

This describes perfectly the approach of FAROO to use all the implicit information in order to cope with the information overflow and to improve the search experience for the user, without sacrificing privacy. A as the information is not leaving the computer, there is no risk this data could be sold, handed over or leaked from a central repository.

FAROO utilizes the implicit web to direct the crawler to places the users are interested in, to select, rank and personalize results according to the attention users paid to the content visited, and to implement behavior targeting for advertising based on present and past behavior.

Quest for the Perfect Search

In a talk held by Michael Zimmer, “Privacy and Quaero’s Quest for the Perfect Search Engine: Threats and Opportunities“, he called the designers of the Quaero project to engage in value-conscious design in order to protect the value of privacy.

He made eight privacy-protecting demands:

1. Quaero must be designed in such a way as to prevent any substantive response to a civil or criminal subpoena of user activity

2. Quaero must be designed so IP addresses and cookies cannot be associated with particular users or accounts

3. Query traffic must be encrypted to prevent ‘man in the middle’ monitoring

4. Quaero must provide transparency in the data it collects about users, how it is used, who uses it, and how long it is retained

5. Quaero must not engage in personalized or behaviorally-targeted advertising

6. Quaero must take steps to remove or obscure personally-identifiable images (faces, license plates, etc) from its searchable index

7. Quaero must provide individuals the ability to remove or obscure personally-identifiable data from its searchable index

8. Quaero must provide users the ability to view, edit, and delete any search history data associated with their account

While we don’t know if Quaero will listen to him, FAROO meets already today six out of eight of his demands. And we believe that we are even conform with the intention behind demand No. 5 .

1. FAROO neither knows its users nor what they are searching.

2. There are neither IP addresses logged nor cookies used.

3. Search queries and index are encrypted.

4. There is no central instance collecting user data at all. No personal data is leaving the computer at any time.

5. Well, FAROO does personalized and behaviorally-targeted advertising. But we are doing this solely on the client side. Therefore we can provide both: personalization and privacy.

6. There is no image search.

7. Difficult, as there is a verification/authorization issue: How we know that the person requesting the removal of information is in fact the person the information is belonging to?

8. There is no search account and there is no search history beyond the own computer.