Abstract
In this paper, we propose a framework to characterize and compare two search engine results. Typical user-queries are ambiguous and, consequentially, each search engine will compute ranks in different manners, attempting to answer them in the best possible way. Thus, each search engine will have its own bias. Given the importance of the first page results in Web Search Engines, in this paper we propose a framework to assess the information presented in the first page by measuring the information entropy and the correlations between two ranks. Employing the recently proposed Rank-Biased Overlap measure [2] we compare to which extent do Bing and Google rankings in fact differ. We also extend this measure and propose a measure for comparing the information entropy present in two ranks. The proposed measure is based on the correlation of two ranks and the application of Jensen-Shannon's divergence among two document sets. Our methodology starts with 40,000 user queries and crawls the search results for these queries on both search engines. The results allow us to determine the search engines correlations, crawling coverage, information overlap, and information entropy.
Original language | English |
---|---|
Title of host publication | CIKM'11 - Proceedings of the 2011 ACM International Conference on Information and Knowledge Management |
Publisher | ACM - Association for Computing Machinery |
Pages | 1933-1936 |
Number of pages | 4 |
ISBN (Print) | 9781450307178 |
DOIs | |
Publication status | Published - 2011 |
Event | 20th ACM Conference on Information and Knowledge Management, CIKM'11 - Glasgow, United Kingdom Duration: 24 Oct 2011 → 28 Oct 2011 |
Conference
Conference | 20th ACM Conference on Information and Knowledge Management, CIKM'11 |
---|---|
Country/Territory | United Kingdom |
City | Glasgow |
Period | 24/10/11 → 28/10/11 |
Keywords
- content aware rank similarity
- jensen-shannon divergence
- rank-biased overlap
- search engines comparison