1. Parsisanj: an automatic component-based approach toward search engine evaluation.
- Author
-
Alashti, Amin Heydari, Rezaei, Ahmad Asgharian, Elahi, Alireza, Sayyaran, Sobhan, and Ghodsi, Mohammad
- Subjects
- *
WEB search engines , *SEARCH engines , *INFORMATION needs , *TASK analysis - Abstract
Web search engines play a significant role in answering users' information needs based on the huge amount of data available on the internet. Although evaluating the performance of these systems is very important for their improvement, there is no comprehensive, unbiased, low-cost, and reusable method for this purpose. Previous works used a small and limited set of queries for their evaluation process that restricts the assessment domain. Moreover, these methods mainly rely on human evaluators for manual assessment of search engines which makes the results of the evaluation subjective to the opinion of human evaluators and also prone to error. In addition, repeating the evaluation would be a problem, as it requires the same level of human effort as of the first evaluation. Another drawback of the existing evaluations is that they score a search result based on its position in the retrieved list of relevant pages. This implies that these methods are only evaluating the ranker component of a web search engine, leaving all other components unevaluated. In this research, we propose an automatic approach for web search engine evaluation that can run with a query set that is multiple times bigger than the query sets used in the manual evaluations. The automatic nature of our proposed method makes repetition of the evaluation to be low cost in terms of the required human effort. Moreover, we designed this approach to be component based, meaning that we have different evaluation tasks for assessing different components of web search engines. For each component, queries are designed differently and are meant to assess the functionality of that component only. Similarly, the way that the retrieved results are scored is different for each component. For example, for assessing the spell correction component, the input query would contain a typo, and in the results, only instances of that word with the correct form will be scored positively. Experimental results of applying thousands of queries on two Persian and two language-independent web search engines show that none of the selected search engines dominates the other three across all components; instead, each search engine has its own points of strength and weakness that are highlighted through this evaluation. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF