Back to Search Start Over

Effects of central tendency measures on term weighting in textual information retrieval

Authors :
Andrea Visconti
Hooman Tahayori
Farzad Ghahramani
Source :
Soft Computing. 25:7341-7378
Publication Year :
2021
Publisher :
Springer Science and Business Media LLC, 2021.

Abstract

It has become evident that term weighting has a significant effect on relevant document retrieval for which various methods are proposed. However, the main question that arises is which weighting method is the best? In this paper, it is shown that proper aggregation of weights generated by carefully selected basic weighting methods improves retrieval of the relevant documents with respect to the user’s needs. Toward this aim, it is shown that even using simple central tendency measures such as average, median or mid-range over an appropriate subset of basic weighting methods provides term weight that not only outperforms using each basic weighting method but also results in more effective weights in comparison with recently proposed complicated weighting methods. Based on exploiting the proposed method on various datasets, we have studied the effects of normalization of the basic weights, normalization of the vector lengths, the use of different components in the term frequency factor, etc. Results reveal the criteria for selecting an appropriate subset of basic weighting methods that would be fed to the aggregator in order to achieve higher retrieval precision.

Details

ISSN :
14337479 and 14327643
Volume :
25
Database :
OpenAIRE
Journal :
Soft Computing
Accession number :
edsair.doi...........7b5b0d0710b679a5b5550798cd7ca6ed
Full Text :
https://doi.org/10.1007/s00500-021-05694-5