Back to Search
Start Over
Effects of central tendency measures on term weighting in textual information retrieval
- Source :
- Soft Computing. 25:7341-7378
- Publication Year :
- 2021
- Publisher :
- Springer Science and Business Media LLC, 2021.
-
Abstract
- It has become evident that term weighting has a significant effect on relevant document retrieval for which various methods are proposed. However, the main question that arises is which weighting method is the best? In this paper, it is shown that proper aggregation of weights generated by carefully selected basic weighting methods improves retrieval of the relevant documents with respect to the user’s needs. Toward this aim, it is shown that even using simple central tendency measures such as average, median or mid-range over an appropriate subset of basic weighting methods provides term weight that not only outperforms using each basic weighting method but also results in more effective weights in comparison with recently proposed complicated weighting methods. Based on exploiting the proposed method on various datasets, we have studied the effects of normalization of the basic weights, normalization of the vector lengths, the use of different components in the term frequency factor, etc. Results reveal the criteria for selecting an appropriate subset of basic weighting methods that would be fed to the aggregator in order to achieve higher retrieval precision.
- Subjects :
- Normalization (statistics)
0209 industrial biotechnology
Computer science
Computational intelligence
02 engineering and technology
computer.software_genre
Theoretical Computer Science
Term (time)
Textual information
News aggregator
Weighting
020901 industrial engineering & automation
Simple (abstract algebra)
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
Geometry and Topology
Data mining
Document retrieval
computer
Software
Subjects
Details
- ISSN :
- 14337479 and 14327643
- Volume :
- 25
- Database :
- OpenAIRE
- Journal :
- Soft Computing
- Accession number :
- edsair.doi...........7b5b0d0710b679a5b5550798cd7ca6ed
- Full Text :
- https://doi.org/10.1007/s00500-021-05694-5