Back to Search Start Over

A New Measure of Similarity in Textual Analysis: Vector Similarity Metric versus Cosine Similarity Metric.

Authors :
Srivastava, Rajendra P.
Source :
Journal of Emerging Technologies in Accounting; Spring2023, Vol. 20 Issue 1, p77-90, 14p, 8 Charts, 3 Graphs
Publication Year :
2023

Abstract

This paper proposes a new similarity metric, Vector Similarity Metric (VSM), which is as simple as the popular Cosine Similarity Metric (CSM). The CSM has a major deficiency. It yields the same value, irrespective of how different the two vectors are in their sizes so long as the angle between them is the same. This deficiency remains intact even when Natural Language Processing is used to associate semantic meanings to the words/phrases and when the term frequency is modified using Inverse Document Frequency. This deficiency becomes a serious concern when one is comparing the risk profile of one company with the risk profile of another company or investigating the changes in the risk profile of a company from one year to another. The VSM is based on the difference of the two vectors. The paper demonstrates the superiority of VSM over CSM analytically and through real-world examples. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
15541908
Volume :
20
Issue :
1
Database :
Complementary Index
Journal :
Journal of Emerging Technologies in Accounting
Publication Type :
Academic Journal
Accession number :
163842952
Full Text :
https://doi.org/10.2308/JETA-2021-043