1. DO WE AGREE? MEASURING AGREEMENT ON THE HUMAN JUDGMENTS IN EMOTION ANNOTATION OF NEWS SENTENCES.
- Author
-
Bhowmick, Plaban Kumar, Mitra, Pabitra, and Basu, Anupam
- Subjects
- *
EMOTIONAL conditioning , *NATURAL language processing , *HUMAN-computer interaction , *RELIABILITY (Personality trait) , *INFORMATION filtering , *ACQUIESCENCE (Psychology) , *CONFIDENCE intervals , *DATA analysis - Abstract
An emotional text may be judged to belong to multiple emotion categories because it may evoke different emotions with varying degrees of intensity. For emotion analysis of text in a supervised manner, it is required to annotate text corpus with emotion categories. Because emotion is a very subjective entity, producing reliable annotation is of prime requirement for developing a robust emotion analysis model, so it is wise to have the data set annotated by multiple human judges and generate an aggregated data set provided that the emotional responses provided by different annotators over the data set exhibit substantial agreement. In reality, multiple emotional responses for an emotional text are common. So, the data set is a multilabel one where a single data item may belong to more than one category simultaneously. This article presents a new agreement measure to compute interannotator reliability in multilabel annotation. The new reliability coefficient has been applied to measure the quality of an emotion text corpus. The procedure for generating aggregated data and some corpus cleaning techniques are also discussed. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF