Back to Search Start Over

Unsupervised Numerical Information Extraction via Exploiting Syntactic Structures.

Authors :
Wang, Zixiang
Li, Tongliang
Li, Zhoujun
Source :
Electronics (2079-9292); May2023, Vol. 12 Issue 9, p1977, 18p
Publication Year :
2023

Abstract

Numerical information plays an important role in various fields such as scientific, financial, social, statistics, and news. Most prior studies adopt unsupervised methods by designing complex handcrafted pattern-matching rules to extract numerical information, which can be difficult to scale to the open domain. Other supervised methods require extra time, cost, and knowledge to design, understand, and annotate the training data. To address these limitations, we propose QuantityIE, a novel approach to extracting numerical information as structured representations by exploiting syntactic features of both constituency parsing (CP) and dependency parsing (DP). The extraction results may also serve as distant supervision for zero-shot model training. Our approach outperforms existing methods from two perspectives: (1) the rules are simple yet effective, and (2) the results are more self-contained. We further propose a numerical information retrieval approach based on QuantityIE to answer analytical queries. Experimental results on information extraction and retrieval demonstrate the effectiveness of QuantityIE in extracting numerical information with high fidelity. [ABSTRACT FROM AUTHOR]

Subjects

Subjects :
DATA mining
INFORMATION retrieval

Details

Language :
English
ISSN :
20799292
Volume :
12
Issue :
9
Database :
Complementary Index
Journal :
Electronics (2079-9292)
Publication Type :
Academic Journal
Accession number :
163684180
Full Text :
https://doi.org/10.3390/electronics12091977