Back to Search Start Over

Feature Forest Models for Probabilistic HPSG Parsing.

Authors :
Miyao, Yusuke
Tsujii, Jun'ichi
Source :
Computational Linguistics. Mar2008, Vol. 34 Issue 1, p35-80. 46p. 22 Diagrams, 17 Charts, 2 Graphs.
Publication Year :
2008

Abstract

Probabilistic modeling of lexicalized grammars is difficult because these grammars exploit complicated data structures, such as typed feature structures. This prevents us from applying common methods of probabilistic modeling in which a complete structure is divided into substructures under the assumption of statistical independence among sub-structures. For example, part-of-speech tagging of a sentence is decomposed into tagging of each word, and CFGparsing is split into applications of CFGrules. These methods have relied on the structure of the target problem, namely lattices or trees, and cannot be applied to graph structures including typed feature structures. This article proposes the feature forest model as a solution to the problem of probabilistic modeling of complex data structures including typed feature structures. The feature forest model provides a method for probabilistic modeling without the independence assumption when probabilistic events are represented with feature forests. Feature forests are generic data structures that represent ambiguous trees in a packed forest structure. Feature forest models are maximum entropy models defined over feature forests. A dynamic programming algorithm is proposed for maximum entropy estimation without unpacking feature forests. Thus probabilistic modeling of any data structures is possible when they are represented by feature forests. This article also describes methods for representing HPSGsyntactic structures and predicate-argument structures with feature forests. Hence, we describe a complete strategy for developing probabilistic models for HPSGparsing. The effectiveness of the proposed methods is empirically evaluated through parsing experiments on the Penn Treebank, and the promise of applicability to parsing of real-world sentences is discussed. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
08912017
Volume :
34
Issue :
1
Database :
Academic Search Index
Journal :
Computational Linguistics
Publication Type :
Academic Journal
Accession number :
31210438
Full Text :
https://doi.org/10.1162/coli.2008.34.1.35