1. Robust Dependency Parsing of Spontaneous Japanese Spoken Language
- Author
-
Yasuyoshi Inagaki, Nobuo Kawaguchi, Tomohiro Ohno, and Shigeki Matsubara
- Subjects
Parsing ,Computer science ,business.industry ,Speech recognition ,computer.software_genre ,syntactically annotated corpus ,dependency parsing ,linguistic phenomena ,Japanese speech ,Artificial Intelligence ,Hardware and Architecture ,Dependency grammar ,S-attributed grammar ,Written language ,stochastic parsing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,computer ,Software ,Natural language processing ,Utterance ,Bottom-up parsing ,Spoken language - Abstract
Spontaneously spoken Japanese includes a lot of grammatically ill-formed linguistic phenomena such as fillers, hesitations, inversions, and so on, which do not appear in written language. This paper proposes a novel method of robust dependency parsing using a large-scale spoken language corpus, and evaluates the availability and robustness of the method using spontaneously spoken dialogue sentences. By utilizing stochastic information about the appearance of ill-formed phenomena, the method can robustly parse spoken Japanese including fillers, inversions, or dependencies over utterance units. Experimental results reveal that the parsing accuracy reached 87.0 %, and we confirmed that it is effective to utilize the location information of a bunsetsu, and the distance information between bunsetsus as stochastic information.
- Published
- 2005