Back to Search Start Over

A computational grammar for Persian based on GPSG

Authors :
Mohammad Bahrani
Mehdi Manshadi
Hossein Sameti
Source :
Language Resources and Evaluation. 45:387-408
Publication Year :
2011
Publisher :
Springer Science and Business Media LLC, 2011.

Abstract

In this paper, we present our attempts to design and implement a large-coverage computational grammar for the Persian language based on the Generalized Phrase Structured Grammar (GPSG) model. This grammatical model was developed for continuous speech recognition (CSR) applications, but is suitable for other applications that need the syntactic analysis of Persian. In this work, we investigate various syntactic structures relevant to the modern Persian language, and then describe these structures according to a phrase structure model. Noun (N), Verb (V), Adjective (ADJ), Adverb (ADV), and Preposition (P) are considered basic syntactic categories, and X-bar theory is used to define Noun phrases, Verb phrases, Adjective phrases, Adverbial phrases, and Prepositional phrases. However, we have to extend Noun phrase levels in X-bar theory to four levels due to certain complexities in the structure of Noun phrases in the Persian language. A set of 120 grammatical rules for describing different phrase structures of Persian is extracted, and a few instances of the rules are presented in this paper. These rules cover the major syntactic structures of the modern Persian language. For evaluation, the obtained grammatical model is utilized in a bottom-up chart parser for parsing 100 Persian sentences. Our grammatical model can take 89 sentences into account. Incorporating this grammar in a Persian CSR system leads to a 31% reduction in word error rate.

Details

ISSN :
15740218 and 1574020X
Volume :
45
Database :
OpenAIRE
Journal :
Language Resources and Evaluation
Accession number :
edsair.doi...........73b63e7a72596fa3b6e758629a552f17
Full Text :
https://doi.org/10.1007/s10579-011-9144-1