60 results on '"Deterministic parsing"'
Search Results
2. Dependency Parsing for Arabic Quran using Easy-First Parsing Algorithm
- Author
-
Mochammad Arif Bijaksana, Alfiya El Hafsa, and Arief Fatchul Huda
- Subjects
Parsing ,Interpretation (logic) ,Relation (database) ,Computer science ,Head (linguistics) ,Dependency grammar ,Deterministic parsing ,computer.software_genre ,Algorithm ,computer ,Sentence ,Word (computer architecture) - Abstract
Arabic is the main language of Al-Quran. Nowadays, many people are studying the Language of Al-Quran, called Quran Arabic. For the beginners, it is important for them to understand the syntactic relationship in a sentence found in the Qur'an. If they do not understand enough, the interpretation will be different and wrong. It will turn into dangerous because Al-Quran is a source of guidance for Muslims’ life. Dependency parsing is very important for linguistic research, especially for rich languages such as the Arabic Language. This study aims to build dependency parsing, in order to make it easier to get to understand syntactic relationship information in sentences. This study uses a parsing method called deterministic parsing, which the method used is shift-reduce parsing with the Easy-First parsing algorithm. The evaluation used labeled attachment score calculation. The score generated from the evaluation was 69.7, beforehand, the comparison both the system results and the gold standard have been done. 62 sentences found the correct head and relation in each word. The number of words found to be wrong is not more than 3 words in one sentence. Evaluation scores produced are not exorbitant due to the complicated tagset used and lacking test sentences.
- Published
- 2020
3. A deterministic parsing algorithm for ambiguous regular expressions
- Author
-
Angelo Borsotti, Luca Breveglieri, Angelo Morzenti, and Stefano Crespi Reghizzi
- Subjects
Empty string ,regular expression deterministic parsing ,Computer Networks and Communications ,Computer science ,0102 computer and information sciences ,02 engineering and technology ,computer.software_genre ,01 natural sciences ,regular expression parsing tool ,0202 electrical engineering, electronic engineering, information engineering ,ambiguous regular expression ,Regular expression ,Time complexity ,Parsing ,Syntax (programming languages) ,String (computer science) ,Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing) ,020207 software engineering ,Tree (graph theory) ,Berry-Sethi recognizer ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,010201 computation theory & mathematics ,Deterministic parsing ,computer ,Algorithm ,Software ,Information Systems - Abstract
We introduce a new parser generator, called Berry–Sethi Parser (BSP), for ambiguous regular expressions (RE). The generator constructs a deterministic finite-state transducer that recognizes an input string, as the classical Berry–Sethi algorithm does, and additionally outputs a linear representation of all the syntax trees of the string; for infinitely ambiguous strings, a policy for selecting representative sets of trees is chosen. To construct the transducer, the RE symbols, including letters, parentheses and other metasymbols, are distinctly numbered, so that the corresponding language becomes locally testable. In this way a deterministic position automaton can be constructed, which recognizes and translates the input into a compact DAG representation of the syntax trees. The correctness of the construction is proved. The transducer operates in a linear time on the input. Its descriptive complexity is analyzed as a function of established RE parameters: the alphabetic width, the number of null string symbols and the height of the RE tree. A condition for checking RE ambiguity on the transducer graph is stated. Experimental results of running the parser generator and the parser on a large RE collection are presented. The POSIX RE disambiguation criterion has also been applied to the parser.
- Published
- 2021
4. Deeply Integrating C11 Code Support into Isabelle/PIDE
- Author
-
Frédéric Tuong and Burkhart Wolff
- Subjects
Computer Science - Symbolic Computation ,Computer Science - Logic in Computer Science ,Computer Science - Programming Languages ,Programming language ,Computer science ,Interface (Java) ,Framing (World Wide Web) ,lcsh:Mathematics ,HOL ,Context (language use) ,Semantics ,computer.software_genre ,lcsh:QA1-939 ,lcsh:QA75.5-76.95 ,Computer Science - Software Engineering ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Rule-based machine translation ,Memory model ,lcsh:Electronic computers. Computer science ,Deterministic parsing ,computer - Abstract
We present a framework for C code in C11 syntax deeply integrated into the Isabelle/PIDE development environment. Our framework provides an abstract interface for verification back-ends to be plugged-in independently. Thus, various techniques such as deductive program verification or white-box testing can be applied to the same source, which is part of an integrated PIDE document model. Semantic back-ends are free to choose the supported C fragment and its semantics. In particular, they can differ on the chosen memory model or the specification mechanism for framing conditions. Our framework supports semantic annotations of C sources in the form of comments. Annotations serve to locally control back-end settings, and can express the term focus to which an annotation refers. Both the logical and the syntactic context are available when semantic annotations are evaluated. As a consequence, a formula in an annotation can refer both to HOL or C variables. Our approach demonstrates the degree of maturity and expressive power the Isabelle/PIDE subsystem has achieved in recent years. Our integration technique employs Lex and Yacc style grammars to ensure efficient deterministic parsing. We present two case studies for the integration of (known) semantic back-ends in order to validate the design decisions for our back-end interface., Comment: In Proceedings F-IDE 2019, arXiv:1912.09611
- Published
- 2019
5. Uniquely Parsable Unification Grammars and Their Parser Implemented in Prolog.
- Author
-
Lee, Jia, Morita, Kenichi, Asou, Hiroki, and Imai, Katsunobu
- Abstract
A uniquely parsable grammar (UPG) introduced by Morita and coworkers is a formal grammar with a restricted type of rewriting rules, where parsing can be performed without backtracking. By extending a UPG, we introduce a uniquely parsable unification grammar (UPUG), and we investigate its applicability to parsing. A unification grammar (UG) is a system such that a sequence of terms is rewritten by a set of rules, and the rewriting process accompanies unification of terms as in Prolog. We first define a general framework of a UG and then give a UPUG-condition so that it has the property of unique parsability. Since the class of UPGs is a subclass of UPUGs and is known to be universal in language generating ability, the class of UPUGs is also universal. We then show a simple parsing method for UPUGs. Based on it, we give a Prolog implementation of a parser which will be useful for natural language analysis and other applications. [ABSTRACT FROM AUTHOR]
- Published
- 2000
- Full Text
- View/download PDF
6. A new view on parser combinators
- Author
-
Pieter Koopman and Rinus Plasmeijer
- Subjects
Functional programming ,Domain-specific language ,Parsing ,Grammar ,Computer science ,Programming language ,media_common.quotation_subject ,computer.software_genre ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Parser combinator ,Deterministic parsing ,Combinatory logic ,computer ,Type constructor ,media_common - Abstract
Parser combinators offer a concise and fast way to produce reasonably efficient parsers. The combinator libraries themselves can be small and provide an elegant application of functional programming techniques. They are one of the success stories in functional programming that are also ported to many other languages. In this paper, we illustrate that we can make the parser combinators more general by modeling them as a tagless domain specific language. The idea is to replace the ordinary combinators by a set of type constructor classes. By making different implementations of this class we can assign various interpretations of one and the same grammar specification. The set of type classes makes the DSL type-safe and extendable without needing to change existing parts and implementations. This enables us to make multiple interpretations, views, of the specified grammar. In this paper we show views for deterministic parsing, nondeterministic parsing, generating possible parse trees produced by the grammar without needing the corresponding input, generating inputs accepted by the grammar, adapting the grammar rules such that the parser combinators can handle left-recursion and so on. This makes our multi-view parser combinators more powerful than the existing approaches.
- Published
- 2019
7. DEALING WITH AMBIGUITIES IN ENGLISH CONJUNCTIONS AND COMPARATIVES BY A DETERMINISTIC PARSER.
- Author
-
LIU, REY-LONG and SOO, VON-WUN
- Abstract
The major problems in parsing English conjunctions and comparatives are ambiguities of scoping and ellipsis. Scoping ambiguities occur when a parser cannot deterministically detect boundaries of constituents, while ellipsis ambiguities occur when a parser cannot deterministically detect missing components. Since simple lookahead mechanisms cannot collect adequate information to resolve these ambiguities, a parsing strategy that only employs such mechanisms will need to backtrack each time it makes incorrect assumptions. In this paper, we extend the Wait-And-See strategy to parse conjunctions and comparatives deterministically and simultaneously. Several mechanisms, such as bottom-up preparsing, suspension, and pattern matching, are implemented. The bottom-up preparsing accesses the dictionary and recognizes isolated sentence fragments which can be determined without ambiguities. The suspension, which is different from Marcus's attention shifting, allows the parser to suspend temporally at ambiguous points and continue to parse the rest of the sentence until it obtains the necessary information to resolve the ambiguities. Pattern matching uses the concept of symmetry to detect missing components (the ellipses) in the two conjoined or compared sentence fragments. [ABSTRACT FROM AUTHOR]
- Published
- 1990
- Full Text
- View/download PDF
8. Generalizing input-driven languages: Theoretical and practical benefits
- Author
-
Dino Mandrioli and Matteo Pradella
- Subjects
FOS: Computer and information sciences ,Theoretical computer science ,Parsing ,General Computer Science ,Hierarchy (mathematics) ,Formal Languages and Automata Theory (cs.FL) ,Computer science ,020207 software engineering ,Computer Science - Formal Languages and Automata Theory ,0102 computer and information sciences ,02 engineering and technology ,computer.software_genre ,01 natural sciences ,Theoretical Computer Science ,Decidability ,Order of operations ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Regular language ,Closure (mathematics) ,Compiler construction ,010201 computation theory & mathematics ,0202 electrical engineering, electronic engineering, information engineering ,Deterministic parsing ,computer - Abstract
Regular languages (RL) are the simplest family in Chomsky’s hierarchy. Thanks to their simplicity they enjoy various nice algebraic and logic properties that have been successfully exploited in many application fields. Practically all of their related problems are decidable, so that they support automatic verification algorithms. Also, they can be recognized in real-time. Context-free languages (CFL) are another major family well-suited to formalize programming, natural, and many other classes of languages; their increased generative power w.r.t. RL, however, causes the loss of several closure properties and of the decidability of important problems; furthermore they need complex parsing algorithms. Thus, various subclasses thereof have been defined with different goals, spanning from efficient, deterministic parsing to closure properties, logic characterization and automatic verification techniques. Among CFL subclasses, so-called structured ones, i.e., those where the typical tree-structure is visible in the sentences, exhibit many of the algebraic and logic properties of RL, whereas deterministic CFL have been thoroughly exploited in compiler construction and other application fields. After surveying and comparing the main properties of those various language families, we go back to operator precedence languages (OPL), an old family through which R. Floyd pioneered deterministic parsing, and we show that they offer unexpected properties in two fields so far investigated in totally independent ways: they enable parsing parallelization in a more effective way than traditional sequential parsers, and exhibit the same algebraic and logic properties so far obtained only for less expressive language families.
- Published
- 2018
9. Deterministic Parsing with P Colony Automata
- Author
-
Erzsébet Csuhaj-Varjú, Kristóf Kántor, and György Vaszil
- Subjects
Discrete mathematics ,Multiset ,Parsing ,Computer science ,Backtracking ,Property (programming) ,0102 computer and information sciences ,02 engineering and technology ,computer.software_genre ,01 natural sciences ,Automaton ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Rule-based machine translation ,010201 computation theory & mathematics ,Symbol (programming) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Deterministic parsing ,computer - Abstract
We investigate the possibility of the deterministic parsing (that is, parsing without backtracking) of languages described by (generalized) P colony automata. We define a subclass of these computing devices satisfying a property which resembles the LL(k) property of context-free grammars, and study the possibility of parsing the characterized languages using a k symbol lookahead, as in the LL(k) parsing method for context-free languages.
- Published
- 2018
10. Determination of the Compound Biological Effectiveness (CBE) Factors Based on the ISHIYAMA-IMAHORI Deterministic Parsing Model with the Dynamic PET Technique
- Author
-
Hanna Koivunoro, Shintaro Ishiyama, Jun Itami, and Yoshio Imahori
- Subjects
Neutron capture ,Boron concentration ,Order (group theory) ,Applied mathematics ,Sigmoid function ,Patient data ,Logistic function ,Deterministic parsing ,Eigenvalues and eigenvectors ,Mathematics - Abstract
Purpose: In defining the biological effects of the 10B(n, α)7Li neutron capture reaction, we have proposed a deterministic parsing model (ISHIYAMA-IMAHORI model) to determine the Compound Biological Effectiveness (CBE) factor in Borono-Phenyl-Alanine (BPA)-mediated Boron Neutron Capture Therapy (BNCT). In present paper, we demonstrate a specific method of how the application of the case of application to actual patient data, which is founded on this model for tissues and tumor. Method: To determine the CBE factor, we derived the following new calculation formula founded on the deterministic parsing model with three constants, CBE0, F, n and the eigen value Nth/Nmax. (1), where, Nth and Nmax are the threshold value of boron concentration of N and saturation boron density and CBE0, F and n are given as 0.5, 8 and 3, respectively. In order to determine Nth and Nmax in the formula, sigmoid logistic function was employed for 10B concentration data, Db(t) obtained by dynamic PET technique. (2), where, A, a and t0 are constants. Results and Conclusion: From the application of sigmoid function to dynamic PET data, it is concluded that the Nth and Nmax for tissue and tumor are identified with the parameter constants in the sigmoid function in Equation (2) as: (3). And the calculated CBE factor values obtained from Equation (1), with Nth/Nmax.
- Published
- 2015
11. Deterministic Parsing Model of the Compound Biological Effectiveness (CBE) Factor for Intracellular 10Boron Distribution in Boron Neutron Capture Therapy
- Author
-
Shintaro Ishiyama
- Subjects
Physics ,Neutron capture ,chemistry ,Distribution (number theory) ,Factor (programming language) ,chemistry.chemical_element ,Statistical physics ,Boron ,Deterministic parsing ,computer ,Eigenvalues and eigenvectors ,computer.programming_language - Abstract
Purpose: In defining the biological effects of the 10 B(n, α) 7 Li neutron capture reaction, we have previously developed a deterministic parsing model to determine the Compound Biological Effectiveness (CBE) factor in Borono-Phenyl-Alanine (BPA)-mediated Boron Neutron Capture Therapy (BNCT). In present paper, we demonstrate that the CBE factor is directly and unambiguously derivable by the new formula for any case of intracellular 10 Boron ( 10 B) distribution, which is founded on this model for tissues and tumor. Method: To determine the CBE factor, we derive the following new calculation formula founded on the deterministic parsing model with three constants, CBE0, F, n and the eigen value Nth/Nmax.
- Published
- 2014
12. Unsupervised dependency parsing without training
- Author
-
Anders Søgaard
- Subjects
Linguistics and Language ,Parsing ,Dependency (UML) ,business.industry ,Computer science ,Multivalued dependency ,computer.software_genre ,Language and Linguistics ,Grammar induction ,Ranking ,Artificial Intelligence ,Dependency grammar ,Artificial intelligence ,Deterministic parsing ,business ,Centrality ,computer ,Software ,Natural language processing - Abstract
Usually unsupervised dependency parsers try to optimize the probability of a corpus by revising the dependency model that is assumed to have generated the corpus. In this paper we explore a different view in which a dependency structure is, among other things, a partial order on the nodes in terms of centrality or saliency. Under this assumption we directly model centrality and derive dependency trees from the ordering of words. The result is an approach to unsupervised dependency parsing that is very different from standard ones in that it requires no training data. The input words are ordered by centrality, and a parse is derived from the ranking using a simple deterministic parsing algorithm, relying on the universal dependency rules defined by Naseem et al. (Naseem, T., Chen, H., Barzilay, R., Johnson, M. 2010. Using universal linguistic knowledge to guide grammar induction. In Proceedings of Empirical Methods in Natural Language Processing, Boston, MA, USA, pp. 1234–44.). Our approach is evaluated on data from twelve different languages and is remarkably competitive.
- Published
- 2012
13. Deterministic shift-reduce parsing for unification-based grammars
- Author
-
Hiroshi Nakagawa, Takuya Matsuzaki, Nobuyuki Shimizu, and Takashi Ninomiya
- Subjects
Linguistics and Language ,Parsing ,Programming language ,Computer science ,Parsing expression grammar ,computer.software_genre ,Top-down parsing ,Language and Linguistics ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Parser combinator ,Artificial Intelligence ,S-attributed grammar ,L-attributed grammar ,Deterministic parsing ,computer ,Software ,Bottom-up parsing - Abstract
Many parsing techniques assume the use of a packed parse forest to enable efficient and accurate parsing. However, they suffer from an inherent problem that derives from the restriction of locality in the packed parse forest. Deterministic parsing is one solution that can achieve simple and fast parsing without the mechanisms of the packed parse forest by accurately choosing search paths. We propose new deterministic shift-reduce parsing and its variants for unification-based grammars. Deterministic parsing cannot simply be applied to unification-based grammar parsing, which often fails because of its hard constraints. Therefore, this is developed by using default unification, which almost always succeeds in unification by overwriting inconsistent constraints in grammars.
- Published
- 2010
14. Reliability analysis of tunnel surrounding rock stability by Monte-Carlo method
- Author
-
Geng-she Yang and Jia-mi Xi
- Subjects
Engineering ,Theory of relativity ,Basis (linear algebra) ,business.industry ,Monte Carlo method ,Energy Engineering and Power Technology ,Structural engineering ,Geotechnical Engineering and Engineering Geology ,Deterministic parsing ,business ,Stability (probability) ,Reliability (statistics) ,Physics::Geophysics - Abstract
Discussed advantages of improved Monte-Carlo method and feasibility about proposed approach applying in reliability analysis for tunnel surrounding rock stability. On the basis of deterministic parsing for tunnel surrounding rock, reliability computing method of surrounding rock stability was derived from improved Monte-Carlo method. The computing method considered random of related parameters, and therefore satisfies relativity among parameters. The proposed method can reasonably determine reliability of surrounding rock stability. Calculation results show that this method is a scientific method in discriminating and checking surrounding rock stability.
- Published
- 2008
15. From Ambiguous Regular Expressions to Deterministic Parsing Automata
- Author
-
Luca Breveglieri, Angelo Morzenti, Stefano Crespi Reghizzi, and Angelo Borsotti
- Subjects
Tree (data structure) ,Generator (computer programming) ,Theoretical computer science ,Parsing ,Syntax (programming languages) ,Computer science ,Programming language ,Regular expression ,computer.software_genre ,Deterministic parsing ,Abstract syntax tree ,computer ,Testability - Abstract
This new parser generator for ambiguous regular expressions (RE) formally extends the Berry-Sethi (BS) algorithm into a finite-state device that specifies the syntax tree(s). We extend the local testability property of the marked RE’s from terminal strings to linearized syntax trees. The generator supports disambiguation, i.e., selecting a preferred tree in case of ambiguity. The selection is parametric with respect to the Greedy or POSIX criterion. The parser is proved correct and has linear-time complexity. The generator is available as an interactive SW tool (on GitHub - see http://github.com/breveglieri/ebs/README).
- Published
- 2015
16. Parallel LL parsing
- Author
-
Ladislav Vagner and Bořivoj Melichar
- Subjects
TheoryofComputation_COMPUTATIONBYABSTRACTDEVICES ,Theoretical computer science ,Computer science ,Computer Networks and Communications ,Parsing expression grammar ,Top-down parsing ,Canonical LR parser ,LL grammar ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Parser combinator ,LL parser ,Deterministic parsing ,Algorithm ,Software ,Bottom-up parsing ,Information Systems - Abstract
A deterministic parallel LL parsing algorithm is presented. The algorithm is based on a transformation from a parsing problem to parallel reduction. First, a nondeterministic version of a parallel LL parser is introduced. Then, it is transformed into the deterministic version—the LLP parser. The deterministic LLP(q,k) parser uses two kinds of information to select the next operation — a lookahead string of length up to k symbols and a lookback string of length up to q symbols. Deterministic parsing is available for LLP grammars, a subclass of LL grammars. Since the presented deterministic and nondeterministic parallel parsers are both based on parallel reduction, they are suitable for most parallel architectures.
- Published
- 2006
17. Extracting Partial Parsing Rules from Tree-Annotated Corpus: Toward Deterministic Global Parsing
- Author
-
Kong Joo Lee, Myung-Seok Choi, Key-Sun Choi, and Gil Chang Kim
- Subjects
Computer science ,media_common.quotation_subject ,Top-down parsing ,computer.software_genre ,Lexicon ,Parser combinator ,Artificial Intelligence ,Electrical and Electronic Engineering ,Phrase structure grammar ,media_common ,Parsing ,Grammar ,business.industry ,Parsing expression grammar ,Syntax ,Substring ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Hardware and Architecture ,Top-down parsing language ,S-attributed grammar ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Deterministic parsing ,business ,computer ,Software ,Natural language ,Natural language processing ,Sentence ,Bottom-up parsing - Abstract
It is not always possible to find a global parse for an input sentence owing to problems such as errors of a sentence, incompleteness of lexicon and grammar. Partial parsing is an alternative approach to respond to these problems. Partial parsing techniques try to recover syntactic information efficiently and reliably by sacrificing completeness and depth of analysis. One of the difficulties in partial parsing is how the grammar might be automatically extracted. In this paper we present a method of automatically extracting partial parsing rules from a tree-annotated corpus using the decision tree method. Our goal is deterministic global parsing using partial parsing rules, in other words, to extract partial parsing rules with higher accuracy and broader expansion. First, we define a rule template that enables to learn a subtree for a given substring, so that the resultant rules can be more specific and stricter to apply. Second, rule candidates extracted from a training corpus are enriched with contextual and lexical information using the decision tree method and verified through cross-validation. Last, we underspecify non-deterministic rules by merging substructures with ambiguity in those rules. The learned grammar is similar to phrase structure grammar with contextual and lexical information, but allows building structures of depth one or more. Thanks to automatic learning, the partial parsing rules can be consistent and domain-independent. Partial parsing with this grammar processes an input sentence deterministically using longest-match heuristics, and recursively applies rules to an input sentence. The experiments showed that the partial parser using automatically extracted rules is not only accurate and efficient but also achieves reasonable coverage for Korean.
- Published
- 2005
18. Efficient Semi-Deterministic Parsing for Korean Using Lexical Co-Occurrence Data from a Corpus
- Author
-
Juntae Yoon
- Subjects
Dependency (UML) ,Parsing ,Computer science ,business.industry ,media_common.quotation_subject ,Speech recognition ,Association (object-oriented programming) ,Ambiguity ,computer.software_genre ,Noun phrase ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Association value ,Artificial intelligence ,Deterministic parsing ,business ,computer ,Natural language processing ,Sentence ,media_common - Abstract
This paper presents an efficient parsing method for Korean using statistical information extracted from a corpus. Structural ambiguity commonly occurs while deciding dependency relations between words in Korean sentences. To resolve the ambiguity, lexical association between words plays an important role in figuring out the correct dependency. Our parser uses statistical co-occurrence data to compute the lexical association. In addition, we define the global association table (GAT) which enables a global management of the associations. Using the GAT, the parser has an overall configuration of dependency relations between words, thus can analyze a sentence almost deterministically. That is, our system can be viewed as a semi-deterministic parser, which is controlled not by the condition-action rule but by the association value between phrases. Furthermore, the unknown grammatical case of a noun phrase caused by the auxiliary postposition in Korean can be effectively resolved using lexical co-occurrences in our system.
- Published
- 2002
19. A Practical GLR Parser Generator for Software Reverse Engineering
- Author
-
Teng Geng, Changqing Lai, Zhibo Chen, Wei Meng, Fu Xu, and Han Mei
- Subjects
Computer Networks and Communications ,Computer science ,Programming language ,ComputerApplications_COMPUTERSINOTHERSYSTEMS ,Recursive descent parser ,Top-down parsing ,computer.software_genre ,Canonical LR parser ,Simple LR parser ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Parser combinator ,GLR parser ,Software_PROGRAMMINGLANGUAGES ,Deterministic parsing ,LALR parser ,computer - Abstract
Traditional parser generators use deterministic parsing methods. These methods can not meet the parsing requirements of software reverse engineering effectively. A new parser generator is presented which can generate GLR parser with automatic error recovery. The generated GLR parser has comparable parsing speed with the traditional LALR(1) parser and can be used in the parsing of software reverse engineering.
- Published
- 2014
20. The PAPAGENO Parallel-Parser Generator
- Author
-
Dino Mandrioli, Alessandro Barenghi, Stefano Crespi Reghizzi, Matteo Pradella, and Federica Panella
- Subjects
Multi-core processor ,Parsing ,Generator (computer programming) ,Computer science ,Parallel computing ,Parser generation ,computer.software_genre ,Field (computer science) ,Order of operations ,Rule-based machine translation ,Parallel Parsing ,Operator Precedence Grammars ,Compiler ,Deterministic parsing ,computer - Abstract
The increasing use of multicore processors has deeply transformed computing paradigms and applications. The wide availability of multicore systems had an impact also in the field of compiler technology, although the research on deterministic parsing did not prove to be effective in exploiting the architectural advantages, the main impediment being the inherent sequential nature of traditional LL and LR algorithms. We present PAPAGENO, an automated parser generator relying on operator precedence grammars. We complemented the PAPAGENO-generated parallel parsers with parallel lexing techniques, obtaining near-linear speedups on multicore machines, and the same speed as Bison parsers on sequential execution.
- Published
- 2014
21. Deterministic Statistical Mapping of Sentences to Underspecified Semantics
- Author
-
Pi-Chuan Chang, Hiyan Alshawi, and Michael Ringgaard
- Subjects
Dependency (UML) ,business.industry ,Computer science ,Treebank ,Statistical model ,Semantics ,computer.software_genre ,Semantic mapping ,Dependency grammar ,Artificial intelligence ,business ,Deterministic parsing ,computer ,Natural language processing ,Natural language - Abstract
We present a method for training a statistical model for mapping natural language sentences to semantic expressions. The semantics are expressions of an underspecified logical form that has properties making it particularly suitable for statistical mapping from text. An encoding of the semantic expressions into dependency trees with automatically generated labels allows application of existing methods for statistical dependency parsing to the mapping task (without the need for separate traditional dependency labels or parts of speech). The encoding also results in a natural per-word semantic-mapping accuracy measure. We report on the results of training and testing statistical models for mapping sentences of the Penn Treebank into the semantic expressions, for which per-word semantic mapping accuracy ranges between 79% and 86% depending on the experimental conditions. The particular choice of algorithms used also means that our trained mapping is deterministic (in the sense of deterministic parsing), paving the way for large-scale text-to-semantic mapping.
- Published
- 2014
22. Uniquely parsable array grammars for generating and parsing connected patterns
- Author
-
Katsunobu Imai and Kenichi Morita
- Subjects
Parsing ,Theoretical computer science ,Grammar ,Computer science ,Backtracking ,media_common.quotation_subject ,computer.software_genre ,Syntactic pattern recognition ,Cellular automaton ,Rule-based machine translation ,Artificial Intelligence ,Deterministic automaton ,Signal Processing ,Formal language ,Computer Vision and Pattern Recognition ,Deterministic parsing ,computer ,Algorithm ,Software ,media_common - Abstract
A uniquely parsable array grammar (UPAG) introduced by Yamamoto and Morita is a special kind of isometric array grammar (IAG) in which parsing can be performed without backtracking. Hence, we can use a UPAG as an efficient syntactic pattern recognition mechanism, if the pattern set is properly described by a UPAG. In this paper, we investigate the problem of describing and recognizing the set of all connected patterns using a UPAG formalism. As for the recognition of connected patterns, Beyer showed an efficient algorithm that operates on cellular automata. We show that his algorithm can be expressed very simply in the UPAG framework, and give two kinds of simple UPAGs that generate the set of all connected patterns.
- Published
- 1999
23. Grammar partitioning and modular deterministic parsing
- Author
-
Giuseppe Psaila and Stefano Crespi Reghizzi
- Subjects
Theoretical computer science ,Parsing ,General Computer Science ,Computer science ,Programming language ,LR parser ,Deterministic context-free grammar ,Context-free grammar ,computer.software_genre ,Canonical LR parser ,LL grammar ,LALR parser ,Deterministic parsing ,computer - Abstract
Complex languages are often modularized into sublanguages and the compiler is accordingly organized as a set of separate modules. Modularization (called federalization) is beneficial for beating complexity, for maintenance, and for reuse. Focusing on syntax analysis, we consider the decomposition of a grammar into deterministic subgrammars. We study three conditions for determinism in grammar partitioning: first using homogeneous modules of the LR(1) or LL(1) kind; then using heterogeneous modules (LR(1) or LL(1)). Federalization slightly decreases the generality of LR(1) parsers, but not of LL(1) ones, and it allows to handle some grammars which are not LALR(1). Experimental results show that LR(1) federal automata have fewer (up to 60%) states than monolithic LR(1) automata. Criteria for modularization, practical experiences and hints to semantic decomposition issues conclude the paper.
- Published
- 1998
24. Parsing Partially Ordered Multisets
- Author
-
Twan Basten, Mathematics and Computer Science, and Formal Methods
- Subjects
Multiset ,Theoretical computer science ,Parsing ,Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing) ,computer.software_genre ,Top-down parsing ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Parser combinator ,Computer Science (miscellaneous) ,Top-down parsing language ,S-attributed grammar ,Deterministic parsing ,Algorithm ,computer ,Mathematics ,Bottom-up parsing - Abstract
A partially ordered multiset or pomset is a generalization of a string in which the total order has been relaxed to a partial order. Strings are often used as a model for sequential computation; pomsets are a natural model for parallel and distributed computation. By viewing pomsets as a generalization of strings, the question is raised whether concepts from language theory can be generalized to pomsets. An important area in the theory of languages is parsing theory. This paper develops the fundamentals of a parsing theory for pomsets, called PLR parsing. It is based on the LR-parsing technique, which is the most powerful deterministic parsing technique in language theory. The basic algorithm in the class of PLR parsing algorithms, the PLR(0) algorithm is explained in detail.
- Published
- 1997
25. Operator Precedence ω-Languages
- Author
-
Matteo Pradella, Dino Mandrioli, Violetta Lonati, and Federica Panella
- Subjects
Structure (mathematical logic) ,Theoretical computer science ,Syntax (programming languages) ,Infinite-state model checking ,Programming language ,Computer science ,Pushdown automaton ,Closure (topology) ,ω-languages ,computer.software_genre ,Notation ,Order of operations ,Closure properties ,Regular language ,Operator precedence languages ,Deterministic parsing ,computer - Abstract
Recent literature extended the analysis of ω-languages from the regular ones to various classes of languages with “visible syntax structure”, such as visibly pushdown languages (VPLs). Operator precedence languages (OPLs), instead, were originally defined to support deterministic parsing and exhibit interesting relations with these classes of languages: OPLs strictly include VPLs, enjoy all relevant closure properties and have been characterized by a suitable automata family and a logic notation. We introduce here operator precedence ω-languages (ωOPLs), investigating various acceptance criteria and their closure properties. Whereas some properties are natural extensions of those holding for regular languages, others require novel investigation techniques.Application-oriented examples show the gain in expressiveness and verifiability offered by ωOPLs w.r.t. smaller classes.
- Published
- 2013
26. E-parser: An implementation of a deterministic GB-related parsing system
- Author
-
Torbjørn Nordgård
- Subjects
Theoretical computer science ,Parsing ,Workstation ,Syntax (programming languages) ,Programming language ,Computer science ,General Social Sciences ,Top-down parsing ,computer.software_genre ,law.invention ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Parser combinator ,law ,S-attributed grammar ,Deterministic parsing ,computer ,Bottom-up parsing - Abstract
This paper describes an implementation of a deterministic parsing system, described in Nordgard (1993). The syntax of “heuristic rules” and how the rules interact with the basic operations of the parser constitute the bulk of the article. The implementation is written in Medley Interlisp, and the system can be run on Sun or Xerox workstations.
- Published
- 1994
27. Verifiable Parse Table Composition for Deterministic Parsing
- Author
-
August Schwerdfeger and Eric Van Wyk
- Subjects
Source code ,Theoretical computer science ,Parsing ,Programming language ,Computer science ,media_common.quotation_subject ,computer.software_genre ,Top-down parsing ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Parser combinator ,Table (database) ,Deterministic parsing ,LALR parser ,computer ,media_common ,Bottom-up parsing - Abstract
One obstacle to the implementation of modular extensions to programming languages lies in the problem of parsing extended languages. Specifically, the parse tables at the heart of traditional LALR(1) parsers are so monolithic and tightly constructed that, in the general case, it is impossible to extend them without regenerating them from the source grammar. Current extensible frameworks employ a variety of solutions, ranging from a full regeneration to using pluggable binary modules for each different extension. But recompilation is time-consuming, while the pluggable modules in many cases cannot support the addition of more than one extension, or use backtracking or non-deterministic parsing techniques. We present here a middle-ground approach that allows an extension, if it meets certain restrictions, to be compiled into a parse table fragment. The host language parse table and fragments from multiple extensions can then always be efficiently composed to produce a conflict-free parse table for the extended language. This allows for the distribution of deterministic parsers for extensible languages in a pre-compiled format, eliminating the need for the “source code” grammar to be distributed. In practice, we have found these restrictions to be reasonable and admit many useful language extensions.
- Published
- 2010
28. An ungreedy Chinese deterministic dependency parser considering long-distance dependency
- Author
-
Wenlin Yao, Lingling Gao, and Lei Wang
- Subjects
Parsing ,Dependency (UML) ,business.industry ,Computer science ,Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing) ,computer.software_genre ,Top-down parsing ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Parser combinator ,Dependency grammar ,S-attributed grammar ,Artificial intelligence ,Deterministic parsing ,business ,computer ,Natural language processing ,Bottom-up parsing - Abstract
This paper presents a two-step dependency parser to parse Chinese deterministically. By dividing a sentence into two parts and parsing them separately, the error accumulation can be avoided effectively. Previous works on shift-reduce dependency parser may guarantee the greedy characteristic of deterministic parsing less. This paper improves on a kind of deterministic dependency parsing method to weaken the greedy characteristic of it. During parsing, both forward and backward parsing directions are chosen to decrease the unparsed rate. Support vector machines are utilized to determine the word dependency relations and in order to solve the problem of long distance dependency, a group of combined global features are presented in this paper. The proposed parser achieved significant improvement on dependency accuracy and root accuracy.
- Published
- 2008
29. Apply a Rough Set-Based Classifier to Dependency Parsing
- Author
-
Yangsheng Ji, Ruoce Ma, Xinyu Dai, and Lin Shang
- Subjects
Parsing ,Computer science ,business.industry ,Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing) ,Top-down parsing ,computer.software_genre ,Machine learning ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Parser combinator ,Dependency grammar ,S-attributed grammar ,Rough set ,Artificial intelligence ,Deterministic parsing ,business ,computer ,Natural language processing ,Bottom-up parsing - Abstract
A rough set-based semi-naive Bayesian classification method is applied to dependency parsing, which is an important task in syntactic structure analysis of natural language processing. Many parsing algorithms have emerged combined with statistical machine learning techniques. The rough set-based classifier is embedded with Nivre's deterministic parsing algorithm to conduct dependency parsing task on a Chinese corpus. Experimental results show that the method has a good performance on dependency parsing task. Moreover, the experiments have justified the effectiveness of the classification influence.
- Published
- 2008
30. Japanese dependency parsing using a tournament model
- Author
-
Masayuki Asahara, Yuji Matsumoto, and Masakazu Iwatate
- Subjects
Text corpus ,Parsing ,business.industry ,Computer science ,Probabilistic logic ,computer.software_genre ,Top-down parsing ,Dependency grammar ,Tournament ,Artificial intelligence ,Deterministic parsing ,business ,computer ,Preference (economics) ,Natural language processing - Abstract
In Japanese dependency parsing, Kudo's relative preference-based method (Kudo and Matsumoto, 2005) outperforms both deterministic and probabilistic CKY-based parsing methods. In Kudo's method, for each dependent word (or chunk) a log-linear model estimates relative preference of all other candidate words (or chunks) for being as its head. This cannot be considered in the deterministic parsing methods. We propose an algorithm based on a tournament model, in which the relative preferences are directly modeled by one-on-one games in a step-ladder tournament. In an evaluation experiment with Kyoto Text Corpus Version 4.0, the proposed method outperforms previous approaches, including the relative preference-based method.
- Published
- 2008
31. The Data-Oriented Parsing Approach: Theory and Application
- Author
-
Bod, R., Fulcher, J., Jain, L.C., and Language and Computation (ILLC, FNWI/FGw)
- Subjects
Parsing ,Computer science ,business.industry ,computer.software_genre ,Top-down parsing ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Parser combinator ,Top-down parsing language ,S-attributed grammar ,Artificial intelligence ,Deterministic parsing ,business ,computer ,Natural language processing ,Bottom-up parsing ,Data-oriented parsing - Abstract
Parsing models have many applications in AI, ranging from natural language processing (NLP) and computational music analysis to logic programming and computational learning. Broadly conceived, a parsing model seeks to uncover the underlying structure of an input, that is, the various ways in which elements of the input combine to form phrases or constituents and how those phrases recursively combine to form a tree structure for the whole input. During the last fifteen years, a major shift has taken place from rule-based, deterministic parsing to corpus-based, probabilistic parsing. A quick glance over the NLP literature from the last ten years, for example, indicates that virtually all natural language parsing systems are currently probabilistic. The same development can be observed in (stochastic) logic programming and (statistical) relational learning. This trend towards probabilistic parsing is not surprising: the increasing availability of very large collections of text, music, images and the like allow for inducing statistically motivated parsing systems from actual data.
- Published
- 2008
32. DEALING WITH AMBIGUITIES IN ENGLISH CONJUNCTIONS AND COMPARATIVES BY A DETERMINISTIC PARSER
- Author
-
Von-Wun Soo and Rey-Long Liu
- Subjects
Parsing ,Computer science ,business.industry ,media_common.quotation_subject ,Ellipsis (linguistics) ,Ambiguity ,computer.software_genre ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Artificial Intelligence ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Pattern matching ,Computational linguistics ,business ,Deterministic parsing ,computer ,Software ,Natural language ,Sentence ,Natural language processing ,media_common - Abstract
The major problems in parsing English conjunctions and comparatives are ambiguities of scoping and ellipsis. Scoping ambiguities occur when a parser cannot deterministically detect boundaries of constituents, while ellipsis ambiguities occur when a parser cannot deterministically detect missing components. Since simple lookahead mechanisms cannot collect adequate information to resolve these ambiguities, a parsing strategy that only employs such mechanisms will need to backtrack each time it makes incorrect assumptions. In this paper, we extend the Wait-And-See strategy to parse conjunctions and comparatives deterministically and simultaneously. Several mechanisms, such as bottom-up preparsing, suspension, and pattern matching, are implemented. The bottom-up preparsing accesses the dictionary and recognizes isolated sentence fragments which can be determined without ambiguities. The suspension, which is different from Marcus’s attention shifting, allows the parser to suspend temporally at ambiguous points and continue to parse the rest of the sentence until it obtains the necessary information to resolve the ambiguities. Pattern matching uses the concept of symmetry to detect missing components (the ellipses) in the two conjoined or compared sentence fragments.
- Published
- 1990
33. Top-Down Deterministic Parsing of Languages Generated by CD Grammar Systems
- Author
-
György Vaszil and Henning Bordihn
- Subjects
Parsing ,Programming language ,business.industry ,Computer science ,Parsing expression grammar ,computer.software_genre ,Top-down parsing ,LL grammar ,S-attributed grammar ,Top-down parsing language ,Artificial intelligence ,Deterministic parsing ,business ,computer ,Natural language processing ,Bottom-up parsing - Abstract
The paper extends the notion of context-free LL(k) grammars to CD grammar systems using two different derivation modes, examines some of the properties of the resulting language families, and studies the possibility of parsing these languages deterministically, without backtracking.
- Published
- 2007
34. A three-step deterministic parser for Chinese dependency parsing
- Author
-
Sadao Kurohashi, Kun Yu, and Hao Liu
- Subjects
Parsing ,Computer science ,business.industry ,Speech recognition ,Parsing expression grammar ,Recursive descent parser ,computer.software_genre ,Top-down parsing ,Canonical LR parser ,Simple LR parser ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Parser combinator ,Dependency grammar ,GLR parser ,LL parser ,Top-down parsing language ,S-attributed grammar ,Artificial intelligence ,Deterministic parsing ,LALR parser ,business ,computer ,Natural language processing ,Sentence ,Bottom-up parsing - Abstract
This paper presents a three-step dependency parser to parse Chinese deterministically. By dividing a sentence into several parts and parsing them separately, it aims to reduce the error propagation coming from the greedy characteristic of deterministic parsing. Experimental results showed that compared with the deterministic parser which parsed a sentence in sequence, the proposed parser achieved extremely significant improvement on dependency accuracy.
- Published
- 2007
35. Dependency parsing based on dynamic local optimization
- Author
-
Ting Liu, Jinshan Ma, Sheng Li, and Huijia Zhu
- Subjects
TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Parsing ,Theoretical computer science ,Parser combinator ,Computer science ,Memoization ,S-attributed grammar ,Parsing expression grammar ,Deterministic parsing ,computer.software_genre ,Top-down parsing ,computer ,Bottom-up parsing - Abstract
This paper presents a deterministic parsing algorithm for projective dependency grammar. In a bottom-up way the algorithm finds the local optimum dynamically. A constraint procedure is made to use more structure information. The algorithm parses sentences in linear time and labeling is integrated with the parsing. This parser achieves 63.29% labeled attachment score on the average in CoNLL-X Shared Task.
- Published
- 2006
36. A best-first probabilistic shift-reduce parser
- Author
-
Alon Lavie and Kenji Sagae
- Subjects
Parsing ,Computer science ,LR parser ,business.industry ,Shift-reduce parser ,Treebank ,Parsing expression grammar ,Top-down parsing ,computer.software_genre ,Canonical LR parser ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Parser combinator ,Top-down parsing language ,Artificial intelligence ,Deterministic parsing ,business ,computer ,Natural language processing ,Generative grammar - Abstract
Recently proposed deterministic classifier-based parsers (Nivre and Scholz, 2004; Sagae and Lavie, 2005; Yamada and Mat-sumoto, 2003) offer attractive alternatives to generative statistical parsers. Deterministic parsers are fast, efficient, and simple to implement, but generally less accurate than optimal (or nearly optimal) statistical parsers. We present a statistical shift-reduce parser that bridges the gap between deterministic and probabilistic parsers. The parsing model is essentially the same as one previously used for deterministic parsing, but the parser performs a best-first search instead of a greedy search. Using the standard sections of the WSJ corpus of the Penn Treebank for training and testing, our parser has 88.1% precision and 87.8% recall (using automatically assigned part-of-speech tags). Perhaps more interestingly, the parsing model is significantly different from the generative models used by other well-known accurate parsers, allowing for a simple combination that produces precision and recall of 90.9% and 90.7%, respectively.
- Published
- 2006
37. Discriminative classifiers for deterministic dependency parsing
- Author
-
Johan Hall, Joakim Nivre, and Jens Nilsson
- Subjects
Parsing ,business.industry ,Memoization ,Computer science ,Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing) ,computer.software_genre ,Machine learning ,Top-down parsing ,Parser combinator ,Discriminative model ,Dependency grammar ,S-attributed grammar ,Artificial intelligence ,business ,Deterministic parsing ,computer ,Natural language processing ,Bottom-up parsing - Abstract
Deterministic parsing guided by treebank-induced classifiers has emerged as a simple and efficient alternative to more complex models for data-driven parsing. We present a systematic comparison of memory-based learning (MBL) and support vector machines (SVM) for inducing classifiers for deterministic dependency parsing, using data from Chinese, English and Swedish, together with a variety of different feature models. The comparison shows that SVM gives higher accuracy for richly articulated feature models across all languages, albeit with considerably longer training times. The results also confirm that classifier-based deterministic parsing can achieve parsing accuracy very close to the best results reported for more complex parsing models.
- Published
- 2006
38. Directly-Executable Earley Parsing
- Author
-
R. Nigel Horspool and John Aycock
- Subjects
TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Parser combinator ,Programming language ,Computer science ,LR parser ,S-attributed grammar ,Parsing expression grammar ,Deterministic parsing ,computer.software_genre ,Top-down parsing ,computer ,Bottom-up parsing ,Earley parser - Abstract
Deterministic parsing techniques are typically used in favor of general parsing algorithms for efficiency reasons. However, general algorithms such as Earley's method are more powerful and also easier for developers to use, because no seemingly arbitrary restrictions are placed on the grammar. We describe how to narrow the performance gap between general and deterministic parsers, constructing a directly-executable Earley parser that can reach speeds comparable to deterministic methods even on grammars for commonly-used programming languages.
- Published
- 2001
39. A Deterministic Shift-Reduce Parser Generator for a Logic Programming Language
- Author
-
Chuck Liang
- Subjects
Memoization ,Computer science ,Optimizing compiler ,Recursive descent parser ,computer.software_genre ,Top-down parsing ,Canonical LR parser ,Rule-based machine translation ,Parser combinator ,LL parser ,Logic programming ,computer.programming_language ,Compiler-compiler ,Parsing ,Syntax (programming languages) ,Programming language ,Deterministic context-free grammar ,Shift-reduce parser ,Parsing expression grammar ,Context-free grammar ,Formal grammar ,Simple LR parser ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Terminal and nonterminal symbols ,GLR parser ,λProlog ,Top-down parsing language ,S-attributed grammar ,Compiler ,L-attributed grammar ,Deterministic parsing ,LALR parser ,computer ,Bottom-up parsing - Abstract
This paper addresses efficient parsing in the context of logical inference for the purpose of using logic programming languages in compiler writing. A bottom-up, deterministic parsing mechanism is formulated for "bounded right context" grammars, a subclass of LR(k) grammars with characteristics amenable to declarative parser specification. A working parser generator for λProlog is described, although the basic parsing mechanism is applicable to logic programming in general.
- Published
- 2000
40. Deterministic parsing for augmented context-free grammars
- Author
-
Stefano Crespi-Reghizzi, Luca Breveglieri, and Alessandra Cherubini
- Subjects
Theoretical computer science ,Computer science ,Context-sensitive grammar ,Recursive descent parser ,computer.software_genre ,Top-down parsing ,Parser combinator ,Rule-based machine translation ,Formal language ,LL parser ,Indexed grammar ,Phrase structure grammar ,Parsing ,Augmented transition network ,Programming language ,Deterministic context-free grammar ,Parsing expression grammar ,Context-free grammar ,Tree-adjoining grammar ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Extended Affix Grammar ,Ambiguous grammar ,Stochastic context-free grammar ,S-attributed grammar ,Top-down parsing language ,L-attributed grammar ,Deterministic parsing ,computer ,Generative grammar ,Bottom-up parsing - Abstract
In contrast to the usual depth-first derivations of context-free (CF) grammars, breadth-first derivations (also in combination with depth-first ones) yield a class of augmented context-free grammars (ACF) (also termed multi-breadth-depth grammars) endowed with greater generative capacity, yet manageable. The inadequacy of CF grammars to treat distant dependencies is overcome by the new model. ACF grammars can be classified with respect to their disposition, a concept related to the data structure needed to parse their strings. For such augmented CF grammars we consider the LL(k) condition, that ensures top-down deterministic parsing. We restate the condition as an adjacency problem and we prove that it is decidable for any disposition. The deterministic linear-time parser differs from a recursive descent parser by using instead of a LIFO stack a more general data structure, involving FIFO queues and LIFO stacks in accordance with the disposition. ACF grammars can be also viewed as a formalized version of ATN (Augmented Transition Networks).
- Published
- 1995
41. Dependency Parsing of Turkish
- Author
-
NivreJoakim, OflazerKemal, and EryiğitGülşen
- Subjects
FOS: Computer and information sciences ,Linguistics and Language ,Turkish ,Computer science ,Memoization ,P Philology. Linguistics ,Top-down parsing ,computer.software_genre ,Language and Linguistics ,200402 Computational Linguistics ,Parser combinator ,Artificial Intelligence ,Dependency grammar ,QA Mathematics ,QA075 Electronic computers. Computer science ,Parsing ,business.industry ,language.human_language ,Computer Science Applications ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Applied Computer Science ,QA076 Computer software ,language ,FOS: Languages and literature ,Top-down parsing language ,S-attributed grammar ,Artificial intelligence ,Deterministic parsing ,business ,80107 Natural Language Processing ,computer ,Syntactic parsing ,Natural language processing ,Bottom-up parsing - Abstract
The suitability of different parsing methods for different languages is an important topic in syntactic parsing. Especially lesser-studied languages, typologically different from the languages for which methods have originally been developed, pose interesting challenges in this respect. This article presents an investigation of data-driven dependency parsing of Turkish, an agglutinative, free constituent order language that can be seen as the representative of a wider class of languages of similar type. Our investigations show that morphological structure plays an essential role in finding syntactic relations in such a language. In particular, we show that employing sublexical units called inflectional groups, rather than word forms, as the basic parsing units improves parsing accuracy. We test our claim on two different parsing methods, one based on a probabilistic model with beam search and the other based on discriminative classifiers and a deterministic parsing strategy, and show that the usefulness of sublexical units holds regardless of the parsing method. We examine the impact of morphological and lexical information in detail and show that, properly used, this kind of information can improve parsing accuracy substantially. Applying the techniques presented in this article, we achieve the highest reported accuracy for parsing the Turkish Treebank.
- Published
- 2008
42. Deterministic parsing and linguistic explanation
- Author
-
Amy Weinberg and Robert C. Berwick
- Subjects
Linguistics and Language ,Grammar ,Computer science ,business.industry ,media_common.quotation_subject ,Experimental and Cognitive Psychology ,computer.software_genre ,Top-down parsing ,Language and Linguistics ,Education ,Artificial intelligence ,business ,Deterministic parsing ,computer ,Natural language processing ,media_common ,Bottom-up parsing - Published
- 1985
43. Analyses of deterministic parsing algorithms
- Author
-
Jacques Cohen and Martin Roth
- Subjects
Parsing ,General Computer Science ,Grammar ,LR parser ,Computer science ,media_common.quotation_subject ,Parsing expression grammar ,Recursive descent parser ,computer.software_genre ,Top-down parsing ,Canonical LR parser ,Simple LR parser ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Parser combinator ,Top-down parsing language ,Deterministic parsing ,computer ,Algorithm ,media_common - Abstract
This paper describes an approach for determining the minimum, maximum, and average times to parse sentences acceptable by a deterministic parser. These quantities are presented in the form of symbolic formulas, called time-formulas. The variables in these formulas represent not only the length of the input string but also the time to perform elementary operations such as pushing, popping, subscripting, iterating, etc. By binding to the variables actual numerical values corresponding to a given compiler-machine configuration, one can determine the execution time for that configuration. Time-formulas are derived by examining the grammar rules and the program representing the algorithm one wishes to analyze. The approach is described by using a specific grammar that defines simple arithmetic expressions. Two deterministic parsers are analyzed: a top-down recursive descent LL(1) parser, and a bottom-up SLR(1) parser. The paper provides estimates for the relative efficiencies of the two parsers. The estimates applicable to a specific machine, the PDP-10, are presented and substantiated by benchmarks. Finally, the paper illustrates the proposed approach by applying it to the analyses of parsers for a simple programming language.
- Published
- 1978
44. Deterministic parsing and subjacency
- Author
-
Janet Dean Fodor
- Subjects
Linguistics and Language ,Parsing ,Computer science ,Subjacency ,Experimental and Cognitive Psychology ,computer.software_genre ,Determinism ,Language and Linguistics ,Linguistics ,Education ,Constraint (information theory) ,Mechanism (philosophy) ,Deterministic parsing ,computer ,Natural language ,Sentence - Abstract
It has previously been claimed that a deterministic model of the human sentence parsing mechanism provides an explanation for the existence of, and some of the properties of, the subjacency constraint on natural languages. The present paper argues that the empirical arguments offered in support of these claims are flawed, and that in any case the explanatory relationship between determinism and subjacency is weak.
- Published
- 1985
45. Global Context Recovery: A New Strategy for Syntactic Error Recovery by Table-Drive Parsers
- Author
-
Richard B. Kieburtz and Ajit B. Pai
- Subjects
Scheme (programming language) ,Parsing ,Computer science ,Context (language use) ,Pascal (programming language) ,computer.software_genre ,Set (abstract data type) ,Table (database) ,Deterministic parsing ,Fiducial marker ,Algorithm ,computer ,Software ,computer.programming_language - Abstract
Described is a method for syntactic error recovery that is compatible with deterministic parsing methods and that is able to recover from many errors more quickly than do other schemes because it performs global context recovery. The method relies on fiducial symbols, which are typically reserved key words of a language, to provide mileposts for error recovery. The method has been applied to LL(1) parsers, for which a detailed algorithm is given, and informally proved correct. The algorithm will always recover and return control to the parser if the text being analyzed satisfies only minimal requirements: that it contains one or more occurrences of fiducial symbols following the point at which an error is detected. Tables needed for error recovery have been automatically generated, along with parsing tables, by a parser constructor for the LL(1) grammars. A theoretical characterization of fiducial symbols is given, and the utility of this characterization in practice is discussed. It has been applied to a grammar for the programming language Pascal to aid in selection of a set of fiducial symbols. The error recovery scheme has been tested on a set of student-written Pascal program texts and is compared with other error recovery strategies.
- Published
- 1980
46. Sentence Disambiguation by a Shift-Reduce Parsing Technique
- Author
-
SRI INTERNATIONAL MENLO PARK CA ARTIFICIAL INTELLIGENCE CENTER, Shieber, Stuart M., SRI INTERNATIONAL MENLO PARK CA ARTIFICIAL INTELLIGENCE CENTER, and Shieber, Stuart M.
- Abstract
Native speakers of English show definite and consistent preferences for certain readings of syntactically ambiguous sentences. A user of a natural-language processing system would naturally expect it to reflect the same preferences. Thus, such systems must model in some way the linguistic performance as well as the linguistic competence of the native speaker. The authors have developed a parsing algorithm -- a variant of the LALR(1) shift-reduce algorithm -- that models the preference behavior of native speakers for a range of syntactic preference phenomena reported in the psycholinguistic literature, including the recent data on lexical preferences. The algorithm yields the preferred parse deterministically, without building multiple parse trees and choosing among them. As a side effect, it displays appropriate behavior in processing the much discussed garden-path sentences. The parsing algorithm has been implemented and has confirmed the feasibility of this approach to the modeling of these phenomena., Technical Note 281. Sponsored in part by the Defense Advanced Research Projects Agency (DARPA). Pub. in the Proceedings of the 21st Annual Meeting of the Association for Computational Linguistics, 1983. Presented at the Annual Meeting of the Association for Computational Linguistics (21st) held in Boston, MA in Jun 1983.
- Published
- 1983
47. Simplifying deterministic parsing
- Author
-
Michael J. Frelling and Alan W. Carter
- Subjects
Head-driven phrase structure grammar ,Interface (Java) ,Computer science ,Attribute grammar ,media_common.quotation_subject ,Syntactic predicate ,Emergent grammar ,Operator-precedence grammar ,Mildly context-sensitive grammar formalism ,computer.software_genre ,Top-down parsing ,Task (project management) ,Parser combinator ,Grammar-based code ,Semantic memory ,media_common ,Sequence ,Parsing ,Grammar ,business.industry ,Phrase structure rules ,Syntax ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Interfacing ,Artificial intelligence ,Deterministic parsing ,business ,computer ,Generative grammar ,Natural language processing - Abstract
This paper presents a model for deterministic parsing which was designed to simplify the task of writing and understanding a deterministic grammar. While retaining structures and operations similar to those of Marcus' PARSIFAL parser [Marcus 80] the grammar language incorporates the following changes. (1) The use of productions operating in parallel has essentially been eliminated and instead the productions are organized into sequences. Not only does this improve the understandability of the grammar, it is felt that this organization corresponds more closely to the task of performing the sequence of buffer transformations and attachments required to parse the most common constituent types. (2) A general method for interfacing between the parser and a semantic representation system is introduced. This interface is independent of the particular semantic representation used and hides all details of the semantic processing from the grammar writer. (3) The interface also provides a general method for dealing with syntactic ambiguities which arise from the attachment of optional modifiers such as prepositional phrases. This frees the grammar writer from determining each point at which such ambiguities can occur.
- Published
- 1984
48. Deterministic parsing of syntactic non-fluencies
- Author
-
Donald Hindle
- Subjects
Basis (linear algebra) ,Grammar ,business.industry ,Computer science ,media_common.quotation_subject ,computer.software_genre ,Syntax ,Linguistics ,Artificial intelligence ,business ,Set (psychology) ,Deterministic parsing ,computer ,Natural language ,Natural language processing ,media_common - Abstract
It is often remarked that natural language, used naturally, is unnaturally ungrammatical. *Spontaneous speech contains all manner of false starts, hesitations, and self-corrections that disrupt the well-formedness of strings. It is a mystery then, that despite this apparent wide deviation from grammatical norms, people have little difficulty understanding the non-fluent speech that is the essential medium of everyday life. And it is a still greater mystery that children can succeed in acquiring the grammar of a language on the basis of evidence provided by a mixed set of apparently grammatical and ungrammatical strings.
- Published
- 1983
49. LR Grammars and Analysers
- Author
-
James J. Horning
- Subjects
Discrete mathematics ,Class (set theory) ,Parsing ,Rule-based machine translation ,Terminal and nonterminal symbols ,Computer science ,computer.software_genre ,Deterministic parsing ,LALR parser ,computer - Abstract
This chapter is concerned with a family of deterministic parsing techniques based on a method first described by Knuth [1965]. These parsers, and the grammars acceptable to them, share most of the desirable properties of the LL(k) family [Chapter 2.B.]. In addition, the class of LR(k)-parsable grammars is probably the largest class accepted by any currently practical parsing technique. The techniques with which we are mostly concerned are, in order of increasing power, LR(0), SLR(1), LALR(1) and LR(1). Collectively, we call these four techniques the LR family [McKeeman 1970] [Aho 1974].
- Published
- 1974
50. Locally Nondeterministic and Hybrid Syntax Analyzers from Partitioned Two-Level Grammars
- Author
-
Heinz Schmidt and Bernd J. Krämer
- Subjects
Syntax (programming languages) ,Computer science ,Programming language ,Deterministic context-free grammar ,Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing) ,Context-free grammar ,computer.software_genre ,Nondeterministic algorithm ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Extended Affix Grammar ,Computer Science::Programming Languages ,S-attributed grammar ,L-attributed grammar ,Deterministic parsing ,computer - Abstract
Considerable effort has been devoted to finding restricted classes of syntax directed translators usually being based on deterministic parsing techniques.
- Published
- 1979
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.