Descriptor: "Deterministic parsing" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Deterministic parsing"' showing total 60 results

Start Over Descriptor "Deterministic parsing"

60 results on '"Deterministic parsing"'

1. Deterministic Parsing with P Colony Automata

Author: Csuhaj-Varjú, Erzsébet, Kántor, Kristóf, Vaszil, György, Hutchison, David, Series Editor, Kanade, Takeo, Series Editor, Kittler, Josef, Series Editor, Kleinberg, Jon M., Series Editor, Mattern, Friedemann, Series Editor, Mitchell, John C., Series Editor, Naor, Moni, Series Editor, Pandu Rangan, C., Series Editor, Steffen, Bernhard, Series Editor, Terzopoulos, Demetri, Series Editor, Tygar, Doug, Series Editor, Weikum, Gerhard, Series Editor, Graciani, Carmen, editor, Riscos-Núñez, Agustín, editor, Păun, Gheorghe, editor, Rozenberg, Grzegorz, editor, and Salomaa, Arto, editor
Published: 2018
Full Text: View/download PDF

2. Dependency Parsing for Arabic Quran using Easy-First Parsing Algorithm

Author: Mochammad Arif Bijaksana, Alfiya El Hafsa, and Arief Fatchul Huda
Subjects: Parsing, Interpretation (logic), Relation (database), Computer science, Head (linguistics), Dependency grammar, Deterministic parsing, computer.software_genre, Algorithm, computer, Sentence, Word (computer architecture)
Abstract: Arabic is the main language of Al-Quran. Nowadays, many people are studying the Language of Al-Quran, called Quran Arabic. For the beginners, it is important for them to understand the syntactic relationship in a sentence found in the Qur'an. If they do not understand enough, the interpretation will be different and wrong. It will turn into dangerous because Al-Quran is a source of guidance for Muslims’ life. Dependency parsing is very important for linguistic research, especially for rich languages such as the Arabic Language. This study aims to build dependency parsing, in order to make it easier to get to understand syntactic relationship information in sentences. This study uses a parsing method called deterministic parsing, which the method used is shift-reduce parsing with the Easy-First parsing algorithm. The evaluation used labeled attachment score calculation. The score generated from the evaluation was 69.7, beforehand, the comparison both the system results and the gold standard have been done. 62 sentences found the correct head and relation in each word. The number of words found to be wrong is not more than 3 words in one sentence. Evaluation scores produced are not exorbitant due to the complicated tagset used and lacking test sentences.
Published: 2020

3. A deterministic parsing algorithm for ambiguous regular expressions

Author: Angelo Borsotti, Luca Breveglieri, Angelo Morzenti, and Stefano Crespi Reghizzi
Subjects: Empty string, regular expression deterministic parsing, Computer Networks and Communications, Computer science, 0102 computer and information sciences, 02 engineering and technology, computer.software_genre, 01 natural sciences, regular expression parsing tool, 0202 electrical engineering, electronic engineering, information engineering, ambiguous regular expression, Regular expression, Time complexity, Parsing, Syntax (programming languages), String (computer science), Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing), 020207 software engineering, Tree (graph theory), Berry-Sethi recognizer, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, 010201 computation theory & mathematics, Deterministic parsing, computer, Algorithm, Software, Information Systems
Abstract: We introduce a new parser generator, called Berry–Sethi Parser (BSP), for ambiguous regular expressions (RE). The generator constructs a deterministic finite-state transducer that recognizes an input string, as the classical Berry–Sethi algorithm does, and additionally outputs a linear representation of all the syntax trees of the string; for infinitely ambiguous strings, a policy for selecting representative sets of trees is chosen. To construct the transducer, the RE symbols, including letters, parentheses and other metasymbols, are distinctly numbered, so that the corresponding language becomes locally testable. In this way a deterministic position automaton can be constructed, which recognizes and translates the input into a compact DAG representation of the syntax trees. The correctness of the construction is proved. The transducer operates in a linear time on the input. Its descriptive complexity is analyzed as a function of established RE parameters: the alphabetic width, the number of null string symbols and the height of the RE tree. A condition for checking RE ambiguity on the transducer graph is stated. Experimental results of running the parser generator and the parser on a large RE collection are presented. The POSIX RE disambiguation criterion has also been applied to the parser.
Published: 2021

4. Deeply Integrating C11 Code Support into Isabelle/PIDE

Author: Frédéric Tuong and Burkhart Wolff
Subjects: Computer Science - Symbolic Computation, Computer Science - Logic in Computer Science, Computer Science - Programming Languages, Programming language, Computer science, Interface (Java), Framing (World Wide Web), lcsh:Mathematics, HOL, Context (language use), Semantics, computer.software_genre, lcsh:QA1-939, lcsh:QA75.5-76.95, Computer Science - Software Engineering, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Rule-based machine translation, Memory model, lcsh:Electronic computers. Computer science, Deterministic parsing, computer
Abstract: We present a framework for C code in C11 syntax deeply integrated into the Isabelle/PIDE development environment. Our framework provides an abstract interface for verification back-ends to be plugged-in independently. Thus, various techniques such as deductive program verification or white-box testing can be applied to the same source, which is part of an integrated PIDE document model. Semantic back-ends are free to choose the supported C fragment and its semantics. In particular, they can differ on the chosen memory model or the specification mechanism for framing conditions. Our framework supports semantic annotations of C sources in the form of comments. Annotations serve to locally control back-end settings, and can express the term focus to which an annotation refers. Both the logical and the syntactic context are available when semantic annotations are evaluated. As a consequence, a formula in an annotation can refer both to HOL or C variables. Our approach demonstrates the degree of maturity and expressive power the Isabelle/PIDE subsystem has achieved in recent years. Our integration technique employs Lex and Yacc style grammars to ensure efficient deterministic parsing. We present two case studies for the integration of (known) semantic back-ends in order to validate the design decisions for our back-end interface., Comment: In Proceedings F-IDE 2019, arXiv:1912.09611
Published: 2019

5. Uniquely Parsable Unification Grammars and Their Parser Implemented in Prolog.

Author: Lee, Jia, Morita, Kenichi, Asou, Hiroki, and Imai, Katsunobu
Abstract: A uniquely parsable grammar (UPG) introduced by Morita and coworkers is a formal grammar with a restricted type of rewriting rules, where parsing can be performed without backtracking. By extending a UPG, we introduce a uniquely parsable unification grammar (UPUG), and we investigate its applicability to parsing. A unification grammar (UG) is a system such that a sequence of terms is rewritten by a set of rules, and the rewriting process accompanies unification of terms as in Prolog. We first define a general framework of a UG and then give a UPUG-condition so that it has the property of unique parsability. Since the class of UPGs is a subclass of UPUGs and is known to be universal in language generating ability, the class of UPUGs is also universal. We then show a simple parsing method for UPUGs. Based on it, we give a Prolog implementation of a parser which will be useful for natural language analysis and other applications. [ABSTRACT FROM AUTHOR]
Published: 2000
Full Text: View/download PDF

6. A new view on parser combinators

Author: Pieter Koopman and Rinus Plasmeijer
Subjects: Functional programming, Domain-specific language, Parsing, Grammar, Computer science, Programming language, media_common.quotation_subject, computer.software_genre, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Parser combinator, Deterministic parsing, Combinatory logic, computer, Type constructor, media_common
Abstract: Parser combinators offer a concise and fast way to produce reasonably efficient parsers. The combinator libraries themselves can be small and provide an elegant application of functional programming techniques. They are one of the success stories in functional programming that are also ported to many other languages. In this paper, we illustrate that we can make the parser combinators more general by modeling them as a tagless domain specific language. The idea is to replace the ordinary combinators by a set of type constructor classes. By making different implementations of this class we can assign various interpretations of one and the same grammar specification. The set of type classes makes the DSL type-safe and extendable without needing to change existing parts and implementations. This enables us to make multiple interpretations, views, of the specified grammar. In this paper we show views for deterministic parsing, nondeterministic parsing, generating possible parse trees produced by the grammar without needing the corresponding input, generating inputs accepted by the grammar, adapting the grammar rules such that the parser combinators can handle left-recursion and so on. This makes our multi-view parser combinators more powerful than the existing approaches.
Published: 2019

7. DEALING WITH AMBIGUITIES IN ENGLISH CONJUNCTIONS AND COMPARATIVES BY A DETERMINISTIC PARSER.

Author: LIU, REY-LONG and SOO, VON-WUN
Abstract: The major problems in parsing English conjunctions and comparatives are ambiguities of scoping and ellipsis. Scoping ambiguities occur when a parser cannot deterministically detect boundaries of constituents, while ellipsis ambiguities occur when a parser cannot deterministically detect missing components. Since simple lookahead mechanisms cannot collect adequate information to resolve these ambiguities, a parsing strategy that only employs such mechanisms will need to backtrack each time it makes incorrect assumptions. In this paper, we extend the Wait-And-See strategy to parse conjunctions and comparatives deterministically and simultaneously. Several mechanisms, such as bottom-up preparsing, suspension, and pattern matching, are implemented. The bottom-up preparsing accesses the dictionary and recognizes isolated sentence fragments which can be determined without ambiguities. The suspension, which is different from Marcus's attention shifting, allows the parser to suspend temporally at ambiguous points and continue to parse the rest of the sentence until it obtains the necessary information to resolve the ambiguities. Pattern matching uses the concept of symmetry to detect missing components (the ellipses) in the two conjoined or compared sentence fragments. [ABSTRACT FROM AUTHOR]
Published: 1990
Full Text: View/download PDF

8. Generalizing input-driven languages: Theoretical and practical benefits

Author: Dino Mandrioli and Matteo Pradella
Subjects: FOS: Computer and information sciences, Theoretical computer science, Parsing, General Computer Science, Hierarchy (mathematics), Formal Languages and Automata Theory (cs.FL), Computer science, 020207 software engineering, Computer Science - Formal Languages and Automata Theory, 0102 computer and information sciences, 02 engineering and technology, computer.software_genre, 01 natural sciences, Theoretical Computer Science, Decidability, Order of operations, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Regular language, Closure (mathematics), Compiler construction, 010201 computation theory & mathematics, 0202 electrical engineering, electronic engineering, information engineering, Deterministic parsing, computer
Abstract: Regular languages (RL) are the simplest family in Chomsky’s hierarchy. Thanks to their simplicity they enjoy various nice algebraic and logic properties that have been successfully exploited in many application fields. Practically all of their related problems are decidable, so that they support automatic verification algorithms. Also, they can be recognized in real-time. Context-free languages (CFL) are another major family well-suited to formalize programming, natural, and many other classes of languages; their increased generative power w.r.t. RL, however, causes the loss of several closure properties and of the decidability of important problems; furthermore they need complex parsing algorithms. Thus, various subclasses thereof have been defined with different goals, spanning from efficient, deterministic parsing to closure properties, logic characterization and automatic verification techniques. Among CFL subclasses, so-called structured ones, i.e., those where the typical tree-structure is visible in the sentences, exhibit many of the algebraic and logic properties of RL, whereas deterministic CFL have been thoroughly exploited in compiler construction and other application fields. After surveying and comparing the main properties of those various language families, we go back to operator precedence languages (OPL), an old family through which R. Floyd pioneered deterministic parsing, and we show that they offer unexpected properties in two fields so far investigated in totally independent ways: they enable parsing parallelization in a more effective way than traditional sequential parsers, and exhibit the same algebraic and logic properties so far obtained only for less expressive language families.
Published: 2018

9. Deterministic Parsing with P Colony Automata

Author: Erzsébet Csuhaj-Varjú, Kristóf Kántor, and György Vaszil
Subjects: Discrete mathematics, Multiset, Parsing, Computer science, Backtracking, Property (programming), 0102 computer and information sciences, 02 engineering and technology, computer.software_genre, 01 natural sciences, Automaton, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Rule-based machine translation, 010201 computation theory & mathematics, Symbol (programming), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Deterministic parsing, computer
Abstract: We investigate the possibility of the deterministic parsing (that is, parsing without backtracking) of languages described by (generalized) P colony automata. We define a subclass of these computing devices satisfying a property which resembles the LL(k) property of context-free grammars, and study the possibility of parsing the characterized languages using a k symbol lookahead, as in the LL(k) parsing method for context-free languages.
Published: 2018

10. Determination of the Compound Biological Effectiveness (CBE) Factors Based on the ISHIYAMA-IMAHORI Deterministic Parsing Model with the Dynamic PET Technique

Author: Hanna Koivunoro, Shintaro Ishiyama, Jun Itami, and Yoshio Imahori
Subjects: Neutron capture, Boron concentration, Order (group theory), Applied mathematics, Sigmoid function, Patient data, Logistic function, Deterministic parsing, Eigenvalues and eigenvectors, Mathematics
Abstract: Purpose: In defining the biological effects of the 10B(n, α)7Li neutron capture reaction, we have proposed a deterministic parsing model (ISHIYAMA-IMAHORI model) to determine the Compound Biological Effectiveness (CBE) factor in Borono-Phenyl-Alanine (BPA)-mediated Boron Neutron Capture Therapy (BNCT). In present paper, we demonstrate a specific method of how the application of the case of application to actual patient data, which is founded on this model for tissues and tumor. Method: To determine the CBE factor, we derived the following new calculation formula founded on the deterministic parsing model with three constants, CBE0, F, n and the eigen value Nth/Nmax. (1), where, Nth and Nmax are the threshold value of boron concentration of N and saturation boron density and CBE0, F and n are given as 0.5, 8 and 3, respectively. In order to determine Nth and Nmax in the formula, sigmoid logistic function was employed for 10B concentration data, Db(t) obtained by dynamic PET technique. (2), where, A, a and t0 are constants. Results and Conclusion: From the application of sigmoid function to dynamic PET data, it is concluded that the Nth and Nmax for tissue and tumor are identified with the parameter constants in the sigmoid function in Equation (2) as: (3). And the calculated CBE factor values obtained from Equation (1), with Nth/Nmax.
Published: 2015

11. Deterministic Parsing Model of the Compound Biological Effectiveness (CBE) Factor for Intracellular 10Boron Distribution in Boron Neutron Capture Therapy

Author: Shintaro Ishiyama
Subjects: Physics, Neutron capture, chemistry, Distribution (number theory), Factor (programming language), chemistry.chemical_element, Statistical physics, Boron, Deterministic parsing, computer, Eigenvalues and eigenvectors, computer.programming_language
Abstract: Purpose: In defining the biological effects of the 10 B(n, α) 7 Li neutron capture reaction, we have previously developed a deterministic parsing model to determine the Compound Biological Effectiveness (CBE) factor in Borono-Phenyl-Alanine (BPA)-mediated Boron Neutron Capture Therapy (BNCT). In present paper, we demonstrate that the CBE factor is directly and unambiguously derivable by the new formula for any case of intracellular 10 Boron ( 10 B) distribution, which is founded on this model for tissues and tumor. Method: To determine the CBE factor, we derive the following new calculation formula founded on the deterministic parsing model with three constants, CBE0, F, n and the eigen value Nth/Nmax.
Published: 2014

12. Unsupervised dependency parsing without training

Author: Anders Søgaard
Subjects: Linguistics and Language, Parsing, Dependency (UML), business.industry, Computer science, Multivalued dependency, computer.software_genre, Language and Linguistics, Grammar induction, Ranking, Artificial Intelligence, Dependency grammar, Artificial intelligence, Deterministic parsing, business, Centrality, computer, Software, Natural language processing
Abstract: Usually unsupervised dependency parsers try to optimize the probability of a corpus by revising the dependency model that is assumed to have generated the corpus. In this paper we explore a different view in which a dependency structure is, among other things, a partial order on the nodes in terms of centrality or saliency. Under this assumption we directly model centrality and derive dependency trees from the ordering of words. The result is an approach to unsupervised dependency parsing that is very different from standard ones in that it requires no training data. The input words are ordered by centrality, and a parse is derived from the ranking using a simple deterministic parsing algorithm, relying on the universal dependency rules defined by Naseem et al. (Naseem, T., Chen, H., Barzilay, R., Johnson, M. 2010. Using universal linguistic knowledge to guide grammar induction. In Proceedings of Empirical Methods in Natural Language Processing, Boston, MA, USA, pp. 1234–44.). Our approach is evaluated on data from twelve different languages and is remarkably competitive.
Published: 2012

13. Deterministic shift-reduce parsing for unification-based grammars

Author: Hiroshi Nakagawa, Takuya Matsuzaki, Nobuyuki Shimizu, and Takashi Ninomiya
Subjects: Linguistics and Language, Parsing, Programming language, Computer science, Parsing expression grammar, computer.software_genre, Top-down parsing, Language and Linguistics, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Parser combinator, Artificial Intelligence, S-attributed grammar, L-attributed grammar, Deterministic parsing, computer, Software, Bottom-up parsing
Abstract: Many parsing techniques assume the use of a packed parse forest to enable efficient and accurate parsing. However, they suffer from an inherent problem that derives from the restriction of locality in the packed parse forest. Deterministic parsing is one solution that can achieve simple and fast parsing without the mechanisms of the packed parse forest by accurately choosing search paths. We propose new deterministic shift-reduce parsing and its variants for unification-based grammars. Deterministic parsing cannot simply be applied to unification-based grammar parsing, which often fails because of its hard constraints. Therefore, this is developed by using default unification, which almost always succeeds in unification by overwriting inconsistent constraints in grammars.
Published: 2010

14. Reliability analysis of tunnel surrounding rock stability by Monte-Carlo method

Author: Geng-she Yang and Jia-mi Xi
Subjects: Engineering, Theory of relativity, Basis (linear algebra), business.industry, Monte Carlo method, Energy Engineering and Power Technology, Structural engineering, Geotechnical Engineering and Engineering Geology, Deterministic parsing, business, Stability (probability), Reliability (statistics), Physics::Geophysics
Abstract: Discussed advantages of improved Monte-Carlo method and feasibility about proposed approach applying in reliability analysis for tunnel surrounding rock stability. On the basis of deterministic parsing for tunnel surrounding rock, reliability computing method of surrounding rock stability was derived from improved Monte-Carlo method. The computing method considered random of related parameters, and therefore satisfies relativity among parameters. The proposed method can reasonably determine reliability of surrounding rock stability. Calculation results show that this method is a scientific method in discriminating and checking surrounding rock stability.
Published: 2008

15. From Ambiguous Regular Expressions to Deterministic Parsing Automata

Author: Luca Breveglieri, Angelo Morzenti, Stefano Crespi Reghizzi, and Angelo Borsotti
Subjects: Tree (data structure), Generator (computer programming), Theoretical computer science, Parsing, Syntax (programming languages), Computer science, Programming language, Regular expression, computer.software_genre, Deterministic parsing, Abstract syntax tree, computer, Testability
Abstract: This new parser generator for ambiguous regular expressions (RE) formally extends the Berry-Sethi (BS) algorithm into a finite-state device that specifies the syntax tree(s). We extend the local testability property of the marked RE’s from terminal strings to linearized syntax trees. The generator supports disambiguation, i.e., selecting a preferred tree in case of ambiguity. The selection is parametric with respect to the Greedy or POSIX criterion. The parser is proved correct and has linear-time complexity. The generator is available as an interactive SW tool (on GitHub - see http://github.com/breveglieri/ebs/README).
Published: 2015

16. Parallel LL parsing

Author: Ladislav Vagner and Bořivoj Melichar
Subjects: TheoryofComputation_COMPUTATIONBYABSTRACTDEVICES, Theoretical computer science, Computer science, Computer Networks and Communications, Parsing expression grammar, Top-down parsing, Canonical LR parser, LL grammar, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Parser combinator, LL parser, Deterministic parsing, Algorithm, Software, Bottom-up parsing, Information Systems
Abstract: A deterministic parallel LL parsing algorithm is presented. The algorithm is based on a transformation from a parsing problem to parallel reduction. First, a nondeterministic version of a parallel LL parser is introduced. Then, it is transformed into the deterministic version—the LLP parser. The deterministic LLP(q,k) parser uses two kinds of information to select the next operation — a lookahead string of length up to k symbols and a lookback string of length up to q symbols. Deterministic parsing is available for LLP grammars, a subclass of LL grammars. Since the presented deterministic and nondeterministic parallel parsers are both based on parallel reduction, they are suitable for most parallel architectures.
Published: 2006

17. Extracting Partial Parsing Rules from Tree-Annotated Corpus: Toward Deterministic Global Parsing

Author: Kong Joo Lee, Myung-Seok Choi, Key-Sun Choi, and Gil Chang Kim
Subjects: Computer science, media_common.quotation_subject, Top-down parsing, computer.software_genre, Lexicon, Parser combinator, Artificial Intelligence, Electrical and Electronic Engineering, Phrase structure grammar, media_common, Parsing, Grammar, business.industry, Parsing expression grammar, Syntax, Substring, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Hardware and Architecture, Top-down parsing language, S-attributed grammar, Computer Vision and Pattern Recognition, Artificial intelligence, Deterministic parsing, business, computer, Software, Natural language, Natural language processing, Sentence, Bottom-up parsing
Abstract: It is not always possible to find a global parse for an input sentence owing to problems such as errors of a sentence, incompleteness of lexicon and grammar. Partial parsing is an alternative approach to respond to these problems. Partial parsing techniques try to recover syntactic information efficiently and reliably by sacrificing completeness and depth of analysis. One of the difficulties in partial parsing is how the grammar might be automatically extracted. In this paper we present a method of automatically extracting partial parsing rules from a tree-annotated corpus using the decision tree method. Our goal is deterministic global parsing using partial parsing rules, in other words, to extract partial parsing rules with higher accuracy and broader expansion. First, we define a rule template that enables to learn a subtree for a given substring, so that the resultant rules can be more specific and stricter to apply. Second, rule candidates extracted from a training corpus are enriched with contextual and lexical information using the decision tree method and verified through cross-validation. Last, we underspecify non-deterministic rules by merging substructures with ambiguity in those rules. The learned grammar is similar to phrase structure grammar with contextual and lexical information, but allows building structures of depth one or more. Thanks to automatic learning, the partial parsing rules can be consistent and domain-independent. Partial parsing with this grammar processes an input sentence deterministically using longest-match heuristics, and recursively applies rules to an input sentence. The experiments showed that the partial parser using automatically extracted rules is not only accurate and efficient but also achieves reasonable coverage for Korean.
Published: 2005

18. Efficient Semi-Deterministic Parsing for Korean Using Lexical Co-Occurrence Data from a Corpus

Author: Juntae Yoon
Subjects: Dependency (UML), Parsing, Computer science, business.industry, media_common.quotation_subject, Speech recognition, Association (object-oriented programming), Ambiguity, computer.software_genre, Noun phrase, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Association value, Artificial intelligence, Deterministic parsing, business, computer, Natural language processing, Sentence, media_common
Abstract: This paper presents an efficient parsing method for Korean using statistical information extracted from a corpus. Structural ambiguity commonly occurs while deciding dependency relations between words in Korean sentences. To resolve the ambiguity, lexical association between words plays an important role in figuring out the correct dependency. Our parser uses statistical co-occurrence data to compute the lexical association. In addition, we define the global association table (GAT) which enables a global management of the associations. Using the GAT, the parser has an overall configuration of dependency relations between words, thus can analyze a sentence almost deterministically. That is, our system can be viewed as a semi-deterministic parser, which is controlled not by the condition-action rule but by the association value between phrases. Furthermore, the unknown grammatical case of a noun phrase caused by the auxiliary postposition in Korean can be effectively resolved using lexical co-occurrences in our system.
Published: 2002

19. A Practical GLR Parser Generator for Software Reverse Engineering

Author: Teng Geng, Changqing Lai, Zhibo Chen, Wei Meng, Fu Xu, and Han Mei
Subjects: Computer Networks and Communications, Computer science, Programming language, ComputerApplications_COMPUTERSINOTHERSYSTEMS, Recursive descent parser, Top-down parsing, computer.software_genre, Canonical LR parser, Simple LR parser, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Parser combinator, GLR parser, Software_PROGRAMMINGLANGUAGES, Deterministic parsing, LALR parser, computer
Abstract: Traditional parser generators use deterministic parsing methods. These methods can not meet the parsing requirements of software reverse engineering effectively. A new parser generator is presented which can generate GLR parser with automatic error recovery. The generated GLR parser has comparable parsing speed with the traditional LALR(1) parser and can be used in the parsing of software reverse engineering.
Published: 2014

20. The PAPAGENO Parallel-Parser Generator

Author: Dino Mandrioli, Alessandro Barenghi, Stefano Crespi Reghizzi, Matteo Pradella, and Federica Panella
Subjects: Multi-core processor, Parsing, Generator (computer programming), Computer science, Parallel computing, Parser generation, computer.software_genre, Field (computer science), Order of operations, Rule-based machine translation, Parallel Parsing, Operator Precedence Grammars, Compiler, Deterministic parsing, computer
Abstract: The increasing use of multicore processors has deeply transformed computing paradigms and applications. The wide availability of multicore systems had an impact also in the field of compiler technology, although the research on deterministic parsing did not prove to be effective in exploiting the architectural advantages, the main impediment being the inherent sequential nature of traditional LL and LR algorithms. We present PAPAGENO, an automated parser generator relying on operator precedence grammars. We complemented the PAPAGENO-generated parallel parsers with parallel lexing techniques, obtaining near-linear speedups on multicore machines, and the same speed as Bison parsers on sequential execution.
Published: 2014

21. Deterministic Statistical Mapping of Sentences to Underspecified Semantics

Author: Pi-Chuan Chang, Hiyan Alshawi, and Michael Ringgaard
Subjects: Dependency (UML), business.industry, Computer science, Treebank, Statistical model, Semantics, computer.software_genre, Semantic mapping, Dependency grammar, Artificial intelligence, business, Deterministic parsing, computer, Natural language processing, Natural language
Abstract: We present a method for training a statistical model for mapping natural language sentences to semantic expressions. The semantics are expressions of an underspecified logical form that has properties making it particularly suitable for statistical mapping from text. An encoding of the semantic expressions into dependency trees with automatically generated labels allows application of existing methods for statistical dependency parsing to the mapping task (without the need for separate traditional dependency labels or parts of speech). The encoding also results in a natural per-word semantic-mapping accuracy measure. We report on the results of training and testing statistical models for mapping sentences of the Penn Treebank into the semantic expressions, for which per-word semantic mapping accuracy ranges between 79% and 86% depending on the experimental conditions. The particular choice of algorithms used also means that our trained mapping is deterministic (in the sense of deterministic parsing), paving the way for large-scale text-to-semantic mapping.
Published: 2014

22. Uniquely parsable array grammars for generating and parsing connected patterns

Author: Katsunobu Imai and Kenichi Morita
Subjects: Parsing, Theoretical computer science, Grammar, Computer science, Backtracking, media_common.quotation_subject, computer.software_genre, Syntactic pattern recognition, Cellular automaton, Rule-based machine translation, Artificial Intelligence, Deterministic automaton, Signal Processing, Formal language, Computer Vision and Pattern Recognition, Deterministic parsing, computer, Algorithm, Software, media_common
Abstract: A uniquely parsable array grammar (UPAG) introduced by Yamamoto and Morita is a special kind of isometric array grammar (IAG) in which parsing can be performed without backtracking. Hence, we can use a UPAG as an efficient syntactic pattern recognition mechanism, if the pattern set is properly described by a UPAG. In this paper, we investigate the problem of describing and recognizing the set of all connected patterns using a UPAG formalism. As for the recognition of connected patterns, Beyer showed an efficient algorithm that operates on cellular automata. We show that his algorithm can be expressed very simply in the UPAG framework, and give two kinds of simple UPAGs that generate the set of all connected patterns.
Published: 1999

23. Grammar partitioning and modular deterministic parsing

Author: Giuseppe Psaila and Stefano Crespi Reghizzi
Subjects: Theoretical computer science, Parsing, General Computer Science, Computer science, Programming language, LR parser, Deterministic context-free grammar, Context-free grammar, computer.software_genre, Canonical LR parser, LL grammar, LALR parser, Deterministic parsing, computer
Abstract: Complex languages are often modularized into sublanguages and the compiler is accordingly organized as a set of separate modules. Modularization (called federalization) is beneficial for beating complexity, for maintenance, and for reuse. Focusing on syntax analysis, we consider the decomposition of a grammar into deterministic subgrammars. We study three conditions for determinism in grammar partitioning: first using homogeneous modules of the LR(1) or LL(1) kind; then using heterogeneous modules (LR(1) or LL(1)). Federalization slightly decreases the generality of LR(1) parsers, but not of LL(1) ones, and it allows to handle some grammars which are not LALR(1). Experimental results show that LR(1) federal automata have fewer (up to 60%) states than monolithic LR(1) automata. Criteria for modularization, practical experiences and hints to semantic decomposition issues conclude the paper.
Published: 1998

24. Parsing Partially Ordered Multisets

Author: Twan Basten, Mathematics and Computer Science, and Formal Methods
Subjects: Multiset, Theoretical computer science, Parsing, Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing), computer.software_genre, Top-down parsing, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Parser combinator, Computer Science (miscellaneous), Top-down parsing language, S-attributed grammar, Deterministic parsing, Algorithm, computer, Mathematics, Bottom-up parsing
Abstract: A partially ordered multiset or pomset is a generalization of a string in which the total order has been relaxed to a partial order. Strings are often used as a model for sequential computation; pomsets are a natural model for parallel and distributed computation. By viewing pomsets as a generalization of strings, the question is raised whether concepts from language theory can be generalized to pomsets. An important area in the theory of languages is parsing theory. This paper develops the fundamentals of a parsing theory for pomsets, called PLR parsing. It is based on the LR-parsing technique, which is the most powerful deterministic parsing technique in language theory. The basic algorithm in the class of PLR parsing algorithms, the PLR(0) algorithm is explained in detail.
Published: 1997

25. Operator Precedence ω-Languages

Author: Matteo Pradella, Dino Mandrioli, Violetta Lonati, and Federica Panella
Subjects: Structure (mathematical logic), Theoretical computer science, Syntax (programming languages), Infinite-state model checking, Programming language, Computer science, Pushdown automaton, Closure (topology), ω-languages, computer.software_genre, Notation, Order of operations, Closure properties, Regular language, Operator precedence languages, Deterministic parsing, computer
Abstract: Recent literature extended the analysis of ω-languages from the regular ones to various classes of languages with “visible syntax structure”, such as visibly pushdown languages (VPLs). Operator precedence languages (OPLs), instead, were originally defined to support deterministic parsing and exhibit interesting relations with these classes of languages: OPLs strictly include VPLs, enjoy all relevant closure properties and have been characterized by a suitable automata family and a logic notation. We introduce here operator precedence ω-languages (ωOPLs), investigating various acceptance criteria and their closure properties. Whereas some properties are natural extensions of those holding for regular languages, others require novel investigation techniques.Application-oriented examples show the gain in expressiveness and verifiability offered by ωOPLs w.r.t. smaller classes.
Published: 2013

26. E-parser: An implementation of a deterministic GB-related parsing system

Author: Torbjørn Nordgård
Subjects: Theoretical computer science, Parsing, Workstation, Syntax (programming languages), Programming language, Computer science, General Social Sciences, Top-down parsing, computer.software_genre, law.invention, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Parser combinator, law, S-attributed grammar, Deterministic parsing, computer, Bottom-up parsing
Abstract: This paper describes an implementation of a deterministic parsing system, described in Nordgard (1993). The syntax of “heuristic rules” and how the rules interact with the basic operations of the parser constitute the bulk of the article. The implementation is written in Medley Interlisp, and the system can be run on Sun or Xerox workstations.
Published: 1994

27. Verifiable Parse Table Composition for Deterministic Parsing

Author: August Schwerdfeger and Eric Van Wyk
Subjects: Source code, Theoretical computer science, Parsing, Programming language, Computer science, media_common.quotation_subject, computer.software_genre, Top-down parsing, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Parser combinator, Table (database), Deterministic parsing, LALR parser, computer, media_common, Bottom-up parsing
Abstract: One obstacle to the implementation of modular extensions to programming languages lies in the problem of parsing extended languages. Specifically, the parse tables at the heart of traditional LALR(1) parsers are so monolithic and tightly constructed that, in the general case, it is impossible to extend them without regenerating them from the source grammar. Current extensible frameworks employ a variety of solutions, ranging from a full regeneration to using pluggable binary modules for each different extension. But recompilation is time-consuming, while the pluggable modules in many cases cannot support the addition of more than one extension, or use backtracking or non-deterministic parsing techniques. We present here a middle-ground approach that allows an extension, if it meets certain restrictions, to be compiled into a parse table fragment. The host language parse table and fragments from multiple extensions can then always be efficiently composed to produce a conflict-free parse table for the extended language. This allows for the distribution of deterministic parsers for extensible languages in a pre-compiled format, eliminating the need for the “source code” grammar to be distributed. In practice, we have found these restrictions to be reasonable and admit many useful language extensions.
Published: 2010

28. An ungreedy Chinese deterministic dependency parser considering long-distance dependency

Author: Wenlin Yao, Lingling Gao, and Lei Wang
Subjects: Parsing, Dependency (UML), business.industry, Computer science, Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing), computer.software_genre, Top-down parsing, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Parser combinator, Dependency grammar, S-attributed grammar, Artificial intelligence, Deterministic parsing, business, computer, Natural language processing, Bottom-up parsing
Abstract: This paper presents a two-step dependency parser to parse Chinese deterministically. By dividing a sentence into two parts and parsing them separately, the error accumulation can be avoided effectively. Previous works on shift-reduce dependency parser may guarantee the greedy characteristic of deterministic parsing less. This paper improves on a kind of deterministic dependency parsing method to weaken the greedy characteristic of it. During parsing, both forward and backward parsing directions are chosen to decrease the unparsed rate. Support vector machines are utilized to determine the word dependency relations and in order to solve the problem of long distance dependency, a group of combined global features are presented in this paper. The proposed parser achieved significant improvement on dependency accuracy and root accuracy.
Published: 2008

29. Apply a Rough Set-Based Classifier to Dependency Parsing

Author: Yangsheng Ji, Ruoce Ma, Xinyu Dai, and Lin Shang
Subjects: Parsing, Computer science, business.industry, Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing), Top-down parsing, computer.software_genre, Machine learning, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Parser combinator, Dependency grammar, S-attributed grammar, Rough set, Artificial intelligence, Deterministic parsing, business, computer, Natural language processing, Bottom-up parsing
Abstract: A rough set-based semi-naive Bayesian classification method is applied to dependency parsing, which is an important task in syntactic structure analysis of natural language processing. Many parsing algorithms have emerged combined with statistical machine learning techniques. The rough set-based classifier is embedded with Nivre's deterministic parsing algorithm to conduct dependency parsing task on a Chinese corpus. Experimental results show that the method has a good performance on dependency parsing task. Moreover, the experiments have justified the effectiveness of the classification influence.
Published: 2008

30. Japanese dependency parsing using a tournament model

Author: Masayuki Asahara, Yuji Matsumoto, and Masakazu Iwatate
Subjects: Text corpus, Parsing, business.industry, Computer science, Probabilistic logic, computer.software_genre, Top-down parsing, Dependency grammar, Tournament, Artificial intelligence, Deterministic parsing, business, computer, Preference (economics), Natural language processing
Abstract: In Japanese dependency parsing, Kudo's relative preference-based method (Kudo and Matsumoto, 2005) outperforms both deterministic and probabilistic CKY-based parsing methods. In Kudo's method, for each dependent word (or chunk) a log-linear model estimates relative preference of all other candidate words (or chunks) for being as its head. This cannot be considered in the deterministic parsing methods. We propose an algorithm based on a tournament model, in which the relative preferences are directly modeled by one-on-one games in a step-ladder tournament. In an evaluation experiment with Kyoto Text Corpus Version 4.0, the proposed method outperforms previous approaches, including the relative preference-based method.
Published: 2008

31. The Data-Oriented Parsing Approach: Theory and Application

Author: Bod, R., Fulcher, J., Jain, L.C., and Language and Computation (ILLC, FNWI/FGw)
Subjects: Parsing, Computer science, business.industry, computer.software_genre, Top-down parsing, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Parser combinator, Top-down parsing language, S-attributed grammar, Artificial intelligence, Deterministic parsing, business, computer, Natural language processing, Bottom-up parsing, Data-oriented parsing
Abstract: Parsing models have many applications in AI, ranging from natural language processing (NLP) and computational music analysis to logic programming and computational learning. Broadly conceived, a parsing model seeks to uncover the underlying structure of an input, that is, the various ways in which elements of the input combine to form phrases or constituents and how those phrases recursively combine to form a tree structure for the whole input. During the last fifteen years, a major shift has taken place from rule-based, deterministic parsing to corpus-based, probabilistic parsing. A quick glance over the NLP literature from the last ten years, for example, indicates that virtually all natural language parsing systems are currently probabilistic. The same development can be observed in (stochastic) logic programming and (statistical) relational learning. This trend towards probabilistic parsing is not surprising: the increasing availability of very large collections of text, music, images and the like allow for inducing statistically motivated parsing systems from actual data.
Published: 2008

32. DEALING WITH AMBIGUITIES IN ENGLISH CONJUNCTIONS AND COMPARATIVES BY A DETERMINISTIC PARSER

Author: Von-Wun Soo and Rey-Long Liu
Subjects: Parsing, Computer science, business.industry, media_common.quotation_subject, Ellipsis (linguistics), Ambiguity, computer.software_genre, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Artificial Intelligence, Computer Vision and Pattern Recognition, Artificial intelligence, Pattern matching, Computational linguistics, business, Deterministic parsing, computer, Software, Natural language, Sentence, Natural language processing, media_common
Abstract: The major problems in parsing English conjunctions and comparatives are ambiguities of scoping and ellipsis. Scoping ambiguities occur when a parser cannot deterministically detect boundaries of constituents, while ellipsis ambiguities occur when a parser cannot deterministically detect missing components. Since simple lookahead mechanisms cannot collect adequate information to resolve these ambiguities, a parsing strategy that only employs such mechanisms will need to backtrack each time it makes incorrect assumptions. In this paper, we extend the Wait-And-See strategy to parse conjunctions and comparatives deterministically and simultaneously. Several mechanisms, such as bottom-up preparsing, suspension, and pattern matching, are implemented. The bottom-up preparsing accesses the dictionary and recognizes isolated sentence fragments which can be determined without ambiguities. The suspension, which is different from Marcus’s attention shifting, allows the parser to suspend temporally at ambiguous points and continue to parse the rest of the sentence until it obtains the necessary information to resolve the ambiguities. Pattern matching uses the concept of symmetry to detect missing components (the ellipses) in the two conjoined or compared sentence fragments.
Published: 1990

33. Top-Down Deterministic Parsing of Languages Generated by CD Grammar Systems

Author: György Vaszil and Henning Bordihn
Subjects: Parsing, Programming language, business.industry, Computer science, Parsing expression grammar, computer.software_genre, Top-down parsing, LL grammar, S-attributed grammar, Top-down parsing language, Artificial intelligence, Deterministic parsing, business, computer, Natural language processing, Bottom-up parsing
Abstract: The paper extends the notion of context-free LL(k) grammars to CD grammar systems using two different derivation modes, examines some of the properties of the resulting language families, and studies the possibility of parsing these languages deterministically, without backtracking.
Published: 2007

34. A three-step deterministic parser for Chinese dependency parsing

Author: Sadao Kurohashi, Kun Yu, and Hao Liu
Subjects: Parsing, Computer science, business.industry, Speech recognition, Parsing expression grammar, Recursive descent parser, computer.software_genre, Top-down parsing, Canonical LR parser, Simple LR parser, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Parser combinator, Dependency grammar, GLR parser, LL parser, Top-down parsing language, S-attributed grammar, Artificial intelligence, Deterministic parsing, LALR parser, business, computer, Natural language processing, Sentence, Bottom-up parsing
Abstract: This paper presents a three-step dependency parser to parse Chinese deterministically. By dividing a sentence into several parts and parsing them separately, it aims to reduce the error propagation coming from the greedy characteristic of deterministic parsing. Experimental results showed that compared with the deterministic parser which parsed a sentence in sequence, the proposed parser achieved extremely significant improvement on dependency accuracy.
Published: 2007

35. Dependency parsing based on dynamic local optimization

Author: Ting Liu, Jinshan Ma, Sheng Li, and Huijia Zhu
Subjects: TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Parsing, Theoretical computer science, Parser combinator, Computer science, Memoization, S-attributed grammar, Parsing expression grammar, Deterministic parsing, computer.software_genre, Top-down parsing, computer, Bottom-up parsing
Abstract: This paper presents a deterministic parsing algorithm for projective dependency grammar. In a bottom-up way the algorithm finds the local optimum dynamically. A constraint procedure is made to use more structure information. The algorithm parses sentences in linear time and labeling is integrated with the parsing. This parser achieves 63.29% labeled attachment score on the average in CoNLL-X Shared Task.
Published: 2006

36. A best-first probabilistic shift-reduce parser

Author: Alon Lavie and Kenji Sagae
Subjects: Parsing, Computer science, LR parser, business.industry, Shift-reduce parser, Treebank, Parsing expression grammar, Top-down parsing, computer.software_genre, Canonical LR parser, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Parser combinator, Top-down parsing language, Artificial intelligence, Deterministic parsing, business, computer, Natural language processing, Generative grammar
Abstract: Recently proposed deterministic classifier-based parsers (Nivre and Scholz, 2004; Sagae and Lavie, 2005; Yamada and Mat-sumoto, 2003) offer attractive alternatives to generative statistical parsers. Deterministic parsers are fast, efficient, and simple to implement, but generally less accurate than optimal (or nearly optimal) statistical parsers. We present a statistical shift-reduce parser that bridges the gap between deterministic and probabilistic parsers. The parsing model is essentially the same as one previously used for deterministic parsing, but the parser performs a best-first search instead of a greedy search. Using the standard sections of the WSJ corpus of the Penn Treebank for training and testing, our parser has 88.1% precision and 87.8% recall (using automatically assigned part-of-speech tags). Perhaps more interestingly, the parsing model is significantly different from the generative models used by other well-known accurate parsers, allowing for a simple combination that produces precision and recall of 90.9% and 90.7%, respectively.
Published: 2006

37. Discriminative classifiers for deterministic dependency parsing

Author: Johan Hall, Joakim Nivre, and Jens Nilsson
Subjects: Parsing, business.industry, Memoization, Computer science, Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing), computer.software_genre, Machine learning, Top-down parsing, Parser combinator, Discriminative model, Dependency grammar, S-attributed grammar, Artificial intelligence, business, Deterministic parsing, computer, Natural language processing, Bottom-up parsing
Abstract: Deterministic parsing guided by treebank-induced classifiers has emerged as a simple and efficient alternative to more complex models for data-driven parsing. We present a systematic comparison of memory-based learning (MBL) and support vector machines (SVM) for inducing classifiers for deterministic dependency parsing, using data from Chinese, English and Swedish, together with a variety of different feature models. The comparison shows that SVM gives higher accuracy for richly articulated feature models across all languages, albeit with considerably longer training times. The results also confirm that classifier-based deterministic parsing can achieve parsing accuracy very close to the best results reported for more complex parsing models.
Published: 2006

38. Directly-Executable Earley Parsing

Author: R. Nigel Horspool and John Aycock
Subjects: TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Parser combinator, Programming language, Computer science, LR parser, S-attributed grammar, Parsing expression grammar, Deterministic parsing, computer.software_genre, Top-down parsing, computer, Bottom-up parsing, Earley parser
Abstract: Deterministic parsing techniques are typically used in favor of general parsing algorithms for efficiency reasons. However, general algorithms such as Earley's method are more powerful and also easier for developers to use, because no seemingly arbitrary restrictions are placed on the grammar. We describe how to narrow the performance gap between general and deterministic parsers, constructing a directly-executable Earley parser that can reach speeds comparable to deterministic methods even on grammars for commonly-used programming languages.
Published: 2001

39. A Deterministic Shift-Reduce Parser Generator for a Logic Programming Language

Author: Chuck Liang
Subjects: Memoization, Computer science, Optimizing compiler, Recursive descent parser, computer.software_genre, Top-down parsing, Canonical LR parser, Rule-based machine translation, Parser combinator, LL parser, Logic programming, computer.programming_language, Compiler-compiler, Parsing, Syntax (programming languages), Programming language, Deterministic context-free grammar, Shift-reduce parser, Parsing expression grammar, Context-free grammar, Formal grammar, Simple LR parser, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Terminal and nonterminal symbols, GLR parser, λProlog, Top-down parsing language, S-attributed grammar, Compiler, L-attributed grammar, Deterministic parsing, LALR parser, computer, Bottom-up parsing
Abstract: This paper addresses efficient parsing in the context of logical inference for the purpose of using logic programming languages in compiler writing. A bottom-up, deterministic parsing mechanism is formulated for "bounded right context" grammars, a subclass of LR(k) grammars with characteristics amenable to declarative parser specification. A working parser generator for λProlog is described, although the basic parsing mechanism is applicable to logic programming in general.
Published: 2000

40. Deterministic parsing for augmented context-free grammars

Author: Stefano Crespi-Reghizzi, Luca Breveglieri, and Alessandra Cherubini
Subjects: Theoretical computer science, Computer science, Context-sensitive grammar, Recursive descent parser, computer.software_genre, Top-down parsing, Parser combinator, Rule-based machine translation, Formal language, LL parser, Indexed grammar, Phrase structure grammar, Parsing, Augmented transition network, Programming language, Deterministic context-free grammar, Parsing expression grammar, Context-free grammar, Tree-adjoining grammar, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Extended Affix Grammar, Ambiguous grammar, Stochastic context-free grammar, S-attributed grammar, Top-down parsing language, L-attributed grammar, Deterministic parsing, computer, Generative grammar, Bottom-up parsing
Abstract: In contrast to the usual depth-first derivations of context-free (CF) grammars, breadth-first derivations (also in combination with depth-first ones) yield a class of augmented context-free grammars (ACF) (also termed multi-breadth-depth grammars) endowed with greater generative capacity, yet manageable. The inadequacy of CF grammars to treat distant dependencies is overcome by the new model. ACF grammars can be classified with respect to their disposition, a concept related to the data structure needed to parse their strings. For such augmented CF grammars we consider the LL(k) condition, that ensures top-down deterministic parsing. We restate the condition as an adjacency problem and we prove that it is decidable for any disposition. The deterministic linear-time parser differs from a recursive descent parser by using instead of a LIFO stack a more general data structure, involving FIFO queues and LIFO stacks in accordance with the disposition. ACF grammars can be also viewed as a formalized version of ATN (Augmented Transition Networks).
Published: 1995

41. Dependency Parsing of Turkish

Author: NivreJoakim, OflazerKemal, and EryiğitGülşen
Subjects: FOS: Computer and information sciences, Linguistics and Language, Turkish, Computer science, Memoization, P Philology. Linguistics, Top-down parsing, computer.software_genre, Language and Linguistics, 200402 Computational Linguistics, Parser combinator, Artificial Intelligence, Dependency grammar, QA Mathematics, QA075 Electronic computers. Computer science, Parsing, business.industry, language.human_language, Computer Science Applications, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Applied Computer Science, QA076 Computer software, language, FOS: Languages and literature, Top-down parsing language, S-attributed grammar, Artificial intelligence, Deterministic parsing, business, 80107 Natural Language Processing, computer, Syntactic parsing, Natural language processing, Bottom-up parsing
Abstract: The suitability of different parsing methods for different languages is an important topic in syntactic parsing. Especially lesser-studied languages, typologically different from the languages for which methods have originally been developed, pose interesting challenges in this respect. This article presents an investigation of data-driven dependency parsing of Turkish, an agglutinative, free constituent order language that can be seen as the representative of a wider class of languages of similar type. Our investigations show that morphological structure plays an essential role in finding syntactic relations in such a language. In particular, we show that employing sublexical units called inflectional groups, rather than word forms, as the basic parsing units improves parsing accuracy. We test our claim on two different parsing methods, one based on a probabilistic model with beam search and the other based on discriminative classifiers and a deterministic parsing strategy, and show that the usefulness of sublexical units holds regardless of the parsing method. We examine the impact of morphological and lexical information in detail and show that, properly used, this kind of information can improve parsing accuracy substantially. Applying the techniques presented in this article, we achieve the highest reported accuracy for parsing the Turkish Treebank.
Published: 2008

42. Deterministic parsing and linguistic explanation

Author: Amy Weinberg and Robert C. Berwick
Subjects: Linguistics and Language, Grammar, Computer science, business.industry, media_common.quotation_subject, Experimental and Cognitive Psychology, computer.software_genre, Top-down parsing, Language and Linguistics, Education, Artificial intelligence, business, Deterministic parsing, computer, Natural language processing, media_common, Bottom-up parsing
Published: 1985

43. Analyses of deterministic parsing algorithms

Author: Jacques Cohen and Martin Roth
Subjects: Parsing, General Computer Science, Grammar, LR parser, Computer science, media_common.quotation_subject, Parsing expression grammar, Recursive descent parser, computer.software_genre, Top-down parsing, Canonical LR parser, Simple LR parser, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Parser combinator, Top-down parsing language, Deterministic parsing, computer, Algorithm, media_common
Abstract: This paper describes an approach for determining the minimum, maximum, and average times to parse sentences acceptable by a deterministic parser. These quantities are presented in the form of symbolic formulas, called time-formulas. The variables in these formulas represent not only the length of the input string but also the time to perform elementary operations such as pushing, popping, subscripting, iterating, etc. By binding to the variables actual numerical values corresponding to a given compiler-machine configuration, one can determine the execution time for that configuration. Time-formulas are derived by examining the grammar rules and the program representing the algorithm one wishes to analyze. The approach is described by using a specific grammar that defines simple arithmetic expressions. Two deterministic parsers are analyzed: a top-down recursive descent LL(1) parser, and a bottom-up SLR(1) parser. The paper provides estimates for the relative efficiencies of the two parsers. The estimates applicable to a specific machine, the PDP-10, are presented and substantiated by benchmarks. Finally, the paper illustrates the proposed approach by applying it to the analyses of parsers for a simple programming language.
Published: 1978

44. Deterministic parsing and subjacency

Author: Janet Dean Fodor
Subjects: Linguistics and Language, Parsing, Computer science, Subjacency, Experimental and Cognitive Psychology, computer.software_genre, Determinism, Language and Linguistics, Linguistics, Education, Constraint (information theory), Mechanism (philosophy), Deterministic parsing, computer, Natural language, Sentence
Abstract: It has previously been claimed that a deterministic model of the human sentence parsing mechanism provides an explanation for the existence of, and some of the properties of, the subjacency constraint on natural languages. The present paper argues that the empirical arguments offered in support of these claims are flawed, and that in any case the explanatory relationship between determinism and subjacency is weak.
Published: 1985

45. Global Context Recovery: A New Strategy for Syntactic Error Recovery by Table-Drive Parsers

Author: Richard B. Kieburtz and Ajit B. Pai
Subjects: Scheme (programming language), Parsing, Computer science, Context (language use), Pascal (programming language), computer.software_genre, Set (abstract data type), Table (database), Deterministic parsing, Fiducial marker, Algorithm, computer, Software, computer.programming_language
Abstract: Described is a method for syntactic error recovery that is compatible with deterministic parsing methods and that is able to recover from many errors more quickly than do other schemes because it performs global context recovery. The method relies on fiducial symbols, which are typically reserved key words of a language, to provide mileposts for error recovery. The method has been applied to LL(1) parsers, for which a detailed algorithm is given, and informally proved correct. The algorithm will always recover and return control to the parser if the text being analyzed satisfies only minimal requirements: that it contains one or more occurrences of fiducial symbols following the point at which an error is detected. Tables needed for error recovery have been automatically generated, along with parsing tables, by a parser constructor for the LL(1) grammars. A theoretical characterization of fiducial symbols is given, and the utility of this characterization in practice is discussed. It has been applied to a grammar for the programming language Pascal to aid in selection of a set of fiducial symbols. The error recovery scheme has been tested on a set of student-written Pascal program texts and is compared with other error recovery strategies.
Published: 1980

46. Sentence Disambiguation by a Shift-Reduce Parsing Technique

Author: SRI INTERNATIONAL MENLO PARK CA ARTIFICIAL INTELLIGENCE CENTER, Shieber, Stuart M., SRI INTERNATIONAL MENLO PARK CA ARTIFICIAL INTELLIGENCE CENTER, and Shieber, Stuart M.
Abstract: Native speakers of English show definite and consistent preferences for certain readings of syntactically ambiguous sentences. A user of a natural-language processing system would naturally expect it to reflect the same preferences. Thus, such systems must model in some way the linguistic performance as well as the linguistic competence of the native speaker. The authors have developed a parsing algorithm -- a variant of the LALR(1) shift-reduce algorithm -- that models the preference behavior of native speakers for a range of syntactic preference phenomena reported in the psycholinguistic literature, including the recent data on lexical preferences. The algorithm yields the preferred parse deterministically, without building multiple parse trees and choosing among them. As a side effect, it displays appropriate behavior in processing the much discussed garden-path sentences. The parsing algorithm has been implemented and has confirmed the feasibility of this approach to the modeling of these phenomena., Technical Note 281. Sponsored in part by the Defense Advanced Research Projects Agency (DARPA). Pub. in the Proceedings of the 21st Annual Meeting of the Association for Computational Linguistics, 1983. Presented at the Annual Meeting of the Association for Computational Linguistics (21st) held in Boston, MA in Jun 1983.
Published: 1983

47. Simplifying deterministic parsing

Author: Michael J. Frelling and Alan W. Carter
Subjects: Head-driven phrase structure grammar, Interface (Java), Computer science, Attribute grammar, media_common.quotation_subject, Syntactic predicate, Emergent grammar, Operator-precedence grammar, Mildly context-sensitive grammar formalism, computer.software_genre, Top-down parsing, Task (project management), Parser combinator, Grammar-based code, Semantic memory, media_common, Sequence, Parsing, Grammar, business.industry, Phrase structure rules, Syntax, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Interfacing, Artificial intelligence, Deterministic parsing, business, computer, Generative grammar, Natural language processing
Abstract: This paper presents a model for deterministic parsing which was designed to simplify the task of writing and understanding a deterministic grammar. While retaining structures and operations similar to those of Marcus' PARSIFAL parser [Marcus 80] the grammar language incorporates the following changes. (1) The use of productions operating in parallel has essentially been eliminated and instead the productions are organized into sequences. Not only does this improve the understandability of the grammar, it is felt that this organization corresponds more closely to the task of performing the sequence of buffer transformations and attachments required to parse the most common constituent types. (2) A general method for interfacing between the parser and a semantic representation system is introduced. This interface is independent of the particular semantic representation used and hides all details of the semantic processing from the grammar writer. (3) The interface also provides a general method for dealing with syntactic ambiguities which arise from the attachment of optional modifiers such as prepositional phrases. This frees the grammar writer from determining each point at which such ambiguities can occur.
Published: 1984

48. Deterministic parsing of syntactic non-fluencies

Author: Donald Hindle
Subjects: Basis (linear algebra), Grammar, business.industry, Computer science, media_common.quotation_subject, computer.software_genre, Syntax, Linguistics, Artificial intelligence, business, Set (psychology), Deterministic parsing, computer, Natural language, Natural language processing, media_common
Abstract: It is often remarked that natural language, used naturally, is unnaturally ungrammatical. *Spontaneous speech contains all manner of false starts, hesitations, and self-corrections that disrupt the well-formedness of strings. It is a mystery then, that despite this apparent wide deviation from grammatical norms, people have little difficulty understanding the non-fluent speech that is the essential medium of everyday life. And it is a still greater mystery that children can succeed in acquiring the grammar of a language on the basis of evidence provided by a mixed set of apparently grammatical and ungrammatical strings.
Published: 1983

49. LR Grammars and Analysers

Author: James J. Horning
Subjects: Discrete mathematics, Class (set theory), Parsing, Rule-based machine translation, Terminal and nonterminal symbols, Computer science, computer.software_genre, Deterministic parsing, LALR parser, computer
Abstract: This chapter is concerned with a family of deterministic parsing techniques based on a method first described by Knuth [1965]. These parsers, and the grammars acceptable to them, share most of the desirable properties of the LL(k) family [Chapter 2.B.]. In addition, the class of LR(k)-parsable grammars is probably the largest class accepted by any currently practical parsing technique. The techniques with which we are mostly concerned are, in order of increasing power, LR(0), SLR(1), LALR(1) and LR(1). Collectively, we call these four techniques the LR family [McKeeman 1970] [Aho 1974].
Published: 1974

50. Locally Nondeterministic and Hybrid Syntax Analyzers from Partitioned Two-Level Grammars

Author: Heinz Schmidt and Bernd J. Krämer
Subjects: Syntax (programming languages), Computer science, Programming language, Deterministic context-free grammar, Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing), Context-free grammar, computer.software_genre, Nondeterministic algorithm, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Extended Affix Grammar, Computer Science::Programming Languages, S-attributed grammar, L-attributed grammar, Deterministic parsing, computer
Abstract: Considerable effort has been devoted to finding restricted classes of syntax directed translators usually being based on deterministic parsing techniques.
Published: 1979

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

60 results on '"Deterministic parsing"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources