1. Computing the Shortest String and the Edit-Distance for Parsing Expression Languages
- Author
-
Yo-Sub Han and Hyunjoon Cheon
- Subjects
Discrete mathematics ,050101 languages & linguistics ,Parsing ,05 social sciences ,String (computer science) ,Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing) ,02 engineering and technology ,Parsing expression grammar ,computer.software_genre ,Undecidable problem ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Ambiguous grammar ,Regular language ,Formal language ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,0501 psychology and cognitive sciences ,Edit distance ,computer ,Computer Science::Formal Languages and Automata Theory ,Mathematics - Abstract
A distance between two languages is a useful tool to measure the language similarity, and is closely related to the intersection problem as well as the shortest string problem. A parsing expression grammar (PEG) is an unambiguous grammar such that the choice operator selects the first matching in PEG while it can be ambiguous in a context-free grammar. PEGs are also closely related to top-down parsing languages. We consider two problems on parsing expression languages (PELs). One is the r-shortest string problem that decides whether or not a given PEL contains a string of length shorter than r. The other problem is the edit-distance problem of PELs with respect to other language families such as finite languages or regular languages. We show that the r-shortest string problem and the edit-distance problem with respect to finite languages are NEXPTIME-complete, and the edit-distance problem with respect to regular languages is undecidable. In addition, we prove that it is impossible to compute a length bound \(\mathcal {B}(G)\) of a PEG G such that L(G) has a string w of length at most \(\mathcal {B}(G)\).
- Published
- 2020
- Full Text
- View/download PDF