Back to Search
Start Over
Efficient dictionary-based text rewriting using subsequential transducers?
- Source :
- Natural Language Engineering; Dec2007, Vol. 13 Issue 4, p353-381, 29p
- Publication Year :
- 2007
-
Abstract
- AbstractProblems in the area of text and document processing can often be described as text rewriting tasks: given an input text, produce a new text by applying some fixed set of rewriting rules. In its simplest form, a rewriting rule is given by a pair of strings, representing a source string (the ?original?) and its substitute. By a rewriting dictionary, we mean a finite list of such pairs; dictionary-based text rewriting means to replace in an input text occurrences of originals by their substitutes. We present an efficient method for constructing, given a rewriting dictionary D, a subsequential transducer that accepts any text tas input and outputs the intended rewriting result under the so-called ?leftmost-longest match? replacement with skips, t'. The time needed to compute the transducer is linear in the size of the input dictionary. Given the transducer, any text tof length |t| is rewritten in a deterministic manner in time O(|t|+|t'|), where t' denotes the resulting output text. Hence the resulting rewriting mechanism is very efficient. As a second advantage, using standard tools, the transducer can be directly composed with other transducers to efficiently solve more complex rewriting tasks in a single processing step. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 13513249
- Volume :
- 13
- Issue :
- 4
- Database :
- Complementary Index
- Journal :
- Natural Language Engineering
- Publication Type :
- Academic Journal
- Accession number :
- 27676463
- Full Text :
- https://doi.org/10.1017/S1351324905004092