Back to Search Start Over

Automatic Generation of Efficient Lexical Processors Using FiniteState Techniques.

Authors :
Johnson, Walter L.
Porter, James H.
Ackley, Stephanie I.
Ross, Douglas T.
Bobrow, D. G.
Source :
Communications of the ACM. Dec1968, Vol. 11 Issue 12, p805-813. 9p. 5 Diagrams, 2 Charts.
Publication Year :
1968

Abstract

The practical application of the theory of finite-state automata to automatically generate lexical processors is dealt with in this tutorial article by the use of the AED RWORD system, developed at M.I.T. as part of the AED-1 system. This system accepts as input descriptions of the multicharacter items or of words allowable in a language given in terms of a subset of regular expressions. The output of the system is a lexical processor which reads a string of characters and combines them into the items as defined by the regular expressions. Each output item is identified by a code number together with a pointer to a block of storage containing the characters and character count in the item. The processors produced by the system are based on finite- state machines. Each state of a "machine" corresponds to a unique condition in the lexical processing of a character string. At each state a character is read, and the machine changes to a new state. At each transition appropriate actions are taken based on the particular character read. The system has been in operation since 1966, and processors generated have compared favorably in speed to carefully hand-coded programs to accomplish the same task. Lexical processors for AED-0 and MAD are among the many which have been produced. The techniques employed are independent of the nature of the items being evaluated. If the word "events" is substituted for character string, these processors may be described as generalized decision-making mechanisms based upon an ordered sequence of events. This allows the system to be used in a range of applications outside the area of lexical processing. However convenient these advantages may be, speed is the most important consideration. In designing a system for automatic generation of a lexical processor, the goal was a processor which completely eliminated backup or rereading, which was nearly as fast as hand-coded processors, which would analyze the language and detect errors, and which would be convenient and easy to use. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00010782
Volume :
11
Issue :
12
Database :
Academic Search Index
Journal :
Communications of the ACM
Publication Type :
Periodical
Accession number :
5247906
Full Text :
https://doi.org/10.1145/364175.364185