Back to Search Start Over

The identification of text variants by computer

Authors :
Georgette Silva
Harold Love
Source :
Information Storage and Retrieval. 5:89-108
Publication Year :
1969
Publisher :
Elsevier BV, 1969.

Abstract

This paper is an exhaustive description of a working program for collating texts. The procedure identifies text variants ranging from minute differences in punctuation and spelling to additions and deletions of larger portions of text. The aim of the procedure is to bring lineation of all texts to be compared into conformity with that of the base-text and to print out all variants in the texts beneath the base-text. The program accepts input in running text form from magnetic tape and does not require any editing or reorganizing of data prior to processing. Comparisons are carried out on a line for line basis, and searches for new correspondences when the texts are out of phase extend over an interval of twenty lines each side of the current line. The report also contains flowcharts and complete program listings, and fills a definite need in the area of variant identification since no other program written in a higher level language has so far been made public. Originally written in FORTRAN 32 for the CDC 3000 series computers, the program is so designed as to make the conversion to another machine a realistic enterprise. Both the problem of operation of the system and possible adaptation to other machines is discussed from the viewpoint of the user, and some suggestions for further development are offered.

Details

ISSN :
00200271
Volume :
5
Database :
OpenAIRE
Journal :
Information Storage and Retrieval
Accession number :
edsair.doi...........9c11264462e14c7a122cab977aaef325