Back to Search
Start Over
The identification of text variants by computer
- Source :
- Information Storage and Retrieval. 5:89-108
- Publication Year :
- 1969
- Publisher :
- Elsevier BV, 1969.
-
Abstract
- This paper is an exhaustive description of a working program for collating texts. The procedure identifies text variants ranging from minute differences in punctuation and spelling to additions and deletions of larger portions of text. The aim of the procedure is to bring lineation of all texts to be compared into conformity with that of the base-text and to print out all variants in the texts beneath the base-text. The program accepts input in running text form from magnetic tape and does not require any editing or reorganizing of data prior to processing. Comparisons are carried out on a line for line basis, and searches for new correspondences when the texts are out of phase extend over an interval of twenty lines each side of the current line. The report also contains flowcharts and complete program listings, and fills a definite need in the area of variant identification since no other program written in a higher level language has so far been made public. Originally written in FORTRAN 32 for the CDC 3000 series computers, the program is so designed as to make the conversion to another machine a realistic enterprise. Both the problem of operation of the system and possible adaptation to other machines is discussed from the viewpoint of the user, and some suggestions for further development are offered.
- Subjects :
- Flowchart
Information retrieval
Computer science
business.industry
Fortran
media_common.quotation_subject
General Engineering
computer.software_genre
Punctuation
Spelling
law.invention
Identification (information)
law
ComputingMethodologies_DOCUMENTANDTEXTPROCESSING
Artificial intelligence
Line (text file)
business
Adaptation (computer science)
computer
Natural language processing
CDC 3000
media_common
computer.programming_language
Subjects
Details
- ISSN :
- 00200271
- Volume :
- 5
- Database :
- OpenAIRE
- Journal :
- Information Storage and Retrieval
- Accession number :
- edsair.doi...........9c11264462e14c7a122cab977aaef325