Back to Search Start Over

Searching Vocalized/Unvocalized Arabic Texts Using an Improved Coding Schema.

Authors :
IBRAHIM, FARID
ATOUM, JALAL OMER
Source :
International Journal of Computer Processing of Languages. Dec2009, Vol. 22 Issue 4, p269-283. 15p. 3 Diagrams, 2 Charts, 1 Graph.
Publication Year :
2009

Abstract

Searching for a pattern in an Arabic text raises various problems due to the association of vocalization characters with alphabetical letters of Arabic words. This feature causes a problem for existing searching algorithms. They either fail to find all partial matches of a pattern or they may suffer from performance degradation when they are simply modified to ignore these vocalization characters. This paper presents a new coding schema for Arabic vocalization characters that will facilitate and improve the performance of searching for vocalized and unvocalized patterns in any Arabic text (vocalized or unvocalized). This schema is based on repositioning the vocalization characters at the end of each word. We present in this paper the coding and decoding algorithms needed to support our new coding schema. In addition, we explain the modifications to the Boyer-Moore algorithm that take advantage of our improved coding schema together with the complexity analysis. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
17938406
Volume :
22
Issue :
4
Database :
Academic Search Index
Journal :
International Journal of Computer Processing of Languages
Publication Type :
Academic Journal
Accession number :
83369454
Full Text :
https://doi.org/10.1142/S1793840609002135