1. Efficiently Subtree Matching between XML and Probabilistic XML Documents
- Author
-
Chang Yong Yu, Miao Fang, Hai Tao Ma, and Chang Ming Xu
- Subjects
Document Structure Description ,XML tree ,Uncertain data ,Computer science ,computer.internet_protocol ,Computer Science::Information Retrieval ,XML validation ,General Medicine ,Similarity measure ,computer.software_genre ,Tree (data structure) ,XML database ,Simple API for XML ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,Edit distance ,Data mining ,XML schema ,computer ,Computer Science::Databases ,XML ,computer.programming_language - Abstract
We explored the subtree matching problem of probabilistic XML documents: finding the matches of an XML query tree over a probabilistic XML document, using the canonical tree edit distance as a similarity measure between subtrees. Probabilistic XML is a probability distribution model capturing uncertainty of both value and structure. Query over probabilistic XML documents is difficult: an naivie algorithm has exponential complexity by directly compute the tree edit distance between the query tree and each certain XML tree represented by the probabilistic XML document. Based on the method of tree edit distance computation over certain XML subtrees, we defined a minimum-solution to the edit distance computation, which means the minimum cost to translate the query tree to the probabilistic XML tree. Furthermore, we developed an algorithm---ASM (Algorithm of Subtree Matching) to compute the minimum solution. Finally, we proved the complexity of ASM is linear in the size of the probabilistic XML document. more...
- Published
- 2014
- Full Text
- View/download PDF