Back to Search Start Over

Reciprocal knowledge use in the mining of semistructured data and HMM-based information extraction.

Authors :
Maruyama, Kohei
Uehara, Kuniaki
Source :
Electronics & Communications in Japan, Part 3: Fundamental Electronic Science; Jul2006, Vol. 89 Issue 7, p51-60, 10p, 2 Black and White Photographs, 8 Diagrams, 1 Chart, 2 Graphs
Publication Year :
2006

Abstract

In this paper we propose a method for the reciprocal use of knowledge between association rule extraction (a technique from data mining) and HMM-based information extraction. Association rule extraction is a method that is generally applied to structured data and is either inefficient or intractable when applied to data that lacks an accurate information schema. In this paper we use HMM-based information extraction to automatically recognize the attributes of each word in a set of textual data and assign tags in order to structure this data. In addition, we use the degree of association between words seen in the association rules extracted from the structured data to weight the HMM parameters in order to improve the recognition accuracy of the information extraction procedure and allow knowledge to be reciprocally shared between the data mining and information extraction procedures. Besides this we propose a data mining method that uses representative schema patterns found in different types of semistructured data such as tagged or BibTeX data using schema discovery methods. © 2006 Wiley Periodicals, Inc. Electron Comm Jpn Pt 3, 89(7): 51–60, 2006; Published online in Wiley InterScience (<URL>www.interscience.wiley.com</URL>). DOI 10.1002/ecjc.20256 [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10420967
Volume :
89
Issue :
7
Database :
Complementary Index
Journal :
Electronics & Communications in Japan, Part 3: Fundamental Electronic Science
Publication Type :
Academic Journal
Accession number :
20080515
Full Text :
https://doi.org/10.1002/ecjc.20256