Back to Search Start Over

Hierarchical subtopic segmentation of web document

Authors :
Gong Ling
Zhang Yun-tao
Wang Yongcheng
Source :
Wuhan University Journal of Natural Sciences. 11:47-50
Publication Year :
2006
Publisher :
Springer Science and Business Media LLC, 2006.

Abstract

The paper proposes a novel method for subtopics segmentation of Web document. An effective retrieval results may be obtained by using subtopics segmentation. The proposed method can segment hierarchically subtopics and identify the boundary of each subtopic. Based on the term frequency matrix, the method measures the similarity between adjacent blocks, such as paragraphs, passages. In the real-world sample experiment, the macro-averaged precision and recall reach 73.4% and 82.5%, and the micro-averaged precision and recal reach 72.9% and 83.1%. Moreover, this method is equally efficient to other Asian languages such as Japanese and Korean, as well as other western languages.

Details

ISSN :
19934998 and 10071202
Volume :
11
Database :
OpenAIRE
Journal :
Wuhan University Journal of Natural Sciences
Accession number :
edsair.doi...........899b8f01db80840481847afa97ebb927