1. Analyzing Relatedness by Toponym Co-Occurrences on Web Pages
- Author
-
Fahui Wang, Yong Gao, Yu Liu, Yongmei Lu, and Chaogui Kang
- Subjects
Distance decay ,Spatial structure ,Linkage (mechanical) ,computer.software_genre ,Municipal level ,law.invention ,Web information ,Geography ,Similarity (network science) ,law ,Web page ,General Earth and Planetary Sciences ,Data mining ,computer ,Spatial organization - Abstract
This research proposes a method for capturing “relatedness between geographical entities” based on the co-occurrences of their names on web pages. The basic assumption is that a higher count of co-occurrences of two geographical places implies a stronger relatedness between them. The spatial structure of China at the provincial level is explored from the co-occurrences of two provincial units in one document, extracted by a web information retrieval engine. Analysis on the co-occurrences and topological distances between all pairs of provinces indicates that: (1) spatially close provinces generally have similar co-occurrence patterns; (2) the frequency of co-occurrences exhibits a power law distance decay effect with the exponent of 0.2; and (3) the co-occurrence matrix can be used to capture the similarity/linkage between neighboring provinces and fed into a regionalization method to examine the spatial organization of China. The proposed method provides a promising approach to extracting valuable geographical information from massive web pages.
- Published
- 2013
- Full Text
- View/download PDF