Author: "Junchuan Fan" / Publisher: copernicus publications - Searchworks@Jio Institute Digital Library Search Results

1. DETECTING SPATIAL PATTERNS OF NATURAL HAZARDS FROM THE WIKIPEDIA KNOWLEDGE BASE

Author: Junchuan Fan and Kathleen Stewart
Subjects: Hazard (logic), Volunteered geographic information, Topic model, lcsh:Applied optics. Photonics, Database dump, Information retrieval, Computer science, Map algebra, business.industry, lcsh:T, lcsh:TA1501-1820, computer.software_genre, Latent Dirichlet allocation, lcsh:Technology, symbols.namesake, Knowledge base, lcsh:TA1-2040, Natural hazard, symbols, Data mining, business, lcsh:Engineering (General). Civil engineering (General), computer
Abstract: The Wikipedia database is a data source of immense richness and variety. Included in this database are thousands of geotagged articles, including, for example, almost real-time updates on current and historic natural hazards. This includes usercontributed information about the location of natural hazards, the extent of the disasters, and many details relating to response, impact, and recovery. In this research, a computational framework is proposed to detect spatial patterns of natural hazards from the Wikipedia database by combining topic modeling methods with spatial analysis techniques. The computation is performed on the Neon Cluster, a high performance-computing cluster at the University of Iowa. This work uses wildfires as the exemplar hazard, but this framework is easily generalizable to other types of hazards, such as hurricanes or flooding. Latent Dirichlet Allocation (LDA) modeling is first employed to train the entire English Wikipedia dump, transforming the database dump into a 500-dimension topic model. Over 230,000 geo-tagged articles are then extracted from the Wikipedia database, spatially covering the contiguous United States. The geo-tagged articles are converted into an LDA topic space based on the topic model, with each article being represented as a weighted multidimension topic vector. By treating each article’s topic vector as an observed point in geographic space, a probability surface is calculated for each of the topics. In this work, Wikipedia articles about wildfires are extracted from the Wikipedia database, forming a wildfire corpus and creating a basis for the topic vector analysis. The spatial distribution of wildfire outbreaks in the US is estimated by calculating the weighted sum of the topic probability surfaces using a map algebra approach, and mapped using GIS. To provide an evaluation of the approach, the estimation is compared to wildfire hazard potential maps created by the USDA Forest service.
Published: 2015

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

1 results on '"Junchuan Fan"'

1. DETECTING SPATIAL PATTERNS OF NATURAL HAZARDS FROM THE WIKIPEDIA KNOWLEDGE BASE

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Publication Year Range

Language

Database

1 results on '"Junchuan Fan"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources