1. GetItFull - A Tool for Downloading and Pre-processing Full-Text Journal Articles.
- Author
-
Hakenberg, Jörg, Han, Eui-Hong (Sam), Berrar, Daniel, Natarajan, Jeyakumar, Haines, Cliff, Berglund, Brian, DeSesa, Catherine, Hack, Catherine J., Dubitzky, Werner, and Bremer, Eric G.
- Abstract
Automated analysis of full-text life science research articles and technical documents is becoming increasingly important. In contrast to abstracts, accessing and processing full-text is considerably more complex. GetItFull is a tool for downloading and pre-processing full-text journal articles. GetItFull automatically connects to a journal's Web site, downloads the journal content and performs various commonly used pre-processing steps. The output comprises a structured XML document for each article with tags identifying the various sections and journal information. The output may then be used as the basis for text mining applications or exported to a database for further processing. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF