Back to Search
Start Over
Minersoft: Software Retrieval in Grid and Cloud Computing Infrastructures.
- Source :
- ACM Transactions on Internet Technology; 2012, Vol. 12 Issue 1, p1-34, 34p
- Publication Year :
- 2012
-
Abstract
- One of the main goals of Cloud and Grid infrastructures is to make their services easily accessible and attractive to end-users. In this article we investigate the problem of supporting keyword-based searching for the discovery of software files that are installed on the nodes of large-scale, federated Grid and Cloud computing infrastructures. We address a number of challenges that arise from the unstructured nature of software and the unavailability of software-related metadata on large-scale networked environments. We present Minersoft, a harvester that visits Grid/Cloud infrastructures, crawls their file systems, identifies and classifies software files, and discovers implicit associations between them. The results of Minersoft harvesting are encoded in a weighted, typed graph, called the Software Graph. A number of information retrieval (IR) algorithms are used to enrich this graph with structural and content associations, to annotate software files with keywords and build inverted indexes to support keyword-based searching for software. Using a real testbed, we present an evaluation study of our approach, using data extracted from productionquality Grid and Cloud computing infrastructures. Experimental results show that Minersoft is a powerful tool for software search and discovery. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 15335399
- Volume :
- 12
- Issue :
- 1
- Database :
- Complementary Index
- Journal :
- ACM Transactions on Internet Technology
- Publication Type :
- Academic Journal
- Accession number :
- 78148751
- Full Text :
- https://doi.org/10.1145/2220352.2220354