Start Over

DIRAQ: scalable in situ data- and resource-aware indexing for optimized query performance.

Authors :: Lakshminarasimhan, Sriram
Zou, Xiaocheng
Boyuka, David
Pendse, Saurabh
Jenkins, John
Vishwanath, Venkatram
Papka, Michael
Klasky, Scott
Samatova, Nagiza
Source :: Cluster Computing. Dec2014, Vol. 17 Issue 4, p1101-1119. 19p.
Publication Year :: 2014
Abstract: Scientific data analytics in high-performance computing environments has been evolving along with the advancement of computing capabilities. With the onset of exascale computing, the increasing gap between compute performance and I/O bandwidth has rendered the traditional post-simulation processing a tedious process. Despite the challenges due to increased data production, there exists an opportunity to benefit from 'cheap' computing power to perform query-driven exploration and visualization during simulation time. To accelerate such analyses, applications traditionally augment, post-simulation, raw data with large indexes, which are then repeatedly utilized for data exploration. However, the generation of current state-of-the-art indexes involves a compute- and memory-intensive processing, thus rendering them inapplicable in an in situ context. In this paper we propose DIRAQ, a parallel in situ, in network data encoding and reorganization technique that enables the transformation of simulation output into a query-efficient form, with negligible runtime overhead to the simulation run. DIRAQ's effective core-local, precision-based encoding approach incorporates an embedded compressed index that is 3-6 $$\times $$ smaller than current state-of-the-art indexing schemes. Its data-aware index adjustmentation improves performance of group-level index layout creation by up to 35 % and reduces the size of the generated index by up to 27 %. Moreover, DIRAQ's in network index merging strategy enables the creation of aggregated indexes that speed up spatial-context query responses by up to $$10\times $$ versus alternative techniques. DIRAQ's topology-, data-, and memory-aware aggregation strategy results in efficient I/O and yields overall end-to-end encoding and I/O time that is less than that required to write the raw data with MPI collective I/O. [ABSTRACT FROM AUTHOR]

Subjects :: *VIRTUAL reality
*DATA analytics
*BANDWIDTHS
*PARALLEL computers
*MULTICORE processors
*QUERY languages (Computer science)
*POWER aware computing

Details

Language :: English
ISSN :: 13867857
Volume :: 17
Issue :: 4
Database :: Academic Search Index
Journal :: Cluster Computing
Publication Type :: Academic Journal
Accession number :: 99453128
Full Text :: https://doi.org/10.1007/s10586-014-0358-z

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

DIRAQ: scalable in situ data- and resource-aware indexing for optimized query performance.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

DIRAQ: scalable in situ data- and resource-aware indexing for optimized query performance.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources