Back to Search Start Over

On-line detection of large-scale parallel application's structure

Authors :
Harald Servat
Juan Antonio Rodríguez González
JESUS LABARTA
Juan Gonzalez-Garcia
Germán Llort
Judit Gimenez
Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
Source :
Recercat. Dipósit de la Recerca de Catalunya, Universitat Jaume I, UPCommons. Portal del coneixement obert de la UPC, Universitat Politècnica de Catalunya (UPC), IPDPS
Publisher :
Institute of Electrical and Electronics Engineers (IEEE)

Abstract

With larger and larger systems being constantly deployed, trace-based performance analysis of parallel applications has become a daunting task. Even if the amount of performance data gathered per single process is small, traces rapidly become unmanageable when merging together the information collected from all processes. In general, an e cient analysis of such a large volume of data is subject to a previous ltering step that directs the analyst's attention towards what is meaningful to understand the observed application behavior. Furthermore, the iterative nature of most scienti c applications usually ends up producing repetitive information. Discarding irrelevant data aims at reducing both the size of traces, and the time required to perform the analysis and deliver results. In this paper, we present an on-line analysis framework that relies on clustering techniques to intelligently select the most relevant information to understand how does the application behave, while keeping the trace volume at a reasonable size.

Details

Database :
OpenAIRE
Journal :
Recercat. Dipósit de la Recerca de Catalunya, Universitat Jaume I, UPCommons. Portal del coneixement obert de la UPC, Universitat Politècnica de Catalunya (UPC), IPDPS
Accession number :
edsair.doi.dedup.....07c9b8f20f170852f2dfd5aaa474b937