Back to Search Start Over

Pipit: Enabling programmatic analysis of parallel execution traces

Authors :
Bhatele, Abhinav
Dhakal, Rakrish
Movsesyan, Alexander
Ranjan, Aditya
Marry, Jordan
Cankur, Onur
Publication Year :
2023

Abstract

Performance analysis is an important part of the oft-repeated, iterative process of performance tuning during the development of parallel programs. Per-process per-thread traces (detailed logs of events with timestamps) enable in-depth analysis of parallel program execution to identify various kinds of performance issues. Often times, trace collection tools provide a graphical tool to analyze the trace output. However, these GUI-based tools only support specific file formats, are difficult to scale when the data is large, limit data exploration to the implemented graphical views, and do not support automated comparisons of two or more datasets. In this paper, we present a programmatic approach to analyzing parallel execution traces by leveraging pandas, a powerful Python-based data analysis library. We have developed a Python library, Pipit, on top of pandas that can read traces in different file formats (OTF2, HPCToolkit, Projections, Nsight, etc.) and provide a uniform data structure in the form of a pandas DataFrame. Pipit provides operations to aggregate, filter, and transform the events in a trace to present the data in different ways. We also provide several functions to quickly identify performance issues in parallel executions.

Details

Language :
English
Database :
OpenAIRE
Accession number :
edsair.doi.dedup.....5eac6a2806e5b2f7eaa886c6ff06b042