Back to Search Start Over

Machine Learning Based Online Full-Chip Heatmap Estimation

Authors :
Yue Zhao
Sheldon X.-D. Tan
Jinwei Zhang
Jorg Henkel
Hussam Amrouch
Sheriff Sadiqbatcha
Source :
ASP-DAC
Publication Year :
2020
Publisher :
IEEE, 2020.

Abstract

Runtime power and thermal control is crucial in any modern processor. However, these control schemes require accurate real-time temperature information, ideally of the entire die area, in order to be effective. On-chip temperature sensors alone cannot provide the full-chip temperature information since the number of sensors that are typically available is very limited due to their high area and power overheads. Furthermore, as we will demonstrate, the peak locations within hot-spots are not stationary and are very workload dependent, making it difficult to rely on fixed temperature sensors alone. Therefore, we propose a novel approach to real-time estimation of full-chip transient heatmaps for commercial processors based on machine learning. The model derived in this work supplements the temperature data sensed from the existing on-chip sensors, allowing for the development of more robust runtime power and thermal control schemes that can take advantage of the additional thermal information that is otherwise not available. The new approach involves offline acquisition of accurate spatial and temporal heatmaps using an infrared thermal imaging setup while nominal working conditions are maintained on the chip. To build the dynamic thermal model, we apply Long-Short-Term-Memory (LSTM) neutral networks with system-level variables such as chip frequency, instruction counts, and other performance metrics as inputs. To reduce the dimensionality of the model, 2D spatial discrete cosine transformation (DCT) is first performed on the heatmaps so that they can be expressed with just their dominant DCT frequencies. Our study shows that only 6×6 DCT coefficients are required to maintain sufficient accuracy across a variety of workloads. Experimental results show that the proposed approach can estimate the full-chip heatmaps with less than $1.4^{o}$C root-mean-square-error and take only $\sim$19ms for each inference which suits well for real-time use.

Details

Database :
OpenAIRE
Journal :
2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC)
Accession number :
edsair.doi...........1f893ad849a6710a76e7f86e80877805
Full Text :
https://doi.org/10.1109/asp-dac47756.2020.9045204