Back to Search
Start Over
Machine Learning Based Online Full-Chip Heatmap Estimation
- Source :
- ASP-DAC
- Publication Year :
- 2020
- Publisher :
- IEEE, 2020.
-
Abstract
- Runtime power and thermal control is crucial in any modern processor. However, these control schemes require accurate real-time temperature information, ideally of the entire die area, in order to be effective. On-chip temperature sensors alone cannot provide the full-chip temperature information since the number of sensors that are typically available is very limited due to their high area and power overheads. Furthermore, as we will demonstrate, the peak locations within hot-spots are not stationary and are very workload dependent, making it difficult to rely on fixed temperature sensors alone. Therefore, we propose a novel approach to real-time estimation of full-chip transient heatmaps for commercial processors based on machine learning. The model derived in this work supplements the temperature data sensed from the existing on-chip sensors, allowing for the development of more robust runtime power and thermal control schemes that can take advantage of the additional thermal information that is otherwise not available. The new approach involves offline acquisition of accurate spatial and temporal heatmaps using an infrared thermal imaging setup while nominal working conditions are maintained on the chip. To build the dynamic thermal model, we apply Long-Short-Term-Memory (LSTM) neutral networks with system-level variables such as chip frequency, instruction counts, and other performance metrics as inputs. To reduce the dimensionality of the model, 2D spatial discrete cosine transformation (DCT) is first performed on the heatmaps so that they can be expressed with just their dominant DCT frequencies. Our study shows that only 6×6 DCT coefficients are required to maintain sufficient accuracy across a variety of workloads. Experimental results show that the proposed approach can estimate the full-chip heatmaps with less than $1.4^{o}$C root-mean-square-error and take only $\sim$19ms for each inference which suits well for real-time use.
- Subjects :
- 020203 distributed computing
Neutral network
Computer science
business.industry
Inference
02 engineering and technology
Machine learning
computer.software_genre
Chip
Die (integrated circuit)
020202 computer hardware & architecture
Power (physics)
0202 electrical engineering, electronic engineering, information engineering
Discrete cosine transform
Transient (computer programming)
Artificial intelligence
business
computer
Curse of dimensionality
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- 2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC)
- Accession number :
- edsair.doi...........1f893ad849a6710a76e7f86e80877805
- Full Text :
- https://doi.org/10.1109/asp-dac47756.2020.9045204