Author: "Vanderbauwhede, Wim" / Publication Type: Academic Journals / Topic: field programmable gate arrays - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Vanderbauwhede, Wim"' showing total 3 results

Start Over Author "Vanderbauwhede, Wim" Topic field programmable gate arrays Publication Type Academic Journals

3 results on '"Vanderbauwhede, Wim"'

1. Type-Driven Automated Program Transformations and Cost Modelling for Optimising Streaming Programs on FPGAs.

Author: Vanderbauwhede, Wim, Nabi, Syed Waqar, and Urlea, Cristian
Subjects: *FIELD programmable gate arrays, *SCIENTIFIC computing, *HIGH performance computing, *LOGIC circuits, *DATA flow computing
Abstract: In this paper we present a novel approach to program optimisation based on compiler-based type-driven program transformations and a fast and accurate cost/performance model for the target architecture. We target streaming programs for the problem domain of scientific computing, such as numerical weather prediction. We present our theoretical framework for type-driven program transformation, our target high-level language and intermediate representation languages and the cost model and demonstrate the effectiveness of our approach by comparison with a commercial toolchain. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

2. An analysis of the feasibility and benefits of GPU/multicore acceleration of the Weather Research and Forecasting model.

Author: Vanderbauwhede, Wim and Takemi, Tetsuya
Subjects: FEASIBILITY studies, GRAPHICS processing units, MULTICORE processors, WEATHER forecasting, FIELD programmable gate arrays
Abstract: There is a growing need for ever more accurate climate and weather simulations to be delivered in shorter timescales, in particular, to guard against severe weather events such as hurricanes and heavy rainfall. Due to climate change, the severity and frequency of such events - and thus the economic impact - are set to rise dramatically. Hardware acceleration using graphics processing units (GPUs) or Field-Programmable Gate Arrays (FPGAs) could potentially result in much reduced run times or higher accuracy simulations. In this paper, we present the results of a study of the Weather Research and Forecasting (WRF) model undertaken in order to assess if GPU and multicore acceleration of this type of numerical weather prediction (NWP) code is both feasible and worthwhile. The focus of this paper is on acceleration of code running on a single compute node through offloading of parts of the code to an accelerator such as a GPU. The governing equations set of the WRF model is based on the compressible, non-hydrostatic atmospheric motion with multi-physics processes. We put this work into context by discussing its more general applicability to multi-physics fluid dynamics codes: in many fluid dynamics codes, the numerical schemes of the advection terms are based on finite differences between neighboring cells, similar to the WRF code. For fluid systems including multi-physics processes, there are many calls to these advection routines. This class of numerical codes will benefit from hardware acceleration. We studied the performance of the original code of the WRF model and proposed a simple model for comparing multicore CPU and GPU performance. Based on the results of extensive profiling of representative WRF runs, we focused on the acceleration of the scalar advection module. We discuss the implementation of this module as a data-parallel kernel in both OpenCL and OpenMP. We show that our data-parallel kernel version of the scalar advection module runs up to seven times faster on the GPU compared with the original code on the CPU. However, as the data transfer cost between GPU and CPU is very high (as shown by our analysis), there is only a small speed-up (two times) for the fully integrated code. We show that it would be possible to offset the data transfer cost through GPU acceleration of a larger portion of the dynamics code. In order to carry out this research, we also developed an extensible software system for integrating OpenCL code into large Fortran code bases such as WRF. This is one of the main contributions of our work. We discuss the system to show how it allows the replacement of the sections of the original codebase with their OpenCL counterparts with minimal changes - literally only a few lines - to the original code. Our final assessment is that, even with the current system architectures, accelerating WRF - and hence also other, similar types of multi-physics fluid dynamics codes - with a factor of up to five times is definitely an achievable goal. Accelerating multi-physics fluid dynamics codes including NWP codes is vital for its application to weather forecasting, environmental pollution warning, and emergency response to the dispersion of hazardous materials. Implementing hardware acceleration capability for fluid dynamics and NWP codes is a prerequisite for up-to-date and future computer architectures. Copyright © 2015 John Wiley & Sons, Ltd. [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

3. Design and Evaluation of High-Performance Processing Elements for Reconfigurable Systems.

Author: Purohit, Sohan S., Chalamalasetti, Sai Rahul, Margala, Martin, and Vanderbauwhede, Wim A.
Subjects: ADAPTIVE computing systems, SYSTEMS design, PERFORMANCE evaluation, DYNAMICAL systems, ENERGY consumption, FIELD programmable gate arrays
Abstract: In this paper, we present the design and evaluation of two new processing elements for reconfigurable computing. We also present a circuit-level implementation of the data paths in static and dynamic design styles to explore the various performance-power tradeoffs involved. When implemented in IBM 90-nm CMOS process, the 8-b data paths achieve operating frequencies ranging over 1 GHz both for static and dynamic implementations, with each data path supporting single-cycle computational capability. A novel single-precision floating point processing element (FPPE) using a 24-b variant of the proposed data paths is also presented. The full dynamic implementation of the FPPE shows that it operates at a frequency of 1 GHz with 6.5-mW average power consumption. Comparison with competing architectures shows that the FPPE provides two orders of magnitude higher throughput. Furthermore, to evaluate its feasibility as a soft-processing solution, we also map the floating point unit onto the Virtex 4 and 5 devices, and observe that the unit requires less than 1% of the total logic slices, while utilizing only around 4% of the DSP blocks available. When compared against popular field-programmable-gate-array-based floating point units, our design on Virtex 5 showed significantly lower resource utilization, while achieving comparable peak operating frequency. [ABSTRACT FROM PUBLISHER]
Published: 2013
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

3 results on '"Vanderbauwhede, Wim"'

1. Type-Driven Automated Program Transformations and Cost Modelling for Optimising Streaming Programs on FPGAs.

2. An analysis of the feasibility and benefits of GPU/multicore acceleration of the Weather Research and Forecasting model.

3. Design and Evaluation of High-Performance Processing Elements for Reconfigurable Systems.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

3 results on '"Vanderbauwhede, Wim"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources