Abstract
Vector instructions of modern CPUs are crucially important for the performance of compute-intensive algorithms. Auto-vectorisation often fails because of an unfortunate choice of data layout by the programmer. This paper proposes a data layout inference for auto-vectorisation that identifies layout transformations that convert single instruction, multiple data-unfavourable layouts of data structures into favourable ones. We present a type system for layout transformations, and we sketch an inference algorithm for it. Finally, we present some initial performance figures for the impact of the inferred layout transformations. They show that non-intuitive layouts that are inferred through our system can have a vast performance impact on compute intensive programs.
Original language | English |
---|---|
Pages (from-to) | 2092–2119 |
Number of pages | 28 |
Journal | Concurrency and Computation: Practice and Experience |
Volume | 28 |
Early online date | 18 May 2015 |
DOIs | |
Publication status | Published - May 2016 |
Keywords
- Data parallelism
- Hpc
- Vectorisation
ASJC Scopus subject areas
- Computer Networks and Communications
- Computer Science Applications
- Software
- Computational Theory and Mathematics
- Theoretical Computer Science