FPGA Design of Transposed Convolutions for Deep Learning Using High-Level Synthesis

Cristian Sestito, Stefania Perri, Robert James Stewart

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)
113 Downloads (Pure)

Abstract

Deep Learning (DL) is pervasive across a wide variety of domains. Convolutional Neural Networks (CNNs) are often used for image processing DL applications. Modern CNN models are growing to meet the needs of more sophisticated tasks, e.g. using Transposed Convolutions (TCONVs) for image decompression and image generation. Such state-of-the-art DL models often target GPU-based high-performance architectures, due to the high computational and hardware resource needs of TCONV layers. To avoid prohibitive GPU energy costs, CNNs are increasingly deployed to decentralized embedded autonomous devices, such as Field Programmable Gate Arrays (FPGAs). However, this poses challenges for designing efficient hardware implementations of TCONV layers. This paper presents a parameterized design and implementation of a new TCONV module, which is synthesizable onto FPGAs. It is implemented using the High-Level Synthesis (HLS), through a C++ template to parameterize its functional and non-functional properties. These parameters allow kernel sizes, image sizes, quantization and parallelism to be varied by users. With a systematic exploration in this design space, we find an optimal instance of this TCONV module that achieves 6.25 Giga Outputs per Second (Gout/s) using just 1.53 W of power. We then use our TCONV layer in two neural networks for image decompression and image generation. Image decompression achieves a speed throughput of more than 30K frames-per-second (fps) using only the 16% of resources on average, image generation achieves an energy efficiency of 324 fps/W and outperforms comparable state-of-the-art models by at least 7.3×.
Original languageEnglish
Pages (from-to)1245-1263
Number of pages19
JournalJournal of Signal Processing Systems
Volume95
Issue number10
Early online date4 Aug 2023
DOIs
Publication statusPublished - Oct 2023

Keywords

  • Deep Learning
  • FPGA
  • High-Level Synthesis
  • Parallelism
  • Quantization
  • Transposed Convolution

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Information Systems
  • Signal Processing
  • Control and Systems Engineering
  • Hardware and Architecture
  • Modelling and Simulation

Fingerprint

Dive into the research topics of 'FPGA Design of Transposed Convolutions for Deep Learning Using High-Level Synthesis'. Together they form a unique fingerprint.

Cite this