Accuracy, Training Time and Hardware Efficiency Trade-Offs for Quantized Neural Networks on FPGAs

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Citations (Scopus)
29 Downloads (Pure)

Abstract

Neural networks have proven a successful AI approach in many application areas. Some neural network deployments require low inference latency and lower power requirements to be useful e.g. autonomous vehicles and smart drones. Whilst FPGAs meet these requirements, hardware needs of neural networks to execute often exceed FPGA resources.

Emerging industry led frameworks aim to solve this problem by compressing the topology and precision of neural networks, eliminating computations that require memory for execution. Compressing neural networks inevitably comes at the cost of reduced inference accuracy.

This paper uses Xilinx's FINN framework to systematically evaluate the trade-off between precision, inference accuracy, training time and hardware resources of 64 quantized neural networks that perform MNIST character recognition.

We identify sweet spots around 3 bit precision in the quantization design space after training with 40 epochs, minimising both hardware resources and accuracy loss. With enough training, using 2 bit weights achieves almost the same inference accuracy as 3-8 bit weights.
Original languageEnglish
Title of host publicationApplied Reconfigurable Computing. Architectures, Tools, and Applications
Subtitle of host publicationARC 2020
PublisherSpringer
Pages121-135
Number of pages15
ISBN (Electronic)9783030445348
ISBN (Print)9783030445331
DOIs
Publication statusPublished - 2020
Event16th International Symposium on Applied Reconfigurable Computing 2020 - University of Castilla-La Mancha, Toledo, Spain
Duration: 1 Apr 20203 Apr 2020
https://arcoresearch.com/arc2020/

Publication series

NameLecture Notes in Computer Science
Volume12083
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th International Symposium on Applied Reconfigurable Computing 2020
Abbreviated titleARC2020
Country/TerritorySpain
CityToledo
Period1/04/203/04/20
Internet address

Keywords

  • Deep learning
  • FPGA
  • Neural networks
  • Quantization

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Accuracy, Training Time and Hardware Efficiency Trade-Offs for Quantized Neural Networks on FPGAs'. Together they form a unique fingerprint.

Cite this