TY - JOUR
T1 - Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm
AU - Stewart, Robert James
AU - Nowlan, Andrew
AU - Bacchus, Pascal
AU - Ducasse, Quentin
AU - Komendantskaya, Ekaterina
N1 - Funding Information:
Funding: This research was funded by EPSRC project “Border Patrol: Improving Smart Device Security through Type-Aware Systems Design (EP/N028201/1)”; EPSRC project “Serious Coding: A Game Approach To Security For The New Code-Citizens (EP/T017511/1)”; National Cyber Security Center, UK, Grant “SecConn-NN: Neural Networks with Security Contracts—towards lightweight, modular security for neural networks”; UK Research Institute in Verified Trustworthy Software Systems research project “CONVENER: Continuous Verification of Neural Networks” (from the “Digital Security Through Verification” call).
Publisher Copyright:
© 2021 by the authors. Licensee MDPI, Basel, Switzerland.
Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.
PY - 2021/2/5
Y1 - 2021/2/5
N2 - This paper compares the latency, accuracy, training time and hardware costs of neural networks compressed with our new multi-objective evolutionary algorithm called NEMOKD, and with quantisation. We evaluate NEMOKD on Intel's Movidius Myriad X VPU processor, and quantisation on Xilinx's programmable Z7020 FPGA hardware. Evolving models with NEMOKD increases inference accuracy by up to 82% at the cost of 38% increased latency, with throughput performance of 100–590 image frames-per-second (FPS). Quantisation identifies a sweet spot of 3 bit precision in the trade-off between latency, hardware requirements, training time and accuracy. Parallelising FPGA implementations of 2 and 3 bit quantised neural networks increases throughput from 6k FPS to 373k FPS, a 62x speedup.
AB - This paper compares the latency, accuracy, training time and hardware costs of neural networks compressed with our new multi-objective evolutionary algorithm called NEMOKD, and with quantisation. We evaluate NEMOKD on Intel's Movidius Myriad X VPU processor, and quantisation on Xilinx's programmable Z7020 FPGA hardware. Evolving models with NEMOKD increases inference accuracy by up to 82% at the cost of 38% increased latency, with throughput performance of 100–590 image frames-per-second (FPS). Quantisation identifies a sweet spot of 3 bit precision in the trade-off between latency, hardware requirements, training time and accuracy. Parallelising FPGA implementations of 2 and 3 bit quantised neural networks increases throughput from 6k FPS to 373k FPS, a 62x speedup.
KW - Evolutionary algorithm
KW - FPGA
KW - Movidius VPU
KW - Neural network
KW - Quantisation
UR - http://www.scopus.com/inward/record.url?scp=85100456075&partnerID=8YFLogxK
U2 - 10.3390/electronics10040396
DO - 10.3390/electronics10040396
M3 - Article
SN - 2079-9292
VL - 10
JO - Electronics
JF - Electronics
IS - 4
M1 - 396
ER -