3D-imaging is used in a wide range of applications such as robotics, computer interfaces, autonomous driving or even capturing the flight of birds. Current systems are often based on stereoscopy or structured light approaches, which impose limitations on standoff distance (range), and require textures in the scene or accurate projection patterns. Furthermore, there may be significant computational requirements for the generation of 3D maps. This work considers a system based on the alternative approach of time-of-flight. A state-of-the art single-photon avalanche diode (SPAD) image sensor is used in combination with pulsed, flood-type illumination. The sensor generates photon timing histograms in pixel, achieving a photon throughput of 100’s of Gigaphotons per second. This in turn enables the capture of 3D maps at frame rates >1kFPS, even in high ambient conditions and with minimal latency. We present initial results on processing data frames from the sensor (in the form of 64×32, 16-bin timing histograms, and 256×128 photon counts) using convolutional neural networks, with the view to localize and classify objects in the field of view, with low latency. In tests involving three different hand signs, with data frames acquired with millisecond exposures, a classification accuracy of >90% is obtained, with histogram-based classification consistently outperforming intensity based processing, despite the former’s relatively low lateral resolution. The total, GPU-assisted, processing time for detecting and classifying a sign is under 25 ms. We believe these results are relevant to robotics or self-driving cars, where fast perception, exceeding human reaction times is often desired.
|Title of host publication||Photonic Instrumentation Engineering VIII|
|Editors||Yakov Soskind, Lynda E. Busse|
|Publication status||Published - 5 Mar 2021|
|Name||Proceedings of SPIE|