Profile guided optimisation is a common technique used by compilers and runtime systems to shorten execution runtimes and to optimise locality aware scheduling and memory access on heterogeneous hardware platforms. Some profiling tools trace the execution of low level code, whilst others are designed for abstract models of computation to provide rich domain-specific context in profiling reports. We have implemented mean shift, a computer vision tracking algorithm, in the RVC-CAL dataflow language and use both dynamic runtime and static dataflow profiling mechanisms to identify and eliminate bottlenecks in our naive initial version. We use these profiling reports to tune the CPU scheduler reducing runtime by 88%, and to optimise our dataflow implementation that reduces runtime by a further 43% — an overall runtime reduction of 93%. We also assess the portability of our mean shift optimisations by trading off CPU runtime against resource utilisation on FPGAs. Applying all dataflow optimisations reduces FPGA design space significantly, requiring fewer slice LUTs and less block memory.
|Title of host publication||2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP)|
|Publication status||Published - 9 Feb 2015|
|Event||2014 IEEE Global Conference on Signal and Information Processing - GA, Atlanta, United States|
Duration: 3 Dec 2014 → 5 Dec 2014
|Conference||2014 IEEE Global Conference on Signal and Information Processing|
|Abbreviated title||GlobalSIP 2014|
|Period||3/12/14 → 5/12/14|