In this paper we develop a new method for high- lighting visually salient regions of an image based upon a known visual search task. The proposed method uses a robust model of instantaneous visual attention (i.e. "bottom- up") combined with a pixel probability map derived from the automatic detection of a previously-seen object (task-dependent i.e. "top-down") . The objects to be recognised are parameterised quickly in advance by a viewpoint-invariant spatial distribution of SURF interest- points. The bottom-up and top-down object probability images are fused to produce a task-dependent saliency map. We validate our method using observer eye-tracker data collected under object search-and-count tasking. Our method shows 10% higher overlap with true attention areas under task compared to bottom- up saliency alone. The new combined saliency map is further used to develop a new intelligent compression technique which is an extension of DCT encoding. We demonstrate our technique on surveillance-style footage throughout.
|Title of host publication||ICSIPA09 - 2009 IEEE International Conference on Signal and Image Processing Applications, Conference Proceedings|
|Number of pages||6|
|Publication status||Published - 2009|
|Event||2009 IEEE International Conference on Signal and Image Processing Applications - Kuala Lumpur, Malaysia|
Duration: 18 Nov 2009 → 19 Nov 2009
|Conference||2009 IEEE International Conference on Signal and Image Processing Applications|
|Period||18/11/09 → 19/11/09|