Positioning with vision sensors is gaining its popularity, since it is more accurate, and requires much less bootstrapping and training effort. However, one of the major limitations of the existing solutions is the expensive visual processing pipeline: on resource-constrained mobile devices, it could take up to tens of seconds to process one frame. To address this, we propose a novel learning algorithm, which adaptively discovers the place dependent parameters for visual processing, such as which parts of the scene are more informative, and what kind of visual elements one would expect, as it is employed more and more by the users in a particular setting. With such meta- information, our positioning system dynamically adjust its behaviour, to localise the users with minimum effort. Preliminary results show that the proposed algorithm can reduce the cost on visual processing significantly, and achieve sub-metre positioning accuracy.