Estimation of texture similarity is fundamental to many material recognition tasks. This study uses fine-grained human perceptual similarity ground-truth to provide a comprehensive evaluation of 51 texture feature sets. We conduct two types of evaluation and both show that these features do not estimate similarity well when compared against human agreement rates, but that performances are improved when the features are combined using a Random Forest. Using a simple two-stage statistical model we show that few of the features capture long-range aperiodic relationships. We perform two psychophysical experiments which indicate that long-range interactions do provide humans with important cues for estimating texture similarity. This motivates an extension of the study to include Convolutional Neural Networks (CNNs) as they enable arbitrary features of large spatial extent to be learnt. Our conclusions derived from the use of two pre-trained CNNs are: that the large spatial extent exploited by the networks' top convolutional and first fully-connected layers, together with the use of large numbers of filters, confers significant advantage for estimation of perceptual texture similarity.
|Journal||IEEE Transactions on Pattern Analysis and Machine Intelligence|
|Early online date||7 Jan 2020|
|Publication status||E-pub ahead of print - 7 Jan 2020|