TY - GEN
T1 - A comparison of unsupervised abnormality detection methods for interstitial lung disease
AU - Daykin, Matt
AU - Sellathurai, Mathini
AU - Poole, Ian
PY - 2018/8/21
Y1 - 2018/8/21
N2 - Abnormality detection, also known as outlier detection or novelty detection, seeks to identify data that do not match an expected distribution. In medical imaging, this could be used to find data samples with possible pathology or, more generally, to exclude samples that are normal. This may be done by learning a model of normality, against which new samples are evaluated. In this paper four methods, each representing a different family of techniques, are compared: one-class support vector machine, isolation forest, local outlier factor, and fast-minimum covariance determinant estimator. Each method is evaluated on patches of CT interstitial lung disease where the patches are encoded with one of four embedding methods: principal component analysis, kernel principal component analysis, a flat autoencoder, and a convolutional autoencoder. The data consists of 5500 healthy patches from one patient cohort defining normality, and 2970 patches from a second patient cohort with emphysema, fibrosis, ground glass opacity, and micronodule pathology representing abnormality. From this second cohort 1030 healthy patches are used as an evaluation dataset. Evaluation occurs in both the accuracy (area under the ROC curve) and runtime efficiency. The fast-minimum covariance determinant estimator is demonstrated to have a fair time scaling with dataset dimensionality, while the isolation forest and one-class support vector machine scale well with dimensionality. The one-class support vector machine is the most accurate, closely followed by the isolation forest and fast-minimum covariance determinant estimator. The embeddings from kernel principal component analysis are the most generally useful.
AB - Abnormality detection, also known as outlier detection or novelty detection, seeks to identify data that do not match an expected distribution. In medical imaging, this could be used to find data samples with possible pathology or, more generally, to exclude samples that are normal. This may be done by learning a model of normality, against which new samples are evaluated. In this paper four methods, each representing a different family of techniques, are compared: one-class support vector machine, isolation forest, local outlier factor, and fast-minimum covariance determinant estimator. Each method is evaluated on patches of CT interstitial lung disease where the patches are encoded with one of four embedding methods: principal component analysis, kernel principal component analysis, a flat autoencoder, and a convolutional autoencoder. The data consists of 5500 healthy patches from one patient cohort defining normality, and 2970 patches from a second patient cohort with emphysema, fibrosis, ground glass opacity, and micronodule pathology representing abnormality. From this second cohort 1030 healthy patches are used as an evaluation dataset. Evaluation occurs in both the accuracy (area under the ROC curve) and runtime efficiency. The fast-minimum covariance determinant estimator is demonstrated to have a fair time scaling with dataset dimensionality, while the isolation forest and one-class support vector machine scale well with dimensionality. The one-class support vector machine is the most accurate, closely followed by the isolation forest and fast-minimum covariance determinant estimator. The embeddings from kernel principal component analysis are the most generally useful.
UR - http://www.scopus.com/inward/record.url?scp=85052867121&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-95921-4_27
DO - 10.1007/978-3-319-95921-4_27
M3 - Conference contribution
AN - SCOPUS:85052867121
SN - 9783319959207
T3 - Communications in Computer and Information Science
SP - 287
EP - 298
BT - Medical Image Understanding and Analysis
A2 - Nixon, Mark
A2 - Mahmoodi, Sasan
A2 - Zwiggelaar, Reyer
PB - Springer
T2 - 22nd Conference on Medical Image Understanding and Analysis 2018
Y2 - 9 July 2018 through 11 July 2018
ER -