Abstract
Missing data research is hindered by a lack in imputation evaluation techniques. Imputation has the potential to increase the impact and validity of studies from different sectors (research, public and private). By creating robust evaluation software, more researchers may be willing to use and justify using imputation methods. This paper aims to encourage further research for robust imputation evaluation by defining a framework which could be used to optimise the way we impute datasets prior to data analysis. We propose a framework which uses a prototypical approach to create testing data and machine learning methods to create a new metric for evaluation. We introduce our implementation of such a framework and present some preliminary results. The results show how, for our dataset, records with less than 40% missingness could be used for analysis, which increases the amount of available data for future studies using that dataset.
Original language | English |
---|---|
Title of host publication | Proceedings of The Seventh International Conference on Intelligent Systems and Applications |
Publisher | IARIA |
Pages | 7-13 |
Number of pages | 7 |
ISBN (Print) | 9781612086460 |
Publication status | Published - 24 Jun 2018 |
Keywords
- missing data
- evaluating imputation
- imputation
- clustering
- prototypical testing