Abstract
Missing data is challenging enough without the added complexities posed by a lack of research in evaluating imputation. Not only could we potentially increase the impact and validity of studies from many different sectors (research, public and private), we also believe that by creating evaluation software, more researchers may be willing to use and justify using imputation methods. This paper aims to encourage further research for efficient imputation evaluation by defining a framework which could be used to optimise the way we impute datasets prior to data analysis. We propose a framework which uses a prototypical approach to create testing data and machine learning methods to create a new metric for evaluation. Preliminary results are presented which show how, for our dataset, records with less than 40% missingness could be used for analysis, increasing the amount of available data.
Original language | English |
---|---|
Title of host publication | 2018 IEEE 12th International Symposium on Applied Computational Intelligence and Informatics (SACI) |
Publisher | IEEE |
Number of pages | 6 |
ISBN (Electronic) | 978-1-5386-4640-3 |
DOIs | |
Publication status | Published - 23 Aug 2018 |
Event | IEEE 12th International Symposium on Applied Computational Intelligence and Informatics - Timiúoara, Romania Duration: 17 May 2018 → 19 May 2018 |
Conference
Conference | IEEE 12th International Symposium on Applied Computational Intelligence and Informatics |
---|---|
Abbreviated title | SACI 2018 |
Country/Territory | Romania |
City | Timiúoara |
Period | 17/05/18 → 19/05/18 |
Keywords
- Clustering
- Evaluating Imputation
- Imputation
- Missing Data
- Prototypical Testing