University of Hertfordshire

From the same journal

By the same authors

Iterative Robust Semi-Supervised Missing Data Imputation

Research output: Contribution to journalArticlepeer-review

Documents

  • 09091515

    Final published version, 1.38 MB, PDF document

  • Nikos Fazakis
  • Georgios Kostopoulos
  • Sotiris Kotsiantis
  • Iosif Mporas
View graph of relations
Original languageEnglish
Pages (from-to)90555 - 90569
Number of pages15
JournalIEEE Access
Volume8
DOIs
Publication statusPublished - 12 May 2020

Abstract

In many real-world applications scientists are often confronted with the problem of incomplete datasets due to several reasons. The direct analysis of datasets with missing values in attributes inevitably results in inaccurate learning models and erroneous results. Facing effectively the challenge of missing values is an essential step of the data mining process. Imputation is often employed to overcome the shortcomings incurred by missing data during the pre-process stage of data analysis. Therefore, a plethora of statistical and machine learning methods have been proposed and employed with a view to imputing the missing values in incomplete data with their potential or actual values. In this context, the main objective of this paper is to put forward an iterative stepwise imputation method based on the semi-supervised learning approach, called IRSSI. Semi-supervised methods have proved to be particularly effective for exploiting incomplete or partially labeled data with regard to the values of the target attribute. The proposed algorithm was experimentally evaluated on real-world benchmark datasets and artificially generated datasets using different high ratios of missing data. The experimental results demonstrate the efficiency of IRSSI algorithm compared to typical imputation methods.

Notes

© 2020 The Author(s). This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.

ID: 21409593