Addressing the Curse of Missing Data in Clinical Contexts: A Novel Approach to Correlation-based Imputation

Research output: Contribution to journalArticlepeer-review

5 Citations (Scopus)
49 Downloads (Pure)

Abstract

Clinical data are essential in the medical domain. However, their heterogeneous nature leads to many data quality problems, notably missing values, which undermine the performance of Machine Learning-based clinical systems. Hence, there has been a growing interest in strategies that address this challenge in order to build trustworthy systems to improve the quality of care and benefit clinical decision-making. In particular, missing value imputation is a common approach. This paper proposes three novel imputation techniques that leverage correlation in an innovative manner by exploring the relationship between values and missingness patterns. Experiments were carried out on three publicly available datasets, under three missingness mechanisms with different missing rates, and on two real-world medical datasets. The imputation precision and the classification performance of the proposed techniques were evaluated in a comprehensive comparative study, which included diverse existing methods. The developed techniques outperformed state-of-the-art methods on several assessments while overcoming current flaws shared by correlation-based imputation strategies in real-world medical problems.

Original languageEnglish
Article number101562
Number of pages12
JournalJournal of King Saud University - Computer and Information Sciences
Volume35
Issue number6
DOIs
Publication statusPublished - Jun 2023

Keywords

  • Clinical data
  • Correlation
  • Machine learning
  • Missing data
  • Missing data imputation

Fingerprint

Dive into the research topics of 'Addressing the Curse of Missing Data in Clinical Contexts: A Novel Approach to Correlation-based Imputation'. Together they form a unique fingerprint.

Cite this