Handling Missing Values in Healthcare Settings

3 minute read

[Link] [Slides]

In statistical modeling, it is common to encounter missing values, particularly in healthcare situations where vital sign measurements may be sporadic. This can lead to difficulties in accurately evaluating a patient’s health and making informed decisions about their treatment. One possible solution is to use imputation methods to fill in the missing values and enable counterfactual predictions in healthcare settings. To address this issue, we have conducted a study examining existing approaches to missing value imputation in healthcare settings, with a specific focus on time series data. We provide an overview of both classical (e.g. MICE, MissForest, Gaussian Processes), and deep learning-based (e.g. Recurrent Neural Networks, Generative Adversarial Networks, Autoencoders) imputation methods, assess their strengths and limitations within the context of time series data, and highlight potential areas for further investigation.

ModelYearGPRNNCNNTFGANODEAEIndicatorDatasets
MTGP2015y       TBI, MIMIC-II
GASF, GADF, MTF2015  y     Gun Point, CBF, Swedish Leaf, ECG, 7 Misc
LSTM2016 y     yPICU at Children’s Hospital LA
MTGP2017y       Duke University Hospital
M-RNN2017 y      MIMIC-III, Deterioration, UNOS-Heart, UNOS-Lung, UK Biobank
BRITS2018 y     yPhysioNet, Beijing Air Quality, Human Activity
GRU-D2018 y     yPhysioNet, MIMIC-III, Gesture
GAIN2018    y   Breast Cancer, Spam, Letter, Credit, News
GP-VAE2019y y   y PhysioNet, Healing MNIST, SPRITES
T-CGAN2019    y   Starlight Curves, Power Demand, ECG200
Imp-GAIN2019    y   Insomnia
Latent ODE2019 y   y  PhysioNet, MuJoCo, Human Activity
VaDER2019 y    yyADNI, PPMI
TKAE2019 y    y PhysioNet, ECG, EHR
ODE-GRU-D2020 y   y yPhysioNet
RBM2020      y Acute Abdomen Taiwan
Multitask LSTM2020 y      PhysioNet
HeartImp2020      y Garmin, Fitbit
GRU-DF2020 y      CLIMB (Multiple Sclerosis)
TAME2020 y    y MMIC-III, DACMI
P-BiGAN2020  y y   MIMIC-III
Deep AE2021      y Ischemic Heart Disease Taiwan
Deep Recurrent AD2021 y     yTADPOLE (ADNI)
MTSIT2022   y    PhysioNet, Beijing Air Quality
AJ-RNN2022 y      PhysioNet, UCR Time Series
@article{septiandri2023missing,
  title   = {Handling Missing Values in Healthcare Settings},
  author  = {Septiandri, Ali Akbar and Jendoubi, Takoua and De la O, Alejandro Díaz},
  year    = 2023,
  month   = {September},
  url     = {https://aliakbars.id/posts/2023/09/missing-values}
}