Skip to content

duanegoodner/icu-deep-learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

344 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ICU Deep Learning

Project Overview

This project implements and optimizes Long Short-Term Memory (LSTM) time series models of intensive care unit (ICU) patient lab and vital sign data to predict patient outcomes. Additionally, an adversarial attack algorithm is used to discover adversarial examples for trained models.

Raw data were obtained from the Medical Information Mart for Intensive Care (MIMIC-III) database [1, 2]. The predictive model input consists of data from 13 lab measurements and 6 vital signs collected during the first 48 hours after patient admission to the ICU. The prediction target is a binary variable representing in-hospital mortality. This type of model can supplement standard heuristics used by care providers in identifying high-risk patients.

The results of adversarial attacks on trained LSTM models provide a gauge of model stability. Additionally, these results offer an opportunity to compare adversarial vulnerabilities in LSTMs with the well-documented [3, 4] adversarial behaviors observed in Convolutional Neural Networks (CNNs) used for computer vision. Unlike the adversarial examples found in CNNs — where perturbations imperceptible to the human eye can drastically alter model predictions — the adversarial examples discovered for our LSTM models exhibit a higher degree of plausibility, aligning more closely with human intuition.

Getting Started

📌 Viewing Documentation

The full documentation, including detailed descriptions of the methodology and implementation, is contained in the project Jupyter notebook. If using the notebook purely for informational purposes (without running the code), you can:

🚀 Running the Project

To run the project code, follow the instructions in SETUP.md for setting up and running in a containerized environment.

Acknowledgement of Prior Studies

This project builds on previous studies [5, 6] that were the first to apply LSTM-based predictive modeling and adversarial attacks to ICU patient data from the MIMIC-III database. While the initial goal was to reproduce and validate portions of the earlier studies, the project has since evolved into significant extensions and new contributions. However, none of this progress would have been possible without the invaluable foundation provided by the original research.

Highlights

  • Extensive hyperparameter tuning for predictive and attack models.
  • Flexible attack objectives allow targeting different types of adversarial perturbations.
  • Fully containerized for easy, reproducible environment setup.
  • Single config.toml file centralizes all parameters for streamlined modification and experimentation.
  • Auto-generated data provenance ensures reproducibility and prevents losing track of "what worked" during experiments.
  • Modular data pipeline eliminates the need for redundant upstream runs when testing multiple downstream settings.
  • Flexible execution — each pipeline component can run from the command line or inside the project's Jupyter notebook.
  • Efficient adversarial attacks — developed a custom PyTorch AdversarialAttacker module capable of attacking batches of samples.
  • Avoids security risks of .pkl files — uses .json and .feather file formats for serialization.
  • 60% Higher predictive performance (F1 score) and 10× faster data preprocessing compared to prior studies.

Example Data: Input Elements' Perturbation Probability

Each model input consists of a [19 × 48] array representing hourly values of 13 lab measurements and 6 vital signs collected over a 48-hour period. To analyze the adversarial behavior of a trained model under a specific attack objective function, we perform multiple attack iterations on each sample in a population. The attack objective function can favor small perturbation magnitude, sparse perturbations, or both.

For each sample, an adversarial example may be found after some number of attack iterations — or in some cases, not at all. Once the first adversarial example is discovered for a sample, we continue attacking the sample (up to a predefined iteration limit) to identify perturbations that further minimize the attack objective function and record the example with the lowest value of the attack objective function as that sample's best adversarial example. The perturbation associated with a best example has lower magnitude and/or greater sparsity than its corresponding first example.

A "zero-to-one" attack refers to an adversarial perturbation that flips a model's predicted output from 0 (non-mortality) to 1 (mortality)**. Similarly, a "one-to-zero" attack flips the predicted output from 1 to 0.

The "Perturbation Probability" figure below shows the fraction of adversarial examples in which each input array element is non-zero for the first and best discovered examples, with separate plots for zero-to-one and one-to-zero attacks.

Perturbation Probability

Some key takeaways from the above data:

  • For the first examples, most measurements' perturbation probability is fairly evenly distributed across the 48-hour observation period, though does increase for later times.
  • SpO2 perturbations are less common in first examples compared to other measurements. However, this difference disappears in best examples.
  • The best examples strongly favor perturbations at the end of the observation period. These adversarial examples are not what we would call "anomalous." A human interpreter would also be more likely to misinterpret a late-time perturbation.
  • Fine details in the probability distributions are well-matched between zero-to-one and one-to-zero best examples, suggesting that the model’s vulnerabilities are generally independent of the attack direction.

For additional exploration and analysis, refer to the Getting Started section.

Tools Used

Python PyTorch Docker PostgreSQL Pandas NumPy Apache Arrow Scikit-Learn TensorBoard Matplotlib Optuna msgspec Jupyter

References

1. Johnson, A., Pollard, T., and Mark, R. (2016) 'MIMIC-III Clinical Database' (version 1.4), PhysioNet.

2. Johnson, A. E. W., Pollard, T. J., Shen, L., Lehman, L. H., Feng, M., Ghassemi, M., Moody, B., Szolovits, P., Celi, L. A., & Mark, R. G. (2016). MIMIC-III, a freely accessible critical care database. Scientific Data, 3, 160035.

3. Goodfellow, I.J., Shlens, J. and Szegedy, C., 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572.

4. Akhtar, N., Mian, A., Kardan, N. and Shah, M., 2021. Advances in adversarial attacks and defenses in computer vision: A survey. IEEE Access, 9, pp.155161-155196.

5. Sun, M., Tang, F., Yi, J., Wang, F. and Zhou, J., 2018, July. Identify susceptible locations in medical records via adversarial attacks on deep predictive models. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 793-801).

6. Tang, F., Xiao, C., Wang, F. and Zhou, J., 2018. Predictive modeling in urgent care: a comparative study of machine learning approaches. Jamia Open, 1(1), pp.87-98.

About

Deep learning pipeline for ICU patient outcome prediction and adversarial attack analysis on clinical time series data

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors