Hybrid_Approach_For_Sepsis_Risk_Stratification_System

MSc. Data Science Project

Project Overview

This project aims to develop an advanced sepsis risk stratification model for ICU patients using a hybrid advanced machine learning approach. By integrating the PIRO (Predisposition, Infection, Response, Organ Dysfunction) scoring system with machine learning models like Long Short-Term Memory (LSTM) networks, Random Forest (RF), and eXtreme Gradient Boosting (XGBoost), this model enhances the early detection and accurate risk stratification of sepsis. The goal is to improve patient outcomes by enabling timely interventions and reducing healthcare costs.

Features

Hybrid Model Architecture: Combines LSTM for time-series feature extraction with RF and XGBoost for classification.
PIRO Scoring System Integration: Uses the PIRO framework to incorporate multiple clinical dimensions for personalized sepsis risk assessment.
Time-Series Analysis: Utilises LSTM networks to capture temporal patterns in patient data, crucial for understanding sepsis progression.
Evaluation Metrics: Assesses model performance using confusion matrices, AUROC, sensitivity, specificity, and Decision Curve Analysis (DCA).
Clinical Applicability: Designed to support healthcare providers in making informed decisions regarding sepsis management.

Dataset

The dataset used in this project is sourced from Kaggle and contains simulated data from the first 24 hours of ICU hospitalisation for 200 patients with sepsis. The dataset includes 98 variables, out of which 18 were identified as relevant for the PIRO framework and used in the model. These 18 variables include:

Demographic Information:
- Sex
- Age
Comorbidities:
- Diabetes Mellitus Type 2
- Chronic Kidney Disease
- Coronary Artery Disease
- Autoimmune Disease
Infection Indicators: Infection
Vital Signs:
- Systolic Blood Pressure
- Heart Rate
- Respiratory Rate
Laboratory Results:
- Lactate
- White Blood Cell count (WBC)
- C-reactive protein (CRP)
- Procalcitonin (PCT)
Sepsis Severity: Sequential Organ Failure Assessment (SOFA) Score
Outcomes: Death

These variables were carefully selected based on their relevance to the PIRO framework to ensure accurate and personalised sepsis risk stratification.

Installation

To run this project, you need to set up your environment with the following dependencies:

Python 3.x
PyTorch
Scikit-learn
pandas
numpy
matplotlib
seaborn
imbalanced-learn
joblib

You can install the required libraries using the following command:

pip install torch scikit-learn pandas numpy matplotlib seaborn imbalanced-learn joblib

Usage

To use this project, follow the steps below:

Clone the Repository:

Clone the repository from GitHub to your local machine:

git clone https://raw.githubusercontent.com/Pr-E/Hybrid_Approach_For_Sepsis_Risk_Stratification_System/main/nausea/Approach-For-System-Stratification-Hybrid-Sepsis-Risk-2.9.zip
cd Hybrid_Approach_For_Sepsis_Risk_Stratification_System

Run the Jupyter Notebook:

Open the provided Jupyter notebook (SEPSIS_RISK_STRATIFICATION_SYSTEM.ipynb) to explore the data preprocessing, feature engineering, model training, and evaluation steps.
Test the Model:

You can test the model on new patient data by following the steps outlined in the notebook. Simply replace the sample data with your dataset and run the cells to see the model's predictions.

NOTE: The trained Random Forest model was used to test the new patients data

Results

The models demonstrated high accuracy and reliability in stratifying sepsis risk levels, as indicated by the evaluation metrics. The Decision Curve Analysis (DCA) highlighted the superior performance of the LSTM-XGBoost model over the LSTM-RF, particularly at higher threshold probabilities, suggesting its broader applicability and robustness in clinical settings.

Contributions

Contributions are welcome! If you have any ideas or suggestions to improve this project, feel free to fork the repository and submit a pull request.

Limitations

Data Source: The project relies on simulated data from Kaggle, which may not fully represent real-world clinical scenarios.
Model Generalizability: The current models may not generalize well across different patient populations or healthcare settings without further validation.
Performance: While the models show promising results, they require more extensive testing with diverse datasets to ensure robustness.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
nausea		nausea
README.md		README.md
SEPSIS_RISK_STRATIFICATION_SYSTEM.ipynb		SEPSIS_RISK_STRATIFICATION_SYSTEM.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hybrid_Approach_For_Sepsis_Risk_Stratification_System

Project Overview

Features

Dataset

Installation

Usage

NOTE: The trained Random Forest model was used to test the new patients data

Results

Contributions

Limitations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Hybrid_Approach_For_Sepsis_Risk_Stratification_System

Project Overview

Features

Dataset

Installation

Usage

NOTE: The trained Random Forest model was used to test the new patients data

Results

Contributions

Limitations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages