A generalized implementation of MRI preprocessing for various ML/AI tasks within the Parra Lab. This project is designed to automate the ingestion, analysis, and processing of raw DICOM MRI data into model-ready inputs.
- Overview
- Key Features
- Project Structure
- Installation
- Usage
- Preprocessing Workflow
- Testing
- Contributing
- Acknowledgements
The MRI Preprocessing Pipeline is a modular system built to handle large datasets of MRI scans. It runs within a Docker container to ensure a consistent environment and supports both an interactive web-based control system and a scriptable command-line interface.
The core functionality resides in code/preprocessing/, where a series of Python scripts handle everything from DICOM extraction to NIfTI conversion and spatial alignment.
- Automated Scanning: Recursively scans directories for MRI DICOM files.
- Metadata Extraction: Extracts and standardizes DICOM header information into CSV tables.
- Intelligent Parsing: Identifies scan types (T1, T2, etc.) and orders sequences based on acquisition times.
- Modular Design: Each step of the pipeline is a standalone script, allowing for flexible execution and debugging.
- Containerized Environment: Fully Dockerized setup for easy deployment on Linux and WSL systems.
- Web Interface: (In Development) A Flask-based dashboard to monitor and control the processing status.
MRI_preprocessing/
├── code/
│ └── preprocessing/ # Core python scripts for data processing
│ ├── 01_scanDicom.py # Scans and extracts DICOM metadata
│ ├── 02_parseDicom.py # Filters and orders scans
│ ├── ... # Subsequent processing steps
│ ├── DICOM.py # DICOM handling utilities
│ └── toolbox.py # General helper functions
├── control_system/ # Docker and Web App configuration
│ ├── app/ # Flask web application
│ └── docker* # Docker Compose files
├── data/ # Data storage (mounted volumes)
├── test/ # Unit and integration tests
├── start_control.sh # Main entry point script
└── install.py # Dependency installation script
- Linux or Windows Subsystem for Linux (WSL2)
- Python 3.x
- Docker & Docker Compose (installed automatically via
install.pyif not present)
-
Clone the repository:
git clone https://github.com/TheParraLab/MRI_preprocessing cd MRI_preprocessing -
Install dependencies and setup Docker:
python3 install.py
Note: This script attempts to install Docker and configure GPU access. If you prefer, you can install Docker manually.
The primary way to interact with the pipeline is through the start_control.sh script.
bash start_control.shYou will be prompted to:
- Enable the webserver component (y/n).
- Provide the path to your raw DICOM data on the host machine.
The system maps your local data directory to /FL_system/data/raw/ inside the Docker container.
If enabled, the web interface is accessible at http://localhost:5000. It provides a dashboard to view the status of the preprocessing steps.
(Note: The web interface is currently under active development).
For batch processing or direct control, you can access the container's shell:
Option 1: Convenience Script
bash access_preprocessing.shOption 2: Direct Docker Exec
docker exec -it control bash
cd /FL_system/code/preprocessing/The pipeline consists of numbered scripts in code/preprocessing/ that should generally be run in order:
- 01_scanDicom.py: Scans raw data and builds a
Data_table.csvof all found DICOM files.- Documentation: See
code/preprocessing/01_scanDicom.pyfor detailed usage and arguments.
- Documentation: See
- 02_parseDicom.py: Filters relevant scans (e.g., T1) and orders them by time.
- 03_saveNifti.py: Converts selected DICOM series to NIfTI format.
- 04_saveRAS.py: Reorients NIfTI files to RAS orientation.
- 05_alignScans.py: Aligns scans to a reference volume.
- 06_genInputs.py: Generates final model inputs.
To run a specific step manually inside the container:
python 01_scanDicom.py --scan_dir /FL_system/data/raw --save_dir /FL_system/dataUnit and integration tests are located in the test/ directory.
To run tests (ensure you have pytest installed):
pytest test/- Fork the repository.
- Create a feature branch (
git checkout -b feature/NewFeature). - Commit your changes.
- Push to the branch.
- Open a Pull Request.
Please ensure all new code is well-documented and passes existing tests.
- Parra Lab
- Contributors: [Add names here]
For questions or support, please contact nleotta000@citymail.cuny.edu