Skip to content

deMirandaLab/CellCytoLego

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cell Segmentation in Multiplexed Proteomics: A Comparative Analysis

This repository contains the code, data, and results for the study:
"Cell segmentation in multiplexed proteomics: a comparative analysis"

Cell segmentation is a critical step in the analysis of multiplexed imaging data, directly influencing downstream analyses and biological insights. In this study, we present a comprehensive comparison of three widely used segmentation algorithms—CellProfiler, Mesmer, and Cellpose—on Imaging Mass Cytometry (IMC) data from colorectal cancer (CRC) tissues.

We evaluate both traditional one-pass segmentation (OneGo) and a sequential segmentation strategy (Sequential), where different cell types are segmented with optimized parameters.

To support further research, we introduce a manually annotated dataset of over 1,900 regions of interest (ROIs) including lymphoid, myeloid, vimentin, and epithelial cells.


📑 Table of Contents

  1. Overview
  2. Annotation Dataset
  3. Segmentation Approaches
  4. Phenotyping and Analysis
  5. Usage and Environment Setup
  6. Source Code
  7. License
  8. Citation

Overview

We compare three segmentation approaches:

  • CellProfiler: Classical image processing
  • Mesmer: Deep learning (from DeepCell)
  • Cellpose: Generalist DL-based model with fine-tuning capabilities

We assess these methods in:

  • OneGo segmentation: single-pass segmentation of all cells
  • Sequential segmentation: type-specific, prioritized segmentation with custom parameters

Evaluation focuses on:

  • Cell count
  • Morphology
  • Phenotyping
  • Marker distribution
  • Immune infiltration classification

Additionally, we provide a manually annotated dataset for training and evaluation.


✏️ Annotation Dataset

We annotated 25 Fields-of-View (FOVs) of 256×256 pixels from CRC tissues using nuclear and membrane markers:

Mask Type Training
FOVs
Training
ROIs
Test
FOVs
Test
ROIs
Total
FOVs
Total
ROIs
Lymphoid 5 315 2 146 7 461
Myeloid 4 211 2 110 6 321
Vimentin 5 297 2 235 7 532
Tumour 3 350 2 278 5 628
Total 17 1173 8 769 25 1942

Cells are only considered valid if both membrane and nuclei are annotated.
Type-specific membrane markers (maximum value of):

  • Lymphoid: CD7, CD20, CD38
  • Myeloid: CD68, CD14, CD11c
  • Vimentin: Vimentin
  • Tumour: Keratin, Bcatenin

Annotations are provided in the annotation_dataset/ folder.


📘📊 Comparison

1. Cellpose Retraining

Notebook: 1_Retrain_cellpose.ipynb

  • Evaluate pretrained Cellpose models: cyto3, Tissuenet, Tissuenet TN1
  • Retrain models using the annotated dataset
  • Evaluate retrained models on the test set
  • Manually tune parameters (flow threshold, probability threshold)

2. Evaluation of Segmentation Methods

Notebook: 2_Evaluation_test_set.ipynb

  • Evaluate: CellProfiler, Mesmer, Cellpose, Cellpose (retrained)
  • Filter nucleus-only segmentations
  • Pixel-level and ROI-level evaluations
  • Metrics: IoU, TP/TN/FP/FN classification

3. OneGo Segmentation

Notebook: 3_Run_segmentation_onego.ipynb

  • Segmentation using nucleus (DNA1, DNA2) and membrane (Vimentin, B-Catenin)
  • Tune Mesmer and Cellpose parameters
  • Extract morphology and intensity features

4. Sequential Segmentation

Notebook: 4_Run_segmentation_sequential.ipynb

  • Type-specific membrane markers:

    • Lymphoid: CD7, CD20, CD38
    • Myeloid: CD68, CD14, CD11c
    • Vimentin: Vimentin
    • Tumour: Keratin, Bcatenin
    • Other: nucleus-only cells
  • Overlayed masks with priority (no overlaps)

  • Tune segmentation parameters

5. Preprocessing & Phenotyping

Notebook: 5_Preprocess_phenotyping.ipynb

  • Compute cell area, expression sum, UMAPs
  • Median marker expression per segmentation type
  • Rule-based phenotyping (decision tree)

6. Phenotype Evaluation & Intraepithelial Analysis

Notebook: 6_Analysis_phenotyping_intraepithelial.ipynb

  • Evaluate phenotype classification using heatmaps/matrix plots
  • Perform intraepithelial immune cell analysis

Notebook 7 generates phenotyping panels.


🔧 Source Code

The source code for the sequential segmentation pipeline is available in the src/ folder.

cellinfo.py - CellMorphology and CellIntensities classes to extract cell morphology and intensity features from segmented cells.

cellphenotyping.py - CellClassifier classes to classify cells based on the decision tree approach.

filter_cells_mask.py - contains filtering functions to remove segmented cells that do not have membrane markers (several strategies).

ImageParser.py - fucntions to parse tiff files

cellseg.py - functions to run the segmentation of the images using Cellpose and Mesmer and to extract markers masks.

⚙️ Usage and Environment Setup

To run this project, install dependencies similar to the PENGUIN environment, then install deepcell and/or Cellpose.


License

Developed at the Leiden University Medical Centre, The Netherlands and Centre of Biological Engineering, University of Minho, Portugal

Released under the GNU Public License (version 3.0).

Citation

If you use this repository, please cite the associated manuscript (preprint/published link to be added):

My Image Description ( ADD image of cellcytolego with tetris image)

About

Sequential cell segmentation for IMC data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published