This repository contains the code, data, and results for the study:
"Cell segmentation in multiplexed proteomics: a comparative analysis"
Cell segmentation is a critical step in the analysis of multiplexed imaging data, directly influencing downstream analyses and biological insights. In this study, we present a comprehensive comparison of three widely used segmentation algorithms—CellProfiler, Mesmer, and Cellpose—on Imaging Mass Cytometry (IMC) data from colorectal cancer (CRC) tissues.
We evaluate both traditional one-pass segmentation (OneGo) and a sequential segmentation strategy (Sequential), where different cell types are segmented with optimized parameters.
To support further research, we introduce a manually annotated dataset of over 1,900 regions of interest (ROIs) including lymphoid, myeloid, vimentin, and epithelial cells.
- Overview
- Annotation Dataset
- Segmentation Approaches
- Phenotyping and Analysis
- Usage and Environment Setup
- Source Code
- License
- Citation
We compare three segmentation approaches:
- CellProfiler: Classical image processing
- Mesmer: Deep learning (from DeepCell)
- Cellpose: Generalist DL-based model with fine-tuning capabilities
We assess these methods in:
- OneGo segmentation: single-pass segmentation of all cells
- Sequential segmentation: type-specific, prioritized segmentation with custom parameters
Evaluation focuses on:
- Cell count
- Morphology
- Phenotyping
- Marker distribution
- Immune infiltration classification
Additionally, we provide a manually annotated dataset for training and evaluation.
We annotated 25 Fields-of-View (FOVs) of 256×256 pixels from CRC tissues using nuclear and membrane markers:
| Mask Type | Training FOVs |
Training ROIs |
Test FOVs |
Test ROIs |
Total FOVs |
Total ROIs |
|---|---|---|---|---|---|---|
| Lymphoid | 5 | 315 | 2 | 146 | 7 | 461 |
| Myeloid | 4 | 211 | 2 | 110 | 6 | 321 |
| Vimentin | 5 | 297 | 2 | 235 | 7 | 532 |
| Tumour | 3 | 350 | 2 | 278 | 5 | 628 |
| Total | 17 | 1173 | 8 | 769 | 25 | 1942 |
Cells are only considered valid if both membrane and nuclei are annotated.
Type-specific membrane markers (maximum value of):
- Lymphoid:
CD7,CD20,CD38- Myeloid:
CD68,CD14,CD11c- Vimentin:
Vimentin- Tumour:
Keratin,Bcatenin
Annotations are provided in the annotation_dataset/ folder.
Notebook: 1_Retrain_cellpose.ipynb
- Evaluate pretrained Cellpose models:
cyto3,Tissuenet,Tissuenet TN1 - Retrain models using the annotated dataset
- Evaluate retrained models on the test set
- Manually tune parameters (flow threshold, probability threshold)
Notebook: 2_Evaluation_test_set.ipynb
- Evaluate: CellProfiler, Mesmer, Cellpose, Cellpose (retrained)
- Filter nucleus-only segmentations
- Pixel-level and ROI-level evaluations
- Metrics: IoU, TP/TN/FP/FN classification
Notebook: 3_Run_segmentation_onego.ipynb
- Segmentation using nucleus (
DNA1,DNA2) and membrane (Vimentin,B-Catenin) - Tune Mesmer and Cellpose parameters
- Extract morphology and intensity features
Notebook: 4_Run_segmentation_sequential.ipynb
-
Type-specific membrane markers:
- Lymphoid:
CD7,CD20,CD38 - Myeloid:
CD68,CD14,CD11c - Vimentin:
Vimentin - Tumour:
Keratin,Bcatenin - Other: nucleus-only cells
- Lymphoid:
-
Overlayed masks with priority (no overlaps)
-
Tune segmentation parameters
Notebook: 5_Preprocess_phenotyping.ipynb
- Compute cell area, expression sum, UMAPs
- Median marker expression per segmentation type
- Rule-based phenotyping (decision tree)
Notebook: 6_Analysis_phenotyping_intraepithelial.ipynb
- Evaluate phenotype classification using heatmaps/matrix plots
- Perform intraepithelial immune cell analysis
Notebook 7 generates phenotyping panels.
The source code for the sequential segmentation pipeline is available in the src/ folder.
cellinfo.py - CellMorphology and CellIntensities classes to extract cell morphology and intensity features from segmented cells.
cellphenotyping.py - CellClassifier classes to classify cells based on the decision tree approach.
filter_cells_mask.py - contains filtering functions to remove segmented cells that do not have membrane markers (several strategies).
ImageParser.py - fucntions to parse tiff files
cellseg.py - functions to run the segmentation of the images using Cellpose and Mesmer and to extract markers masks.
To run this project, install dependencies similar to the PENGUIN environment, then install
deepcell and/or Cellpose.
Developed at the Leiden University Medical Centre, The Netherlands and Centre of Biological Engineering, University of Minho, Portugal
Released under the GNU Public License (version 3.0).
If you use this repository, please cite the associated manuscript (preprint/published link to be added):
