Skip to content

ColonyQuant is a command-line toolkit for fully automated quantification and morphometric analysis of AP-stained embryonic stem cell colonies, with integrated visualization and clustering workflows.

License

Notifications You must be signed in to change notification settings

KidderLab/ColonyQuant

Repository files navigation

ColonyQuant

A command-line toolkit for fully automated, high-throughput quantification of alkaline phosphatase (AP) staining in embryonic stem cell (ESC) colonies. The pipeline uses adaptive thresholding and contour detection to segment individual ESC colonies, extract per‐colony intensity metrics and a comprehensive suite of morphometric descriptors (area, perimeter, circularity, solidity, eccentricity, etc.), then drives integrated downstream analyses, including PCA, t-SNE, UMAP, LDA, PLS-DA and random-forest classification, with publication-ready plots and summary tables in both R and Python.

Installation

  1. Clone the repository:
    git clone https://github.com/KidderLab/ColonyQuant.git
    cd ColonyQuant
  2. Create and activate the Conda environment:
    conda env create -f environment.yml
    conda activate colonyquant
  3. Install the package in editable mode:
    python -m pip install -e .

Quick start: run the full pipeline

The complete ColonyQuant workflow can be executed end-to-end using a single command.

Full pipeline (recommended)

./run_colonyquant_pipeline.sh

Pixel size calibration (optional)

Several ColonyQuant commands accept an additional optional calibration parameter:

--pixel-size-um <FLOAT>

When provided, ColonyQuant:

  • Converts pixel-based area and perimeter measurements into physical units
  • Adds calibrated columns such as Area_um2, Perimeter_um, EquivalentDiameter_um
  • Generates additional plots with axes labeled in µm² or µm

If --pixel-size-um is omitted, all commands report pixel-based measurements only.

Pixel-calibrated and pixel-based columns may coexist in the same output tables.

Full pipeline with pixel size calibration

./run_colonyquant_pipeline_pixel_size_um.sh

These scripts run the entire ColonyQuant pipeline sequentially, including:

  • AP colony detection and quantitation
  • Morphometric feature extraction
  • Histogram and scatter visualizations
  • v4 and v11 visualization pipelines, including contour heatmaps, mosaics, and clustering analyses

See the sections below for detailed CLI usage and customization options.

CLI Usage

1. Quantification

colonyquant quantitate \
  --ap-dir examples/images \
  --block-size 51 --offset 5 --min-area 300 \
  --output examples/output

Processes all AP images, generating per-image:

  • <basename>_summary.csv — per-colony measurements (Area, Mean Intensity, Total Intensity)
  • <basename>_mask.png — binary colony mask
  • <basename>_boundary.png — original image with detected contours overlaid

Quantification mask & boundary

Pixel-calibrated quantification (optional)

colonyquant quantitate \
  --ap-dir examples/images \
  --block-size 51 --offset 5 --min-area 300 \
  --pixel-size-um 1.76 \
  --output examples/output

When --pixel-size-um is provided, pixel-based measurements are converted to physical units (for example µm²), and calibrated columns are added to the output tables. If omitted, quantification proceeds in pixel units only.


2. Morphometrics

colonyquant morphometrics \
  --ap-dir examples/images \
  --output-dir examples/output

Files generated by colonyquant morphometrics

Computes shape features and exports for each image:

All files are written into examples/output/

  • <basename>_summary.csv
    Per-image CSV with colony‐level metrics: Area, Perimeter, Aspect Ratio, Extent, Solidity, Equivalent Diameter, Circularity, Eccentricity

  • Histograms (one file per feature; for each image named <basename>_hist_<feature>.png):

    • *_hist_area.png
    • *_hist_perimeter.png
    • *_hist_aspect_ratio.png
    • *_hist_extent.png
    • *_hist_solidity.png
    • *_hist_equivalent_diameter.png
    • *_hist_circularity.png
    • *_hist_eccentricity.png
  • Intensity vs. Solidity scatter

    • <basename>_intensity_vs_solidity.png

Morphometrics histograms & scatter


2.1 Pixel-calibrated morphometrics

colonyquant morphometrics \
  --ap-dir examples/images \
  --output-dir examples/output \
  --pixel-size-um 1.76

When calibration is provided, additional physical-unit features (for example Area_um2) are included in all outputs.


2.2 Hierarchical colony table

colonyquant morphometrics \
  --ap-dir examples/images \
  --output-dir examples/output \
  --export-hierarchy

Generates:

  • hierarchical_colony_table.csv

Each row corresponds to one colony with the hierarchy:

SampleID
 └── ImageID
      └── ColonyID

This output is intended for mixed-effects and hierarchical statistical modeling.


2.3 Image-level aggregation

colonyquant morphometrics \
  --ap-dir examples/images \
  --output-dir examples/output \
  --aggregate-level image

Generates:

  • image_summary.csv

Each row summarizes all colonies within a single image and reports:

  • N_colonies
  • <feature>_mean
  • <feature>_median
  • <feature>_sd

2.4 Well-level aggregation using a plate map

colonyquant morphometrics \
  --ap-dir examples/images \
  --output-dir examples/output \
  --aggregate-level well \
  --plate-map plate_map.csv

Generates:

  • well_summary.csv

Plate map requirements

The plate map must be a CSV file with the following columns (case-insensitive):

  • ImageID: image filename stem (without extension)
  • WellID: true plate well identifier (for example A01, B05)

Example plate_map.csv:

ImageID,WellID
1-control-ESC-01,A01
1-control-ESC-02,A01
2-treatment-ESC-01,B05
2-treatment-ESC-02,B05

The plate map is only used when --aggregate-level well is requested. For image-level aggregation and hierarchical export without well aggregation, the original filename-based inference strategy is used.

If any processed image is missing from the plate map, ColonyQuant raises an error to prevent silent mislabeling.


2.5 Input directory behavior

The morphometrics command automatically ignores non-image files in --ap-dir.

Supported image extensions:

  • .png, .jpg, .jpeg, .tif, .tiff, .bmp, .webp

This allows files such as plate_map.csv to coexist in the image directory without being processed as images.


3. Visualization v4

colonyquant visualize-v4 \
  --data-dir examples/output \
  --out-dir examples/output/v4_figs \
  --features3 Log2Area "Mean Intensity" Circularity \
  --features2 Log2Area "Mean Intensity" \
  --grid_size 300 \
  --shapes

Generates contour-density heatmaps and 3D plots.

Files generated by colonyquant visualize-v4

All files are written into examples/output/v4_figs/

  • 2D density heatmap

    • colony_2d.png — 2D kernel‐density heatmap of the two features passed to --features2 (e.g. Log2Area vs. Mean Intensity).
  • 3D scatter‐density plot

    • colony_3d.png — 3D scatter‐density of the three features passed to --features3 (e.g. Log2Area, Mean Intensity, Circularity), colored by local point density.
  • Contour‐density heatmaps (one per group <GID>)

    • group_<GID>_contour_heatmap.png — aggregated contour‐density (“hot” colormap) for each treatment group, with outlines overlaid if --shapes is set.
2D density heatmap
2D density heatmap
3D scatter-density plot
3D scatter-density plot
Sample 1 mask density
Sample 1: Mask Density
Sample 2 mask density
Sample 2: Mask Density

4. Visualization v11

colonyquant visualize-v11 \
  --data-dir examples/output \
  --out-dir examples/output/v11_figs \
  --features3 Log2Area "Mean Intensity" Circularity \
  --features2 Log2Area "Mean Intensity" \
  --grid_size 100 \
  --contours --mosaic --heatmosaic \
  --heatmap_size 200 \
  --global_clusters 100 \
  --cluster_analysis

Full v11 visualization pipeline.

Files generated by colonyquant visualize-v11

All files are written into examples/output/v11_figs/

  • Per-group data tables

    • group_<GID>_data.csv
      Merged per-colony summary (Area, Intensity, Morphometrics) for group <GID>.
  • Density plots

    • colony_2d.png
      2D density heatmap of features2 (e.g. Log2Area vs. Mean Intensity).
    • colony_3d.png
      3D scatter-density of features3 (e.g. Log2Area, Mean Intensity, Circularity).
  • Contour heatmaps (if --contours passed)

    • group_<GID>_contour_heatmap_gray.png
      Grayscale contour-density for group <GID>.
    • group_<GID>_contour_heatmap_color.png
      “Hot” colormap version.
  • Summary mosaics (if --mosaic passed)

    • group_<GID>_summary_mosaic.png
      10×10 grid of the top 100 colony outlines for group <GID>.
    • global_summary_mosaic.png
      10×10 grid of the top 100 outlines pooled across all groups.
    • group_<GID>_shape_entropies_mosaic_10x10.png
      10×10 grid colored by Shannon entropy of each colony’s 2D KDE.
  • Heat-mosaics (if --heatmosaic passed)

    • group_<GID>_heatmosaic.png
      Mosaic of per-shape 2D density plots (hot-colormap) for the top 100 outlines in group <GID>.
  • Cluster-analysis outputs (if --cluster_analysis passed)

    • group_cluster_fractions.csv
      Fraction of each cluster within each group.
    • group_cluster_fraction_heatmap.png
      Heatmap of cluster fractions by group.
    • group_js_distances.csv
      Pairwise Jensen–Shannon distances between group distributions.
    • group_js_distances_heatmap.png
      Heatmap of the JS-distance matrix.
    • cluster_centers.json
      JSON with coordinates of the cluster centers (in Hu-moment space).
Color Contour Heatmap
Color Contour Heatmap
Grayscale Contour Heatmap
Grayscale Contour Heatmap
Heat Mosaic
Heat Mosaic
Summary Mosaic
Summary Mosaic

Shape Entropies (10×10 Mosaic)
Shape Entropies (10×10 Mosaic)

Entropies Heatmap
Entropies Heatmap

JS Distances Heatmap
JS Distances Heatmap

Cluster Fraction Heatmap


Cluster Fraction Heatmap

5. Feature Distributions (R)

colonyquant feature-distributions \
  --data-dir examples/output \
  --out-dir examples/output/feature_distributions

Calls R/AP_stain_feature_distributions_v3.R to generate:

  • Violin + boxplots
  • Raincloud plots
  • Ridgeline densities
  • Z-score heatmap
  • Radar charts
  • Panel mosaics

Ensure R packages installed:

install.packages(c(
  "ggplot2",    # plotting
  "ggdist",     # visualizing distributions
  "ggridges",   # ridgeline plots
  "pheatmap",   # heatmaps
  "fmsb",       # radar charts
  "cowplot",    # combining ggplots
  "dplyr",      # data manipulation
  "tidyr",      # data reshaping
  "viridis",    # color scales
  "ggplotify",  # convert plots to grobs
  "scales",     # axis scaling
  "mixOmics"    # multivariate data integration
))

Files generated by colonyquant feature-distributions

All files are written into examples/output/feature_distributions/

  • Violin + boxplots

    • violin_boxplot_Area.png
    • violin_boxplot_MeanIntensity.png
    • violin_boxplot_TotalIntensity.png
  • Raincloud plots

    • raincloud_Area.png
    • raincloud_MeanIntensity.png
  • Ridgeline density plots

    • ridgeline_Area.png
    • ridgeline_MeanIntensity.png
  • Z-score heatmap

    • zscore_heatmap.png
  • Radar charts

    • radar_chart_Group1.png
    • radar_chart_Group2.png
  • Panel mosaics

    • panel_mosaic_A.png
    • panel_mosaic_B.png
    • panel_mosaic_C.png
    • panel_mosaic_D.png
    • panel_mosaic_E.png

Mosaic A–D labeled

Panel E


6. Comprehensive Analysis (R)

colonyquant comprehensive-analysis \
  --data-dir examples/output \
  --out-dir examples/output/comprehensive_analysis

Calls R/AP_stain_comprehensive_analysis_v3.R for advanced multivariate plots.
Installs any missing R packages as prompted.

Files generated by colonyquant comprehensive-analysis

All files are written into examples/output/comprehensive_analysis/

  • Merged summary table

    • merged_summary.csv
      Combined per-colony summary statistics from all input groups.
  • Random Forest feature importance

    • feature_importance_RF.png
    • feature_importance_RF.pdf
  • PCA on sample medians

    • PCA_sample_medians.png
    • PCA_sample_medians.pdf
  • PLS-DA embedding

    • PLSDA_colonies.png
    • PLSDA_colonies.pdf
  • LDA classification

    • LDA_colonies.png
    • LDA_colonies.pdf
  • t-SNE embedding

    • tsne_embedding.png
    • tsne_embedding.pdf
  • Supervised UMAP (top 10 PCs)

    • umap_supervised_top10PCs.png
    • umap_supervised_top10PCs.pdf
  • Composite panels

    • panel_A.png / panel_A.pdf (Random Forest)
    • panel_B.png / panel_B.pdf (PCA)
    • panel_C.png / panel_C.pdf (PLS-DA)
    • panel_D.png / panel_D.pdf (t-SNE)
    • panel_E.png / panel_E.pdf (UMAP)
  • Labeled mosaic of panels A–D

    • mosaic_A-D_labeled.png
    • mosaic_A-D_labeled.pdf
  • Overview figure

    • comprehensive_analysis.png
    • comprehensive_analysis.pdf

Random Forest Feature Importance
Random Forest Feature Importance

PCA on Sample Medians
PCA on Sample Medians

LDA Colony Classification
LDA Colony Classification

PLS-DA Colony Embedding
PLS-DA Colony Embedding

Supervised UMAP (top 10 PCs)
Supervised UMAP (top 10 PCs)

Examples

Check README.md for sample workflows and expected outputs.


Enjoy automated, reproducible AP stain quantification and visualization!

Visit the Kidder Lab website: benjaminkidderlab.com

Cancer Biology Program profile: Kidder Lab


License

ColonyQuant is available free of charge for academic research, teaching, and internal core-facility use. Commercial use requires prior written permission.

About

ColonyQuant is a command-line toolkit for fully automated quantification and morphometric analysis of AP-stained embryonic stem cell colonies, with integrated visualization and clustering workflows.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published