Q-score Validation

Benchmarking pure Python Q-score computation against MapQ/Chimera (EMDB reference)

Overview

This repository contains a systematic comparison of the pure Python Q-score reimplementation (jamaliki/qscore) against the reference MapQ/Chimera implementation (gregdp/mapq), as used by the EMDB for structural validation reports.

This work supports the effort to remove the UCSF Chimera dependency from the IHMValidation pipeline (issue #119).

What is Q-score?

The Q-score metric quantifies the resolvability of individual atoms in cryo-EM maps by measuring the similarity between the map values around each atom and a reference Gaussian profile. A Q-score of 1.0 indicates perfect resolvability, while values closer to 0.0 indicate poor resolvability. It was introduced by Pintilie et al. (2020) and is now part of the standard wwPDB/EMDB validation pipeline.

Why replace MapQ?

The current implementation relies on MapQ, a UCSF Chimera plugin, which introduces several issues:

Issue	Impact
Requires UCSF Chimera (Python 2.7)	Heavy, legacy dependency
Needs Tk and X11 display libraries	Fails in headless environments
Slow for large structures	Minutes per structure
Not pip-installable	Manual installation required

Method

Selected 28 EMDB entries with deposited Q-scores (resolution: 1.8–4.2 Å)
Downloaded PDB models and EMDB maps for each entry
Computed Q-scores using jamaliki/qscore with σ = 0.4 (EMDB standard)
Compared against EMDB reference Q-scores (computed by MapQ v2.9.7)

Results

Correlation Plot

Figure 1. (A) Scatter plot of MapQ vs qscore values with linear fit. (B) Difference as a function of resolution. (C) Bland-Altman plot showing limits of agreement. The red × marks the single outlier (EMD-72359, extremely low map contrast).

Summary Statistics

Metric	All entries (n=28)	Excluding outlier (n=27)
Pearson r	0.892	0.997
Spearman ρ	0.987	0.995
Mean offset	+0.031 ± 0.063	+0.043 ± 0.009
Max \|diff\|	0.291	0.055
Linear fit	—	qscore = 0.994 × MapQ + 0.046

Per-residue Validation

To confirm agreement at the individual residue level (not just global averages), we performed a detailed per-residue comparison on 5A1A / EMD-2984 (2.2 Å, 20 chains, 4085 residues):

Figure 2. (A) Per-residue scatter plot with linear fit. (B) Distribution of per-residue differences. (C) Offset along Chain A sequence with 20-residue moving average.

Metric	Value
Residues matched	4,085
Pearson r	0.987
Spearman ρ	0.973
Mean offset	+0.028 ± 0.017
RMSD	0.033
Max \|diff\|	0.109
Linear fit	y = 0.992x + 0.034
R²	0.975

Key finding: Per-residue agreement is strong across the entire Q-score range (0.1–0.8), confirming that the pure Python implementation preserves residue-level ranking and can reliably identify poorly resolved regions.

PDB-IHM Integrative Structure

To confirm compatibility with IHMValidation's target data, we tested on PDBDEV_00000141 / EMD-14774 — an integrative model of PTX3 Pentraxin (2.5 Å) combining cryo-EM and AlphaFold-derived coordinates.

Figure 3. (A) Q-score distribution across 2,912 residues. (B) Per-chain box plots showing consistent Q-scores across all 8 symmetric chains. (C) Q-score along Chain A sequence with 10-residue moving average.

Property	Value
Entry	PDBDEV_00000141 / EMD-14774 / 9A24
Type	Integrative (cryo-EM + AlphaFold)
Resolution	2.5 Å
Chains / Residues	8 / 2,912
Overall Q-score	0.337
Computation time	~6 seconds

Key finding: The pure Python Q-score runs successfully on PDB-IHM integrative structures, producing physically meaningful per-chain and per-residue scores without any Chimera dependency.

Full Comparison Table

Click to expand (28 entries)

EMDB ID	PDB ID	Res (Å)	MapQ	qscore	Diff	% Diff
EMD-52518	9HYU	1.8	0.666	0.710	+0.044	+6.6%
EMD-2984	5A1A	2.2	0.615	0.644	+0.029	+4.7%
EMD-53512	9R1Q	2.2	0.610	0.655	+0.045	+7.4%
EMD-64933	9VBT	2.3	0.676	0.707	+0.031	+4.6%
EMD-64047	9UCL	2.4	0.452	0.503	+0.051	+11.3%
EMD-48779	9N09	2.6	0.532	0.574	+0.042	+7.9%
EMD-55737	9T9U	2.6	0.566	0.603	+0.037	+6.6%
EMD-73900	9Z8M	2.7	0.556	0.592	+0.036	+6.5%
EMD-63009	9LDX	2.8	0.541	0.591	+0.050	+9.2%
EMD-55355	9SYV	2.9	0.559	0.601	+0.042	+7.6%
EMD-54930	9SIQ	3.0	0.495	0.547	+0.052	+10.4%
EMD-66260	9WUF	3.0	0.413	0.466	+0.053	+12.9%
EMD-53804	9R85	3.0	0.458	0.506	+0.048	+10.5%
EMD-53483	9R0I	3.1	0.481	0.529	+0.048	+9.9%
EMD-60928	9IVJ	3.1	0.522	0.568	+0.046	+8.8%
EMD-56518	9U2S	3.2	0.444	0.470	+0.026	+5.8%
EMD-47792	9E9D	3.2	0.484	0.529	+0.045	+9.3%
EMD-55831	9TEO	3.3	0.181	0.203	+0.022	+12.0%
EMD-63013	9LE2	3.3	0.361	0.413	+0.052	+14.3%
EMD-66788	9XED	3.4	0.378	0.408	+0.030	+7.9%
EMD-3061	5A63	3.4	0.457	0.509	+0.052	+11.4%
EMD-56096	9TNZ	3.5	0.439	0.487	+0.048	+10.9%
EMD-48340	9MKW	3.6	0.297	0.341	+0.044	+14.9%
EMD-49797	9NU5	3.7	0.450	0.500	+0.050	+11.0%
EMD-54674	9S98	3.9	0.434	0.478	+0.044	+10.1%
EMD-72359	9XZK*	3.9	0.376	0.085	−0.291	−77.4%
EMD-71406	9P9C	4.0	0.324	0.375	+0.051	+15.7%
EMD-53590	9R5K	4.2	0.294	0.349	+0.055	+18.6%

* Outlier — extremely low map contrast (σ ≈ 100× below normal).

Root Cause Analysis

Source of the systematic offset (+0.043)

The offset arises from a fundamental algorithmic difference:

	MapQ (reference)	qscore (pure Python)
Approach	Volume-based cross-correlation	Point-based radial sampling
Implementation	3D Gaussian grid via `_gaussian.sum_of_gaussians` → CCm via `FitMap.overlap_and_correlation`	Random points on spherical shells (0–2.0 Å) → Pearson correlation of radial profiles
Dependency	UCSF Chimera	NumPy, SciPy

Factors ruled out

Factor	Effect on offset	Method
Reference Gaussian parameters	None	Both use identical formula: `maxD = min(mean + 10σ, max)`
Number of sampling points (8 → 64)	< 0.001	Tested on EMD-2984/5A1A
Random seed variation	< 0.001	Three independent runs

Outlier (EMD-72359/9XZK)

Property	Normal maps	EMD-72359
Map σ	0.01–0.05	0.0002
height/max ratio	0.2–1.0	0.12

The extremely low contrast causes the reference Gaussian to become poorly conditioned.

Speed comparison

Implementation	Time (5A1A, 32k atoms)	Dependencies
MapQ/Chimera	Minutes	UCSF Chimera, Tk, X11
jamaliki/qscore	~11 seconds	NumPy, SciPy, BioPython, mrcfile

Integration

The integration code for IHMValidation is available on a separate branch:

Branch: ShravyaRS/IHMValidation:qscore-pure-python

Changes

File	Description
`ihm_validation/qscore_utils.py`	New module: pure Python Q-score computation matching VA's output format
`ihm_validation/em.py`	Removed `qscore` from VA CLI call; computes Q-scores directly; backward compatible
`pyproject.toml`	Added `qscore` as optional dependency (`[em]`, `[all]`)

Required fix for jamaliki/qscore

# qscore/utils.py — interpolate_grid_at_points()
# Add bounds_error=False to handle atoms near map edges
return interpn((x, y, z), map.grid, p, bounds_error=False, fill_value=0.0)

Repository Structure

qscore_validation/
├── LICENSE
├── README.md
├── scripts/
│   ├── run_qscore_comparison.py   # Download, compute, and compare Q-scores
│   ├── analyze_results.py
│   ├── per_residue_comparison.py  # Per-residue validation (Fig. 2)
│   └── ihm_entry_test.py          # PDB-IHM integrative model test (Fig. 3)         # Statistical analysis and plotting
├── results/
│   ├── qscore_comparison_full.csv # All 28 entries with Q-scores
│   ├── qscore_correlation.png
│   ├── per_residue_correlation.png
│   ├── per_residue_5a1a.csv
│   ├── per_residue_summary.json
│   ├── ihm_pdbdev_00000141.json
│   └── ihm_pdbdev_00000141.png     # Three-panel figure
└── test_data/                     # Downloaded on demand (not committed)

Reproducing

# Clone this repo
git clone https://github.com/ShravyaRS/qscore-validation.git
cd qscore-validation

# Install dependencies
pip install mrcfile biopython scipy tqdm matplotlib

# Install qscore and apply fix
pip install git+https://github.com/jamaliki/qscore.git
QPATH=$(python -c "import qscore; print(qscore.__path__[0])")
sed -i 's/return interpn((x, y, z), map.grid, p)/return interpn((x, y, z), map.grid, p, bounds_error=False, fill_value=0.0)/' $QPATH/utils.py

# Run full comparison (~2 GB of map downloads)
python scripts/run_qscore_comparison.py

# Analyze results and generate plots
python scripts/analyze_results.py

References

Pintilie, G., Zhang, K., Su, Z. et al. (2020). Measurement of atom resolvability in cryo-EM maps with Q-scores. Nature Methods 17, 328–334. doi:10.1038/s41592-020-0731-1
Pintilie, G. et al. (2025). Q-score as a reliability measure for protein, nucleic acid, and small molecule atomic coordinate models derived from 3DEM density maps. Acta Cryst. D81. doi:10.1107/S2059798325005923
Lawson, C.L. et al. (2021). Cryo-EM model validation recommendations based on outcomes of the 2019 EMDataResource challenge. Nature Methods 18, 156–164. doi:10.1038/s41592-020-01051-w
MapQ — Chimera plugin for Q-scores: github.com/gregdp/mapq
jamaliki/qscore — Pure Python reimplementation: github.com/jamaliki/qscore

Part of the IHMValidation project at Sali Lab

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
results		results
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Q-score Validation

Overview

What is Q-score?

Why replace MapQ?

Method

Results

Correlation Plot

Summary Statistics

Per-residue Validation

PDB-IHM Integrative Structure

Full Comparison Table

Root Cause Analysis

Source of the systematic offset (+0.043)

Factors ruled out

Outlier (EMD-72359/9XZK)

Speed comparison

Integration

Changes

Required fix for jamaliki/qscore

Repository Structure

Reproducing

References

About

Uh oh!

Releases

Packages

Languages

License

ShravyaRS/qscore-validation

Folders and files

Latest commit

History

Repository files navigation

Q-score Validation

Overview

What is Q-score?

Why replace MapQ?

Method

Results

Correlation Plot

Summary Statistics

Per-residue Validation

PDB-IHM Integrative Structure

Full Comparison Table

Root Cause Analysis

Source of the systematic offset (+0.043)

Factors ruled out

Outlier (EMD-72359/9XZK)

Speed comparison

Integration

Changes

Required fix for jamaliki/qscore

Repository Structure

Reproducing

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages