Benchmarking pure Python Q-score computation against MapQ/Chimera (EMDB reference)
This repository contains a systematic comparison of the pure Python Q-score reimplementation (jamaliki/qscore) against the reference MapQ/Chimera implementation (gregdp/mapq), as used by the EMDB for structural validation reports.
This work supports the effort to remove the UCSF Chimera dependency from the IHMValidation pipeline (issue #119).
The Q-score metric quantifies the resolvability of individual atoms in cryo-EM maps by measuring the similarity between the map values around each atom and a reference Gaussian profile. A Q-score of 1.0 indicates perfect resolvability, while values closer to 0.0 indicate poor resolvability. It was introduced by Pintilie et al. (2020) and is now part of the standard wwPDB/EMDB validation pipeline.
The current implementation relies on MapQ, a UCSF Chimera plugin, which introduces several issues:
| Issue | Impact |
|---|---|
| Requires UCSF Chimera (Python 2.7) | Heavy, legacy dependency |
| Needs Tk and X11 display libraries | Fails in headless environments |
| Slow for large structures | Minutes per structure |
| Not pip-installable | Manual installation required |
- Selected 28 EMDB entries with deposited Q-scores (resolution: 1.8–4.2 Å)
- Downloaded PDB models and EMDB maps for each entry
- Computed Q-scores using
jamaliki/qscorewithσ = 0.4(EMDB standard) - Compared against EMDB reference Q-scores (computed by MapQ v2.9.7)
Figure 1. (A) Scatter plot of MapQ vs qscore values with linear fit. (B) Difference as a function of resolution. (C) Bland-Altman plot showing limits of agreement. The red × marks the single outlier (EMD-72359, extremely low map contrast).
| Metric | All entries (n=28) | Excluding outlier (n=27) |
|---|---|---|
| Pearson r | 0.892 | 0.997 |
| Spearman ρ | 0.987 | 0.995 |
| Mean offset | +0.031 ± 0.063 | +0.043 ± 0.009 |
| Max |diff| | 0.291 | 0.055 |
| Linear fit | — | qscore = 0.994 × MapQ + 0.046 |
To confirm agreement at the individual residue level (not just global averages), we performed a detailed per-residue comparison on 5A1A / EMD-2984 (2.2 Å, 20 chains, 4085 residues):
Figure 2. (A) Per-residue scatter plot with linear fit. (B) Distribution of per-residue differences. (C) Offset along Chain A sequence with 20-residue moving average.
| Metric | Value |
|---|---|
| Residues matched | 4,085 |
| Pearson r | 0.987 |
| Spearman ρ | 0.973 |
| Mean offset | +0.028 ± 0.017 |
| RMSD | 0.033 |
| Max |diff| | 0.109 |
| Linear fit | y = 0.992x + 0.034 |
| R² | 0.975 |
Key finding: Per-residue agreement is strong across the entire Q-score range (0.1–0.8), confirming that the pure Python implementation preserves residue-level ranking and can reliably identify poorly resolved regions.
To confirm compatibility with IHMValidation's target data, we tested on PDBDEV_00000141 / EMD-14774 — an integrative model of PTX3 Pentraxin (2.5 Å) combining cryo-EM and AlphaFold-derived coordinates.
Figure 3. (A) Q-score distribution across 2,912 residues. (B) Per-chain box plots showing consistent Q-scores across all 8 symmetric chains. (C) Q-score along Chain A sequence with 10-residue moving average.
| Property | Value |
|---|---|
| Entry | PDBDEV_00000141 / EMD-14774 / 9A24 |
| Type | Integrative (cryo-EM + AlphaFold) |
| Resolution | 2.5 Å |
| Chains / Residues | 8 / 2,912 |
| Overall Q-score | 0.337 |
| Computation time | ~6 seconds |
Key finding: The pure Python Q-score runs successfully on PDB-IHM integrative structures, producing physically meaningful per-chain and per-residue scores without any Chimera dependency.
Click to expand (28 entries)
| EMDB ID | PDB ID | Res (Å) | MapQ | qscore | Diff | % Diff |
|---|---|---|---|---|---|---|
| EMD-52518 | 9HYU | 1.8 | 0.666 | 0.710 | +0.044 | +6.6% |
| EMD-2984 | 5A1A | 2.2 | 0.615 | 0.644 | +0.029 | +4.7% |
| EMD-53512 | 9R1Q | 2.2 | 0.610 | 0.655 | +0.045 | +7.4% |
| EMD-64933 | 9VBT | 2.3 | 0.676 | 0.707 | +0.031 | +4.6% |
| EMD-64047 | 9UCL | 2.4 | 0.452 | 0.503 | +0.051 | +11.3% |
| EMD-48779 | 9N09 | 2.6 | 0.532 | 0.574 | +0.042 | +7.9% |
| EMD-55737 | 9T9U | 2.6 | 0.566 | 0.603 | +0.037 | +6.6% |
| EMD-73900 | 9Z8M | 2.7 | 0.556 | 0.592 | +0.036 | +6.5% |
| EMD-63009 | 9LDX | 2.8 | 0.541 | 0.591 | +0.050 | +9.2% |
| EMD-55355 | 9SYV | 2.9 | 0.559 | 0.601 | +0.042 | +7.6% |
| EMD-54930 | 9SIQ | 3.0 | 0.495 | 0.547 | +0.052 | +10.4% |
| EMD-66260 | 9WUF | 3.0 | 0.413 | 0.466 | +0.053 | +12.9% |
| EMD-53804 | 9R85 | 3.0 | 0.458 | 0.506 | +0.048 | +10.5% |
| EMD-53483 | 9R0I | 3.1 | 0.481 | 0.529 | +0.048 | +9.9% |
| EMD-60928 | 9IVJ | 3.1 | 0.522 | 0.568 | +0.046 | +8.8% |
| EMD-56518 | 9U2S | 3.2 | 0.444 | 0.470 | +0.026 | +5.8% |
| EMD-47792 | 9E9D | 3.2 | 0.484 | 0.529 | +0.045 | +9.3% |
| EMD-55831 | 9TEO | 3.3 | 0.181 | 0.203 | +0.022 | +12.0% |
| EMD-63013 | 9LE2 | 3.3 | 0.361 | 0.413 | +0.052 | +14.3% |
| EMD-66788 | 9XED | 3.4 | 0.378 | 0.408 | +0.030 | +7.9% |
| EMD-3061 | 5A63 | 3.4 | 0.457 | 0.509 | +0.052 | +11.4% |
| EMD-56096 | 9TNZ | 3.5 | 0.439 | 0.487 | +0.048 | +10.9% |
| EMD-48340 | 9MKW | 3.6 | 0.297 | 0.341 | +0.044 | +14.9% |
| EMD-49797 | 9NU5 | 3.7 | 0.450 | 0.500 | +0.050 | +11.0% |
| EMD-54674 | 9S98 | 3.9 | 0.434 | 0.478 | +0.044 | +10.1% |
| EMD-72359 | 9XZK* | 3.9 | 0.376 | 0.085 | −0.291 | −77.4% |
| EMD-71406 | 9P9C | 4.0 | 0.324 | 0.375 | +0.051 | +15.7% |
| EMD-53590 | 9R5K | 4.2 | 0.294 | 0.349 | +0.055 | +18.6% |
* Outlier — extremely low map contrast (σ ≈ 100× below normal).
The offset arises from a fundamental algorithmic difference:
| MapQ (reference) | qscore (pure Python) | |
|---|---|---|
| Approach | Volume-based cross-correlation | Point-based radial sampling |
| Implementation | 3D Gaussian grid via _gaussian.sum_of_gaussians → CCm via FitMap.overlap_and_correlation |
Random points on spherical shells (0–2.0 Å) → Pearson correlation of radial profiles |
| Dependency | UCSF Chimera | NumPy, SciPy |
| Factor | Effect on offset | Method |
|---|---|---|
| Reference Gaussian parameters | None | Both use identical formula: maxD = min(mean + 10σ, max) |
| Number of sampling points (8 → 64) | < 0.001 | Tested on EMD-2984/5A1A |
| Random seed variation | < 0.001 | Three independent runs |
| Property | Normal maps | EMD-72359 |
|---|---|---|
| Map σ | 0.01–0.05 | 0.0002 |
| height/max ratio | 0.2–1.0 | 0.12 |
The extremely low contrast causes the reference Gaussian to become poorly conditioned.
| Implementation | Time (5A1A, 32k atoms) | Dependencies |
|---|---|---|
| MapQ/Chimera | Minutes | UCSF Chimera, Tk, X11 |
| jamaliki/qscore | ~11 seconds | NumPy, SciPy, BioPython, mrcfile |
The integration code for IHMValidation is available on a separate branch:
Branch: ShravyaRS/IHMValidation:qscore-pure-python
| File | Description |
|---|---|
ihm_validation/qscore_utils.py |
New module: pure Python Q-score computation matching VA's output format |
ihm_validation/em.py |
Removed qscore from VA CLI call; computes Q-scores directly; backward compatible |
pyproject.toml |
Added qscore as optional dependency ([em], [all]) |
# qscore/utils.py — interpolate_grid_at_points()
# Add bounds_error=False to handle atoms near map edges
return interpn((x, y, z), map.grid, p, bounds_error=False, fill_value=0.0)qscore_validation/
├── LICENSE
├── README.md
├── scripts/
│ ├── run_qscore_comparison.py # Download, compute, and compare Q-scores
│ ├── analyze_results.py
│ ├── per_residue_comparison.py # Per-residue validation (Fig. 2)
│ └── ihm_entry_test.py # PDB-IHM integrative model test (Fig. 3) # Statistical analysis and plotting
├── results/
│ ├── qscore_comparison_full.csv # All 28 entries with Q-scores
│ ├── qscore_correlation.png
│ ├── per_residue_correlation.png
│ ├── per_residue_5a1a.csv
│ ├── per_residue_summary.json
│ ├── ihm_pdbdev_00000141.json
│ └── ihm_pdbdev_00000141.png # Three-panel figure
└── test_data/ # Downloaded on demand (not committed)
# Clone this repo
git clone https://github.com/ShravyaRS/qscore-validation.git
cd qscore-validation
# Install dependencies
pip install mrcfile biopython scipy tqdm matplotlib
# Install qscore and apply fix
pip install git+https://github.com/jamaliki/qscore.git
QPATH=$(python -c "import qscore; print(qscore.__path__[0])")
sed -i 's/return interpn((x, y, z), map.grid, p)/return interpn((x, y, z), map.grid, p, bounds_error=False, fill_value=0.0)/' $QPATH/utils.py
# Run full comparison (~2 GB of map downloads)
python scripts/run_qscore_comparison.py
# Analyze results and generate plots
python scripts/analyze_results.py-
Pintilie, G., Zhang, K., Su, Z. et al. (2020). Measurement of atom resolvability in cryo-EM maps with Q-scores. Nature Methods 17, 328–334. doi:10.1038/s41592-020-0731-1
-
Pintilie, G. et al. (2025). Q-score as a reliability measure for protein, nucleic acid, and small molecule atomic coordinate models derived from 3DEM density maps. Acta Cryst. D81. doi:10.1107/S2059798325005923
-
Lawson, C.L. et al. (2021). Cryo-EM model validation recommendations based on outcomes of the 2019 EMDataResource challenge. Nature Methods 18, 156–164. doi:10.1038/s41592-020-01051-w
-
MapQ — Chimera plugin for Q-scores: github.com/gregdp/mapq
-
jamaliki/qscore — Pure Python reimplementation: github.com/jamaliki/qscore
Part of the IHMValidation project at Sali Lab


