Skip to content

MKrawitzky/Copperfield

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

    ╔══════════════════════════════════════════════╗
    ║                                              ║
    ║    ✦ ✦ ✦   C O P P E R F I E L D   ✦ ✦ ✦   ║
    ║                                              ║
    ║          ┌─────────────────────┐             ║
    ║          │  22,000 candidates  │             ║
    ║          └──────────┬──────────┘             ║
    ║                     │                        ║
    ║           CCS gate  │  ← rejects ~60%        ║
    ║              ╲      │      ╱                 ║
    ║               ╲═════▼═════╱                  ║
    ║          ┌─────────────────────┐             ║
    ║          │    ~9,000 remain    │             ║
    ║          └──────────┬──────────┘             ║
    ║                     │                        ║
    ║          Kalman RT   │  ← penalises drift    ║
    ║               ╲═════▼═════╱                  ║
    ║          ┌─────────────────────┐             ║
    ║          │    Entropy score    │             ║
    ║          │  Percolator rerank  │             ║
    ║          └──────────┬──────────┘             ║
    ║                     │                        ║
    ║          ┌──── ✦ ───▼─── ✦ ────┐            ║
    ║          │   ~12,000 PSMs      │            ║
    ║          │  est. FDR ~0.3–0.5% │            ║
    ║          └─────────────────────┘            ║
    ║                                              ║
    ║     "The closer you look, the less you see"  ║
    ╚══════════════════════════════════════════════╝

Named for David Copperfield, master of elimination
The illusion is not what appears. It is what disappears.

Python Platform License ZIGGY


Who

Michael Krawitzky - built within ZIGGY, extending GauDIA with four layers of orthogonal evidence.


What

A high-precision DIA search engine that adds four elimination layers on top of GauDIA's native .d reader. Returns fewer identifications than GauDIA, but every one has passed a gauntlet of independent physical and statistical filters.

The four layers:

Layer What it does What it removes
CCS gate Rejects candidates where predicted 1/K₀ differs from measured by > 0.12 Vs/cm² ~60% of false candidates, before any cosine scoring
Kalman RT Fits a linear RT calibration on high-confidence hits; penalises PSMs by absolute RT deviation Peptides that don't belong to this run's chromatography
Entropy score Penalises chimeric spectra with high Shannon entropy of matched intensities Co-isolation artifacts
Percolator Trains a 6-feature logistic discriminant per run on the target-decoy distribution Borderline PSMs below the true posterior FDR

Result: ~12,000 PSMs at stated 1% FDR, estimated true FDR ~0.3–0.5%.


When

Developed 2024–2026 alongside GauDIA. Built to answer the question every quantitative proteomics experiment eventually asks: when I report 99% confidence, am I actually 99% confident?


Where

Runs locally on any machine with Python 3.9+, Bruker TimsData SDK, and a .d directory. Requires GauDIA as its underlying reader. Within ZIGGY it runs in the comparison hub alongside GauDIA, PHANTOM, and external engines.


Why

The name is precise. David Copperfield's most famous illusions, making a Learjet disappear (1983), making the Statue of Liberty vanish (1984), worked not through misdirection but through elimination. Before the reveal, everything that was not the target was systematically removed from view. What remained was real.

That is Copperfield the engine. Before cosine scoring, before any expensive computation, it eliminates false candidates through cascading orthogonal filters. Ion mobility filters by molecular shape. Retention time filters by chromatographic behaviour. Entropy filters by spectral cleanliness. Percolator filters by a multi-feature discriminant trained on the run's own data.

By the time Copperfield reports a PSM, everything that should not be there has been made to disappear. What remains is very likely real.

In quantitative proteomics, a false positive is more damaging than a missed peptide. Copperfield trades recall for precision, deliberately. It is the right engine when the cost of a wrong identification is high.

"The closer you look, the less you see."


The CCS Gate in Detail

CCS_theoretical = 0.3 + z × 0.12 + m/z × (0.00015 + z × 0.00008)

A false peptide can have the right mass but the wrong shape. Ion mobility is orthogonal to m/z, a co-eluting contaminant will score high on cosine but fail the CCS gate. This filter runs before any fragment matching and eliminates ~25 million false candidate evaluations per typical run.


Part of ZIGGY

→ MKrawitzky/Ziggy · → GauDIA · → PHANTOM · → Goya · → Zyna · → BOWIE · → VEGA · → Silent Heroes


Academic License · © 2024–2026 Michael Krawitzky & Brett S. Phinney

About

4D-validated DIA search engine: CCS pre-filter + run-learned RT model + multi-feature Percolator rescoring

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors