GitHub - MKrawitzky/Copperfield: 4D-validated DIA search engine: CCS pre-filter + run-learned RT model + multi-feature Percolator rescoring

    ╔══════════════════════════════════════════════╗
    ║                                              ║
    ║    ✦ ✦ ✦   C O P P E R F I E L D   ✦ ✦ ✦   ║
    ║                                              ║
    ║          ┌─────────────────────┐             ║
    ║          │  22,000 candidates  │             ║
    ║          └──────────┬──────────┘             ║
    ║                     │                        ║
    ║           CCS gate  │  ← rejects ~60%        ║
    ║              ╲      │      ╱                 ║
    ║               ╲═════▼═════╱                  ║
    ║          ┌─────────────────────┐             ║
    ║          │    ~9,000 remain    │             ║
    ║          └──────────┬──────────┘             ║
    ║                     │                        ║
    ║          Kalman RT   │  ← penalises drift    ║
    ║               ╲═════▼═════╱                  ║
    ║          ┌─────────────────────┐             ║
    ║          │    Entropy score    │             ║
    ║          │  Percolator rerank  │             ║
    ║          └──────────┬──────────┘             ║
    ║                     │                        ║
    ║          ┌──── ✦ ───▼─── ✦ ────┐            ║
    ║          │   ~12,000 PSMs      │            ║
    ║          │  est. FDR ~0.3–0.5% │            ║
    ║          └─────────────────────┘            ║
    ║                                              ║
    ║     "The closer you look, the less you see"  ║
    ╚══════════════════════════════════════════════╝

Named for David Copperfield, master of elimination
The illusion is not what appears. It is what disappears.

Who

Michael Krawitzky - built within ZIGGY, extending GauDIA with four layers of orthogonal evidence.

What

A high-precision DIA search engine that adds four elimination layers on top of GauDIA's native .d reader. Returns fewer identifications than GauDIA, but every one has passed a gauntlet of independent physical and statistical filters.

The four layers:

Layer	What it does	What it removes
CCS gate	Rejects candidates where predicted 1/K₀ differs from measured by > 0.12 Vs/cm²	~60% of false candidates, before any cosine scoring
Kalman RT	Fits a linear RT calibration on high-confidence hits; penalises PSMs by absolute RT deviation	Peptides that don't belong to this run's chromatography
Entropy score	Penalises chimeric spectra with high Shannon entropy of matched intensities	Co-isolation artifacts
Percolator	Trains a 6-feature logistic discriminant per run on the target-decoy distribution	Borderline PSMs below the true posterior FDR

Result: ~12,000 PSMs at stated 1% FDR, estimated true FDR ~0.3–0.5%.

When

Developed 2024–2026 alongside GauDIA. Built to answer the question every quantitative proteomics experiment eventually asks: when I report 99% confidence, am I actually 99% confident?

Where

Runs locally on any machine with Python 3.9+, Bruker TimsData SDK, and a .d directory. Requires GauDIA as its underlying reader. Within ZIGGY it runs in the comparison hub alongside GauDIA, PHANTOM, and external engines.

Why

The name is precise. David Copperfield's most famous illusions, making a Learjet disappear (1983), making the Statue of Liberty vanish (1984), worked not through misdirection but through elimination. Before the reveal, everything that was not the target was systematically removed from view. What remained was real.

That is Copperfield the engine. Before cosine scoring, before any expensive computation, it eliminates false candidates through cascading orthogonal filters. Ion mobility filters by molecular shape. Retention time filters by chromatographic behaviour. Entropy filters by spectral cleanliness. Percolator filters by a multi-feature discriminant trained on the run's own data.

By the time Copperfield reports a PSM, everything that should not be there has been made to disappear. What remains is very likely real.

In quantitative proteomics, a false positive is more damaging than a missed peptide. Copperfield trades recall for precision, deliberately. It is the right engine when the cost of a wrong identification is high.

"The closer you look, the less you see."

The CCS Gate in Detail

CCS_theoretical = 0.3 + z × 0.12 + m/z × (0.00015 + z × 0.00008)

A false peptide can have the right mass but the wrong shape. Ion mobility is orthogonal to m/z, a co-eluting contaminant will score high on cosine but fail the CCS gate. This filter runs before any fragment matching and eliminates ~25 million false candidate evaluations per typical run.

Part of ZIGGY

→ MKrawitzky/Ziggy · → GauDIA · → PHANTOM · → Goya · → Zyna · → BOWIE · → VEGA · → Silent Heroes

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
README.md		README.md
_tab_copperfield.jsx		_tab_copperfield.jsx
_tab_phantom.jsx		_tab_phantom.jsx
copperfield.py		copperfield.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Who

What

When

Where

Why

The CCS Gate in Detail

Part of ZIGGY

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Who

What

When

Where

Why

The CCS Gate in Detail

Part of ZIGGY

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages