VENOM: Very Efficient Neural Optimal-filter for MKIDs

A selective state space model (Mamba architecture) for photon energy estimation from raw MKID IQ timestream data. Replaces the traditional coordinate transform + optimal filter pipeline with a learned model that operates directly on I(t), Q(t). Designed to fit inside an FPGA-class resource budget on the MKIDGen3 RFSoC readout platform.

Supports both MLX (Apple Silicon) and PyTorch (NVIDIA GPU / CPU) via a unified backend abstraction. The active backend is auto-detected at import time or overridden with VENOM_BACKEND=mlx|torch.

Headline results

A single 3,314-parameter model, trained per-detector from the same recipe, evaluated on published calibration data:

Detector	N wavelengths	Range	VENOM mean KDE R	OF mean KDE R	Δ
InHf bilayer (Zobrist 2022)	10	254–1310 nm	26.6	24.8	+7%
PtSi (Zobrist 2019)	5	808–1310 nm	19.8	12.2	+62%

InHf OF baseline uses Zobrist's per-wavelength-template optimal filter (10 hand-tuned templates from mkidcalculator.filter_pulses(template_mask=True), shipped as InHfData/metadata.json). PtSi OF baseline uses the published shared 920 nm template. VENOM uses a single model with no per-wavelength tuning.

See paper/venom.pdf for the full writeup.

Architecture

VENOM uses a two-tier design motivated by the eventual FPGA deployment target:

Tier 1 — SSM Backbone (SSMBackbone): Processes raw IQ samples through a single Mamba block (LayerNorm, causal depthwise Conv1D of width 4, selective SSM with input-dependent B, C, Δ, SiLU gate). Diagonal S4D-real state matrix stored as log_neg_A for stability. Runs sample-by-sample in recurrent mode at ADC rate.

Tier 2 — Energy Head (EnergyHead): Takes the full sequence of backbone features, applies learned attention pooling over time, and regresses photon energy with a 3-layer GELU MLP. Outputs (μ̂, log σ̂²) for Gaussian NLL training. Runs at photon rate (orders of magnitude slower than ADC rate).

Dual execution modes

Parallel (__call__): Segmented Hillis-Steele associative scan over the full sequence, used during training. Accumulated in float64 to limit rounding error at long sequence lengths.
Recurrent (step): O(1) per-sample with carried (h, conv_state), matches the eventual FPGA datapath.

Numerical agreement between the two paths is characterized across the full validation set by scripts/parallel_recurrent_distribution.py. Worst-case |e_par − e_rec| is 1.4 meV on PtSi (sequence length 126) and 1.7 meV on InHf (sequence length 426) — under 5% of per-pulse σ on both datasets. Per-wavelength R values shift by ≤ 0.12 R units between the two modes.

Supporting modules

backend.py — unified MLX / PyTorch abstraction. Exports mx, nn, optim. Backend selected by VENOM_BACKEND or auto-detected.
synthetic_pca.py — PCA-based synthetic pulse generator. Streaming Gram eigendecomposition → K=50 principal components, per-wavelength mean via CubicSpline and covariance via shape-preserving PCHIP interpolation, Cholesky sampling plus orthogonal CSD reconstruction residuals. Produces 120k synthetic pulses/epoch.
synthetic.py — legacy template-interpolation synthetic generator (kept for reference; superseded by synthetic_pca.py).
ssvkernel.py — vectorized locally-adaptive KDE from Shimazaki & Shinomoto (2010); used by compute_resolving_power for the KDE FWHM.

Winner configuration (2026-04-15)

Parameter	Value
`d_model`	16
`d_state`	8
`n_layers`	1
`d_conv`	4
`expand`	1
`d_head_hidden`	32
Total params	3,314

Installation

Python 3.11+.

# Core dependencies
pip install numpy scipy tqdm safetensors

# Pick one compute backend:
pip install mlx          # Apple Silicon
pip install torch        # NVIDIA GPU / CPU

# Reports and figures
pip install matplotlib reportlab

Backend auto-selects MLX only when it can run model code; headless sessions fall back to PyTorch. Force one with VENOM_BACKEND=torch or VENOM_BACKEND=mlx.

Pre-trained winner weights

The winner checkpoints and training metadata used for all numbers in the paper live in opt/:

File	Contents
`opt/venom_inhf_20260415_143354.safetensors`	InHf winner weights (3,314 params)
`opt/venom_inhf_20260415_143354_meta.json`	Training history + validation indices
`opt/venom_ptsi_20260415_071143.safetensors`	PtSi winner weights
`opt/venom_ptsi_20260415_071143_meta.json`	Training history + validation indices
`opt/parallel_recurrent_distribution.json`	Per-pulse `
`opt/RESULTS_2026-04-15.md`	Session log documenting how the winners were reached
`opt/RESULTS_{inhf,ptsi}_winner_20260415.pdf`	Final training-report PDFs for each dataset

Loading a winner checkpoint:

from mkid_ssm import MKIDEnergySSM
import mlx.core as mx   # or: from backend import mx

model = MKIDEnergySSM(d_model=16, d_state=8, n_layers=1,
                      d_conv=4, expand=1, d_head_hidden=32)
mx.eval(model.parameters())
model.load_weights("opt/venom_inhf_20260415_143354.safetensors")

New .safetensors checkpoints are written in the true safetensors format. Legacy VENOM checkpoints that used a .safetensors suffix for PyTorch zip archives still load through model.load_weights().

Reproducing the winner runs

# InHf (10 wavelengths, 254-1310 nm)
python mkid_ssm.py --downsample 4 --n-components 50 \
    --loss gaussian_nll --epochs 25 --n-synthetic 120000 \
    --n-layers 1 --d-model 16 --d-state 8 --expand 1 \
    --peak-sigma 2.0 --report-every 25

# PtSi (5 wavelengths, 808-1310 nm)
python mkid_ssm.py --ptsi --downsample 4 --n-components 50 \
    --loss gaussian_nll --epochs 25 --n-synthetic 120000 \
    --n-layers 1 --d-model 16 --d-state 8 --expand 1 \
    --peak-sigma 2.0 --report-every 25

The two commands differ only in --ptsi. Both use the same model, training recipe, and PCA synthetic generator. Pre-trained weights for both datasets live in opt/venom_{inhf,ptsi}_20260415_*.safetensors.

Remote training via Ray

scripts/run_remote_train.py is a Ray-Client launcher that runs the standard PyTorch training path on a remote CUDA host without modifying any training code. The driver only needs ray==2.55.1 matched to the cluster's Python minor version (3.12 for our reference setup running an NGC PyTorch container with PyTorch 2.11, CUDA 13, RTX 5090); all heavy imports (torch, mkid_ssm) happen on the worker.

# One-time: driver env matched to the cluster's Python minor version
conda create -n pyray python=3.12 -y
/opt/anaconda3/envs/pyray/bin/pip install "ray[client]==2.55.1"

# Site-specific defaults: copy .env.example to .env and edit, or
# export the variables in your shell, or pass --ray-address /
# --data-root explicitly on the CLI.
cp .env.example .env   # then edit .env with your cluster URL and data path

# Submit a run; flags after `--` are forwarded verbatim to mkid_ssm.main()
/opt/anaconda3/envs/pyray/bin/python scripts/run_remote_train.py \
    --run-name ptsi_winner \
    -- --ptsi --downsample 4 --n-components 50 --loss gaussian_nll \
       --epochs 25 --n-synthetic 120000 --n-layers 1 --d-model 16 \
       --d-state 8 --expand 1 --peak-sigma 2.0 --report-every 25

The launcher uploads the repo source as runtime_env.working_dir (data, weights, and zips are excluded), submits a single @ray.remote(num_gpus=1) task that symlinks <data-root>/{InHf,PtSi}Data into the working dir, sets VENOM_BACKEND=torch, and pulls weights/, training_reports_*/, and VENOM_*.pdf back into runs/<run-name>/ after the run completes.

Common CLI flags

Flag	Meaning
`--ptsi`	Use PtSi dataset instead of InHf
`--downsample N`	Anti-aliased FIR decimation factor (default 4)
`--n-tau T`	Pulse window = T × τ_qp past trigger (default 5)
`--peak-sigma S`	MAD peak-height clip, σ (default 2.0)
`--loss {mse,gaussian_nll}`	Default `gaussian_nll`
`--n-synthetic N`	Synthetic pulses per epoch (0 = disable, default winner = 120k)
`--n-components K`	PCA components for synthetic generator (default 50)
`--epochs N`	Training epochs
`--report-every N`	Emit multi-page PDF training report every N epochs
`--full`	Use full trace, no pulse windowing
`--jitter`	Per-wavelength label jitter
`--plot-features`	Plot all backbone feature channels after training
`--synthetic`	Use fully synthetic (no real) calibration for quick tests

Data loading

load_mkid_data() loads per-wavelength .npz files from a data directory. Each file must contain i_trace, q_trace, mask, peak_heights. Energy labels are computed from $E = hc/\lambda$.

When a metadata.json with an opt_filt_R dictionary is present, those per-wavelength OF resolving powers are used in preference to recomputing from peak_heights. This is how the published per-wavelength-template OF values from Zobrist 2022 are restored for InHf (shipped with InHfData/metadata.json). PtSi has no metadata.json and uses a post-clip KDE of the stored shared-template peak heights, matching Zobrist 2019.

from mkid_ssm import load_mkid_data

iq, energies, wl_ids, wl_labels, opt_R, peak_heights = load_mkid_data(
    data_dir='InHfData',
    downsample=4,
    n_tau=5.0,
    peak_height_sigma=2.0,
)

Supported datasets

Dataset	Dir	Wavelengths	Raw rate	Source
InHf bilayer	`InHfData/`	254–1310 nm (10)	0.8 MHz	Zobrist 2022
PtSi	`PtSiData/`	808–1310 nm (5)	2.0 MHz	Zobrist 2019

Obtaining the data

The raw IQ calibration data (InHfData.zip ≈ 1.1 GB, PtSiData.zip ≈ 580 MB) is too large to ship through GitHub. It is available on request — contact the Mazin Lab and we will provide a download link. Extract each archive in the repo root so InHfData/ and PtSiData/ sit next to mkid_ssm.py.

Energy resolution metrics

compute_resolving_power(values, energy) — KDE-based $R = E / \mathrm{FWHM}$ using the locally adaptive ssvkernel (Shimazaki & Shinomoto 2010). Headline metric; matches mkidcalculator's convention.
summarize_resolving_power(pred, true, wl_ids) — Gaussian-approximation $R = E / (2.355 \cdot \mathrm{RMSE})$. Stricter (penalizes per-wavelength bias). Used as a training monitor. Edge wavelengths excluded from the aggregate.

Training details

Optimizer: AdamW, weight decay 1e-4
LR schedule: cosine annealing with 5-epoch linear warmup
Loss: Gaussian NLL (default) or MSE. Gaussian NLL outputs (μ, log σ²); sigma absorbs heavy tails so training stays stable on outlier-heavy calibration data.
Validation: 15% stratified-by-wavelength hold-out (fixed seed)
Gradient accumulation: configurable micro-batch size to cap peak memory
Gradient checkpointing: backbone layers

Scripts

Auxiliary scripts live in scripts/:

Script	Purpose
`train_single.py`	Train one config with periodic PDF reports
`run_remote_train.py`	Ray-Client launcher: run the PyTorch training path on a remote CUDA host
`generate_paper_figures.py`	Regenerate all paper figures from winner weights
`parallel_recurrent_distribution.py`	Characterize parallel-vs-recurrent mode agreement on the full validation set
`convert_inhf_data.py`	Convert `mkidcalculator` pickles to the common npz format
`make_report.py`	PDF data summary (raw-trace overview)
`explore_data.py`	Quick data-exploration plots
`sweep_hyperparams_v2.py`	Grid hyperparameter sweep with per-config reports

Tests

pytest tests/
# In headless/CI sessions:
VENOM_BACKEND=torch pytest tests/

Covers: model shapes, recurrent/parallel equivalence, KDE accuracy against Gaussian theory, edge cases, stratified splitting, micro-batch weights, IQ normalization.

References

Gu & Dao, "Mamba: Linear-Time Sequence Modeling with Selective State Spaces," COLM 2024
Shimazaki & Shinomoto, "Kernel Bandwidth Optimization in Spike Rate Estimation," J. Comput. Neurosci. 29, 171 (2010)
Zobrist et al., "Wide-band parametric amplifier readout and resolution of optical microwave kinetic inductance detectors," Appl. Phys. Lett. 115, 042601 (2019)
Zobrist et al., "Membraneless Phonon Trapping and Resolution Enhancement in Optical Microwave Kinetic Inductance Detectors," Phys. Rev. Lett. 129, 017701 (2022)
Smith et al., "MKIDGen3: Energy-resolving single-photon-counting MKID readout on an RFSoC," Rev. Sci. Instrum. 95, 114705 (2024)
Fritsch & Carlson, "Monotone Piecewise Cubic Interpolation," SIAM J. Numer. Anal. 17, 238 (1980)

Citation

If you use VENOM in published work, please cite this repository and the associated paper (forthcoming). A CITATION.cff file is included; GitHub will render a "Cite this repository" shortcut in the sidebar.

License

BSD 3-Clause. See LICENSE for the full text. Briefly: you may use, modify, and redistribute this software with or without modification, provided the copyright notice is retained and the Mazin Lab / UCSB name is not used to endorse derivative products without written permission.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VENOM: Very Efficient Neural Optimal-filter for MKIDs

Headline results

Architecture

Dual execution modes

Supporting modules

Winner configuration (2026-04-15)

Installation

Pre-trained winner weights

Reproducing the winner runs

Remote training via Ray

Common CLI flags

Data loading

Supported datasets

Obtaining the data

Energy resolution metrics

Training details

Scripts

Tests

References

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 114 Commits
.github/workflows		.github/workflows
opt		opt
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.gitleaks.toml		.gitleaks.toml
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
backend.py		backend.py
baselines.py		baselines.py
mkid_ssm.py		mkid_ssm.py
ssvkernel.py		ssvkernel.py
synthetic.py		synthetic.py
synthetic_pca.py		synthetic_pca.py
venom.mplstyle		venom.mplstyle
venom_env.py		venom_env.py
venom_numpy.py		venom_numpy.py

Folders and files

Latest commit

History

Repository files navigation

VENOM: Very Efficient Neural Optimal-filter for MKIDs

Headline results

Architecture

Dual execution modes

Supporting modules

Winner configuration (2026-04-15)

Installation

Pre-trained winner weights

Reproducing the winner runs

Remote training via Ray

Common CLI flags

Data loading

Supported datasets

Obtaining the data

Energy resolution metrics

Training details

Scripts

Tests

References

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages