[MOVED] This repository has moved to https://github.com/ToppyMicroServices/YOLOZU Please use the company repository for all new issues, PRs, and development.
日本語: Readme_jp.md
Contract-first evaluation harness for detection / segmentation / pose.
YOLOZU supports different models and datasets through unified contracts and adapters.
Run inference in any backend, export a common predictions.json,
and evaluate apples-to-apples with the same validators and metrics.
bash scripts/smoke.shThis is the one-line copy-paste path. Detailed command breakdown is in
the Quickstart details section below.
- A: Evaluate from precomputed predictions (no inference deps)
—
predictions.json→ validate → eval. Start:docs/README.md - B: Train → Export → Eval (RT-DETR scaffold)
— reproducible run artifacts → ONNX → parity/eval.
Start:
docs/README.md - C: Contracts (predictions / adapter / ttt protocol)
— stable schema + adapter boundary + safe adaptation protocol.
Start:
docs/README.md - D: Bench/Parity (TensorRT pipeline / latency benchmark)
— backend parity checks + fixed-protocol latency benchmarking.
Start:
docs/README.md
Key points:
- Bring-your-own inference → stable
predictions.json. - Validators catch schema drift early.
- Metrics stay comparable across backends/environments.
- Tooling stays CPU-friendly by default (GPU optional).
- RT-DETR pose scaffold is available for train→export→eval.
- Safe TTT presets exist (Tent/MIM/CoTTA/EATA/SAR).
With this repo checkout, run:
bash scripts/smoke.shThis runs doctor → validate dataset → validate predictions →
eval-coco --dry-run using bundled smoke assets in data/smoke.
Manual equivalent (same fixed inputs):
yolozu doctor --output -
yolozu validate dataset data/smoke
yolozu validate predictions data/smoke/predictions/predictions_dummy.json --strict
yolozu eval-coco \
--dataset data/smoke \
--split val \
--predictions data/smoke/predictions/predictions_dummy.json \
--dry-run \
--output reports/smoke_coco_eval_dry_run.jsonDetailed option patterns are in docs/README.md.
Optional extras:
python3 -m pip install 'yolozu[demo]' # torch demos (CPU OK)
python3 -m pip install 'yolozu[onnxrt]' # ONNXRuntime CPU exporter
python3 -m pip install 'yolozu[coco]' # pycocotools COCOeval
python3 -m pip install 'yolozu[full]'Docs index (start here): docs/README.md
One-page proof (shortest path + report shape): docs/proof_onepager.md
Prepare keypoints data into YOLOZU-ready layout:
python3 tools/yolozu.py prepare-keypoints-dataset \
--source data/keypoints_src \
--format auto \
--out data/keypoints_datasetSupported direct keypoints inputs:
autoyolo_posecococvat_xml
Not direct (convert first):
detectron2_dataset_dictlabelme_keypoints
Format matrix/help:
python3 tools/yolozu.py prepare-keypoints-dataset \
--list-formats \
--source . \
--out .Minimal CVAT XML smoke test:
python3 -m pytest -q tests/test_prepare_keypoints_dataset_cvat_xml.pyIn one glance:
-
BYO inference + contract-first eval: export the same
predictions.jsonand compare apples-to-apples. -
Safe TTT: guard rails + reset policies for online adaptation.
-
Apache-2.0-only ops: license policy + checks to keep the toolchain clean.
-
Parity/bench: diff stats + fixed-protocol benchmarks across backends.
-
Bring-your-own inference + contract-first evaluation: run inference in PyTorch / ONNXRuntime / TensorRT / C++ / Rust, export the same
predictions.json, and compare apples-to-apples. -
Safe TTT (test-time training): presets + guard rails + reset policies (see
docs/ttt_protocol.md). -
Apache-2.0-only ops: license policy + checks to keep the toolchain clean (see
docs/license_policy.md). -
Unified CLI:
yolozu(pip) +python3 tools/yolozu.py(repo) wrap backends with consistent args and caching (--cache), and always write run metadata (git SHA / env / GPU / config hash). -
Parity + benchmarks: backend diff stats (torch vs onnxrt vs trt) and fixed-protocol latency/FPS reports.
-
AI-friendly repo surface: stable schemas +
tools/manifest.jsonfor tool discovery / automation.
- Dataset I/O: YOLO-format images/labels + optional per-image JSON metadata.
- Stable evaluation contract: versioned predictions-JSON schema + adapter contract.
- Unified CLI:
- pip:
yolozu(install-safe commands + CPU demos) - repo:
python3 tools/yolozu.py(power-user research/eval workflows)
- pip:
- Inference/export:
python3 tools/yolozu.py export --backend {torch,onnxrt,trt}(wrapper) or the low-level scripts (tools/export_predictions*.py). - Test-time adaptation options:
- TTA: lightweight prediction-space post-transform (
--tta). - TTT: pre-prediction test-time training (Tent or MIM) via
--ttton the torch backend (seedocs/ttt_protocol.md).
- TTA: lightweight prediction-space post-transform (
- Hessian refinement (post-processing): per-detection iterative refinement on exported predictions JSON; default is disabled and must be opt-in.
- TensorRT note: TRT conversion targets the inference graph only; Hessian refinement runs outside the engine as a separate post-processing step.
- Evaluation: COCO mAP conversion/eval and scenario suite reporting.
- Keypoints:
YOLO pose-style keypoints in labels/predictions + PCK evaluation
- optional COCO OKS mAP (
tools/eval_keypoints.py --oks) and parity/benchmark helpers. COCO/Detectron2 keypoint schema (categories[].keypoints/skeleton) is auto-ingested into wrapper metadata so training can auto-setnum_keypointsand left/right flip pairs.
- optional COCO OKS mAP (
- Semantic seg:
dataset prep helpers +
tools/eval_segmentation.py(mIoU/per-class IoU/ignore_index + optional HTML overlays). - Instance seg:
tools/eval_instance_segmentation.py(mask mAP from per-instance binary PNG masks + optional HTML overlays). - Training pipeline: RT-DETR pose trainer with run contract, metrics output, ONNX export, and optional SDFT-style self-distillation.
- Depth-aware training path (optional):
--depth-mode {none,sidecar,fuse_mid}with sidecar depth validity gating and safe defaultnone.
YOLOZU evaluates instance segmentation using per-instance binary PNG masks (no RLE/polygons required).
Predictions JSON (minimal):
[
{
"image": "000001.png",
"instances": [
{ "class_id": 0, "score": 0.9, "mask": "masks/000001_inst0.png" }
]
}
]Validate an artifact:
python3 tools/validate_instance_segmentation_predictions.py reports/instance_seg_predictions.jsonEval outputs:
- mask mAP (
map50,map50_95) - per-class AP table
- per-image diagnostics (TP/FP/FN, mean IoU) and overlay selection
(
--overlay-sort {worst,best,first}; default:worst)
Run the synthetic demo and render overlays/HTML:
python3 tools/eval_instance_segmentation.py \
--dataset examples/instance_seg_demo/dataset \
--split val2017 \
--predictions examples/instance_seg_demo/predictions/instance_seg_predictions.json \
--pred-root examples/instance_seg_demo/predictions \
--classes examples/instance_seg_demo/classes.txt \
--html reports/instance_seg_demo_eval.html \
--overlays-dir reports/instance_seg_demo_overlays \
--max-overlays 10Same via the unified CLI:
python3 tools/yolozu.py eval-instance-seg \
--dataset examples/instance_seg_demo/dataset \
--split val2017 \
--predictions examples/instance_seg_demo/predictions/instance_seg_predictions.json \
--pred-root examples/instance_seg_demo/predictions \
--classes examples/instance_seg_demo/classes.txt \
--html reports/instance_seg_demo_eval.html \
--overlays-dir reports/instance_seg_demo_overlays \
--max-overlays 10Optional: prepare COCO instance-seg dataset with per-instance PNG masks (requires pycocotools):
python3 tools/prepare_coco_instance_seg.py \
--coco-root data/coco128 \
--split train2017 \
--out data/smoke_instance_segOptional: convert COCO instance-seg predictions (RLE/polygons) into YOLOZU PNG masks (requires pycocotools):
python3 tools/convert_coco_instance_seg_predictions.py \
--predictions reports/smoke_coco_instance_seg_preds.json \
--instances-json data/coco/annotations/instances_val2017.json \
--output reports/instance_seg_predictions.json \
--masks-dir reports/instance_seg_masksStart here: docs/README.md
- Repo feature summary: docs/yolozu_spec.md
- Model/spec note: docs/specs/rt_detr_6dof_geom_mim_spec_en_v0_4.md
- Training / inference / export quick steps: docs/training_inference_export.md
- Hessian solver for regression refinement: docs/hessian_solver.md
- Predictions schema (stable): docs/predictions_schema.md
- Adapter contract (stable): docs/adapter_contract.md
- Migration helpers: docs/migrate.md
- License policy: docs/license_policy.md
- Tools index (AI-friendly): docs/tools_index.md / tools/manifest.json
- AI-first usage guide: docs/ai_first.md
- PyInstaller/PyArmor packaging notes: deploy/pyinstaller/README.md
- P0: Unified CLI (
torch/onnxruntime/tensorrt) with consistent args- same output schema; always write meta (git SHA / env / GPU / seed / config hash);
keep
tools/manifest.jsonupdated.
- same output schema; always write meta (git SHA / env / GPU / seed / config hash);
keep
- P1:
doctor(deps/GPU/driver/onnxrt/TRT diagnostics)predict-images(folder input → predictions JSON + overlays)- HTML report.
- P2: cache/re-run (fingerprinted runs) + sweeps (wrapper exists;
expand sweeps for TTT/threshold/gate weights)
- production inference cores (C++/Rust) as needed.
- Long-form notes:
docs/roadmap.md
- P0: implemented in unified wrapper CLI
(
python3 tools/yolozu.py export --backend {dummy,torch,onnxrt,trt}) with wrapped predictions JSON andmeta.run(git/env/gpu/seed/config_hash). - P1: implemented (
doctor,predict-images, HTML overlays/report path). - P2: implemented baseline (
--cache, sweep wrapper) and ongoing expansion for broader production cores/tuning presets.
Recent compatibility additions:
- import/doctor auto-detection:
yolozu import ... --from auto,yolozu doctor import --config-from auto|--dataset-from auto. - train shorthand preview:
yolozu train --import auto --cfg configs/examples/train_setting.yamlwrites resolved canonicalTrainConfig.
rtdetr_pose/tools/train_minimal.py supports optional depth integration
without breaking the backbone swap boundary ([P3,P4,P5]):
--depth-mode none(default): no depth path, baseline behavior.--depth-mode sidecar: read per-image sidecar depth (depth_path/depth) and propagatedepth_valid.--depth-mode fuse_mid: sidecar + lightweight mid-fusion after projector (outside backbone boundary), with--depth-dropoutfor modality dropout.
Safety defaults:
--depth-unitcontrols absolute-depth safety (unspecified|relative|metric, defaultunspecified).- Absolute depth matcher costs are only active in metric mode.
Non-metric modes disable
cost_z/cost_tsafety-sensitively. --depth-scaleapplies unit scaling to sidecar depth values before use.
- Apache-2.0-only utilities and evaluation harnesses (no vendored GPL/AGPL inference code).
- CPU-first development workflow: dataset tooling, validators, scenario suite, and unit tests run without a GPU.
- Adapter interface decouples inference backend from evaluation (PyTorch/ONNXRuntime/TensorRT/custom), so you can run inference elsewhere and still score/compare locally.
- Reproducible artifacts: stable JSON reports + optional JSONL history for regressions.
- Symmetry + commonsense constraints are treated as first-class, test-covered utilities (not ad-hoc postprocess).
-
Training in
rtdetr_pose/is run-contract based (data/loss/export wiring, resume, parity gate). Continual-learning behavior is testable from pip withyolozu demo continual --compare --markdown, and source training stays available viayolozu train configs/examples/train_setting.yaml(docs/training_inference_export.md, requiresyolozu[train]). -
A one-command folder inference path is available from pip:
yolozu predict-images --backend onnxrt --input-dir data/smoke/images/val --onnx runs/smoke/model.onnx, which writes predictions JSON + overlays + HTML in one run. -
TensorRT remains NVIDIA/Linux-centric, while macOS can run CPU validation and ONNXRuntime export:
yolozu onnxrt export ...; GPU/TRT build/eval is pinned to Runpod/container workflows (docs/tensorrt_pipeline.md). -
Backend parity drift is handled by a dedicated checker:
yolozu parity --reference reports/pred_torch.json --candidate reports/pred_onnxrt.jsonplus protocol-pinned eval settings (docs/yolo26_eval_protocol.md). -
Lightweight metrics stay available for fast loops, and full COCOeval is directly exposed from pip: install extras and run:
python3 -m pip install 'yolozu[coco]' yolozu eval-coco --dataset data/smoke --predictions data/smoke/predictions/predictions_dummy.json -
Long-tail focused post-hoc path is available without retraining:
yolozu calibrate --method fracal --task bbox --dataset data/smoke \ --predictions data/smoke/predictions/predictions_dummy.json \ --output runs/smoke/predictions_calibrated.json \ --stats-out runs/smoke/fracal_stats_bbox.json yolozu eval-long-tail --dataset data/smoke --predictions runs/smoke/predictions_calibrated.json
Reuse training-time stats with
--stats-in reports/fracal_stats_bbox.json(also supported for--task seg). Alternative methods are also available for comparison:--method la --tau <value>and--method norcal --gamma <value>. -
Model weights/datasets stay outside git by design; reproducibility is maintained through stable JSON artifacts and pinned path conventions documented in
docs/external_inference.mdanddocs/yolo26_inference_adapters.md.
python3 -m pip install yolozu
yolozu --help
yolozu doctor --output -Support / legal:
- Contact: develop@toppymicros.com
- © 2026 ToppyMicroServices OÜ
- Legal address: Karamelli tn 2, 11317 Tallinn, Harju County, Estonia
- Registry code: 16551297
Optional extras (recommended as needed):
python3 -m pip install 'yolozu[demo]' # torch demos (CPU OK)
python3 -m pip install 'yolozu[onnxrt]' # ONNXRuntime CPU exporter
python3 -m pip install 'yolozu[coco]' # pycocotools COCOeval
python3 -m pip install 'yolozu[full]'CPU demos:
yolozu demo instance-seg
yolozu demo continual --method ewc_replay # requires yolozu[demo]
yolozu demo continual --compare --markdown # suite: naive/ewc/replay/ewc_replayThis path unlocks the full repo tooling (tools/, rtdetr_pose/, scenarios, etc.).
python3 -m pip install -r requirements-test.txt
python3 -m pip install -e .
# Tiny smoke dataset (optional but useful for scenario runs)
bash tools/fetch_coco128.sh
python3 -m unittest -q- Environment report:
yolozu doctor --output -/python3 tools/yolozu.py doctor --output reports/doctor.json - Export smoke (no inference):
yolozu export --backend labels --dataset data/smoke --output runs/smoke/predictions_labels.json/ same in repo wrapper. - Folder inference + overlays/HTML:
yolozu predict-images --backend onnxrt --input-dir data/smoke/images/val --model runs/smoke/model.onnx/python3 tools/yolozu.py predict-images ... - Backend parity check:
yolozu parity --reference reports/pred_ref.json --candidate reports/pred_cand.json/python3 tools/check_predictions_parity.py ... - Validate dataset layout:
yolozu validate dataset data/smoke --strict/python3 tools/validate_dataset.py ... --strict - Validate predictions JSON:
yolozu validate predictions reports/predictions.json --strict/python3 tools/validate_predictions.py ... --strict - COCOeval mAP:
yolozu eval-coco --dataset data/smoke --predictions data/smoke/predictions/predictions_dummy.json(yolozu[coco]) /python3 tools/eval_coco.py ... - Long-tail post-hoc + report:
yolozu calibrate --method fracal ... && yolozu eval-long-tail .../ same viapython3 tools/yolozu.py ... - Long-tail train recipe:
yolozu long-tail-recipe --dataset data/smoke .../ same viapython3 tools/yolozu.py ... - Instance-seg eval (PNG masks):
yolozu eval-instance-seg --dataset /path --predictions preds.json .../python3 tools/eval_instance_segmentation.py ... - ONNXRuntime CPU export:
yolozu onnxrt export ...(yolozu[onnxrt]) /python3 tools/export_predictions_onnxrt.py ... - Training pipeline:
yolozu train configs/examples/train_contract.yaml --run-id exp01(yolozu[train]) /python3 rtdetr_pose/tools/train_minimal.py ... - Scenario suite:
yolozu test configs/examples/test_setting.yaml/python3 tools/run_scenarios.py ...
The “power-user” unified CLI lives in-repo: python3 tools/yolozu.py --help.
Path behavior in tool CLIs:
- Relative input paths are resolved from the current working directory (with repo-root fallback for compatibility).
- Relative output paths are written under the current working directory.
- For config-driven tools such as
tools/tune_gate_weights.py, relative paths in the config are resolved from the config file directory.
YOLOZU can publish Docker images to GitHub Container Registry (GHCR) on tags vX.Y.Z.
- Minimal (no torch):
ghcr.io/<owner>/yolozu:<tag> - Demo (includes torch):
ghcr.io/<owner>/yolozu-demo:<tag>
Examples:
docker run --rm ghcr.io/<owner>/yolozu:0.1.0 doctor --output -
docker run --rm ghcr.io/<owner>/yolozu-demo:0.1.0 demo continual --method ewc_replayPublish trigger:
- Push a tag
vX.Y.Zto run.github/workflows/container.yml. - If the tag existed before the workflow was added, run it manually via GitHub Actions (workflow_dispatch) or cut a new tag.
Details: deploy/docker/README.md
- GPU is supported (training/inference): install CUDA-enabled PyTorch in your environment and use
--device cuda:0. - CI/dev does not require GPU; many checks are CPU-friendly.
The trainer implementation lives in rtdetr_pose/rtdetr_pose/train_minimal.py
(source-checkout wrapper: rtdetr_pose/tools/train_minimal.py).
Quickest path (source checkout):
python3 -m pip install -r requirements-test.txt
bash tools/fetch_coco128.sh
python3 rtdetr_pose/tools/train_minimal.py \
--dataset-root data/coco128 \
--config rtdetr_pose/configs/base.json \
--max-steps 50 \
--run-dir runs/train_minimal_smokeConfig-driven path (recommended for repeatability):
yolozu train configs/examples/train_setting.yaml- Prepare YOLO-format dataset root:
images/train/*.{jpg,png}(andimages/val/*.{jpg,png})labels/train/*.txt(andlabels/val/*.txt, YOLO txt:class cx cy w h)
- Validate dataset before training:
yolozu validate dataset data/smoke --split val --strict
yolozu validate dataset data/smoke --split val --strict --no-check-images- Copy and edit contract config (set dataset root/splits and training knobs):
cp configs/examples/train_contract.yaml configs/runtime/train_contract_custom.yaml- Launch run-contract training:
yolozu train configs/runtime/train_contract_custom.yaml --run-id custom_exp01- Resume / dry-run when needed:
yolozu train configs/runtime/train_contract_custom.yaml --run-id custom_exp01 --resume
yolozu train configs/runtime/train_contract_custom.yaml --run-id custom_exp01 --dry-run- Export predictions from your trained checkpoint and evaluate:
python3 tools/export_predictions.py \
--adapter rtdetr_pose \
--config rtdetr_pose/configs/base.json \
--checkpoint runs/custom_exp01/checkpoints/best.pt \
--dataset data/smoke \
--split val \
--wrap \
--output reports/custom_exp01_predictions_val.json
yolozu eval-coco \
--dataset data/smoke \
--split val \
--predictions reports/custom_exp01_predictions_val.json \
--bbox-format cxcywh_normFor quick experiments, set --run-dir, which writes a standard artifact set:
metrics.jsonl(+ finalmetrics.json/metrics.csv)checkpoint.pt(+ optionalcheckpoint_bundle.pt)model.onnx(+model.onnx.meta.json)run_record.json(git SHA / platform / args)
For production-style repeatability (fixed paths, resume, best/last, parity gate), use the run contract:
yolozu train configs/examples/train_contract.yaml --run-id exp01
# Resume (full state: model/optim/sched/amp/ema/step + RNG)
yolozu train configs/examples/train_contract.yaml --run-id exp01 --resume
# Smoke wiring check (1 step then exit 0)
yolozu train configs/examples/train_contract.yaml --run-id exp01 --dry-runContracted artifacts (fixed paths):
runs/exp01/checkpoints/{last,best}.ptruns/exp01/reports/train_metrics.jsonl(1 line per optimizer step)runs/exp01/reports/val_metrics.jsonlruns/exp01/reports/config_resolved.yamlruns/exp01/reports/run_meta.jsonruns/exp01/reports/onnx_parity.json(Torch vs ONNXRuntime; fails run on drift by default)runs/exp01/exports/model.onnx(+model.onnx.meta.json)
Best definition: max map50_95 on validation.
Run contract spec: docs/run_contract.md
Trainer core features (implemented):
- Full resume (model/optim/sched/AMP scaler/EMA/progress + RNG) via
--resume-fromor contracted--resume. - NaN/Inf guard with skip + LR decay knobs
(
--stop-on-non-finite-loss,--non-finite-max-skips,--non-finite-lr-decay). - Grad clipping (
--clip-grad-norm, recommended >0 for pose/TTT/MIM stability). - AMP (
--amp {none,fp16,bf16}), EMA (--use-ema+--ema-eval), DDP (--ddpviatorchrun). - (Optional)
torch.compile:--torch-compile(+--torch-compile-*; experimental; falls back by default if compile fails). - (Optional) torchao quantization / QLoRA:
--torchao-quant {int8wo,int4wo}/--qlora(experimental; requires torchao). - Lightweight aug: multiscale (
--multiscale), hflip (--hflip-prob), photometric HSV/grayscale/noise/blur (--hsv-*,--gray-prob,--gaussian-noise-*,--blur-*; effective when--real-imagesis used). - Validation cadence: epoch (
--val-every) and step-based (--val-every-steps). - Early stop:
--early-stop-patience(+--early-stop-min-delta). - Activation swapping (config):
set
model.backbone_activation/model.head_activationtosilu|gelu|swish|hardswish|hard-swish|leakyrelu, or usemodel.activation_preset(recommended default:default= SiLU/SiLU). - Backbone swap (config):
prefer
model.backbone.name|norm|argsplusmodel.projector.d_model(legacymodel.backbone_nameis still accepted). - Backbone contract:
model backbones return
[P3,P4,P5]at strides[8,16,32]; channels are aligned by 1x1 projection before encoder input.
Common next checks:
python3 tools/plot_metrics.py --jsonl runs/train_minimal_smoke/metrics.jsonl --out reports/train_loss.png
python3 tools/export_predictions.py \
--adapter rtdetr_pose \
--config rtdetr_pose/configs/base.json \
--checkpoint runs/train_minimal_smoke/checkpoint.pt \
--max-images 20 \
--wrap \
--output reports/predictions.json
python3 tools/eval_coco.py \
--dataset data/smoke \
--predictions reports/predictions.json \
--bbox-format cxcywh_norm \
--max-images 20 \
--dry-runIf you explicitly want the downloaded subset path, use:
python3 tools/eval_coco.py \
--dataset data/coco128 \
--predictions reports/predictions.json \
--bbox-format cxcywh_norm \
--max-images 20 \
--dry-runBackbone details and extension guide: docs/backbones.md
Plot a loss curve (requires matplotlib):
python3 tools/plot_metrics.py \
--jsonl runs/train_minimal_smoke/metrics.jsonl \
--out reports/train_loss.pngONNX export runs when --run-dir is set (defaulting to runs/smoke/model.onnx) or when --onnx-out is provided.
Useful flags:
--run-dir runs/smoke--onnx-out runs/smoke/model.onnx--onnx-meta-out runs/smoke/model.onnx.meta.json--onnx-opset <int>--onnx-dynamic-hw(dynamic H/W axes)
Base dataset format:
- Images:
images/train/*.{jpg,png}(andimages/val/*.{jpg,png}) - Labels:
labels/train/*.txt(andlabels/val/*.txt, YOLO:class cx cy w hnormalized)
- Ultralytics YOLOv8 / YOLO11:
if your dataset root contains
images/train+labels/train(andimages/val+labels/val), it is already compatible. Useyolozu validate dataset data/smoke --strict.- You can also pass an Ultralytics
data.yamlas--dataset(expectspath:+train:/val:pointing toimages/trainandimages/val).
- You can also pass an Ultralytics
- YOLOX: common setups use COCO JSON (
instances_*.json). Convert once withtools/prepare_coco_yolo.pyto generate YOLO-format labels (and an optionaldataset.jsondescriptor) under a YOLOZU-compatible dataset root.- If you want a read-only wrapper (no label txt generation),
use import adapters:
docs/import_adapters.md.
- If you want a read-only wrapper (no label txt generation),
use import adapters:
Optional per-image metadata (JSON): labels/train/000001.json (same pattern for labels/val/*.json)
- Masks/seg:
mask_path/mask/M - Depth:
depth_path/depth/D_obj - Pose:
R_gt/t_gt(orpose) - Intrinsics:
K_gt/intrinsics(also supports OpenCV FileStorage-stylecamera_matrix: {rows, cols, data:[...]})
Notes on units (pixels vs mm/m) and intrinsics coordinate frames:
If YOLO txt labels are missing and a mask is provided, bbox+class can be derived from masks. Details (including color/instance modes and multi-PNG-per-class options) are documented in:
- Symmetry specs live in
configs/runtime/symmetry.json(validated loader:yolozu.config.load_symmetry_map). - Core ops:
yolozu/symmetry.py(types:none,Cn/C2/C4,Cinf). - Symmetry-aware template verification:
yolozu/template_verification.py. - Tests:
python3 -m pytest -q tests/test_symmetry.py tests/test_template_verification.py.
This repo evaluates models through a stable predictions JSON format:
- Schema doc: docs/predictions_schema.md
- Machine-readable schema: schemas/predictions.schema.json
Adapters power tools/export_predictions.py --adapter <name> and follow:
There are two common workflows:
If you run real inference elsewhere (PyTorch/ONNXRuntime/TensorRT/etc.), you can evaluate this repo without installing heavy deps locally.
- Validate the JSON:
python3 tools/validate_predictions.py reports/predictions.json
- Consume predictions locally:
yolozu test configs/examples/test_setting.yaml \ --adapter precomputed \ --predictions reports/predictions.json \ --max-images 50 python3 tools/run_scenarios.py --adapter precomputed --predictions reports/predictions.json --max-images 50
TTT updates weights in-memory at inference time and is OFF by default (opt-in via --ttt).
- Torch backend (
rtdetr_pose, supports TTA + TTT):-
Baseline:
python3 tools/yolozu.py export \ --backend torch \ --checkpoint runs/smoke/checkpoints/best.pt \ --device cuda \ --max-images 50 \ --output reports/predictions.json -
TTT (Tent, safe preset):
python3 tools/yolozu.py export \ --backend torch \ --checkpoint runs/smoke/checkpoints/best.pt \ --device cuda \ --max-images 50 \ --ttt \ --ttt-preset safe \ --ttt-reset sample \ --ttt-log-out reports/ttt_log_safe.json \ --output reports/predictions_ttt_safe.json -
TTT batch/chunk knobs: add
--ttt-batch-size <N>and--ttt-max-batches <K>to cap adaptation cost (example:--ttt-batch-size 4 --ttt-max-batches 8). -
TTT reset behavior: use
--ttt-reset streamfor one adaptation phase then fast prediction, or--ttt-reset samplefor per-image/per-batch reset-ablation runs. -
Note:
tools/yolozu.py exportalways writes the wrapped{ "predictions": [...] }form (so--wrapis not needed). -
Note: TTT is supported in repo tooling (
python3 tools/yolozu.py ...) on the torch backend; pip CLIyolozu exportis intentionally smoke-only (dummy/labels). -
Recommended protocol + rationale (domain shift, presets, guards): docs/ttt_protocol.md
-
- ONNXRuntime/TensorRT backends:
use
python3 tools/yolozu.py export --backend onnxrt|trt ...(TTT is torch-only; use TTA or export precomputed predictions for other backends).
Supported predictions JSON shapes:
[{"image": "...", "detections": [...]}, ...]{ "predictions": [ ... ] }{ "000000000009.jpg": [...], "/abs/path.jpg": [...] }(image -> detections)
Schema details:
To compete on e2e mAP (NMS-free), evaluate detections as-is (no NMS postprocess applied).
This repo includes a COCO-style evaluator that:
- Builds COCO ground truth from YOLO-format labels
- Converts YOLOZU predictions JSON into COCO detections
- Runs COCO mAP via
pycocotools(optional dependency)
Example (smoke-first, works with bundled assets):
python3 tools/export_predictions.py \
--adapter dummy \
--dataset data/smoke \
--split val \
--max-images 10 \
--wrap \
--output reports/predictions_smoke.json
python3 tools/eval_coco.py \
--dataset data/smoke \
--split val \
--predictions reports/predictions_smoke.json \
--bbox-format cxcywh_norm \
--max-images 10 \
--dry-runAlternative (coco128 quick run):
python3 tools/export_predictions.py --adapter dummy --max-images 50 --wrap --output reports/predictions.json
python3 tools/eval_coco.py \
--dataset data/coco128 \
--predictions reports/predictions.json \
--bbox-format cxcywh_norm \
--max-images 50Note:
--bbox-format cxcywh_normexpects bbox dict{cx,cy,w,h}normalized to[0,1](matching the RTDETR pose adapter bbox head).- For run-contract variable naming and artifact conventions, see
docs/run_contract.md; README examples intentionally use fixed concrete paths such asruns/smoke.
Reference recipe for external training runs (augment, multiscale, schedule, EMA):
docs/training_recipe_v1.md
docs/training_inference_export.md
Run a configurable sweep and emit CSV/MD tables:
docs/hpo_sweep.md
Report latency/FPS per YOLO26 bucket and archive runs over time:
docs/benchmark_latency.md
Fuse detection/template/uncertainty signals into a single score and tune weights offline (CPU-only):
docs/gate_weight_tuning.md
Reproducible engine build + parity validation steps:
docs/tensorrt_pipeline.md
This repo does not require (or vendor) any GPL/AGPL inference code.
To compare against external baselines (including YOLO26) while keeping this repo Apache-2.0-only:
-
Run baseline inference in your own environment/implementation (ONNX Runtime / TensorRT / custom code).
-
Export detections to YOLOZU predictions JSON (see schema below).
-
(Optional) Normalize class ids using COCO
classes.jsonmapping. -
Validate + evaluate mAP in this repo:
python3 tools/validate_predictions.py reports/predictions.json python3 tools/eval_coco.py \ --dataset data/smoke \ --split val \ --predictions reports/predictions.json \ --bbox-format cxcywh_norm
Minimal predictions entry schema:
{"image": "/abs/or/rel/path.jpg", "detections": [...]}where each detection includesclass_id,score, andbbox {cx,cy,w,h}.
Optional class-id normalization (when your exporter produces COCO category_id):
python3 tools/normalize_predictions.py \
--input reports/predictions.json \
--output reports/predictions_norm.json \
--classes data/coco-yolo/labels/val2017/classes.json \
--wrapIf you have the official COCO layout (images + annotations/instances_*.json), you can generate YOLO-format labels:
python3 tools/prepare_coco_yolo.py --coco-root data/coco --split val2017 --out data/coco-yolo
This creates:
data/coco-yolo/labels/val2017/*.txt(YOLO normalizedclass cx cy w h)data/coco-yolo/labels/val2017/classes.json(category_id <-> class_id mapping)
For local development, keep datasets under data/:
- Debug/smoke:
data/coco128(already included) - Full COCO (official):
data/coco(your download) - YOLO-format labels generated from official JSON:
data/coco-yolo(your output fromtools/prepare_coco_yolo.py)
If you export yolo26n/s/m/l/x predictions as separate JSON files (e.g. reports/pred_yolo26n.json, ...),
you can score them together:
-
Protocol details:
docs/yolo26_eval_protocol.md -
Evaluate suite:
python3 tools/eval_suite.py \ --protocol yolo26 \ --dataset data/coco-yolo \ --predictions-glob 'reports/pred_yolo26*.json' \ --output reports/eval_suite.json -
Fill in targets:
baselines/yolo26_targets.json -
Validate targets:
python3 tools/validate_map_targets.py --targets baselines/yolo26_targets.json -
Check pass/fail:
python3 tools/check_map_targets.py \ --suite reports/eval_suite.json \ --targets baselines/yolo26_targets.json \ --key map50_95
-
Print a table:
python3 tools/print_leaderboard.py \ --suite reports/eval_suite.json \ --targets baselines/yolo26_targets.json \ --key map50_95
-
Archive the run (commands + hardware + suite output):
python3 tools/import_yolo26_baseline.py --dataset data/coco-yolo --predictions-glob 'reports/pred_yolo26*.json'
If you don't have pycocotools installed yet, you can still validate/convert predictions on data/coco128:
python3 tools/export_predictions.py --adapter dummy --max-images 10 --wrap --output reports/predictions_dummy.json
python3 tools/eval_coco.py --predictions reports/predictions_dummy.json --dry-run- Keep symmetry/commonsense logic in lightweight postprocess utilities, outside any inference graph export.
Code in this repository is licensed under the Apache License, Version 2.0. See LICENSE.