Japanese: Readme_jp.md
- Framework-agnostic evaluation toolkit for vision models: designed for reproducible continual learning and test-time adaptation under domain shift.
- Training-capable workflows for mitigating catastrophic forgetting: supports training and evaluation workflows based on self-distillation, replay, and parameter-efficient updates (PEFT). These approaches reduce forgetting and make it measurable and comparable across runs, though complete elimination is not guaranteed.
- Support for inference-time adaptation (TTT): allows model parameters to be adjusted during inference, enabling continual adaptation to domain shift in deployment.
- Predictions as the stable interface contract: treats predictions---not models---as the primary contract, making training, continual learning, and inference-time adaptation comparable, restartable, and CI-friendly across frameworks and runtimes.
- Multi-task evaluation support: covers object detection, segmentation, keypoint estimation, monocular depth estimation, and 6DoF pose estimation. Training implementations remain configurable and decoupled, rather than fixed to a specific framework.
- Production-ready deployment path: supports ONNX/ExecuTorch export and execution across PyTorch, ONNX Runtime, TensorRT, and ExecuTorch, with reference inference templates in C++ and Rust.
- Interface-contract-first, AI-first workflow: every experiment emits versioned artifacts that can be automatically compared and regression-tested in CI.
One-screen view of how YOLOZU lets you compare different model stacks using the same dataset and a stable predictions interface contract (predictions.json + export_settings).
| Model / stack | Fine-tune entrypoint (smoke) | predictions.json export path |
Eval path | Notes |
|---|---|---|---|---|
| Ultralytics YOLO (YOLOv8/YOLO11) | tools/run_external_finetune_smoke.py (framework=yolov) |
tools/export_predictions_ultralytics.py |
tools/eval_coco.py |
Typical exports are post-NMS; use protocol=nms_applied. |
RT-DETR (in-repo rtdetr_pose) |
tools/run_external_finetune_smoke.py (framework=rtdetr) |
tools/run_reference_adapter_regression.py (predict→canonicalize) |
tools/run_reference_adapter_regression.py (gates) |
Reference adapter regression is the “real model baseline” path. |
| Hugging Face DETR / RT-DETR | tools/support_ultralytics_detr.py th (dry/non-dry) |
tools/support_ultralytics_detr.py pn (normalize) |
tools/eval_coco.py |
Keeps framework specifics outside the stable interface contract. |
| Detectron2 | tools/run_external_finetune_smoke.py (framework=detectron2) |
tools/export_predictions_detectron2.py |
tools/eval_coco.py |
Non-dry execution requires --detectron2-train-script. |
| MMDetection | tools/run_external_finetune_smoke.py (framework=mmdetection) |
tools/export_predictions_mmdet.py |
tools/eval_coco.py |
Non-dry execution requires --mmdet-train-script. |
| YOLOX | (interop smoke) | tools/yolozu.py export --backend yolox |
tools/eval_coco.py |
Intended for “external inference → interface contract → eval” workflows. |
Minimal proof (same dataset, same report shape; safe default is dry-run):
python3 tools/run_external_finetune_smoke.py --dataset-root data/smoke --split train --output reports/external_finetune_smoke.jsonVisual evidence (example overlays produced by the instance-seg demo):
python3 -m pip install -e .
bash scripts/smoke.shIf your system Python is externally managed (PEP 668), use a venv:
python3 -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
python -m pip install -e .
bash scripts/smoke.shOutput artifacts:
reports/smoke_coco_eval_dry_run.jsonreports/smoke_synthgen_summary.jsonreports/smoke_synthgen_eval.jsonreports/smoke_synthgen_overlay.pngreports/smoke_demo_instance_seg/overlays/*.png(visual demo evidence)
If you only want contract checks (skip demo PNG generation), run:
bash scripts/smoke.sh --skip-demoIf you want a deeper first-time walkthrough evidence report (capability claims + deploy-path dry-runs), run:
bash scripts/smoke.sh --profile deepDeep profile additionally writes:
reports/smoke_walkthrough_report.jsonreports/smoke_demo_overview.jsonreports/smoke_external_finetune_report.jsonreports/smoke_export_{onnxrt,trt,executorch}.json
Docs index (start here): docs/README.md.
AI-friendly tool registry (source of truth): tools/manifest.json.
Tool list + args examples: docs/tools_index.md.
Learning features (training / continual learning / TTT / distillation / long-tail recipe PyTorch plugin choices): docs/learning_features.md.
- A: Evaluate from precomputed predictions (no inference deps) —
predictions.json→ validate → eval. - B: Train → Export → Eval (RT-DETR scaffold + run interface contract / Run Contract) — run artifacts → ONNX → parity/eval.
- C: Interface contracts (predictions / adapter / TTT protocol) — schemas + adapter interface contract boundary + safe adaptation protocol.
- D: Bench/Parity (TensorRT / latency benchmark) — parity checks + pinned-protocol benchmarks.
All four entry points are documented (with copy-paste commands) in docs/README.md.
CLI note:
yolozu ...is the pip/package CLI.python3 tools/yolozu.py ...is the repo wrapper CLI.- For equivalent commands, swap only the executable (
yolozu↔python3 tools/yolozu.py).
Module path note:
- Canonical Python modules live under categorized packages (
yolozu/core,yolozu/datasets,yolozu/eval,yolozu/inference,yolozu/predictions,yolozu/training,yolozu/geometry). - Legacy imports such as
from yolozu.dataset import build_manifestremain available via package-level aliasing inyolozu/__init__.py.
- Bring-your-own inference → stable
predictions.jsoninterface contract. - Validators catch schema drift early.
- Protocol-pinned
export_settingsmakes comparisons reproducible. - Parity/bench quantify backend drift and performance.
- Tooling stays CPU-friendly by default (GPU optional).
- Apache-2.0-only ops policy is enforced in repo tooling.
YOLOZU standardizes evaluation around a predictions-first interface contract: run inference anywhere, export predictions.json (+ export_settings), then validate and evaluate with fixed protocols for reproducible comparisons.
Details: docs/yolozu_spec.md.
python3 -m pip install yolozu
yolozu --help
yolozu doctor --output -Optional (CPU) demos:
python3 -m pip install -U 'yolozu[demo]'
yolozu demo overview # writes demo_output/overview/<utc>/demo_overview_report.json
yolozu demo
yolozu demo instance-seg
yolozu demo keypoints
yolozu demo pose # chessboard default; use --backend aruco for marker-based pose
yolozu demo pose --backend aruco # cached sample in demo_output/pose/_samples (delete to regenerate)
yolozu demo pose --backend densefusion # heavy: CUDA + large downloads
yolozu demo depth # default: Depth Anything (Transformers); use --compare to run MiDaS/DPT too
yolozu demo train # downloads ResNet18 weights on first run
yolozu demo continual --compare --markdownFirst-time visual confirmation (PNG output check):
yolozu demo instance-seg --background synthetic --inference none --num-images 2 --image-size 64 --max-instances 2 --run-dir reports/demo_firsttime_instance_seg
ls reports/demo_firsttime_instance_seg/overlays/*.pngOptional extras and CPU demos: docs/install.md.
CLI completion (bash/zsh):
# bash
eval "$(yolozu completion --shell bash)"
# zsh
eval "$(yolozu completion --shell zsh)"Real-image multitask finetune smoke (bbox/segmentation/keypoints/depth/pose6d):
# review dataset license/terms before download
python3 scripts/download_coco_instances_tiny.py --out-root data/coco --split val2017 --num-images 8 --seed 0 --force
python3 tools/prepare_real_multitask_fewshot.py --out data/real_multitask_fewshot --train-images 6 --val-images 2 --strict-provenance --force
python3 tools/run_real_multitask_finetune_demo.py --dataset-root data/real_multitask_fewshot --out reports/real_multitask_finetune_demo --device cpu --epochs 1 --max-steps 1 --batch-size 2 --image-size 96 --strict-provenance --forceOne-command workflow (prepare + optional tiny COCO auto-download + staged smoke):
python3 tools/run_real_multitask_finetune_demo.py --dataset-root data/real_multitask_fewshot --prepare --download-if-missing --allow-auto-download --accept-dataset-license --download-num-images 8 --out reports/real_multitask_finetune_demo --device cpu --epochs 1 --max-steps 1 --batch-size 2 --image-size 96 --strict-provenance --forceExternal finetune smoke matrix (YOLOv/MMDetection/Detectron2/RT-DETR, interface contract report):
python3 tools/run_external_finetune_smoke.py --dataset-root data/smoke --split train --output reports/external_finetune_smoke.jsonExecute real training for selected frameworks:
python3 tools/run_external_finetune_smoke.py --dataset-root data/smoke --split train --non-dry-framework yolov --non-dry-framework rtdetr --epochs 1 --max-steps 1 --batch-size 2 --image-size 96 --device cpu --require-training-execution --output reports/external_finetune_smoke.exec.jsonRT-DETR non-dry path now emits explicit dependency failure metadata when torch is unavailable (failure_code=E_DEP_TORCH_MISSING).
MMDetection/Detectron2 non-dry runs with --mmdet-train-script / --detectron2-train-script continue train-path audit even when projection deps are missing, and record projection_error in the report.
Details and external launcher wiring: docs/external_finetune_smoke.md.
Deterministic domain-shift target recipe for TTT:
python3 scripts/prepare_ttt_domain_shift_target.py --dataset-root data/smoke --split val --out reports/domain_shift/smoke_gaussian_blur_s2 --corruption gaussian_blur --severity 2 --seed 2026 --force
python3 tools/export_predictions.py --adapter dummy --dataset reports/domain_shift/smoke_gaussian_blur_s2 --split val --wrap --domain-shift-recipe reports/domain_shift/smoke_gaussian_blur_s2/domain_shift_recipe.json --output reports/pred_shift_target.jsonDetails: docs/ttt_protocol.md.
Reference adapter regression (RT-DETR, real-image baseline):
python3 tools/run_reference_adapter_regression.py --dataset data/smoke --split val --max-images 2 --profile micro --repro-policy relaxed --runtime-lock requirements-ci.lock --baseline baselines/reference_adapter/rtdetr_pose_smoke_val.json --diff-summary-out reports/reference_adapter_regression.diff_summary.json --topk-examples-dir reports/reference_adapter_regression_topk --topk-examples 3 --output reports/reference_adapter_regression.jsonInterface-contract-only hard gate:
python3 tools/run_reference_adapter_regression.py --dataset data/smoke --split val --max-images 2 --profile micro --score-gate-mode off --perf-gate-mode off --runtime-lock requirements-ci.lock --enforce-runtime-lock --enforce-weights-hash --baseline baselines/reference_adapter/rtdetr_pose_smoke_val.json --output reports/reference_adapter_regression_contract.jsonBehavior-only warn gate:
python3 tools/run_reference_adapter_regression.py --dataset data/smoke --split val --max-images 2 --profile micro --schema-gate-mode off --consistency-gate-mode off --score-gate-mode warn --perf-gate-mode warn --runtime-lock requirements-ci.lock --enforce-runtime-lock --baseline baselines/reference_adapter/rtdetr_pose_smoke_val.json --output reports/reference_adapter_regression_behavior.jsonpython3 -m pip install -r requirements-test.txt
# optional: mirror CI recommended tier (pinned runtime)
python3 -m pip install -r requirements-ci.lock
python3 -m pip install -e .
python3 tools/yolozu.py --help
python3 -m unittest -qIf you want the optional demo dependencies in a source checkout:
python3 -m pip install -e '.[demo]'Single-command release automation (no required options):
bash release.shrelease.sh auto-selects Python in this order:
$YOLOZU_PYTHON(if set)./.venv/bin/pythonpython3inPATHpythoninPATH
Auto bump policy (current X.Y.Z -> next version):
- small change:
X.Y.(Z+1)(e.g.1.1.1+addequivalent) - medium change:
X.(Y+1).0(e.g.1.1+a.0equivalent) - large change:
(X+1).0.0(e.g.1+a.0.0equivalent)
Dry-run preview:
bash release.sh --dry-run --allow-dirty --allow-non-main --output reports/release_report.dry_run.jsonMCP settings check (manifest + generated MCP/Actions references):
python3 tools/check_mcp_settings.py --output reports/mcp_settings_check.jsonUltralytics/DETR support (trainer/repo/export 3-layer helpers):
python3 tools/support_ultralytics_detr.py ls -j
python3 tools/support_ultralytics_detr.py tu -P smoke -n -o reports/support_ultralytics_detr.train_ultralytics.json
python3 tools/support_ultralytics_detr.py th -P smoke -n -o reports/support_ultralytics_detr.train_hf_detr.json
python3 tools/support_ultralytics_detr.py eo -P smoke -o models/yolo11n.onnx -n -r reports/support_ultralytics_detr.export_onnx.jsonSee: docs/ultralytics_detr_support.md
Printable manual source: manual/.
- Contact: develop@toppymicros.com
- © 2026 ToppyMicroServices OÜ
Full support/legal:
docs/support.md.
Code in this repository is licensed under the Apache License, Version 2.0. See LICENSE.
