You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
src/opentau/policies/ has accumulated significant copy-paste between policy families. Each new policy (pi0 → pi05 → pi05_mem → pi06 → pi07 → pi07_paligemma) has been added as a fresh subdirectory rather than by extending shared modules, so utilities like video_encoder.py, paligemma_with_expert.py, and gemma3_with_expert.py exist as near-identical forks. There are no shared base classes; each modeling_*.py re-implements the same set of prepare_* / embed_* / sample_actions / denoise_step methods.
This issue audits the duplication, quantifies it, and proposes concrete deduplication targets. Roughly 7-8k lines could be unified.
Findings (quantified)
Numbers are wc -l of each file and diff A B | wc -l (raw diff line count, lower = more similar).
#
Files
Lines (A / B)
Diff lines
Notes
1
pi05_mem/video_encoder.py vs pi07/low_level_planner/video_encoder.py
460 / 505
~115
~88% identical. Diff is mostly docstring (PI05Mem* vs PI07*), one contextmanager import, and the suppress_spacetime_temporal helper that pi07 adds.
2
pi06/gemma3_with_expert.py vs pi07/gemma3_with_expert.py
861 / 919
~100
~94% identical. Diff is one new config flag (load_pretrained_gemma3), updated docstrings (π0.6 → π0.7), and a _multi_modal_projector lookup helper.
3
pi0/paligemma_with_expert.py vs pi05/paligemma_with_expert.py
pi07/low_level_planner/modeling_pi07_low_level.py vs pi07_paligemma/low_level_planner/modeling_pi07_low_level.py
1879 / 1744
~636
Vision-encoder swap (SpaceTimeSiglip ↔ V-JEPA2) + class-name changes. See also #210, #211, #192.
5
pi07/high_level_planner/modeling_pi07_high_level.py vs pi07_paligemma/high_level_planner/modeling_pi07_high_level.py
1487 / 1440
~391
Same situation as #4 but for the high-level planner.
6
pi05/modeling_pi05.py vs pi05_mem/modeling_pi05.py
1733 / 1194
~1570
pi05_mem is an intentional memory variant — biggest divergence of the six, but the shared scaffolding (forward, prepare_*, sample_actions) is still copy-paste.
Method-level duplication in modeling_*.py
Every flow-matching policy reimplements the same surface. Grepping method signatures across pi0, pi05, pi06, pi07/low_level_planner:
The bodies of sample_noise, sample_time, select_action, predict_action_chunk, and denoise_step are byte-trivial differences across policies.
Bonus finding: class-name typo
pi07_paligemma/low_level_planner/configuration_pi07_low_level.py:38 defines PI07lowlevelPlannerConfig (lowercase "lowlevel"), while pi07/low_level_planner/configuration_pi07_low_level.py:40 defines PI07LowLevelPlannerConfig. factory.py:50 papers over this with an as alias. This is a downstream symptom of fork-and-edit duplication.
Cross-import map
grep -rn "from opentau.policies\." src/opentau/policies/ shows each policy directory only imports from itself and from policies/{pretrained,normalize,utils,factory}. There is zero sharing between sibling policy folders — the only cross-policy import is pi07/gemma3_with_expert.py lazily importing pi07/low_level_planner/video_encoder.py. This silo structure is what causes the duplication to grow with each new policy.
Proposed deduplication targets
Ordered by ROI (lines saved / behavior risk):
shared/video_encoder.py — extract the SpaceTime-SigLIP wrapper used by both pi05_mem and pi07/low_level_planner. Pi07's suppress_spacetime_temporal context manager is a strict superset of pi05_mem's behavior, so pi05_mem can adopt it for free. Smallest, lowest-risk win (~460 lines deleted).
shared/gemma3_with_expert.py — fold pi06 and pi07 versions together. The pi07 superset is load_pretrained_gemma3 flag + a vision/projector-locator helper; both are safe additions for pi06. ~860 lines deleted.
shared/paligemma_with_expert.py — fold pi0 and pi05 versions together; pi05's discrete-action and AdaRMS additions are gated by config flags that pi0 simply doesn't set. ~690 lines deleted.
BaseVLMPolicyConfig — configuration_pi0.py, configuration_pi05.py, configuration_pi06.py share most fields (vision/state/action shapes, optimizer block, normalization mapping). A shared dataclass base with policy-specific subclasses overriding only divergent fields would shrink each config to <50 lines.
Suggested rollout
Land 1 → 2 → 3 as independent PRs, each gated on byte-identical loss/forward output for at least one smoke config per affected policy (pi0/pi05/pi06/pi07). Defer 5 until #226 and #230 give us a regression net. 4 happens naturally as a follow-up to 1–3.
Summary
src/opentau/policies/has accumulated significant copy-paste between policy families. Each new policy (pi0 → pi05 → pi05_mem → pi06 → pi07 → pi07_paligemma) has been added as a fresh subdirectory rather than by extending shared modules, so utilities likevideo_encoder.py,paligemma_with_expert.py, andgemma3_with_expert.pyexist as near-identical forks. There are no shared base classes; eachmodeling_*.pyre-implements the same set ofprepare_*/embed_*/sample_actions/denoise_stepmethods.This issue audits the duplication, quantifies it, and proposes concrete deduplication targets. Roughly 7-8k lines could be unified.
Findings (quantified)
Numbers are
wc -lof each file anddiff A B | wc -l(raw diff line count, lower = more similar).pi05_mem/video_encoder.pyvspi07/low_level_planner/video_encoder.pyPI05Mem*vsPI07*), onecontextmanagerimport, and thesuppress_spacetime_temporalhelper that pi07 adds.pi06/gemma3_with_expert.pyvspi07/gemma3_with_expert.pyload_pretrained_gemma3), updated docstrings (π0.6 → π0.7), and a_multi_modal_projectorlookup helper.pi0/paligemma_with_expert.pyvspi05/paligemma_with_expert.pypi07/low_level_planner/modeling_pi07_low_level.pyvspi07_paligemma/low_level_planner/modeling_pi07_low_level.pypi07/high_level_planner/modeling_pi07_high_level.pyvspi07_paligemma/high_level_planner/modeling_pi07_high_level.pypi05/modeling_pi05.pyvspi05_mem/modeling_pi05.pyMethod-level duplication in
modeling_*.pyEvery flow-matching policy reimplements the same surface. Grepping method signatures across
pi0,pi05,pi06,pi07/low_level_planner:And inside each
*FlowMatchingsubmodule:The bodies of
sample_noise,sample_time,select_action,predict_action_chunk, anddenoise_stepare byte-trivial differences across policies.Bonus finding: class-name typo
pi07_paligemma/low_level_planner/configuration_pi07_low_level.py:38definesPI07lowlevelPlannerConfig(lowercase "lowlevel"), whilepi07/low_level_planner/configuration_pi07_low_level.py:40definesPI07LowLevelPlannerConfig.factory.py:50papers over this with anasalias. This is a downstream symptom of fork-and-edit duplication.Cross-import map
grep -rn "from opentau.policies\." src/opentau/policies/shows each policy directory only imports from itself and frompolicies/{pretrained,normalize,utils,factory}. There is zero sharing between sibling policy folders — the only cross-policy import ispi07/gemma3_with_expert.pylazily importingpi07/low_level_planner/video_encoder.py. This silo structure is what causes the duplication to grow with each new policy.Proposed deduplication targets
Ordered by ROI (lines saved / behavior risk):
shared/video_encoder.py— extract the SpaceTime-SigLIP wrapper used by bothpi05_memandpi07/low_level_planner. Pi07'ssuppress_spacetime_temporalcontext manager is a strict superset of pi05_mem's behavior, so pi05_mem can adopt it for free. Smallest, lowest-risk win (~460 lines deleted).shared/gemma3_with_expert.py— fold pi06 and pi07 versions together. The pi07 superset isload_pretrained_gemma3flag + a vision/projector-locator helper; both are safe additions for pi06. ~860 lines deleted.shared/paligemma_with_expert.py— fold pi0 and pi05 versions together; pi05's discrete-action and AdaRMS additions are gated by config flags that pi0 simply doesn't set. ~690 lines deleted.pi07_paligemmaremoval/merge — already tracked in pi07_paligemma: re-implement V-JEPA2 video encoder, or delete the policy #211. Extracting (1)–(3) makes that merge trivial: pi07_paligemma becomes a config variant (vision encoder choice) of pi07, not a forked codebase.BaseFlowMatchingPolicy/BaseFlowMatchingExpertmixins — pullsample_noise,sample_time,select_action,predict_action_chunk,denoise_stepand the standardprepare_*skeleton into base classes. Subclasses override only what genuinely differs (vision tower, action head, prefix/suffix layout). Largest payoff but highest risk — should land after the byte-equivalence regression tests from pi07 low-level planner forward is not byte-equivalent to pi05 when all optional inputs are dropped #226 and pi06: align prompt template + add future byte-identity tests against pi07 planners #230 are in place so we can prove the refactor is identity-preserving.BaseVLMPolicyConfig—configuration_pi0.py,configuration_pi05.py,configuration_pi06.pyshare most fields (vision/state/action shapes, optimizer block, normalization mapping). A shared dataclass base with policy-specific subclasses overriding only divergent fields would shrink each config to <50 lines.Suggested rollout
Land 1 → 2 → 3 as independent PRs, each gated on byte-identical loss/forward output for at least one smoke config per affected policy (pi0/pi05/pi06/pi07). Defer 5 until #226 and #230 give us a regression net. 4 happens naturally as a follow-up to 1–3.
Related
pi07_paligemma: re-implement V-JEPA2 or delete the policypi07_paligemmalow-level planner broken importpi07/low_level_planner→low_level