refactor: migrate pl.at(optimization=) to optimizations=[pl.auto_chunk] by lyfne123 · Pull Request #373 · hw-native-sys/pypto-lib

lyfne123 · 2026-05-25T09:35:27Z

Summary

pypto#1504 removed the deprecated pl.at(optimization=, split=) kwargs and the chunked_loop_optimizer sentinel. This migrates every callsite in pypto-lib to the supported optimizations=[pl.auto_chunk] form and refreshes stale comments.

18 files: examples + qwen3/deepseek/kimi/milm kernels
No split= usage existed, so all become a plain optimizations=[pl.auto_chunk]
pl.auto_chunk is itself deprecation-warned but still functional; kept to keep examples runnable

Test plan

lint: English-only + headers pass
golden harness on examples/models compiles & runs

coderabbitai · 2026-05-25T09:35:35Z

Warning

Review limit reached

@lyfne123, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 50 minutes and 19 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e33f0916-ef65-4cd1-ab28-410ab04ddf9c

📥 Commits

Reviewing files that changed from the base of the PR and between 09a33f0 and 5a2b095.

📒 Files selected for processing (15)

examples/advanced/gemm_eltwise.py
examples/beginner/hello_world.py
examples/beginner/matmul.py
examples/intermediate/gemm.py
examples/intermediate/layer_norm.py
examples/intermediate/rope.py
examples/intermediate/softmax.py
models/deepseek/v3_2/deepseek_v3_2_prefill_front_draft.py
models/deepseek/v4/decode_indexer.py
models/deepseek/v4/hc_post.py
models/deepseek/v4/qkv_proj_rope.py
models/kimi/kimi_k2_decode_draft.py
models/milm/milm_decode_draft.py
models/qwen3/14b/qwen3_14b_l3_generate.py
models/qwen3/32b/qwen3_32b_prefill_draft.py

📝 Walkthrough

Walkthrough

This PR systematically replaces PyPTO's singular pl.at(..., optimization=pl.chunked_loop_optimizer) parameter with a list-based pl.at(..., optimizations=[pl.auto_chunk]) parameter across all examples and production models, unifying the loop optimization hint mechanism from one strategy to another.

Changes

Loop optimization hint unification

Layer / File(s)	Summary
Documentation and docstring updates `examples/advanced/gemm_eltwise.py`	Module-level comments and function docstrings describing the optimizer configuration are updated to reference `auto_chunk` instead of `chunked_loop_optimizer`.
Beginner and intermediate example programs `examples/beginner/hello_world.py`, `examples/beginner/matmul.py`, `examples/intermediate/gemm.py`, `examples/intermediate/layer_norm.py`, `examples/intermediate/rope.py`, `examples/intermediate/softmax.py`	Six example programs update their core-group `pl.at` contexts to use the new optimizer hint format via single-line parameter changes.
Single-region model kernel updates `models/deepseek/v3_2/deepseek_v3_2_prefill_front_draft.py`, `models/deepseek/v4/decode_attention_hca.py`, `models/deepseek/v4/hc_post.py`, `models/kimi/kimi_k2_decode_draft.py`, `models/milm/milm_decode_draft.py`	Deepseek v3_2, v4 (hca, hc_post), Kimi, and MiLM decode programs update their individual core-group scheduling regions via straightforward one-line parameter changes.
Multi-region models with updated comments `models/deepseek/v4/decode_attention_swa.py`, `models/deepseek/v4/qkv_proj_rope.py`, `models/deepseek/v4/decode_indexer.py`	Deepseek v4 attention SWA (three KV cache and assembly regions), qkv_proj_rope (RMSNorm partial-sum paths), and decode_indexer (score_quant path) each switch multiple `pl.at` contexts and update related inline documentation to reflect `auto_chunk` usage.
Large-scale Qwen3 model refactoring `models/qwen3/14b/qwen3_14b_l3_generate.py`, `models/qwen3/32b/qwen3_32b_prefill_draft.py`	Qwen3 14B and 32B models update all core-group `pl.at` directives across prefill and decode paths (Q/K/V projection, normalization, padding, RoPE, attention matmul/softmax, and MLP stages) to use the new `optimizations=[pl.auto_chunk]` format.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

hw-native-sys/pypto-lib#332: Both PRs modify models/deepseek/v4/qkv_proj_rope.py by replacing pl.at(..., optimization=pl.chunked_loop_optimizer) hints—this PR switches to optimizations=[pl.auto_chunk] while the related PR refactors the qkv_proj_rope scope structure.
hw-native-sys/pypto-lib#350: Both PRs modify models/deepseek/v4/decode_indexer.py in the score_quant path—this PR updates the CORE_GROUP loop scheduling hint to pl.auto_chunk while the related PR refactors the scoring computation logic.
hw-native-sys/pypto-lib#276: Both PRs update Qwen3-14B pl.at directives across the same core-group regions in prefill and decode paths; this PR switches the optimizer strategy while the related PR adds name_hint labeling.

Poem

🐰 With optimizer hints now crystalline clear,
From chunked loops to auto-chunks we steer,
Across all examples and models so grand,
A unified PyPTO, refactored so planned! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 20.69% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely describes the main change: migrating deprecated pl.at(optimization=) syntax to the new optimizations=[pl.auto_chunk] form across the codebase.
Description check	✅ Passed	The description is directly related to the changeset, explaining the migration rationale, scope (18 files), technical details, and test plan.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request performs a widespread refactor to replace the chunked_loop_optimizer with auto_chunk across multiple example scripts and model implementations. The changes primarily involve updating the pl.at context manager to use the optimizations list parameter instead of the single optimization parameter, along with corresponding updates to docstrings and comments. I have no feedback to provide as there were no review comments.

…izations=[pl.auto_chunk] pypto#1504 removed the pl.at(optimization=, split=) kwargs and the chunked_loop_optimizer sentinel. Switch all callsites to the supported optimizations=[pl.auto_chunk] form and update stale comments. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

auto_chunk requests a ~27GB static arena for rms_norm's two-pass manually-chunked kernel under the pinned pto-isa, failing CI runtime. softmax/layer_norm migrate cleanly (single full-hidden tile); rms_norm is the only example already manually chunked, so revert it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…mizer decode_attention_hca/swa wrap pl.at inside an outer pl.range with explicit chunk= args; auto_chunk re-chunks and gives ~34% sim mismatch on x_out (device passes). Same already-manually-chunked case as rms_norm. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

gemini-code-assist Bot reviewed May 25, 2026

View reviewed changes

lyfne123 and others added 2 commits May 26, 2026 10:57

lyfne123 force-pushed the refactor/auto-chunk-kwarg branch from 09a33f0 to 2ea0c63 Compare May 26, 2026 02:57

lyfne123 mentioned this pull request May 26, 2026

refactor: drop deprecated pl.auto_chunk / chunked_loop_optimizer #372

Closed

zhangqi-chen merged commit a71551e into hw-native-sys:main May 26, 2026
7 checks passed

This was referenced May 26, 2026

refactor: migrate rms_norm + dsv4 attention to auto_chunk #386

Open

[Feature] Remove chunked_loop_optimizer from dsv4 attention (hca/swa), migrate to auto_chunk #388

Open

coderabbitai Bot mentioned this pull request May 26, 2026

chore(dsv4): migrate chunked_loop_optimizer to auto_chunk (#388) #389

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: migrate pl.at(optimization=) to optimizations=[pl.auto_chunk]#373

refactor: migrate pl.at(optimization=) to optimizations=[pl.auto_chunk]#373
zhangqi-chen merged 3 commits into
hw-native-sys:mainfrom
lyfne123:refactor/auto-chunk-kwarg

lyfne123 commented May 25, 2026

Uh oh!

coderabbitai Bot commented May 25, 2026 •

edited

Loading

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lyfne123 commented May 25, 2026

Summary

Test plan

Uh oh!

coderabbitai Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai Bot commented May 25, 2026 •

edited

Loading