Conversation
Sanitizer Performance Benchmark
Iterations: 1 warmup + 40 measured |
|
I will review after 4pm because my token is used up... |
|
it's not done yet 😮💨 you can skim it if you'd like but this is likely to change some amount before it's ready for review by the way we may need to fix up code quality in another PR because jialiang needs this ready soon |
…ools/triton-viz into nki-beta-2-pt2
Which PR? |
oh I guess I didn't say. This PR is kinda messy although I'm cleaning it up a bit and I don't think we'll need a separate PR (this'll stay messy, but it's not core functionality so it's probably fine ) |
Code reviewFound 4 issues:
triton-viz/triton_viz/core/nki_beta2.py Lines 656 to 662 in df6dbc8
triton-viz/triton_viz/core/nki_beta2.py Lines 759 to 770 in df6dbc8
triton-viz/triton_viz/core/trace.py Lines 174 to 180 in df6dbc8
Lines 61 to 64 in df6dbc8 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
|
@latentCall145 Claude has given some comments but with low confidence / priority. Feel free to disregard them. |
|
suggestions were pretty reasonable btw |
|
@claude review? |
…ools/triton-viz into nki-beta-2-pt2
|
@codex review. |
|
[P1] Register the beta2 ops used by the new demos When |
@latentCall145 Can you verify this issue from codex-cli? Not sure if it's hallucination or real problem. |
Code reviewFound 3 issues:
triton-viz/triton_viz/core/trace.py Lines 312 to 315 in 7a4d6f7 vs exact matching in triton-viz/triton_viz/core/patch.py Lines 312 to 322 in 7a4d6f7
triton-viz/tests/nki/test_nki_beta2.py Lines 7 to 9 in 7a4d6f7
Lines 61 to 64 in 7a4d6f7 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
It’s correct but we don’t trace these operations in triton either (they’re basically just elementwise ops) so not an issue for now. Also I’m planning to connect NKI beta 2 interpreter to triton viz in a another PR |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7a4d6f7f02
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| else: | ||
| self.interpreter_fn = NKIInterpretedFunction(kernel) | ||
| self.interpreter_fn = nki_fn_cls(kernel) | ||
| self.func = kernel |
There was a problem hiding this comment.
Restore GenericKernel unwrapping for nki backend
For backend="nki", this branch now wraps any non-interpreted object directly (self.interpreter_fn = nki_fn_cls(kernel)), but NKIInterpretedFunction.run expects to operate on the underlying Python kernel function (it inspects/calls self.fn as a function). Passing a compiled NKI GenericKernel here no longer uses kernel.func as before, which can bypass interpreter execution or fail argument/source handling when users trace jitted kernels; please unwrap compiled kernels to their .func before constructing the interpreter wrapper.
Useful? React with 👍 / 👎.
triton_viz/core/trace.py
Outdated
|
|
||
| kwargs.pop("warmup", None) | ||
| grid = kwargs.pop("grid", None) | ||
| nki.trace(self.func, grid=grid, platform_target="trn1").specialize( |
There was a problem hiding this comment.
Do not hard-code beta2 pre-trace to trn1
The beta2 pre-trace path always calls nki.trace(..., platform_target="trn1") and provides no way for callers to override the target, so kernels that are valid on newer NeuronCore targets can be rejected before interpreter execution when pre_trace=True (the default). This creates false failures for trn2/gen4 workflows even when the interpreter path itself supports those semantics; the platform target should be configurable or derived instead of fixed to trn1.
Useful? React with 👍 / 👎.
@latentCall145 Those issues are from Claude Code with confidence of 75. Default threshold is 80 so it wasn't published previously. You may also take a look. |
|
@latentCall145 Did you find Claude's findings useful? If not, I'm good to stamp it. |
Summary
adds core NKI Beta 2 examples (elementwise add, tiled matmul, RoPE, attention, rmsnorm, softmax, mlp kernels), the interpreter functions needed to handle it, and unit tests to check function parity with NKI docs
Test Plan
examples/nki_betato make sure they run on the trn1.2xlarge node and also on the interpretermainRelated Issues
#296
Breaking Changes
N/A
Checklist
npm run build:frontendif the PR modified any TypeScript code.