Skip to content

[DO NOT MERGE][TEST ONLY] Create AGENTS.md#3281

Draft
Stonepia wants to merge 5 commits intomainfrom
tong/agents
Draft

[DO NOT MERGE][TEST ONLY] Create AGENTS.md#3281
Stonepia wants to merge 5 commits intomainfrom
tong/agents

Conversation

@Stonepia
Copy link
Copy Markdown
Contributor

@Stonepia Stonepia commented Apr 8, 2026

This PR is for test only

@laifenxiawucha
Copy link
Copy Markdown
Contributor

@laifenxiawucha 1

@mengfei25 mengfei25 added the disable_all Disable all ci test jobs for the PR, just keep basic lint check label Apr 8, 2026
@laifenxiawucha
Copy link
Copy Markdown
Contributor

@laifenxiawucha review

@laifenxiawucha

This comment was marked as low quality.

@laifenxiawucha

This comment was marked as low quality.

@laifenxiawucha
Copy link
Copy Markdown
Contributor

@laifenxiawucha review

@laifenxiawucha

This comment was marked as low quality.

@laifenxiawucha
Copy link
Copy Markdown
Contributor

@laifenxiawucha 1

@laifenxiawucha

This comment was marked as resolved.

@laifenxiawucha

This comment was marked as resolved.

@laifenxiawucha

This comment was marked as off-topic.

@laifenxiawucha

This comment was marked as resolved.

@laifenxiawucha

This comment was marked as abuse.

@laifenxiawucha

This comment was marked as abuse.

@laifenxiawucha

This comment was marked as spam.

@laifenxiawucha

This comment was marked as abuse.

@laifenxiawucha

This comment was marked as duplicate.

@laifenxiawucha

This comment was marked as low quality.

@laifenxiawucha

This comment was marked as resolved.

@laifenxiawucha
Copy link
Copy Markdown
Contributor

@laifenxiawucha review

@laifenxiawucha

This comment was marked as resolved.

@laifenxiawucha

This comment was marked as duplicate.

@laifenxiawucha

This comment was marked as abuse.

@laifenxiawucha

This comment was marked as abuse.

@laifenxiawucha

This comment was marked as spam.

@laifenxiawucha
Copy link
Copy Markdown
Contributor

Requested in this mention.

High

  • Build failure in src/ATen/native/sparse/xpu/SparseSoftmax.cpp: softmax_sparse_xpu now calls softmax_forward_sparse_xpu_kernel, which does not exist in the header or implementation files. This will fail to link.
  • API breakage in src/ATen/native/sparse/xpu/SparseSoftmax.cpp: The log_softmax_sparse_xpu function is completely removed without replacement. This is a publicly registered API (registered in yaml/native/native_functions.yaml) and will cause linking failures for any code calling it.

Medium

  • Removed safety check in src/ATen/native/transformers/Attention.cpp:129-131: The TORCH_CHECK that prevented NestedTensor with mask combinations is removed without replacement. If this combination is not supported, silently allowing it could lead to crashes or incorrect behavior.
  • Breaking API change in src/ATen/native/transformers/Attention.cpp:148-149: The dimension requirement for qkv_weight is changed from 2-D to 4-D. While the comment says "This aligns with CUDA", this breaks the existing API contract—callers passing 2-D tensors will now fail. There's no evidence that existing callers have been updated to handle 4-D tensors.

Given that this is marked [DO NOT MERGE][TEST ONLY], the changes are clearly not ready for production. The build-breaking issues alone make this non-mergeable.

@laifenxiawucha
Copy link
Copy Markdown
Contributor

@laifenxiawucha review

@laifenxiawucha
Copy link
Copy Markdown
Contributor

Requested in this mention.

Based on the diff, this PR contains substantial code changes beyond the stated "TEST ONLY" purpose. Here's my review:

High

  • API breakage in SparseSoftmax.cpp: The log_softmax_sparse_xpu function is entirely removed without checking for existing callers. This will break any code that was using this function.

  • Changed dimension semantics in Attention.cpp: qkv_weight dimension check changed from dim() == 2 to dim() == 4. This is a breaking change that will cause existing code to fail TORCH_CHECK. The comment "This aligns with CUDA" suggests this is intentional, but the PR description does not mention this API change.

  • Removed safety check: The TORCH_CHECK for "NestedTensor with mask is not supported yet" was removed. This could lead to undefined behavior if someone passes a NestedTensor with a mask, rather than failing with a clear error message.

Medium

  • Inconsistent kernel naming: Changed from softmax_sparse_xpu_kernel to softmax_forward_sparse_xpu_kernel. While this may be intentional (suggesting a forward-only kernel), the PR has no context on why this change is needed.

Test Coverage

  • No tests are added or modified to verify the new qkv_weight.dim() == 4 constraint.
  • No tests added to verify the removed log_softmax_sparse_xpu was unused.
  • No tests to verify the behavior when NestedTensor with mask is passed (now silently accepted but unsupported).

Note: The creation of AGENTS.md appears to be unrelated scaffolding, but the substantive code changes in sparse softmax and attention should be addressed before this could be considered.

@laifenxiawucha
Copy link
Copy Markdown
Contributor

@laifenxiawucha review

@laifenxiawucha
Copy link
Copy Markdown
Contributor

Requested in this mention.

Status: Request Changes

High (Must-Fix)

Missing function definition will break build

  • Problem: softmax_sparse_xpu now calls xpu::softmax_forward_sparse_xpu_kernel which has no declaration or definition in the codebase
  • Evidence: src/ATen/native/sparse/xpu/SparseSoftmax.cpp line changes from xpu::softmax_sparse_xpu_kernel to xpu::softmax_forward_sparse_xpu_kernel, but src/ATen/native/sparse/xpu/sycl/SparseSoftmaxKernels.h declares only softmax_sparse_xpu_kernel and log_softmax_sparse_xpu_kernel
  • Why it matters: This will cause a linker error and prevent the code from building
  • Suggestion: Restore the original function call or verify the intended function exists and is properly declared

API breakage: removed backward compatibility function

  • Problem: The entire log_softmax_sparse_xpu function has been removed
  • Evidence: src/ATen/native/sparse/xpu/SparseSoftmax.cpp diff removes all 7 lines of the log_softmax_sparse_xpu function definition
  • Why it matters: If log_softmax_sparse_xpu is declared as a public API (even if just a wrapper), removing it without deprecation breaks backward compatibility for any downstream consumers
  • Suggestion: Verify if this function is part of the public API. If yes, use deprecation. If no, confirm it's not called anywhere else

Removed input validation guard

  • Problem: TORCH_CHECK for NestedTensor with mask is removed without explanation
  • Evidence: src/ATen/native/transformers/Attention.cpp removes 3 lines: TORCH_CHECK(!mask || !query.is_nested(), "NestedTensor with mask is not supported yet");
  • Why it matters: This could allow unsupported configurations to crash at runtime instead of failing with a clear error message
  • Suggestion: If NestedTensor with mask is now supported, add a test demonstrating this. If not, restore the guard or explain in the commit why it's safe

Contradictory API contract change

  • Problem: qkv_weight dimension check changes from 2 to 4, but the comment still says shape is [3 * D, D]
  • Evidence: src/ATen/native/transformers/Attention.cpp changes qkv_weight.dim() == 2 to qkv_weight.dim() == 4, while the comment above it still states // qkv_weight shape: [3 * D, D] which describes a 2D tensor
  • Why it matters: This creates an API contract mismatch. If qkv_weight is now 4D, the comment is misleading and the original shape description is wrong
  • Suggestion: Update the comment to reflect the new expected 4D shape, or verify if the dimension change is correct by checking the CUDA implementation you're aligning with

Medium

Insufficient verification documentation

  • Problem: The changes include a comment "This aligns with CUDA" but no evidence or verification is provided
  • Evidence: src/ATen/native/transformers/Attention.cpp adds comment // This aligns with CUDA before the dimension check change
  • Why it matters: Without citing the specific CUDA code or tests, reviewers cannot verify if the alignment claim is correct
  • Suggestion: Reference the specific CUDA implementation file and line number, or add a test that demonstrates the new dimension expectation

Residual Risk

  • Blast radius: These changes affect sparse softmax and transformer attention paths. The missing function definition will prevent building. The dimension contract change could break existing users of native_multi_head_attention_xpu who pass 2D qkv_weight.
  • Test gaps: No tests added to verify the new qkv_weight.dim() == 4 behavior or to demonstrate that NestedTensor with mask is now supported.
  • CI verification: Given the build-breaking change, CI would fail immediately, so no functional pipeline can run.

If these code changes were intentional (e.g., for testing the review process), please clarify. If they were accidental, consider reverting the code changes and keeping only the AGENTS.md addition.

@laifenxiawucha
Copy link
Copy Markdown
Contributor

@laifenxiawucha review

@laifenxiawucha
Copy link
Copy Markdown
Contributor

Requested in this mention.

<tool_call>grep -B 5 -A 10 "softmax_sparse_xpu" src/ATen/native/sparse/xpu/SparseSoftmax.cpp | head -40</arg_value></tool_call>

@laifenxiawucha
Copy link
Copy Markdown
Contributor

@laifenxiawucha review 审查

@laifenxiawucha
Copy link
Copy Markdown
Contributor

Review of PR #3281:

This PR is titled [DO NOT MERGE][TEST ONLY] but contains substantive code changes mixed with a test file. Here are the findings:

1. AGENTS.md

Not valid — The file contains a single placeholder line (This is the test pr for adding AGENTS.md for AI usage) with no actual agent configuration or useful content. If this is intended to define AI coding agent instructions for the repo, it needs real content (build instructions, test commands, code style rules, etc.).

2. SparseSoftmax.cpp — Removal of log_softmax_sparse_xpu

Not valid — The log_softmax_sparse_xpu function is deleted entirely, and softmax_sparse_xpu is renamed to call softmax_forward_sparse_xpu_kernel. If any dispatch table or native_functions.yaml entry still references log_softmax_sparse_xpu, this will cause a linker error. The PR does not show corresponding updates to the kernel header or registration, which must be verified.

3. Attention.cpp — qkv_weight.dim() check changed from 2 to 4

Not valid — The TORCH_CHECK error message still says "expected 2-D qkv_weight" but now asserts dim() == 4. This is a direct contradiction: either the message must be updated to "expected 4-D qkv_weight" or the dimension check is wrong. Additionally, the comment // This aligns with CUDA needs a reference (link to upstream CUDA code or commit) to be verifiable.

4. Attention.cpp — Removal of NestedTensor+mask guard

Not valid — Removing the TORCH_CHECK(!mask || !query.is_nested(), ...) guard without adding actual NestedTensor+mask support will silently pass invalid inputs through to kernels that cannot handle them, likely producing incorrect results or crashes. If XPU now supports this path, the PR should include corresponding test coverage.

Overall: This PR bundles unrelated changes (AGENTS.md, sparse softmax refactor, attention validation changes) into a single [TEST ONLY] PR with no tests and contradictory error messages. Should be split into separate PRs with proper test coverage.

[AI-assisted reply]

@laifenxiawucha
Copy link
Copy Markdown
Contributor

@laifenxiawucha 请帮我review这个PR的代码变更

@laifenxiawucha
Copy link
Copy Markdown
Contributor

This PR has been reviewed extensively already, but since you asked again — the core issues remain unchanged:

Build failure: softmax_sparse_xpu now calls xpu::softmax_forward_sparse_xpu_kernel at SparseSoftmax.cpp:23, but the header src/ATen/native/sparse/xpu/sycl/SparseSoftmaxKernels.h only declares softmax_sparse_xpu_kernel. This will cause a linker error.

Broken dispatch registration: log_softmax_sparse_xpu is entirely removed from SparseSoftmax.cpp, but it is still registered in yaml/native/native_functions.yaml. The dispatcher will reference a missing symbol at link time.

Inconsistent dimension check: Attention.cpp changes qkv_weight.dim() == 2 to qkv_weight.dim() == 4, but the error message at L148 still says "expected 2-D qkv_weight" and the comment at L128 still says // qkv_weight shape: [3 * D, D] (which is 2-D). Subsequent indexing at L152-156 uses sizes()[0] / sizes()[1] assuming 2-D layout. All three are inconsistent with a 4-D check.

Removed safety guard: The TORCH_CHECK(!mask || !query.is_nested()) guard was removed with no evidence the kernel now supports that combination. Without a test proving it works, this allows silent incorrect behavior.

Given the PR is marked [DO NOT MERGE][TEST ONLY], these are expected test artifacts — but they would all be blockers for a production merge.

[AI-assisted reply]

@laifenxiawucha
Copy link
Copy Markdown
Contributor

Requested in this mention.

Status: Request Changes

High (Must-Fix)

Non-existent kernel name in SparseSoftmax.cpp

  • Problem: Rename to softmax_forward_sparse_xpu_kernel calls a kernel that does not exist. The kernel declaration is softmax_sparse_xpu_kernel in src/ATen/native/sparse/xpu/sycl/SparseSoftmaxKernels.h.
  • Evidence: src/ATen/native/sparse/xpu/SparseSoftmax.cpp line: return xpu::softmax_forward_sparse_xpu_kernel(input_, dim_, half_to_float); changed from xpu::softmax_sparse_xpu_kernel.
  • Why it matters: This will cause a link error or runtime failure when softmax_sparse_xpu is invoked.
  • Suggestion: Revert the call name to xpu::softmax_sparse_xpu_kernel.

Incorrect TORCH_CHECK message for qkv_weight dimensionality

  • Problem: The error message still says "expected 2-D qkv_weight" while the check enforces 4-D, making diagnostics misleading.
  • Evidence: src/ATen/native/transformers/Attention.cpp:
TORCH_CHECK(
    qkv_weight.dim() == 4,
    "expected 2-D `qkv_weight`, got ",
    qkv_weight.dim(),
    "-D tensor");
  • Why it matters: Error messages must match the actual check to help developers debug failures.
  • Suggestion: Fix the error message to "expected 4-D qkv_weight".

Medium

Guard removal for NestedTensor with mask requires test

  • Problem: Removing the TORCH_CHECK that blocked NestedTensor with mask should be validated by a test to avoid silent regressions.
  • Evidence: src/ATen/native/transformers/Attention.cpp removed:
TORCH_CHECK(
    !mask || !query.is_nested(),
    "NestedTensor with mask is not supported yet");
  • Why it matters: Without a test, unsupported use-cases may yield incorrect results instead of an explicit error.
  • Suggestion: Add a test in test/xpu/ or test/test_ops_xpu.py for the new NestedTensor-with-mask behavior.

Suspicious qkv_weight dimensionality change not verified

  • Problem: Changing qkv_weight.dim() from 2 to 4 is a behavioral change that effectively enables a different call path. When combined with the comment "This aligns with CUDA", CUDA parity is not verified here; I did not inspect the CUDA implementation.
  • Evidence: src/ATen/native/transformers/Attention.cpp changed qkv_weight.dim() == 2 to qkv_weight.dim() == 4 and added // This aligns with CUDA.
  • Why it matters: A dimensionality requirement change without context can break existing callers and deviate from the documented API contract.
  • Suggestion: Verify the signature and usage from other backends or upstream. Update the PR description to justify this change (which tensor shapes are now expected).

Low / Suggestions

Missing tests for blocked sparse softmax function removal

  • Problem: log_softmax_sparse_xpu was removed. No tests were added or updated to confirm the new behavior.
  • Evidence: src/ATen/native/sparse/xpu/SparseSoftmax.cpp removed whole function log_softmax_sparse_xpu.
  • Why it matters: Code without tests is harder to validate for regressions.
  • Suggestion: Add or extend tests in test/xpu/ or test/test_ops_xpu.py to cover the relevant semantics after changes.

Incomplete PR description

  • Problem: The PR is marked "DO NOT MERGE / TEST ONLY", but the description doesn’t explain the purpose, verification steps, or blast radius.
  • Evidence: PR body: "This PR is for test only".
  • Why it matters: Reviewers need context to know why these specific changes are being made and how to verify them.
  • Suggestion: Expand the description with: what changed, why, how you verified (device/build/CI), and impact on existing callers and tests.

Residual Risk

  • The renamed softmax path will fail link/runtime if callers hit it; callers should be checked after fixing the name.
  • No test coverage exists for the new NestedTensor-with-mask path after guard removal.
  • The qkv_weight dimensionality change may break existing callers if they pass 2-D weights; verify with callers and update documentation if the API contract changes.

@laifenxiawucha
Copy link
Copy Markdown
Contributor

Requested in this mention.

Perfect! Now I have enough context to write a comprehensive review. Let me summarize what I found:

Issues:

  1. High Severity - Undefined function call: In SparseSoftmax.cpp, the kernel was renamed from softmax_sparse_xpu_kernel to softmax_forward_sparse_xpu_kernel, but this function doesn't exist anywhere else in the codebase.

  2. High Severity - Logic inconsistency: In Attention.cpp, changed qkv_weight.dim() == 2 to dim() == 4 with comment "aligns with CUDA", but the subsequent code logic uses qkv_weight.sizes()[0] and qkv_weight.sizes()[1], which would be incorrect for a 4D tensor. This would break the tensor logic.

  3. High Severity - Safety check removed: Removed the TORCH_CHECK for nested tensor with mask combinations without explanation or replacement.

  4. Medium Severity - Function removal: log_softmax_sparse_xpu function was removed. While no callers were found in the current search, if this is part of a public API, it could break external code.

Let me now write the review following the xpu-pr-review skill format exactly:Status: Request Changes

High (Must-Fix)

Undefined kernel function

  • Problem: Changed kernel call to softmax_forward_sparse_xpu_kernel which does not exist
  • Evidence: src/ATen/native/sparse/xpu/SparseSoftmax.cpp:23 - calls xpu::softmax_forward_sparse_xpu_kernel, but grep shows this function only appears in this file (no definition in kernel headers or implementation files)
  • Why it matters: Linker error will occur at build time
  • Suggestion: Revert to softmax_sparse_xpu_kernel or provide the new kernel implementation

Inconsistent tensor dimension logic

  • Problem: Changed qkv_weight.dim() == 2 to dim() == 4 with comment "aligns with CUDA", but subsequent code uses 2D indexing (sizes()[0], sizes()[1])
  • Evidence: src/ATen/native/transformers/Attention.cpp:129 checks dim() == 4, but line 132-136 access qkv_weight.sizes()[0] and qkv_weight.sizes()[1] expecting a 2D tensor shape:
    TORCH_CHECK(D * 3 == qkv_weight.sizes()[0], ...);
    TORCH_CHECK(D == qkv_weight.sizes()[1], ...);
    
  • Why it matters: Logic expects 2D shape but validates for 4D; runtime failures or incorrect tensor access will occur
  • Suggestion: Either revert to dim() == 2 or update all downstream logic for 4D tensors

Removed safety check without justification

  • Problem: Deleted TORCH_CHECK that validated NestedTensor combinations with mask
  • Evidence: src/ATen/native/transformers/Attention.cpp:127-130 removed:
    TORCH_CHECK(!mask || !query.is_nested(),
                "NestedTensor with mask is not supported yet");
  • Why it matters: Safety guard removed for unsupported edge case; could lead to silent incorrect behavior or crashes
  • Suggestion: Restore the check or explain in description why the combination is now supported with evidence (e.g., added tests)

Medium

Removed public API function

  • Problem: log_softmax_sparse_xpu function deleted without verification of external call sites
  • Evidence: src/ATen/native/sparse/xpu/SparseSoftmax.cpp:24-27 - the entire function implementation was removed
  • Why it matters: May break external code or internal registrations not visible in this workspace (dispatch tables, ATen bindings, Python wrappers)
  • Suggestion: Verify if this function is part of a public API (check PyTorch Python exports, native_functions.yaml, or test calls) before removal; if unused, document why

Low / Suggestion

None

Residual Risk

  • Kernel function rename may have duplicate unused kernel implementations if softmax_forward_sparse_xpu_kernel is meant to replace softmax_sparse_xpu_kernel but was not added
  • Dimension change in Attention.cpp suggests potential unverified CUDA parity claim—if CUDA uses 4D, the downstream logic should also match CUDA's indexing scheme; this was not verified in this review
  • No tests provided to verify new tensor shapes or removed guard checks

@laifenxiawucha
Copy link
Copy Markdown
Contributor

@laifenxiawucha review

@laifenxiawucha
Copy link
Copy Markdown
Contributor

Requested in this mention.

I'm unable to access the repository files to complete your review request due to a technical issue with the sandbox environment. Could you please provide the diff or the files you'd like me to review, or confirm if this is a test of the mention review bot behavior itself?

If you'd like me to review the AGENTS.md changes once the technical issue is resolved, I can provide a detailed review with file-scoped comments and line-specific findings.

@laifenxiawucha
Copy link
Copy Markdown
Contributor

This PR has been reviewed extensively in the comments above. The core issues remain unchanged:

Build failure: softmax_sparse_xpu calls xpu::softmax_forward_sparse_xpu_kernel which does not exist in src/ATen/native/sparse/xpu/sycl/SparseSoftmaxKernels.h — only softmax_sparse_xpu_kernel is declared. This will fail to link.

Broken dispatch registration: log_softmax_sparse_xpu is removed from SparseSoftmax.cpp but still registered in yaml/native/native_functions.yaml as SparseXPU: log_softmax_sparse_xpu. Dispatch will fail at runtime.

Contradictory dimension check: qkv_weight.dim() == 4 check in Attention.cpp:L147 but error message still says "expected 2-D qkv_weight" and comment at L143 still says // qkv_weight shape: [3 * D, D] (which is 2-D). Subsequent indexing at L152-L156 uses sizes()[0]/sizes()[1] assuming 2-D layout.

Removed safety guard: TORCH_CHECK(!mask || !query.is_nested()) deleted without evidence that NestedTensor+mask is now supported by downstream kernels.

Given this is marked [DO NOT MERGE][TEST ONLY], these are expected test artifacts. No action needed unless this PR is intended for production.

[AI-assisted reply]

@laifenxiawucha
Copy link
Copy Markdown
Contributor

@laifenxiawucha review

@laifenxiawucha
Copy link
Copy Markdown
Contributor

Status: Request Changes

This PR has been reviewed multiple times. The same blocking issues persist:

  1. Build failure: softmax_sparse_xpu calls xpu::softmax_forward_sparse_xpu_kernel which is not declared in src/ATen/native/sparse/xpu/sycl/SparseSoftmaxKernels.h — only softmax_sparse_xpu_kernel exists. This is a linker error.

  2. Broken dispatch registration: log_softmax_sparse_xpu is deleted but still registered in yaml/native/native_functions.yaml as SparseXPU: log_softmax_sparse_xpu. The backward function log_softmax_backward_sparse_xpu also still exists, creating an orphaned backward path.

  3. Contradictory dimension check: qkv_weight.dim() == 4 at Attention.cpp:147 but error message still says "expected 2-D qkv_weight" and the comment at line 143 still says // qkv_weight shape: [3 * D, D]. Subsequent indexing at lines 152-156 uses sizes()[0] and sizes()[1] which is only meaningful for the original 2-D layout.

  4. Removed safety guard: TORCH_CHECK(!mask || !query.is_nested()) deleted without adding actual NestedTensor+mask support or tests.

None of these have been addressed since first flagged. PR remains non-mergeable.

[AI-assisted reply]

Copy link
Copy Markdown
Contributor

@laifenxiawucha laifenxiawucha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requested in this mention.

Status: Comment

Summary

This PR introduces build errors and semantic regressions with incomplete refactoring in sparse softmax kernels and mismatched error messaging in the attention module.

Cross-Cutting / Residual Risk

  • No tests added for the dimensionality change from 2-D to 4-D for qkv_weight inAttention.cpp. Verify existing Transformer tests pass with the new shape requirement.

  • Sparse softmax kernel naming change is incomplete—header declarations and kernel implementations are out of sync.

  • Verify whether rename from softmax_sparse_xpu_kernel to softmax_forward_sparse_xpu_kernel is intentional and propagated to all registration files (e.g., yaml/xpu_functions.yaml).

  • src/ATen/native/sparse/xpu/SparseSoftmax.cpp (L25-L29 (removed)): Medium : Removed wrapper function without API analysis. Kept in the summary instead of an inline review comment because the referenced line is not a changed RIGHT-side line in the PR diff.

  • src/ATen/native/transformers/Attention.cpp (L129-L131 (removed)): Medium : Removed guard check without verification. Kept in the summary instead of an inline review comment because the referenced line is not a changed RIGHT-side line in the PR diff.

const int64_t dim_,
const bool half_to_float) {
return xpu::log_softmax_sparse_xpu_kernel(input_, dim_, half_to_float);
return xpu::softmax_forward_sparse_xpu_kernel(input_, dim_, half_to_float);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High: : Call to non-existent kernel function

  • Problem: xpu::softmax_forward_sparse_xpu_kernel is called but this function does not exist.
  • Evidence: src/ATen/native/sparse/xpu/SparseSoftmax.cpp:23 calls xpu::softmax_forward_sparse_xpu_kernel, but src/ATen/native/sparse/xpu/sycl/SparseSoftmaxKernels.h only declares softmax_sparse_xpu_kernel, not softmax_forward_sparse_xpu_kernel.
  • Why it matters: This will cause a linker error at build time.
  • Suggestion: Either create the missing kernel function or revert to calling the existing softmax_sparse_xpu_kernel.

// This aligns with CUDA
TORCH_CHECK(
qkv_weight.dim() == 2,
qkv_weight.dim() == 4,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High: : Error message mismatched to dimensionality check

  • Problem: Error message says "expected 2-D" but the check validates dim() == 4.
  • Evidence: src/ATen/native/transformers/Attention.cpp:146 has qkv_weight.dim() == 4 check but the error message on L147 says "expected 2-D 'qkv_weight'".
  • Why it matters: Users will see confusing error messages that don't match the actual requirement, making debugging harder.
  • Suggestion: Update error message to "expected 4-D qkv_weight" to match the new check.

@laifenxiawucha
Copy link
Copy Markdown
Contributor

@laifenxiawucha review

@laifenxiawucha

This comment was marked as spam.

@laifenxiawucha
Copy link
Copy Markdown
Contributor

Review requested in this mention.

Based on the xpu-pr-review skill and reviewing both changed files:

Summary
This PR makes two API surface changes: it renames a softmax sparse kernel call that does not exist, and it removes the log_softmax_sparse_xpu forward function while leaving its backward path intact. Both changes will cause linker or runtime failures. Additionally, an error message in Attention.cpp is now incorrect after a dimension check change. The PR description does not explain any of these changes or their verification strategy.

src/ATen/native/sparse/xpu/SparseSoftmax.cpp

High: Kernel function name will fail to link

  • Line / Scope: L23
  • Problem: Call references softmax_forward_sparse_xpu_kernel but the kernel is still declared and defined as softmax_sparse_xpu_kernel in the base branch
  • Evidence: src/ATen/native/sparse/xpu/sycl/SparseSoftmaxKernels.h:11 declares softmax_sparse_xpu_kernel, not softmax_forward_sparse_xpu_kernel; src/ATen/native/sparse/xpu/sycl/SparseSoftmaxKernels.cpp:512 defines softmax_sparse_xpu_kernel with the original name
  • Why it matters: Linker error at build time
  • Suggestion: Either rename the kernel function in SparseSoftmaxKernels.h and SparseSoftmaxKernels.cpp, or revert to the existing softmax_sparse_xpu_kernel name

High: Removing forward implementation breaks backward compatibility

  • Line / Scope: L26-L28
  • Problem: log_softmax_sparse_xpu is deleted, but log_softmax_backward_sparse_xpu remains in the same file
  • Evidence: src/ATen/native/sparse/xpu/SparseSoftmax.cpp:41-46 still defines and calls log_softmax_backward_sparse_xpu_kernel in the backward path, creating a broken forward/backward pair
  • Why it matters: Removing a public entrypoint without deprecation or coverage of downstream callers is a BC break; runtime error or undefined behavior may occur
  • Suggestion: Either restore log_softmax_sparse_xpu or remove both forward and backward together with evidence that no code paths use the pair

src/ATen/native/transformers/Attention.cpp

High: Error message text mismatch after validation change

  • Line / Scope: L148
  • Problem: Error message still says "expected 2-D" but the check now requires qkv_weight.dim() == 4
  • Evidence: Check on L146 was changed from qkv_weight.dim() == 2 to qkv_weight.dim() == 4, but the error message on L148 still reads "expected 2-D qkv_weight, got ...-D tensor"
  • Why it matters: Misleading error message wastes debugging time and violates UI correctness
  • Suggestion: Update the error message to "expected 4-D" or revert to dim() == 2 if 2-D is still the correct shape

Cross-Cutting / Residual Risk

  • No test changes are present. These logic and API changes must have corresponding unit tests (perhaps within test/xpu/test_sparse_xpu.py for sparse en and test/xpu/nn/test_multihead_attention_xpu.py for attention).
  • No description of why the qkv_weight dimension requirement changed from 2 to 4, or how callers are expected to pass 4-D weights now.
  • The title suffix "[TEST ONLY]" should be removed at merge since these look like functional changes.

@laifenxiawucha
Copy link
Copy Markdown
Contributor

@laifenxiawucha review

@laifenxiawucha
Copy link
Copy Markdown
Contributor

Status: Do Not Merge

This PR is explicitly marked [DO NOT MERGE][TEST ONLY]. The code changes introduce intentional defects for testing purposes. Key issues already identified in inline comments:

  1. SparseSoftmax.cpp:23 — calls softmax_forward_sparse_xpu_kernel which does not exist in sycl/SparseSoftmaxKernels.h. Build will fail with linker error.
  2. Attention.cpp:148 — error message says "expected 2-D" but the check enforces dim() == 4. Misleading diagnostic.
  3. SparseSoftmax.cpplog_softmax_sparse_xpu removed entirely while still registered in dispatch yaml. Breaks the dispatch table.
  4. Attention.cpp:129-131 — removed TORCH_CHECK(!mask || !query.is_nested()) guard without adding NestedTensor+mask support.

No new code changes since last review. All previously identified issues remain unresolved, which is expected given the [TEST ONLY] designation.

[AI-assisted reply]

@laifenxiawucha
Copy link
Copy Markdown
Contributor

@laifenxiawucha review

Copy link
Copy Markdown
Contributor

@laifenxiawucha laifenxiawucha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

This test PR has three substantive code issues beyond the AGENTS.md addition. All three are build-breaking or API-breaking changes that would block merge regardless of the [DO NOT MERGE] label.

[AI-assisted reply]

const int64_t dim_,
const bool half_to_float) {
return xpu::log_softmax_sparse_xpu_kernel(input_, dim_, half_to_float);
return xpu::softmax_forward_sparse_xpu_kernel(input_, dim_, half_to_float);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two issues here:

  1. softmax_forward_sparse_xpu_kernel does not exist — src/ATen/native/sparse/xpu/sycl/SparseSoftmaxKernels.h only declares softmax_sparse_xpu_kernel. This causes a linker error.

  2. The entire log_softmax_sparse_xpu function is removed but remains registered in yaml/native/native_functions.yaml as SparseXPU: log_softmax_sparse_xpu. The backward function log_softmax_backward_sparse_xpu also still exists, creating an orphaned backward without a forward.

[AI-assisted reply]

// This aligns with CUDA
TORCH_CHECK(
qkv_weight.dim() == 2,
qkv_weight.dim() == 4,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — dimension check is inconsistent with error message and downstream usage. The check now requires dim() == 4, but the error message on line 149 still says "expected 2-D 'qkv_weight'", and the comment at line 128 still documents shape as [3 * D, D] (2-D). Subsequent indexing at qkv_weight.sizes()[0]/[1] also assumes 2-D layout.

[AI-assisted reply]

// qkv_weight shape: [3 * D, D]

TORCH_CHECK(
!mask || !query.is_nested(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this guard without adding support is unsafe. If downstream XPU kernels do not handle NestedTensor with mask, this silently allows undefined behavior instead of failing with a clear error. Restore the check or add a test proving the combination now works.

[AI-assisted reply]

@laifenxiawucha
Copy link
Copy Markdown
Contributor

@laifenxiawucha review

laifenxiawucha

This comment was marked as resolved.

@laifenxiawucha
Copy link
Copy Markdown
Contributor

@laifenxiawucha review 审查

@laifenxiawucha
Copy link
Copy Markdown
Contributor

@laifenxiawucha review 审查

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

disable_all Disable all ci test jobs for the PR, just keep basic lint check

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants