Fix notation and clarification of Puzzle 9 by alexzhang13 · Pull Request #16 · gpu-mode/Triton-Puzzles

alexzhang13 · 2024-07-17T18:11:45Z

The notation for the softmax in Puzzle 9 is confusing. The current indexing is not representative of the outer product, and also the inclusion of an extra variable B1 is a bit ambiguous. I think the new description (minor change) is more clear.

The new notation also makes it clear the relationship between the k vector and v vector, which is important for understanding how the full flash attention is done.

The notation for the softmax in Puzzle 9 is both confusing and wrong. The indexing is not the outer product, and also the inclusion of an extra variable B1 is a bit ambiguous. I think the new description (minor change) is more clear.

VachanVY · 2025-08-23T11:19:40Z

Hi @alexzhang13 ,

function arguments are also not complete... B1 is missing

but here i found:
https://github.com/SiriusNEO/Triton-Puzzles-Lite/blob/main/puzzles_ans.py#L614-L617

@triton.jit
def flashatt_kernel(
    q_ptr, k_ptr, v_ptr, z_ptr, N0, T, B0: tl.constexpr, B1: tl.constexpr
):

skimberk · 2025-09-30T23:23:50Z

+1 that the current notation for Puzzle 9 is confusing/potentially incorrect, and also that B1 is missing

As @VachanVY mentioned, it looks like it's been fixed/improved in Triton-Puzzles-Lite: https://github.com/SiriusNEO/Triton-Puzzles-Lite/blob/main/puzzles.md#puzzle-9-simple-flashattention

$$z_{i} = \sum_{j=1}^{T} \text{softmax}(q_i k_1, \ldots, q_i k_T)_j v_{j} \text{ for } i = 1\ldots N_0$$

The missing B1 has been fixed there too: https://github.com/SiriusNEO/Triton-Puzzles-Lite/blob/2990dc91ab0495c5d0306609806f0b455b0555f2/puzzles.py#L466

@triton.jit
def flashatt_kernel(
    q_ptr, k_ptr, v_ptr, z_ptr, N0, T, B0: tl.constexpr, B1: tl.constexpr
):

and then calls it with:

test(
    flashatt_kernel,
    flashatt_spec,
    B={"B0": 64, "B1": 32},
    nelem={"N0": 200, "T": 200},
    # other lite specific params removed
)

msaroufim · 2026-03-18T03:31:22Z

This was fixed

Fix notation and clarification of Puzzle 9

5ec6aa6

The notation for the softmax in Puzzle 9 is both confusing and wrong. The indexing is not the outer product, and also the inclusion of an extra variable B1 is a bit ambiguous. I think the new description (minor change) is more clear.

skimberk mentioned this pull request Sep 30, 2025

Solutions & Enhancement for Triton-Puzzles #33

Open

msaroufim closed this Mar 18, 2026

msaroufim reopened this Apr 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix notation and clarification of Puzzle 9#16

Fix notation and clarification of Puzzle 9#16
alexzhang13 wants to merge 1 commit intogpu-mode:mainfrom
alexzhang13:main

alexzhang13 commented Jul 17, 2024 •

edited

Loading

Uh oh!

VachanVY commented Aug 23, 2025

Uh oh!

skimberk commented Sep 30, 2025

Uh oh!

msaroufim commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

alexzhang13 commented Jul 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

VachanVY commented Aug 23, 2025

Uh oh!

skimberk commented Sep 30, 2025

Uh oh!

msaroufim commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

alexzhang13 commented Jul 17, 2024 •

edited

Loading