Skip to content

Flash Attention Integration Tests Fail #52

@weiT1993

Description

@weiT1993

Describe the bug

Flash attention integration tests failed

Expected Behavior

Flash attention integration correctness pytest should pass.

Current Behavior

FAILED flash_attention_correctness.py::test_attention[True-dtype0] - TypeError: multiple values for argument 'softmax_scale'
FAILED flash_attention_correctness.py::test_attention[True-dtype1] - TypeError: multiple values for argument 'softmax_scale'
FAILED flash_attention_correctness.py::test_attention[False-dtype0] - TypeError: multiple values for argument 'softmax_scale'
FAILED flash_attention_correctness.py::test_attention[False-dtype1] - TypeError: multiple values for argument 'softmax_scale'

Reproduction Steps

Running

pytest flash_attention_correctness.py

from test/integration/flash_attention.

Regression Issue

  • Select this option if this issue appears to be a regression.

Possible Solution

It is currently calling compiler NKI kernel. Upgrade to the one from the repo?

Additional Information/Context

No response

neuronx-cc version used

2.21

Framework(s) and their versions used (JAX, PyTorch, etc..)

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions