Commit 0a8917e

committed

[ExecuTorch][WebGPU] SDPA test suite: replay + dynamic input_pos + in-graph KV cache

Pull Request resolved: #20087 Adds the WebGPU SDPA test coverage as its own diff, stacked on the SDPA op (which already carries the dynamic-`input_pos` consumption) and the SymInt mechanism below it: multi-step prefill->mt->decode replay, runtime-dynamic `input_pos` (autoregressive decode), and an in-graph mutable KV cache, each compared against a torch `F.scaled_dot_product_attention` golden. - `test/ops/sdpa/test_sdpa.py`: `ReplaySeq`/`REPLAY_SEQS` + per-step replay export/golden; `DynamicSdpaModule` + `export_dynamic_decode` (one `.pte`, `input_pos` supplied at runtime as a SymInt); `DecodeCacheModule` + `export_incache_decode` (KV cache as `register_buffer` mutable buffers, so the cache persists in-graph and forward() feeds only the new token + `input_pos`). - `test/test_webgpu_native.cpp`: `test_sdpa_replay`, `test_sdpa_dynamic_decode` (+ negative control: a pinned `input_pos` diverges), `test_sdpa_incache_decode` (+ static control: a fresh Module per step diverges, proving in-graph accumulation is real), `test_symint_roundtrip`, `test_resize_hook`; shared per-element tolerance `sdpa_within_tol` (abs 1e-4 OR rel 1e-3). - `test/test_build_webgpu.sh`: export the replay / dynamic / in-graph-cache models for the native test. ghstack-source-id: 391373155 @exported-using-ghexport Differential Revision: [D107595144](https://our.internmc.facebook.com/intern/diff/D107595144/)

1 parent 7c2ed45 commit 0a8917eCopy full SHA for 0a8917e

3 files changed

backends/webgpu/test
- ops/sdpa
  - test_sdpa.py
- test_build_webgpu.sh
- test_webgpu_native.cpp

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit 0a8917e

File tree

0 commit comments