Skip to content

[cuda backend] optimized L_kv threshold for sdpa implementation selection. #13858

[cuda backend] optimized L_kv threshold for sdpa implementation selection.

[cuda backend] optimized L_kv threshold for sdpa implementation selection. #13858

Job Run time
5s
32s
46m 25s
45m 47s
18m 17s
17m 44s
31m 16s
31m 29s
22m 7s
26m 59s
21m 25s
17m 33s
20m 58s
34m 5s
30m 40s
0s
28m 33s
33m 25s
28m 1s
34m 48s
21m 33s
0s
23m 49s
3s
8h 55m 34s