Commit 4e76d24
authored
ggml : fix AMX and add batched support (ggml-org#19925)
llama-perplexity -hf ggml-org/Qwen3-0.6B-GGUF:Q4_0 -f wikitext-2-raw/wiki.test.raw -c 2048 -b 2048 --chunks 2
before this commit:
```
perplexity: calculating perplexity over 2 chunks, n_ctx=2048, batch_size=2048, n_seq=1
perplexity: 2.31 seconds per pass - ETA 0.07 minutes
[1]17.3868,[2]22.2199,
Final estimate: PPL = 22.2199 +/- 1.59692
llama_perf_context_print: load time = 878.56 ms
llama_perf_context_print: prompt eval time = 2037.82 ms / 4096 tokens ( 0.50 ms per token, 2009.99 tokens per second)
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
llama_perf_context_print: total time = 6403.17 ms / 4097 tokens
llama_perf_context_print: graphs reused = 0
llama_memory_breakdown_print: | memory breakdown [MiB] | total free self model context compute unaccounted |
llama_memory_breakdown_print: | - Host | 845 = 318 + 224 + 302 |
llama_memory_breakdown_print: | - CPU_REPACK | 288 = 288 + 0 + 0 |
llama_memory_breakdown_print: | - AMX | 31 = 31 + 0 + 0 |
```
after this commit:
```
perplexity: calculating perplexity over 2 chunks, n_ctx=2048, batch_size=2048, n_seq=1
perplexity: 1.98 seconds per pass - ETA 0.05 minutes
[1]17.2005,[2]21.8220,
Final estimate: PPL = 21.8220 +/- 1.56485
llama_perf_context_print: load time = 719.23 ms
llama_perf_context_print: prompt eval time = 1676.23 ms / 4096 tokens ( 0.41 ms per token, 2443.58 tokens per second)
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
llama_perf_context_print: total time = 4258.74 ms / 4097 tokens
llama_perf_context_print: graphs reused = 0
llama_memory_breakdown_print: | memory breakdown [MiB] | total free self model context compute unaccounted |
llama_memory_breakdown_print: | - Host | 845 = 318 + 224 + 302 |
llama_memory_breakdown_print: | - AMX | 319 = 319 + 0 + 0 |
```
(no more CPU_REPACK)
after this commit, disabling amx:
```
perplexity: calculating perplexity over 2 chunks, n_ctx=2048, batch_size=2048, n_seq=1
perplexity: 2.34 seconds per pass - ETA 0.07 minutes
[1]17.2005,[2]21.8220,
Final estimate: PPL = 21.8220 +/- 1.56485
llama_perf_context_print: load time = 841.91 ms
llama_perf_context_print: prompt eval time = 2057.28 ms / 4096 tokens ( 0.50 ms per token, 1990.98 tokens per second)
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
llama_perf_context_print: total time = 6454.51 ms / 4097 tokens
llama_perf_context_print: graphs reused = 0
llama_memory_breakdown_print: | memory breakdown [MiB] | total free self model context compute unaccounted |
llama_memory_breakdown_print: | - Host | 845 = 318 + 224 + 302 |
llama_memory_breakdown_print: | - CPU_REPACK | 319 = 319 + 0 + 0 |
```
=> same perplexity.
Signed-off-by: Adrien Gallouët <angt@huggingface.co>1 parent 723c710 commit 4e76d24
2 files changed
+124
-101
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
141 | 141 | | |
142 | 142 | | |
143 | 143 | | |
144 | | - | |
145 | | - | |
146 | | - | |
147 | | - | |
148 | | - | |
149 | | - | |
150 | | - | |
151 | | - | |
152 | | - | |
153 | | - | |
154 | | - | |
155 | | - | |
156 | | - | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
157 | 179 | | |
158 | | - | |
159 | | - | |
160 | | - | |
161 | | - | |
162 | | - | |
163 | 180 | | |
164 | | - | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
165 | 188 | | |
166 | 189 | | |
167 | 190 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
2 | 1 | | |
3 | 2 | | |
4 | 3 | | |
| |||
202 | 201 | | |
203 | 202 | | |
204 | 203 | | |
205 | | - | |
206 | | - | |
207 | | - | |
| 204 | + | |
| 205 | + | |
208 | 206 | | |
209 | | - | |
| 207 | + | |
210 | 208 | | |
211 | 209 | | |
212 | 210 | | |
213 | | - | |
214 | | - | |
215 | | - | |
216 | | - | |
217 | | - | |
218 | | - | |
219 | | - | |
220 | | - | |
221 | | - | |
222 | | - | |
223 | | - | |
224 | | - | |
225 | | - | |
226 | | - | |
227 | | - | |
228 | | - | |
229 | | - | |
230 | | - | |
231 | | - | |
232 | | - | |
233 | | - | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
234 | 225 | | |
235 | 226 | | |
236 | 227 | | |
| |||
268 | 259 | | |
269 | 260 | | |
270 | 261 | | |
271 | | - | |
272 | | - | |
273 | | - | |
274 | | - | |
275 | | - | |
276 | | - | |
277 | | - | |
278 | | - | |
279 | | - | |
280 | | - | |
281 | | - | |
282 | | - | |
283 | | - | |
284 | | - | |
285 | | - | |
286 | | - | |
287 | | - | |
288 | | - | |
289 | | - | |
290 | | - | |
291 | | - | |
292 | | - | |
293 | | - | |
294 | | - | |
295 | | - | |
296 | | - | |
297 | | - | |
298 | 262 | | |
299 | 263 | | |
300 | 264 | | |
| |||
1370 | 1334 | | |
1371 | 1335 | | |
1372 | 1336 | | |
1373 | | - | |
1374 | | - | |
1375 | | - | |
| 1337 | + | |
| 1338 | + | |
| 1339 | + | |
1376 | 1340 | | |
1377 | 1341 | | |
1378 | 1342 | | |
| |||
2019 | 1983 | | |
2020 | 1984 | | |
2021 | 1985 | | |
2022 | | - | |
2023 | | - | |
2024 | | - | |
2025 | | - | |
2026 | | - | |
| 1986 | + | |
| 1987 | + | |
| 1988 | + | |
| 1989 | + | |
| 1990 | + | |
2027 | 1991 | | |
2028 | 1992 | | |
2029 | 1993 | | |
| |||
2079 | 2043 | | |
2080 | 2044 | | |
2081 | 2045 | | |
2082 | | - | |
| 2046 | + | |
2083 | 2047 | | |
2084 | 2048 | | |
2085 | 2049 | | |
| |||
2336 | 2300 | | |
2337 | 2301 | | |
2338 | 2302 | | |
| 2303 | + | |
| 2304 | + | |
| 2305 | + | |
| 2306 | + | |
| 2307 | + | |
| 2308 | + | |
| 2309 | + | |
2339 | 2310 | | |
2340 | 2311 | | |
2341 | 2312 | | |
| |||
2348 | 2319 | | |
2349 | 2320 | | |
2350 | 2321 | | |
| 2322 | + | |
2351 | 2323 | | |
2352 | 2324 | | |
2353 | 2325 | | |
2354 | 2326 | | |
2355 | 2327 | | |
2356 | | - | |
| 2328 | + | |
2357 | 2329 | | |
2358 | 2330 | | |
2359 | 2331 | | |
| |||
2365 | 2337 | | |
2366 | 2338 | | |
2367 | 2339 | | |
2368 | | - | |
| 2340 | + | |
2369 | 2341 | | |
2370 | 2342 | | |
2371 | 2343 | | |
| |||
2382 | 2354 | | |
2383 | 2355 | | |
2384 | 2356 | | |
| 2357 | + | |
| 2358 | + | |
| 2359 | + | |
2385 | 2360 | | |
2386 | 2361 | | |
2387 | 2362 | | |
2388 | 2363 | | |
2389 | 2364 | | |
2390 | 2365 | | |
2391 | | - | |
| 2366 | + | |
2392 | 2367 | | |
2393 | 2368 | | |
2394 | | - | |
2395 | | - | |
| 2369 | + | |
| 2370 | + | |
| 2371 | + | |
| 2372 | + | |
| 2373 | + | |
| 2374 | + | |
| 2375 | + | |
| 2376 | + | |
2396 | 2377 | | |
2397 | 2378 | | |
2398 | 2379 | | |
| |||
2424 | 2405 | | |
2425 | 2406 | | |
2426 | 2407 | | |
2427 | | - | |
| 2408 | + | |
2428 | 2409 | | |
2429 | 2410 | | |
2430 | | - | |
| 2411 | + | |
2431 | 2412 | | |
2432 | 2413 | | |
2433 | 2414 | | |
| |||
2436 | 2417 | | |
2437 | 2418 | | |
2438 | 2419 | | |
2439 | | - | |
2440 | | - | |
2441 | | - | |
2442 | | - | |
| 2420 | + | |
| 2421 | + | |
| 2422 | + | |
| 2423 | + | |
| 2424 | + | |
| 2425 | + | |
| 2426 | + | |
| 2427 | + | |
| 2428 | + | |
| 2429 | + | |
| 2430 | + | |
2443 | 2431 | | |
2444 | | - | |
| 2432 | + | |
2445 | 2433 | | |
2446 | 2434 | | |
2447 | 2435 | | |
| |||
2451 | 2439 | | |
2452 | 2440 | | |
2453 | 2441 | | |
2454 | | - | |
| 2442 | + | |
2455 | 2443 | | |
2456 | 2444 | | |
2457 | 2445 | | |
2458 | 2446 | | |
2459 | 2447 | | |
2460 | | - | |
| 2448 | + | |
| 2449 | + | |
| 2450 | + | |
| 2451 | + | |
| 2452 | + | |
| 2453 | + | |
| 2454 | + | |
2461 | 2455 | | |
2462 | 2456 | | |
2463 | 2457 | | |
| |||
2481 | 2475 | | |
2482 | 2476 | | |
2483 | 2477 | | |
2484 | | - | |
| 2478 | + | |
2485 | 2479 | | |
2486 | 2480 | | |
2487 | 2481 | | |
| |||
2491 | 2485 | | |
2492 | 2486 | | |
2493 | 2487 | | |
2494 | | - | |
2495 | | - | |
| 2488 | + | |
| 2489 | + | |
| 2490 | + | |
| 2491 | + | |
| 2492 | + | |
| 2493 | + | |
| 2494 | + | |
| 2495 | + | |
2496 | 2496 | | |
2497 | 2497 | | |
2498 | 2498 | | |
| |||
2501 | 2501 | | |
2502 | 2502 | | |
2503 | 2503 | | |
2504 | | - | |
2505 | | - | |
2506 | | - | |
| 2504 | + | |
| 2505 | + | |
| 2506 | + | |
2507 | 2507 | | |
2508 | 2508 | | |
2509 | 2509 | | |
| |||
0 commit comments