Performance Regression Detected
Commit: 117f7d2c
Run: https://github.com/ROCm/ATOM/actions/runs/23503063060
Date: 2026-03-25T02:06:23.707234+00:00
Regressed Configurations
| Model |
ISL/OSL |
Conc |
Tput (cur) |
Tput (base) |
Δ% |
TPOT (cur) |
TPOT (base) |
Δ% |
| DeepSeek-R1-0528 |
1024/1024 |
1 |
85.6 |
96.2 |
-11.0% |
11.58 |
10.32 |
12.1% |
| DeepSeek-R1-0528 |
1024/1024 |
32 |
1657.5 |
1744.9 |
-5.0% |
18.72 |
17.74 |
5.5% |
| DeepSeek-R1-0528 |
1024/1024 |
64 |
2645.3 |
2832.8 |
-6.6% |
23.25 |
21.73 |
7.0% |
| DeepSeek-R1-0528 |
8192/1024 |
1 |
76.6 |
85.5 |
-10.4% |
12.75 |
11.42 |
11.7% |
| DeepSeek-R1-0528 |
8192/1024 |
4 |
282.2 |
316.4 |
-10.8% |
13.40 |
12.02 |
11.4% |
| DeepSeek-R1-0528 |
8192/1024 |
8 |
523.1 |
556.5 |
-6.0% |
14.68 |
13.67 |
7.4% |
| DeepSeek-R1-0528 |
8192/1024 |
64 |
1668.5 |
1764.4 |
-5.4% |
36.48 |
34.46 |
5.9% |
| DeepSeek-R1-0528-mtp3 |
1024/1024 |
8 |
756.7 |
878.9 |
-13.9% |
10.21 |
8.74 |
16.9% |
| DeepSeek-R1-0528-mtp3 |
8192/1024 |
4 |
438.6 |
551.2 |
-20.4% |
8.03 |
6.54 |
22.9% |
| DeepSeek-R1-0528-mtp3 |
8192/1024 |
8 |
773.2 |
735.2 |
5.2% |
9.63 |
10.11 |
-4.7% |
| GLM-5-FP8 |
1024/1024 |
4 |
162.6 |
176.7 |
-8.0% |
23.57 |
21.77 |
8.3% |
| GLM-5-FP8 |
8192/1024 |
2 |
83.4 |
83.4 |
-0.1% |
23.29 |
23.40 |
-0.5% |
| gpt-oss-120b |
1024/1024 |
16 |
2394.8 |
2420.9 |
-1.1% |
6.46 |
6.41 |
0.8% |
| gpt-oss-120b |
1024/1024 |
128 |
8684.9 |
8716.7 |
-0.4% |
14.10 |
14.14 |
-0.3% |
| gpt-oss-120b |
1024/8192 |
4 |
950.7 |
936.9 |
1.5% |
4.14 |
4.20 |
-1.4% |
| gpt-oss-120b |
1024/8192 |
128 |
8998.8 |
9073.1 |
-0.8% |
13.84 |
13.75 |
0.6% |
| gpt-oss-120b |
8192/1024 |
8 |
1336.1 |
1342.8 |
-0.5% |
5.69 |
5.68 |
0.3% |
Performance Summary
# Trace Performance Summary
**File:** `DeepSeek-R1-0528_ts_20260325_021655_879.pt.trace.json.gz`
## Prefill
| # | Label | Duration |
|---|-------|----------|
| 0 | `prefill[bs=1 tok=991 ctx=991]` | 78.10 ms |
| 1 | `prefill[bs=1 tok=866 ctx=866]` | 77.14 ms |
**Total prefill:** 155.24 ms
## Decode
- **Iterations:** 1947
- **Mean:** 827.4 us
- **Min:** 672.2 us
- **Max:** 1.74 ms
- **Total:** 1610.92 ms
Profiler Traces
Download from workflow artifacts.
Open in Perfetto UI or Chrome chrome://tracing for analysis.
Next Steps
- Download
profiler-analysis-23503063060 artifact
- Open trace files in Perfetto UI
- Compare kernel durations against previous traces
- Identify bottleneck changes
Performance Regression Detected
Commit:
117f7d2cRun: https://github.com/ROCm/ATOM/actions/runs/23503063060
Date: 2026-03-25T02:06:23.707234+00:00
Regressed Configurations
Performance Summary
Profiler Traces
Download from workflow artifacts.
Open in Perfetto UI or Chrome
chrome://tracingfor analysis.Next Steps
profiler-analysis-23503063060artifact