Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,076 changes: 1,039 additions & 37 deletions docker/patch/latest/sglang.patch

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docker/version.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
nightly-dev-20260327a
nightly-dev-20260329a
Binary file added docs/_static/image/trace.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
65 changes: 65 additions & 0 deletions docs/en/developer_guide/trace.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Trace Viewer

slime can attach lightweight execution traces to each rollout sample. These traces capture span-style events such as generation and reward-model calls, and they can be inspected later from a saved rollout debug dump.

![trace timeline viewer](../../_static/image/trace.png)

## Save rollout trace data

To inspect traces later, save rollout debug data during a run:

```bash
python train.py \
... \
--save-debug-rollout-data /path/to/debug/rollout_{rollout_id}.pt
```

Each saved `.pt` file contains the rollout samples together with their `trace` payloads. You can also replay the same dump later with `--load-debug-rollout-data`.

## Open the timeline viewer

Use the trace viewer script on a saved rollout dump:

```bash
python tools/trace_timeline_viewer.py /path/to/debug/rollout_0.pt
```

The script generates:

- `rollout_0.trace_timeline_cache.json`
- `rollout_0.trace_timeline_viewer.html`

By default it also starts a local static server so you can open the generated HTML immediately. If you only want the files, use `--no-serve`.

## How to read the viewer

- Each row corresponds to one sample.
- Bars represent spans, while point markers represent instant events.
- Span attributes recorded at the start or end of `trace_span(...)` are shown in the details panel.
- When SGLang returns PD disaggregation timings, the viewer adds synthetic `[P]` and `[D]` lanes to break out prefill/decode work.
- When PD is not enabled, those virtual lanes are omitted automatically and the base trace still renders normally.

## Instrument custom code

For custom rollout or reward code, reuse helpers from `slime.utils.trace_utils`:

- `trace_span(target, name, attrs=...)`: record a duration span.
- `trace_event(target, name, attrs=...)`: record an instant event.
- `bind_trace(sample)`: ensure a sample already has a trace carrier before passing it across helpers or tasks.

If you want to record SGLang generation metadata in a consistent way, reuse `build_sglang_meta_trace_attrs`:

```python
from slime.utils.trace_utils import build_sglang_meta_trace_attrs, trace_span

with trace_span(sample, "sglang_generate") as span:
output = await post(url, payload)
span.update(build_sglang_meta_trace_attrs(output["meta_info"]))
```

## Tips

- Save a small number of rollouts first; the viewer is easiest to read when each dump contains a manageable number of samples.
- The viewer is built from the saved `.pt` dump, so traces can be inspected offline on another machine.
- For GPU/kernel-level SGLang profiling traces, see [Profiling](./profiling.md).

1 change: 1 addition & 0 deletions docs/en/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ slime is the RL-framework behind GLM-4.7, GLM-4.6 and GLM-4.5. Apart from models

developer_guide/ci.md
developer_guide/debug.md
developer_guide/trace.md
developer_guide/profiling.md

.. toctree::
Expand Down
65 changes: 65 additions & 0 deletions docs/zh/developer_guide/trace.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Trace 可视化

slime 可以为每条 rollout sample 挂上轻量级执行 trace。它会记录生成、奖励模型等 span 事件,并且可以在保存下来的 rollout debug dump 中离线查看。

![trace 时间线查看器](../../_static/image/trace.png)

## 保存 rollout trace 数据

如果想在运行结束后查看 trace,可以在训练时打开 rollout debug dump:

```bash
python train.py \
... \
--save-debug-rollout-data /path/to/debug/rollout_{rollout_id}.pt
```

每个保存出来的 `.pt` 文件都会包含 rollout samples,以及对应的 `trace` 数据。之后也可以通过 `--load-debug-rollout-data` 复用同一份 dump。

## 打开时间线查看器

对保存好的 rollout dump 运行:

```bash
python tools/trace_timeline_viewer.py /path/to/debug/rollout_0.pt
```

脚本会生成:

- `rollout_0.trace_timeline_cache.json`
- `rollout_0.trace_timeline_viewer.html`

默认情况下,它还会启动一个本地静态文件服务,方便直接在浏览器里打开。如果只想生成文件,可以加 `--no-serve`。

## 如何理解可视化结果

- 每一行对应一条 sample。
- 条形块表示 span,点表示瞬时事件。
- `trace_span(...)` 在开始和结束时记录的属性,都会显示在详情面板里。
- 当 SGLang 返回 PD 分离相关时延时,viewer 会自动补出 `[P]` 和 `[D]` 两条虚拟 lane,用来拆开展示 prefill/decode。
- 如果没有开启 PD,这两条虚拟 lane 不会出现,基础 trace 也仍然可以正常渲染。

## 给自定义代码打点

在自定义 rollout 或 reward 逻辑中,可以直接复用 `slime.utils.trace_utils` 里的工具:

- `trace_span(target, name, attrs=...)`:记录一段持续时间。
- `trace_event(target, name, attrs=...)`:记录一个瞬时事件。
- `bind_trace(sample)`:在 sample 被传递到其他 helper 或任务之前,确保它已经绑定好 trace carrier。

如果想统一记录 SGLang 返回的 generation 元信息,可以复用 `build_sglang_meta_trace_attrs`:

```python
from slime.utils.trace_utils import build_sglang_meta_trace_attrs, trace_span

with trace_span(sample, "sglang_generate") as span:
output = await post(url, payload)
span.update(build_sglang_meta_trace_attrs(output["meta_info"]))
```

## 使用建议

- 先保存少量 rollout;单个 dump 的 sample 数量适中时,viewer 会更容易阅读。
- viewer 直接基于保存下来的 `.pt` dump 工作,因此可以把文件拷到别的机器离线分析。
- 如果你想看的是 SGLang 自身的 GPU / kernel 级 profiling trace,请参考 [性能分析](./profiling.md)。

1 change: 1 addition & 0 deletions docs/zh/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ slime 是 GLM-4.7、GLM-4.6、GLM-4.5 背后的 RL 训练框架。除此之外

developer_guide/ci.md
developer_guide/debug.md
developer_guide/trace.md
developer_guide/profiling.md

.. toctree::
Expand Down
2 changes: 2 additions & 0 deletions slime/ray/rollout.py
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,8 @@ def start_engines(self, port_cursors: dict[int, int] | None = None) -> tuple[lis
"SGLANG_BATCH_INVARIANT_OPS_ENABLE_MM_FALLBACK_VARIANT": "true",
"SGLANG_ENABLE_HEALTH_ENDPOINT_GENERATION": "false",
"SGLANG_ENABLE_STRICT_MEM_CHECK_DURING_IDLE": "false",
"SGLANG_TRANSFER_PROFILING_INFO": "true",
"SLIME_ENABLE_PROFILING": "true",
}.items()
}

Expand Down
20 changes: 16 additions & 4 deletions slime/rollout/sglang_rollout.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
load_processor,
load_tokenizer,
)
from slime.utils.trace_utils import build_sglang_meta_trace_attrs, trace_function, trace_span
from slime.utils.types import Sample

from .rm_hub import async_rm, batched_async_rm
Expand Down Expand Up @@ -174,7 +175,9 @@ async def generate(args: Namespace, sample: Sample, sampling_params: dict[str, A
if getattr(args, "router_policy", None) == "consistent_hashing":
headers = {"X-SMG-Routing-Key": sample.session_id}

output = await post(url, payload, headers=headers)
with trace_span(sample, "sglang_generate", attrs={"max_new_tokens": sampling_params["max_new_tokens"]}) as span:
output = await post(url, payload, headers=headers)
span.update(build_sglang_meta_trace_attrs(output["meta_info"]))

if "output_token_logprobs" in output["meta_info"]:
new_response_tokens = [item[1] for item in output["meta_info"]["output_token_logprobs"]]
Expand Down Expand Up @@ -211,6 +214,7 @@ async def generate(args: Namespace, sample: Sample, sampling_params: dict[str, A
return sample


@trace_function("generate_and_rm", target="sample")
async def generate_and_rm(
args: Namespace,
sample: Sample | list[Sample],
Expand Down Expand Up @@ -261,7 +265,8 @@ async def generate_and_rm(

# for multi agent system, the reward of some sample is calculated during generation.
samples_need_reward = [sample for sample in samples if sample.reward is None]
rewards = await batched_async_rm(args, samples_need_reward)
with trace_span(samples_need_reward, "reward_model"):
rewards = await batched_async_rm(args, samples_need_reward)
for sample, reward in zip(samples_need_reward, rewards, strict=False):
sample.reward = reward
return samples
Expand All @@ -270,11 +275,17 @@ async def generate_and_rm(
return sample
# for multi-turn environment, a reward could be assigned to the agent.
if sample.reward is None:
sample.reward = await async_rm(args, sample)
with trace_span(sample, "reward_model"):
sample.reward = await async_rm(args, sample)

return sample


@trace_function(
"generate_and_rm_group",
target="group",
attrs_getter=lambda args, group, sampling_params, evaluation=False: {"group_size": len(group)},
)
async def generate_and_rm_group(
args: Namespace, group: list[Sample], sampling_params: dict[str, Any], evaluation: bool = False
) -> list[Sample]:
Expand Down Expand Up @@ -302,7 +313,8 @@ async def generate_and_rm_group(

# for the rm that need the whole group, we will do the rm here
if not state.aborted and args.group_rm:
rewards = await batched_async_rm(args, group)
with trace_span(group, "group_reward_model"):
rewards = await batched_async_rm(args, group)
for sample, reward in zip(group, rewards, strict=False):
sample.reward = reward

Expand Down
Loading