AgentTrace

AgentTrace is an open dataset of tool-using language-model agent traces with execution telemetry. Each trace records model-generation steps, tool calls, wall-clock timing, OS-level resource usage, tool inputs and outputs, reasoning content, and reproducibility metadata.

The repository contains the dataset, collection code, analysis scripts, and the deterministic NL2Bash fixture needed to replay the local command-line tasks.

Links

GitHub: https://github.com/pagarsky/agent-trace
Hugging Face: https://huggingface.co/datasets/pagarsky/agent-trace

Dataset Files

File	Source tasks	Model	Traces
`datasets/mbpp_0_6B_20260328T133144Z.jsonl`	MBPP test split	Qwen3-0.6B	500
`datasets/mbpp_1_7B_20260403T211347Z.jsonl`	MBPP test split	Qwen3-1.7B	500
`datasets/nl2bash_0_6B_20260328T133144Z.jsonl`	NL2Bash / InterCode curated	Qwen3-0.6B	200
`datasets/nl2bash_1_7B_20260403T211347Z.jsonl`	NL2Bash / InterCode curated	Qwen3-1.7B	200

Total: 1,400 traces.

Trace Schema

Each JSONL row is one agent run with:

trace_id, timestamp_utc, prompt, model, total_duration_ms
spans[]: tool invocations with tool_name, tool_input, tool_output, timing, exit code, and resource telemetry
llm_steps[]: model-generation steps with visible output, reasoning content, parsed tool calls, and token counts when available
metadata: dataset source, task id, run id, serving configuration, model artifact, platform, hardware, and collection version metadata

Telemetry fields include user CPU time, system CPU time, peak resident-set size, disk read bytes, and disk write bytes. On macOS and Linux, memory accounting differs at the OS API level; the collector normalizes ru_maxrss to bytes.

Quick Start

uv sync
uv run python analyze.py datasets/mbpp_0_6B_20260328T133144Z.jsonl
uv run python analyze_deep.py datasets/mbpp_0_6B_20260328T133144Z.jsonl datasets/nl2bash_0_6B_20260328T133144Z.jsonl

Load the Hugging Face dataset:

import json
from datasets import load_dataset

ds = load_dataset("pagarsky/agent-trace")["train"]
row = ds[0]
spans = json.loads(row["spans_json"])
llm_steps = json.loads(row["llm_steps_json"])
metadata = json.loads(row["metadata_json"])

The data/agenttrace.parquet file is a normalized convenience view for the Hugging Face dataset viewer and the datasets library. The raw release artifacts remain available under datasets/*.jsonl.

Generate plots into a local directory:

uv run python plots.py --outdir figures

Replay And Collection

Start a local llama-server compatible with OpenAI-style tool calling, then run:

./scripts/llama-server.sh start
uv run python collect.py --dataset mbpp -n 10 --model Qwen/Qwen3-0.6B --output datasets/traces.jsonl
uv run python collect.py --dataset nl2bash -n 10 --model Qwen/Qwen3-0.6B --output datasets/traces.jsonl

For full runs:

./scripts/collect-all.sh Qwen/Qwen3-1.7B

The NL2Bash tasks use a deterministic fixture under testdata/; regenerate or verify it with:

python scripts/generate-testdata.py --check
python scripts/generate-testdata.py

Data Sources And Licensing

Code in this repository is released under Apache-2.0. The trace dataset is released as openly as possible under the same license, subject to any applicable terms of the upstream prompt sources used to generate traces: MBPP and the curated NL2Bash/InterCode task set. Users should respect the licenses and terms of those upstream datasets.

The traces contain model-generated reasoning and tool outputs. Host-specific paths have been normalized to /testdata or redacted placeholders where applicable. A small number of early NL2Bash traces accidentally captured output lines from repo-local files outside the /testdata fixture; those line contents are masked with <redacted ...> markers, but no trace files or rows were removed.

Citation

A paper citation will be added after publication. Until then, please cite the dataset and code repository:

@misc{paharskyi_agenttrace_2026,
  title        = {AgentTrace: Tool-Using Model Telemetry Dataset},
  author       = {Paharskyi, Oleksii and Haina, Heorhii},
  year         = {2026},
  howpublished = {GitHub and Hugging Face},
  url          = {https://github.com/pagarsky/agent-trace},
  note         = {Dataset and code: https://huggingface.co/datasets/pagarsky/agent-trace}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
datasets		datasets
docs		docs
runs		runs
scripts		scripts
src		src
testdata		testdata
.gitignore		.gitignore
.python-version		.python-version
CITATION.cff		CITATION.cff
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
analyze.py		analyze.py
analyze_deep.py		analyze_deep.py
collect.py		collect.py
main.py		main.py
plots.py		plots.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AgentTrace

Links

Dataset Files

Trace Schema

Quick Start

Replay And Collection

Data Sources And Licensing

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AgentTrace

Links

Dataset Files

Trace Schema

Quick Start

Replay And Collection

Data Sources And Licensing

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages