AgentTrace

Open-source runtime tracing and diagnostics for AI agent execution flows.

AgentTrace helps you understand what your agent actually did at runtime — not just whether the final answer looks good.

It is built for people who want to answer questions like:

Why did the agent call this tool twice?
Where did the latency actually come from?
Which fallback path was triggered?
What did the LLM see before it made this decision?
Was the execution flow correct, redundant, or suspicious?

If you want something closer to pprof + tracing + agent diagnostics, AgentTrace is designed for that.

Why AgentTrace

Most agent tooling focuses on one of two things:

output evaluation — “was the answer good?”
framework abstraction — “how do I build the agent?”

AgentTrace focuses on a different question:

What exactly happened during execution, and why did the agent behave that way?

That makes it especially useful for:

debugging execution flow
diagnosing redundancy and fallback behavior
inspecting LLM prompts / responses in context
understanding tool usage patterns
tracing runtime state across a run

Core capabilities

Trace LLM / Tool / Skill execution flows
Capture parallel, retry, fallback, and repeated-call patterns
Record Prompt / Response / Context / Plan / Execution snapshots
Persist runs locally and inspect them in a built-in dashboard
Review runs with an LLM after execution
Generate structured diagnostics: critical path, recovery chains, redundant calls, suspicious decisions

What you get

Execution tracing

AgentTrace records a runtime trace for each run, including:

span type
start / end time
latency
status
input parameters
grouping and parent-child relationships

Structured state snapshots

For LLM spans, AgentTrace can capture:

ContextSnapshot
MemorySnapshot
PlanSnapshot
DecisionSnapshot
ResumeSnapshot
ExecutionSnapshot

Diagnostics

AgentTrace builds a diagnostics layer on top of the raw trace:

critical path
failed tool calls
recovery chains
redundant tool clusters
suspicious decisions
filtered review findings

LLM review

After each run, AgentTrace can ask an LLM to review the recorded execution flow and flag:

redundant tool calls
wrong tool choices
suspicious fallback behavior
unnecessary skill execution
likely execution-flow issues

Review strictness is configurable:

review_level=1 → tolerant
review_level=2 → balanced (default)
review_level=3 → strict

At review_level=1/2, the UI hides low severity findings by default. At review_level=3, all findings are shown.

Dashboard

AgentTrace includes a local dashboard at:

http://localhost:3500

Current UI features include:

session list
execution timeline
parallel-lane view
collapsed repeated-tool clusters
prompt / response modal for LLM spans
execution-state tabs
diagnostics panel
LLM review panel
collapsible final agent output

Quick start

1. Install

PyPI:

https://pypi.org/project/agenttrace-runtime/0.1.0/

pip install agenttrace-runtime

If your mirror has not synced the package yet, install from the official PyPI index:

pip install -i https://pypi.org/simple agenttrace-runtime

If you want to install from source instead:

git clone https://github.com/happli-sys/AgentTrace.git
cd AgentTrace
pip install -e .

2. Patch once, trace every run

import agenttrace
from my_agent import run

agenttrace.patch(
    "my_agent.tools",
    "my_agent.skills",
    "my_agent.llm",
    llm_modules=["my_agent.llm"],
    skill_modules=["my_agent.skills"],
    review_level=2,
)

output = agenttrace.session("查北京天气并计算 1+2")(run)("查北京天气并计算 1+2")
print(output)
print(agenttrace.last_result().summary())

3. Start the dashboard

from agenttrace.dashboard.server import start_server

start_server(port=3500)

Open:

http://localhost:3500

Demo agent

This repo includes a demo agent that intentionally exercises multiple tracing scenarios:

bash
read
grep
calculate
get_weather
flaky_weather
weather_report_skill
parallel weather queries
fallback to stable tools

Run it:

python examples/demo_agent/main.py

Stress prompt:

分析当前目录下的项目；bash pwd；read examples/demo_agent/tools.py；grep calculate examples/demo_agent；查北京和西安的天气，并计算1123123123+1283123；生成北京天气播报；最后总结。

Protocol ingest (for non-Python agents)

AgentTrace now includes a first protocol-based ingestion path for non-Python agents.

Start the ingest server:

from agenttrace import start_ingest_server

start_ingest_server(port=7760)

Then send protocol events to:

POST /api/v1/events
POST /api/v1/events/batch

The protocol draft lives in:

docs/protocol-v0.1.md

This is the recommended direction for Go / Node / Java style agents that cannot use the native Python patch/session integration.

Integration model

AgentTrace works best for:

custom Python agents with source code
local development environments
CLI / hook-based agents
runtime debugging and diagnostics workflows

The default integration style is intentionally lightweight:

patch modules once
wrap runs with session(...)
inspect results locally

For non-Python agents, AgentTrace is evolving toward a protocol-based model. Current repository drafts include:

docs/protocol-v0.1.md
docs/agenttrace-go-adapter-v0.md
docs/agenttrace-go-api-sketch.md
sdk/go/agenttracego/ (prototype)

Project scope

AgentTrace is currently optimized as:

a runtime tracing tool
a local-first diagnostics tool
a developer-facing execution inspector

It is not currently focused on being:

a hosted eval platform
a benchmark leaderboard
a dataset management system
a full SaaS observability suite

Who this is for

AgentTrace is especially useful for:

engineers building custom agents
teams debugging real runtime behavior
people who need local-first execution visibility
anyone who wants to inspect agent decisions beyond final output quality

Roadmap direction

Current direction is intentionally focused:

stronger execution tracing
better diagnostics and issue localization
cleaner runtime state modeling
broader integration patterns for source-based agents
more production-friendly export / observability hooks

The goal is to keep AgentTrace useful as a general execution-flow listener, not to turn it into a bloated all-in-one platform too early.

Still useful for objective metrics

Although AgentTrace centers on tracing and diagnostics, it still retains objective runtime metrics such as:

total latency
avg / p95 step latency
tool success rate
token usage
estimated cost
step efficiency
correctness (if expected_output is provided)
regression tracking
comparison helpers

Contributing

Contributions are welcome — especially around:

new agent integrations
richer diagnostics
runtime state capture
dashboard usability
packaging and release polish

For local development:

git clone https://github.com/happli-sys/AgentTrace.git
cd AgentTrace
pip install -e ".[dev]"
pytest tests/

If you want to contribute, small focused improvements are preferred over large platform-style expansions.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
agenttrace		agenttrace
docs		docs
sdk/go/agenttracego		sdk/go/agenttracego
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AgentTrace

Why AgentTrace

Core capabilities

What you get

Execution tracing

Structured state snapshots

Diagnostics

LLM review

Dashboard

Quick start

1. Install

2. Patch once, trace every run

3. Start the dashboard

Demo agent

Protocol ingest (for non-Python agents)

Integration model

Project scope

Who this is for

Roadmap direction

Still useful for objective metrics

Contributing

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AgentTrace

Why AgentTrace

Core capabilities

What you get

Execution tracing

Structured state snapshots

Diagnostics

LLM review

Dashboard

Quick start

1. Install

2. Patch once, trace every run

3. Start the dashboard

Demo agent

Protocol ingest (for non-Python agents)

Integration model

Project scope

Who this is for

Roadmap direction

Still useful for objective metrics

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages