Open-source runtime tracing and diagnostics for AI agent execution flows.
AgentTrace helps you understand what your agent actually did at runtime — not just whether the final answer looks good.
It is built for people who want to answer questions like:
- Why did the agent call this tool twice?
- Where did the latency actually come from?
- Which fallback path was triggered?
- What did the LLM see before it made this decision?
- Was the execution flow correct, redundant, or suspicious?
If you want something closer to pprof + tracing + agent diagnostics, AgentTrace is designed for that.
Most agent tooling focuses on one of two things:
- output evaluation — “was the answer good?”
- framework abstraction — “how do I build the agent?”
AgentTrace focuses on a different question:
What exactly happened during execution, and why did the agent behave that way?
That makes it especially useful for:
- debugging execution flow
- diagnosing redundancy and fallback behavior
- inspecting LLM prompts / responses in context
- understanding tool usage patterns
- tracing runtime state across a run
- Trace
LLM / Tool / Skillexecution flows - Capture parallel, retry, fallback, and repeated-call patterns
- Record
Prompt / Response / Context / Plan / Executionsnapshots - Persist runs locally and inspect them in a built-in dashboard
- Review runs with an LLM after execution
- Generate structured diagnostics: critical path, recovery chains, redundant calls, suspicious decisions
AgentTrace records a runtime trace for each run, including:
- span type
- start / end time
- latency
- status
- input parameters
- grouping and parent-child relationships
For LLM spans, AgentTrace can capture:
ContextSnapshotMemorySnapshotPlanSnapshotDecisionSnapshotResumeSnapshotExecutionSnapshot
AgentTrace builds a diagnostics layer on top of the raw trace:
- critical path
- failed tool calls
- recovery chains
- redundant tool clusters
- suspicious decisions
- filtered review findings
After each run, AgentTrace can ask an LLM to review the recorded execution flow and flag:
- redundant tool calls
- wrong tool choices
- suspicious fallback behavior
- unnecessary skill execution
- likely execution-flow issues
Review strictness is configurable:
review_level=1→ tolerantreview_level=2→ balanced (default)review_level=3→ strict
At review_level=1/2, the UI hides low severity findings by default.
At review_level=3, all findings are shown.
AgentTrace includes a local dashboard at:
http://localhost:3500
Current UI features include:
- session list
- execution timeline
- parallel-lane view
- collapsed repeated-tool clusters
- prompt / response modal for LLM spans
- execution-state tabs
- diagnostics panel
- LLM review panel
- collapsible final agent output
PyPI:
pip install agenttrace-runtimeIf your mirror has not synced the package yet, install from the official PyPI index:
pip install -i https://pypi.org/simple agenttrace-runtimeIf you want to install from source instead:
git clone https://github.com/happli-sys/AgentTrace.git
cd AgentTrace
pip install -e .import agenttrace
from my_agent import run
agenttrace.patch(
"my_agent.tools",
"my_agent.skills",
"my_agent.llm",
llm_modules=["my_agent.llm"],
skill_modules=["my_agent.skills"],
review_level=2,
)
output = agenttrace.session("查北京天气并计算 1+2")(run)("查北京天气并计算 1+2")
print(output)
print(agenttrace.last_result().summary())from agenttrace.dashboard.server import start_server
start_server(port=3500)Open:
http://localhost:3500
This repo includes a demo agent that intentionally exercises multiple tracing scenarios:
bashreadgrepcalculateget_weatherflaky_weatherweather_report_skill- parallel weather queries
- fallback to stable tools
Run it:
python examples/demo_agent/main.pyStress prompt:
分析当前目录下的项目;bash pwd;read examples/demo_agent/tools.py;grep calculate examples/demo_agent;查北京和西安的天气,并计算1123123123+1283123;生成北京天气播报;最后总结。
AgentTrace now includes a first protocol-based ingestion path for non-Python agents.
Start the ingest server:
from agenttrace import start_ingest_server
start_ingest_server(port=7760)Then send protocol events to:
POST /api/v1/eventsPOST /api/v1/events/batch
The protocol draft lives in:
docs/protocol-v0.1.md
This is the recommended direction for Go / Node / Java style agents that cannot use the native Python patch/session integration.
AgentTrace works best for:
- custom Python agents with source code
- local development environments
- CLI / hook-based agents
- runtime debugging and diagnostics workflows
The default integration style is intentionally lightweight:
- patch modules once
- wrap runs with
session(...) - inspect results locally
For non-Python agents, AgentTrace is evolving toward a protocol-based model. Current repository drafts include:
docs/protocol-v0.1.mddocs/agenttrace-go-adapter-v0.mddocs/agenttrace-go-api-sketch.mdsdk/go/agenttracego/(prototype)
AgentTrace is currently optimized as:
- a runtime tracing tool
- a local-first diagnostics tool
- a developer-facing execution inspector
It is not currently focused on being:
- a hosted eval platform
- a benchmark leaderboard
- a dataset management system
- a full SaaS observability suite
AgentTrace is especially useful for:
- engineers building custom agents
- teams debugging real runtime behavior
- people who need local-first execution visibility
- anyone who wants to inspect agent decisions beyond final output quality
Current direction is intentionally focused:
- stronger execution tracing
- better diagnostics and issue localization
- cleaner runtime state modeling
- broader integration patterns for source-based agents
- more production-friendly export / observability hooks
The goal is to keep AgentTrace useful as a general execution-flow listener, not to turn it into a bloated all-in-one platform too early.
Although AgentTrace centers on tracing and diagnostics, it still retains objective runtime metrics such as:
- total latency
- avg / p95 step latency
- tool success rate
- token usage
- estimated cost
- step efficiency
- correctness (if
expected_outputis provided) - regression tracking
- comparison helpers
Contributions are welcome — especially around:
- new agent integrations
- richer diagnostics
- runtime state capture
- dashboard usability
- packaging and release polish
For local development:
git clone https://github.com/happli-sys/AgentTrace.git
cd AgentTrace
pip install -e ".[dev]"
pytest tests/If you want to contribute, small focused improvements are preferred over large platform-style expansions.
MIT