Skip to content

happli-sys/AgentTrace

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AgentTrace

GitHub Repo Python License PyPI Status Local First

Open-source runtime tracing and diagnostics for AI agent execution flows.

AgentTrace helps you understand what your agent actually did at runtime — not just whether the final answer looks good.

It is built for people who want to answer questions like:

  • Why did the agent call this tool twice?
  • Where did the latency actually come from?
  • Which fallback path was triggered?
  • What did the LLM see before it made this decision?
  • Was the execution flow correct, redundant, or suspicious?

If you want something closer to pprof + tracing + agent diagnostics, AgentTrace is designed for that.


Why AgentTrace

Most agent tooling focuses on one of two things:

  • output evaluation — “was the answer good?”
  • framework abstraction — “how do I build the agent?”

AgentTrace focuses on a different question:

What exactly happened during execution, and why did the agent behave that way?

That makes it especially useful for:

  • debugging execution flow
  • diagnosing redundancy and fallback behavior
  • inspecting LLM prompts / responses in context
  • understanding tool usage patterns
  • tracing runtime state across a run

Core capabilities

  • Trace LLM / Tool / Skill execution flows
  • Capture parallel, retry, fallback, and repeated-call patterns
  • Record Prompt / Response / Context / Plan / Execution snapshots
  • Persist runs locally and inspect them in a built-in dashboard
  • Review runs with an LLM after execution
  • Generate structured diagnostics: critical path, recovery chains, redundant calls, suspicious decisions

What you get

Execution tracing

AgentTrace records a runtime trace for each run, including:

  • span type
  • start / end time
  • latency
  • status
  • input parameters
  • grouping and parent-child relationships

Structured state snapshots

For LLM spans, AgentTrace can capture:

  • ContextSnapshot
  • MemorySnapshot
  • PlanSnapshot
  • DecisionSnapshot
  • ResumeSnapshot
  • ExecutionSnapshot

Diagnostics

AgentTrace builds a diagnostics layer on top of the raw trace:

  • critical path
  • failed tool calls
  • recovery chains
  • redundant tool clusters
  • suspicious decisions
  • filtered review findings

LLM review

After each run, AgentTrace can ask an LLM to review the recorded execution flow and flag:

  • redundant tool calls
  • wrong tool choices
  • suspicious fallback behavior
  • unnecessary skill execution
  • likely execution-flow issues

Review strictness is configurable:

  • review_level=1 → tolerant
  • review_level=2 → balanced (default)
  • review_level=3 → strict

At review_level=1/2, the UI hides low severity findings by default. At review_level=3, all findings are shown.


Dashboard

AgentTrace includes a local dashboard at:

  • http://localhost:3500

Current UI features include:

  • session list
  • execution timeline
  • parallel-lane view
  • collapsed repeated-tool clusters
  • prompt / response modal for LLM spans
  • execution-state tabs
  • diagnostics panel
  • LLM review panel
  • collapsible final agent output

Quick start

1. Install

PyPI:

pip install agenttrace-runtime

If your mirror has not synced the package yet, install from the official PyPI index:

pip install -i https://pypi.org/simple agenttrace-runtime

If you want to install from source instead:

git clone https://github.com/happli-sys/AgentTrace.git
cd AgentTrace
pip install -e .

2. Patch once, trace every run

import agenttrace
from my_agent import run

agenttrace.patch(
    "my_agent.tools",
    "my_agent.skills",
    "my_agent.llm",
    llm_modules=["my_agent.llm"],
    skill_modules=["my_agent.skills"],
    review_level=2,
)

output = agenttrace.session("查北京天气并计算 1+2")(run)("查北京天气并计算 1+2")
print(output)
print(agenttrace.last_result().summary())

3. Start the dashboard

from agenttrace.dashboard.server import start_server

start_server(port=3500)

Open:

  • http://localhost:3500

Demo agent

This repo includes a demo agent that intentionally exercises multiple tracing scenarios:

  • bash
  • read
  • grep
  • calculate
  • get_weather
  • flaky_weather
  • weather_report_skill
  • parallel weather queries
  • fallback to stable tools

Run it:

python examples/demo_agent/main.py

Stress prompt:

分析当前目录下的项目;bash pwd;read examples/demo_agent/tools.py;grep calculate examples/demo_agent;查北京和西安的天气,并计算1123123123+1283123;生成北京天气播报;最后总结。

Protocol ingest (for non-Python agents)

AgentTrace now includes a first protocol-based ingestion path for non-Python agents.

Start the ingest server:

from agenttrace import start_ingest_server

start_ingest_server(port=7760)

Then send protocol events to:

  • POST /api/v1/events
  • POST /api/v1/events/batch

The protocol draft lives in:

  • docs/protocol-v0.1.md

This is the recommended direction for Go / Node / Java style agents that cannot use the native Python patch/session integration.


Integration model

AgentTrace works best for:

  • custom Python agents with source code
  • local development environments
  • CLI / hook-based agents
  • runtime debugging and diagnostics workflows

The default integration style is intentionally lightweight:

  • patch modules once
  • wrap runs with session(...)
  • inspect results locally

For non-Python agents, AgentTrace is evolving toward a protocol-based model. Current repository drafts include:

  • docs/protocol-v0.1.md
  • docs/agenttrace-go-adapter-v0.md
  • docs/agenttrace-go-api-sketch.md
  • sdk/go/agenttracego/ (prototype)

Project scope

AgentTrace is currently optimized as:

  • a runtime tracing tool
  • a local-first diagnostics tool
  • a developer-facing execution inspector

It is not currently focused on being:

  • a hosted eval platform
  • a benchmark leaderboard
  • a dataset management system
  • a full SaaS observability suite

Who this is for

AgentTrace is especially useful for:

  • engineers building custom agents
  • teams debugging real runtime behavior
  • people who need local-first execution visibility
  • anyone who wants to inspect agent decisions beyond final output quality

Roadmap direction

Current direction is intentionally focused:

  • stronger execution tracing
  • better diagnostics and issue localization
  • cleaner runtime state modeling
  • broader integration patterns for source-based agents
  • more production-friendly export / observability hooks

The goal is to keep AgentTrace useful as a general execution-flow listener, not to turn it into a bloated all-in-one platform too early.


Still useful for objective metrics

Although AgentTrace centers on tracing and diagnostics, it still retains objective runtime metrics such as:

  • total latency
  • avg / p95 step latency
  • tool success rate
  • token usage
  • estimated cost
  • step efficiency
  • correctness (if expected_output is provided)
  • regression tracking
  • comparison helpers

Contributing

Contributions are welcome — especially around:

  • new agent integrations
  • richer diagnostics
  • runtime state capture
  • dashboard usability
  • packaging and release polish

For local development:

git clone https://github.com/happli-sys/AgentTrace.git
cd AgentTrace
pip install -e ".[dev]"
pytest tests/

If you want to contribute, small focused improvements are preferred over large platform-style expansions.


License

MIT

About

ai agent trace, Open-source runtime tracing and diagnostics for AI agent execution flows

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors