-
Notifications
You must be signed in to change notification settings - Fork 92
feat(observability): Add Weave integration for agent tracing #1446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
morganmcg1
wants to merge
8
commits into
OpenHands:main
Choose a base branch
from
morganmcg1:add-weave-observability
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add Weights & Biases Weave integration to the SDK observability module, providing comprehensive tracing capabilities for agent operations. New features: - weave_op decorator for tracing functions - observe_weave decorator with Laminar-compatible interface - weave_thread context manager for grouping related operations - WeaveSpanManager for manual span management - Auto-initialization via environment variables (WANDB_API_KEY, WEAVE_PROJECT) Files added: - openhands-sdk/openhands/sdk/observability/weave.py - examples/weave_observability_demo.py - tests/sdk/observability/test_weave.py Dependencies: - Added weave>=0.52.22 to openhands-sdk dependencies Co-authored-by: openhands <openhands@all-hands.dev>
Key improvements: - Simplified integration by leveraging Weave's built-in LiteLLM autopatching - When init_weave() is called, all LiteLLM calls are automatically traced - No manual decoration needed for LLM calls - just call init_weave() - Added weave_attributes() context manager for conversation grouping - Added get_weave_op() for dynamic decorator access - Updated @weave_op to support both @weave_op and @weave_op(...) syntax - Improved documentation explaining the autopatching approach - Updated demo script to showcase automatic LLM tracing - Added tests for autopatching behavior The SDK uses LiteLLM for all LLM calls. Weave automatically patches LiteLLM when initialized, so users get full tracing with minimal setup. Co-authored-by: openhands <openhands@all-hands.dev>
Integrates Weave threading into LocalConversation.run() to automatically group all operations (LLM calls, traced functions) under the conversation ID. Key changes: - Added _get_weave_thread_context() helper that returns weave.thread() if Weave is initialized, otherwise a nullcontext (no-op) - Wrapped the run loop with the Weave thread context - All LLM calls (autopatched via Weave's LiteLLM integration) and @weave_op decorated functions are now grouped by conversation This enables conversation-level tracing in the Weave UI, similar to the OpenHands PR #12056 approach but adapted for the SDK architecture. Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Introduces a unified observability context management system that allows
multiple observability tools (Weave, Laminar, etc.) to work together
seamlessly.
Key changes:
- Added context.py with provider registry pattern
- get_conversation_context() composes context managers from all enabled tools
- Built-in providers for Weave (weave.thread) and Laminar (span with session_id)
- LocalConversation.run() now uses the generic get_conversation_context()
- Easy to add new observability tools via register_conversation_context_provider()
Design benefits:
- SDK is agnostic to which observability tools are enabled
- Graceful degradation when tools are not initialized
- Exception in one provider doesn't break others
- Single integration point in LocalConversation
Usage for adding new tools:
from openhands.sdk.observability import register_conversation_context_provider
def get_my_tool_context(conversation_id: str):
if not is_my_tool_initialized():
return nullcontext()
return my_tool.thread(conversation_id)
register_conversation_context_provider(get_my_tool_context)
Co-authored-by: openhands <openhands@all-hands.dev>
Introduces a unified tool tracing system that works across all enabled observability tools (Weave, Laminar, etc.). Key additions: - trace_tool_call(): Context manager for tracing tool executions - traced_tool(): Decorator for tracing tool functions - trace_mcp_list_tools(): Context manager for MCP tool listing - trace_mcp_call_tool(): Context manager for MCP tool calls - Tool trace provider registry (similar to conversation context providers) Integration points: - Agent._execute_action_event() now uses trace_tool_call() for all tools - MCPToolExecutor.call_tool() uses trace_mcp_call_tool() - MCP utils._list_tools() uses trace_mcp_list_tools() What gets traced: - Tool name - Tool inputs (safely serialized) - Tool type (TOOL, MCP_TOOL, MCP_LIST) - Execution duration (via context manager) Design benefits: - Single API for all observability tools - Easy to add new observability providers - Graceful degradation when tools not initialized - Backward compatible with existing Laminar @observe decorators Co-authored-by: openhands <openhands@all-hands.dev>
Simplifies the Weave observability integration to be more elegant: 1. **Removed complex tool tracing system**: - Removed trace_tool_call, traced_tool, trace_mcp_* functions - Removed tool trace provider registry - These were over-engineered; Weave's autopatching handles LLM tracing 2. **Simplified weave.py**: - Kept only essential functions: init_weave, maybe_init_weave, weave_op - Removed WeaveSpanManager, observe_weave, weave_attributes, weave_thread - Users can use weave.op and weave.thread directly from the weave package 3. **Key exports**: - init_weave(): Initialize Weave with autopatching - maybe_init_weave(): Conditional init based on env vars - weave_op(): Decorator wrapper that's a no-op when not initialized - get_weave_op(): Get weave.op or no-op decorator 4. **Design philosophy**: - Weave autopatching traces all LiteLLM calls automatically - Use @weave.op directly for custom function tracing - Use weave.thread() directly for conversation grouping - Keep SDK integration minimal and non-invasive This matches the approach in the OpenHands PR #12056. Co-authored-by: openhands <openhands@all-hands.dev>
- Make weave an optional dependency (install with pip install openhands-sdk[weave]) - Add auto-init via maybe_init_weave() at module load (matches Laminar pattern) - Simplify demo to use only existing functions - Fix tests for optional dependency handling - Remove unrelated changes from PR scope
neubig
reviewed
Dec 19, 2025
Contributor
neubig
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks @morganmcg1! Tell me when this ready for a look and I'll review.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds Weights & Biases Weave integration to the SDK observability module, providing comprehensive tracing capabilities for agent operations. This is a port of the observability code from OpenHands PR #12056, adapted to work cleanly with the existing SDK architecture.
Features
New Decorators
@weave_op: Decorator for tracing functions with Weave, supporting custom names, display names, and input/output postprocessing@observe_weave: Laminar-compatible decorator that provides a unified interface for observabilityContext Management
weave_thread: Context manager for grouping related operations (e.g., all events in a conversation) under a single trace hierarchyManual Span Management
WeaveSpanManager: Class for manual span lifecycle managementstart_weave_span/end_weave_span: Global functions for manual span controlConfiguration
WANDB_API_KEY: Your Weights & Biases API keyWEAVE_PROJECT: The Weave project name (e.g., "my-team/my-project")init_weave(project, api_key)Usage Example
Files Changed
openhands-sdk/openhands/sdk/observability/weave.py- Main Weave integration moduleexamples/weave_observability_demo.py- Demo script showing usagetests/sdk/observability/test_weave.py- Unit tests (16 tests, all passing)openhands-sdk/openhands/sdk/observability/__init__.py- Export new functionsopenhands-sdk/pyproject.toml- Addweave>=0.52.22dependencyDesign Decisions
Graceful Degradation: All decorators and functions work as no-ops when Weave is not initialized, allowing code to run without tracing when credentials are not available.
Laminar Compatibility: The
observe_weavedecorator provides a similar interface to the existingobservedecorator from Laminar, making it easy to switch between backends.Thread Grouping: The
weave_threadcontext manager allows grouping related operations (like all events in a conversation) under a single trace hierarchy, similar to the pattern used in the OpenHands PR.Automatic wandb Login: The
init_weavefunction automatically handles wandb login when an API key is provided.Testing
All 16 unit tests pass:
Related
@morganmcg1 can click here to continue refining the PR