feat(observability): Add Weave integration for agent tracing #1446

morganmcg1 · 2025-12-18T22:31:33Z

Summary

This PR adds Weights & Biases Weave integration to the SDK observability module, providing comprehensive tracing capabilities for agent operations. This is a port of the observability code from OpenHands PR #12056, adapted to work cleanly with the existing SDK architecture.

Features

New Decorators

@weave_op: Decorator for tracing functions with Weave, supporting custom names, display names, and input/output postprocessing
@observe_weave: Laminar-compatible decorator that provides a unified interface for observability

Context Management

weave_thread: Context manager for grouping related operations (e.g., all events in a conversation) under a single trace hierarchy

Manual Span Management

WeaveSpanManager: Class for manual span lifecycle management
start_weave_span / end_weave_span: Global functions for manual span control

Configuration

Auto-initialization via environment variables:
- WANDB_API_KEY: Your Weights & Biases API key
- WEAVE_PROJECT: The Weave project name (e.g., "my-team/my-project")
Programmatic initialization via init_weave(project, api_key)

Usage Example

from openhands.sdk.observability import (
    maybe_init_weave,
    weave_op,
    weave_thread,
    observe_weave,
)

# Auto-initialize from environment variables
maybe_init_weave()

# Trace a function
@weave_op(name="process_message")
def process_message(message: str) -> dict:
    return {"processed": True, "message": message}

# Group operations under a conversation thread
with weave_thread("conversation-123"):
    result = process_message("Hello!")

Files Changed

New: openhands-sdk/openhands/sdk/observability/weave.py - Main Weave integration module
New: examples/weave_observability_demo.py - Demo script showing usage
New: tests/sdk/observability/test_weave.py - Unit tests (16 tests, all passing)
Modified: openhands-sdk/openhands/sdk/observability/__init__.py - Export new functions
Modified: openhands-sdk/pyproject.toml - Add weave>=0.52.22 dependency

Design Decisions

Graceful Degradation: All decorators and functions work as no-ops when Weave is not initialized, allowing code to run without tracing when credentials are not available.
Laminar Compatibility: The observe_weave decorator provides a similar interface to the existing observe decorator from Laminar, making it easy to switch between backends.
Thread Grouping: The weave_thread context manager allows grouping related operations (like all events in a conversation) under a single trace hierarchy, similar to the pattern used in the OpenHands PR.
Automatic wandb Login: The init_weave function automatically handles wandb login when an API key is provided.

Testing

All 16 unit tests pass:

tests/sdk/observability/test_weave.py::TestWeaveConfiguration::test_should_enable_weave_with_both_vars PASSED
tests/sdk/observability/test_weave.py::TestWeaveConfiguration::test_should_enable_weave_missing_api_key PASSED
tests/sdk/observability/test_weave.py::TestWeaveConfiguration::test_should_enable_weave_missing_project PASSED
tests/sdk/observability/test_weave.py::TestWeaveConfiguration::test_is_weave_initialized_default PASSED
tests/sdk/observability/test_weave.py::TestWeaveOpDecorator::test_weave_op_without_initialization PASSED
tests/sdk/observability/test_weave.py::TestWeaveOpDecorator::test_weave_op_preserves_function_metadata PASSED
tests/sdk/observability/test_weave.py::TestWeaveOpDecorator::test_weave_op_handles_exceptions PASSED
tests/sdk/observability/test_weave.py::TestObserveWeaveDecorator::test_observe_weave_without_initialization PASSED
tests/sdk/observability/test_weave.py::TestObserveWeaveDecorator::test_observe_weave_with_ignore_inputs PASSED
tests/sdk/observability/test_weave.py::TestWeaveThread::test_weave_thread_without_initialization PASSED
tests/sdk/observability/test_weave.py::TestWeaveSpanManager::test_span_manager_without_initialization PASSED
tests/sdk/observability/test_weave.py::TestWeaveSpanManager::test_global_span_functions PASSED
tests/sdk/observability/test_weave.py::TestWeaveExports::test_all_exports_available PASSED
tests/sdk/observability/test_weave.py::TestInitWeave::test_init_weave_requires_project PASSED
tests/sdk/observability/test_weave.py::TestInitWeave::test_init_weave_uses_env_project PASSED
tests/sdk/observability/test_weave.py::TestInitWeave::test_init_weave_already_initialized PASSED

Add Weights & Biases Weave integration to the SDK observability module, providing comprehensive tracing capabilities for agent operations. New features: - weave_op decorator for tracing functions - observe_weave decorator with Laminar-compatible interface - weave_thread context manager for grouping related operations - WeaveSpanManager for manual span management - Auto-initialization via environment variables (WANDB_API_KEY, WEAVE_PROJECT) Files added: - openhands-sdk/openhands/sdk/observability/weave.py - examples/weave_observability_demo.py - tests/sdk/observability/test_weave.py Dependencies: - Added weave>=0.52.22 to openhands-sdk dependencies Co-authored-by: openhands <openhands@all-hands.dev>

Key improvements: - Simplified integration by leveraging Weave's built-in LiteLLM autopatching - When init_weave() is called, all LiteLLM calls are automatically traced - No manual decoration needed for LLM calls - just call init_weave() - Added weave_attributes() context manager for conversation grouping - Added get_weave_op() for dynamic decorator access - Updated @weave_op to support both @weave_op and @weave_op(...) syntax - Improved documentation explaining the autopatching approach - Updated demo script to showcase automatic LLM tracing - Added tests for autopatching behavior The SDK uses LiteLLM for all LLM calls. Weave automatically patches LiteLLM when initialized, so users get full tracing with minimal setup. Co-authored-by: openhands <openhands@all-hands.dev>

Integrates Weave threading into LocalConversation.run() to automatically group all operations (LLM calls, traced functions) under the conversation ID. Key changes: - Added _get_weave_thread_context() helper that returns weave.thread() if Weave is initialized, otherwise a nullcontext (no-op) - Wrapped the run loop with the Weave thread context - All LLM calls (autopatched via Weave's LiteLLM integration) and @weave_op decorated functions are now grouped by conversation This enables conversation-level tracing in the Weave UI, similar to the OpenHands PR #12056 approach but adapted for the SDK architecture. Co-authored-by: openhands <openhands@all-hands.dev>

Co-authored-by: openhands <openhands@all-hands.dev>

Introduces a unified observability context management system that allows multiple observability tools (Weave, Laminar, etc.) to work together seamlessly. Key changes: - Added context.py with provider registry pattern - get_conversation_context() composes context managers from all enabled tools - Built-in providers for Weave (weave.thread) and Laminar (span with session_id) - LocalConversation.run() now uses the generic get_conversation_context() - Easy to add new observability tools via register_conversation_context_provider() Design benefits: - SDK is agnostic to which observability tools are enabled - Graceful degradation when tools are not initialized - Exception in one provider doesn't break others - Single integration point in LocalConversation Usage for adding new tools: from openhands.sdk.observability import register_conversation_context_provider def get_my_tool_context(conversation_id: str): if not is_my_tool_initialized(): return nullcontext() return my_tool.thread(conversation_id) register_conversation_context_provider(get_my_tool_context) Co-authored-by: openhands <openhands@all-hands.dev>

@observe

Introduces a unified tool tracing system that works across all enabled observability tools (Weave, Laminar, etc.). Key additions: - trace_tool_call(): Context manager for tracing tool executions - traced_tool(): Decorator for tracing tool functions - trace_mcp_list_tools(): Context manager for MCP tool listing - trace_mcp_call_tool(): Context manager for MCP tool calls - Tool trace provider registry (similar to conversation context providers) Integration points: - Agent._execute_action_event() now uses trace_tool_call() for all tools - MCPToolExecutor.call_tool() uses trace_mcp_call_tool() - MCP utils._list_tools() uses trace_mcp_list_tools() What gets traced: - Tool name - Tool inputs (safely serialized) - Tool type (TOOL, MCP_TOOL, MCP_LIST) - Execution duration (via context manager) Design benefits: - Single API for all observability tools - Easy to add new observability providers - Graceful degradation when tools not initialized - Backward compatible with existing Laminar @observe decorators Co-authored-by: openhands <openhands@all-hands.dev>

Simplifies the Weave observability integration to be more elegant: 1. **Removed complex tool tracing system**: - Removed trace_tool_call, traced_tool, trace_mcp_* functions - Removed tool trace provider registry - These were over-engineered; Weave's autopatching handles LLM tracing 2. **Simplified weave.py**: - Kept only essential functions: init_weave, maybe_init_weave, weave_op - Removed WeaveSpanManager, observe_weave, weave_attributes, weave_thread - Users can use weave.op and weave.thread directly from the weave package 3. **Key exports**: - init_weave(): Initialize Weave with autopatching - maybe_init_weave(): Conditional init based on env vars - weave_op(): Decorator wrapper that's a no-op when not initialized - get_weave_op(): Get weave.op or no-op decorator 4. **Design philosophy**: - Weave autopatching traces all LiteLLM calls automatically - Use @weave.op directly for custom function tracing - Use weave.thread() directly for conversation grouping - Keep SDK integration minimal and non-invasive This matches the approach in the OpenHands PR #12056. Co-authored-by: openhands <openhands@all-hands.dev>

- Make weave an optional dependency (install with pip install openhands-sdk[weave]) - Add auto-init via maybe_init_weave() at module load (matches Laminar pattern) - Simplify demo to use only existing functions - Fix tests for optional dependency handling - Remove unrelated changes from PR scope

neubig

Nice, thanks @morganmcg1! Tell me when this ready for a look and I'll review.

openhands-agent and others added 8 commits December 18, 2025 22:26

docs: Update demo to highlight conversation threading feature

b7b791d

Co-authored-by: openhands <openhands@all-hands.dev>

neubig reviewed Dec 19, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(observability): Add Weave integration for agent tracing #1446

feat(observability): Add Weave integration for agent tracing #1446

Uh oh!

morganmcg1 commented Dec 18, 2025

Uh oh!

neubig left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat(observability): Add Weave integration for agent tracing #1446

Are you sure you want to change the base?

feat(observability): Add Weave integration for agent tracing #1446

Uh oh!

Conversation

morganmcg1 commented Dec 18, 2025

Summary

Features

New Decorators

Context Management

Manual Span Management

Configuration

Usage Example

Files Changed

Design Decisions

Testing

Related

Uh oh!

neubig left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants