Skip to content

Conversation

@morganmcg1
Copy link

Summary

This PR adds Weights & Biases Weave integration to the SDK observability module, providing comprehensive tracing capabilities for agent operations. This is a port of the observability code from OpenHands PR #12056, adapted to work cleanly with the existing SDK architecture.

Features

New Decorators

  • @weave_op: Decorator for tracing functions with Weave, supporting custom names, display names, and input/output postprocessing
  • @observe_weave: Laminar-compatible decorator that provides a unified interface for observability

Context Management

  • weave_thread: Context manager for grouping related operations (e.g., all events in a conversation) under a single trace hierarchy

Manual Span Management

  • WeaveSpanManager: Class for manual span lifecycle management
  • start_weave_span / end_weave_span: Global functions for manual span control

Configuration

  • Auto-initialization via environment variables:
    • WANDB_API_KEY: Your Weights & Biases API key
    • WEAVE_PROJECT: The Weave project name (e.g., "my-team/my-project")
  • Programmatic initialization via init_weave(project, api_key)

Usage Example

from openhands.sdk.observability import (
    maybe_init_weave,
    weave_op,
    weave_thread,
    observe_weave,
)

# Auto-initialize from environment variables
maybe_init_weave()

# Trace a function
@weave_op(name="process_message")
def process_message(message: str) -> dict:
    return {"processed": True, "message": message}

# Group operations under a conversation thread
with weave_thread("conversation-123"):
    result = process_message("Hello!")

Files Changed

  • New: openhands-sdk/openhands/sdk/observability/weave.py - Main Weave integration module
  • New: examples/weave_observability_demo.py - Demo script showing usage
  • New: tests/sdk/observability/test_weave.py - Unit tests (16 tests, all passing)
  • Modified: openhands-sdk/openhands/sdk/observability/__init__.py - Export new functions
  • Modified: openhands-sdk/pyproject.toml - Add weave>=0.52.22 dependency

Design Decisions

  1. Graceful Degradation: All decorators and functions work as no-ops when Weave is not initialized, allowing code to run without tracing when credentials are not available.

  2. Laminar Compatibility: The observe_weave decorator provides a similar interface to the existing observe decorator from Laminar, making it easy to switch between backends.

  3. Thread Grouping: The weave_thread context manager allows grouping related operations (like all events in a conversation) under a single trace hierarchy, similar to the pattern used in the OpenHands PR.

  4. Automatic wandb Login: The init_weave function automatically handles wandb login when an API key is provided.

Testing

All 16 unit tests pass:

tests/sdk/observability/test_weave.py::TestWeaveConfiguration::test_should_enable_weave_with_both_vars PASSED
tests/sdk/observability/test_weave.py::TestWeaveConfiguration::test_should_enable_weave_missing_api_key PASSED
tests/sdk/observability/test_weave.py::TestWeaveConfiguration::test_should_enable_weave_missing_project PASSED
tests/sdk/observability/test_weave.py::TestWeaveConfiguration::test_is_weave_initialized_default PASSED
tests/sdk/observability/test_weave.py::TestWeaveOpDecorator::test_weave_op_without_initialization PASSED
tests/sdk/observability/test_weave.py::TestWeaveOpDecorator::test_weave_op_preserves_function_metadata PASSED
tests/sdk/observability/test_weave.py::TestWeaveOpDecorator::test_weave_op_handles_exceptions PASSED
tests/sdk/observability/test_weave.py::TestObserveWeaveDecorator::test_observe_weave_without_initialization PASSED
tests/sdk/observability/test_weave.py::TestObserveWeaveDecorator::test_observe_weave_with_ignore_inputs PASSED
tests/sdk/observability/test_weave.py::TestWeaveThread::test_weave_thread_without_initialization PASSED
tests/sdk/observability/test_weave.py::TestWeaveSpanManager::test_span_manager_without_initialization PASSED
tests/sdk/observability/test_weave.py::TestWeaveSpanManager::test_global_span_functions PASSED
tests/sdk/observability/test_weave.py::TestWeaveExports::test_all_exports_available PASSED
tests/sdk/observability/test_weave.py::TestInitWeave::test_init_weave_requires_project PASSED
tests/sdk/observability/test_weave.py::TestInitWeave::test_init_weave_uses_env_project PASSED
tests/sdk/observability/test_weave.py::TestInitWeave::test_init_weave_already_initialized PASSED

Related

@morganmcg1 can click here to continue refining the PR

openhands-agent and others added 8 commits December 18, 2025 22:26
Add Weights & Biases Weave integration to the SDK observability module,
providing comprehensive tracing capabilities for agent operations.

New features:
- weave_op decorator for tracing functions
- observe_weave decorator with Laminar-compatible interface
- weave_thread context manager for grouping related operations
- WeaveSpanManager for manual span management
- Auto-initialization via environment variables (WANDB_API_KEY, WEAVE_PROJECT)

Files added:
- openhands-sdk/openhands/sdk/observability/weave.py
- examples/weave_observability_demo.py
- tests/sdk/observability/test_weave.py

Dependencies:
- Added weave>=0.52.22 to openhands-sdk dependencies

Co-authored-by: openhands <openhands@all-hands.dev>
Key improvements:
- Simplified integration by leveraging Weave's built-in LiteLLM autopatching
- When init_weave() is called, all LiteLLM calls are automatically traced
- No manual decoration needed for LLM calls - just call init_weave()
- Added weave_attributes() context manager for conversation grouping
- Added get_weave_op() for dynamic decorator access
- Updated @weave_op to support both @weave_op and @weave_op(...) syntax
- Improved documentation explaining the autopatching approach
- Updated demo script to showcase automatic LLM tracing
- Added tests for autopatching behavior

The SDK uses LiteLLM for all LLM calls. Weave automatically patches
LiteLLM when initialized, so users get full tracing with minimal setup.

Co-authored-by: openhands <openhands@all-hands.dev>
Integrates Weave threading into LocalConversation.run() to automatically
group all operations (LLM calls, traced functions) under the conversation ID.

Key changes:
- Added _get_weave_thread_context() helper that returns weave.thread() if
  Weave is initialized, otherwise a nullcontext (no-op)
- Wrapped the run loop with the Weave thread context
- All LLM calls (autopatched via Weave's LiteLLM integration) and
  @weave_op decorated functions are now grouped by conversation

This enables conversation-level tracing in the Weave UI, similar to
the OpenHands PR #12056 approach but adapted for the SDK architecture.

Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Introduces a unified observability context management system that allows
multiple observability tools (Weave, Laminar, etc.) to work together
seamlessly.

Key changes:
- Added context.py with provider registry pattern
- get_conversation_context() composes context managers from all enabled tools
- Built-in providers for Weave (weave.thread) and Laminar (span with session_id)
- LocalConversation.run() now uses the generic get_conversation_context()
- Easy to add new observability tools via register_conversation_context_provider()

Design benefits:
- SDK is agnostic to which observability tools are enabled
- Graceful degradation when tools are not initialized
- Exception in one provider doesn't break others
- Single integration point in LocalConversation

Usage for adding new tools:
    from openhands.sdk.observability import register_conversation_context_provider

    def get_my_tool_context(conversation_id: str):
        if not is_my_tool_initialized():
            return nullcontext()
        return my_tool.thread(conversation_id)

    register_conversation_context_provider(get_my_tool_context)

Co-authored-by: openhands <openhands@all-hands.dev>
Introduces a unified tool tracing system that works across all enabled
observability tools (Weave, Laminar, etc.).

Key additions:
- trace_tool_call(): Context manager for tracing tool executions
- traced_tool(): Decorator for tracing tool functions
- trace_mcp_list_tools(): Context manager for MCP tool listing
- trace_mcp_call_tool(): Context manager for MCP tool calls
- Tool trace provider registry (similar to conversation context providers)

Integration points:
- Agent._execute_action_event() now uses trace_tool_call() for all tools
- MCPToolExecutor.call_tool() uses trace_mcp_call_tool()
- MCP utils._list_tools() uses trace_mcp_list_tools()

What gets traced:
- Tool name
- Tool inputs (safely serialized)
- Tool type (TOOL, MCP_TOOL, MCP_LIST)
- Execution duration (via context manager)

Design benefits:
- Single API for all observability tools
- Easy to add new observability providers
- Graceful degradation when tools not initialized
- Backward compatible with existing Laminar @observe decorators

Co-authored-by: openhands <openhands@all-hands.dev>
Simplifies the Weave observability integration to be more elegant:

1. **Removed complex tool tracing system**:
   - Removed trace_tool_call, traced_tool, trace_mcp_* functions
   - Removed tool trace provider registry
   - These were over-engineered; Weave's autopatching handles LLM tracing

2. **Simplified weave.py**:
   - Kept only essential functions: init_weave, maybe_init_weave, weave_op
   - Removed WeaveSpanManager, observe_weave, weave_attributes, weave_thread
   - Users can use weave.op and weave.thread directly from the weave package

3. **Key exports**:
   - init_weave(): Initialize Weave with autopatching
   - maybe_init_weave(): Conditional init based on env vars
   - weave_op(): Decorator wrapper that's a no-op when not initialized
   - get_weave_op(): Get weave.op or no-op decorator

4. **Design philosophy**:
   - Weave autopatching traces all LiteLLM calls automatically
   - Use @weave.op directly for custom function tracing
   - Use weave.thread() directly for conversation grouping
   - Keep SDK integration minimal and non-invasive

This matches the approach in the OpenHands PR #12056.

Co-authored-by: openhands <openhands@all-hands.dev>
- Make weave an optional dependency (install with pip install openhands-sdk[weave])
- Add auto-init via maybe_init_weave() at module load (matches Laminar pattern)
- Simplify demo to use only existing functions
- Fix tests for optional dependency handling
- Remove unrelated changes from PR scope
Copy link
Contributor

@neubig neubig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks @morganmcg1! Tell me when this ready for a look and I'll review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants