-
Notifications
You must be signed in to change notification settings - Fork 507
Description
LiteLLM + AgentOps integration: Provider-specific tracking issue with Anthropic models
Summary
When using LiteLLM's success_callback = ["agentops"] feature, LLM events appear correctly in AgentOps dashboard traces for OpenAI models (e.g., GPT-4o) but do NOT appear for Anthropic models (e.g., Claude 3.5 Sonnet). This is a provider-specific tracking issue, not a universal dependency problem.
Root Cause
The issue is provider-specific and affects how LiteLLM's callback system interacts with AgentOps' existing instrumentation:
- OpenAI models: LLM events appear correctly in AgentOps dashboard traces when using
litellm.success_callback = ["agentops"] - Anthropic models: LLM events do NOT appear in AgentOps dashboard traces when using the same callback configuration
- API calls succeed: Both providers complete API calls successfully, but only OpenAI events are tracked in AgentOps
- Silent failure: No error messages indicate the tracking problem - events simply don't appear in dashboard traces
Issue Details
The problem manifests as missing LLM events in AgentOps dashboard traces for Anthropic models:
- OpenAI behavior: When using
litellm.completion(model="gpt-4o", ...)withsuccess_callback = ["agentops"], LLM events appear in the AgentOps dashboard trace - Anthropic behavior: When using
litellm.completion(model="anthropic/claude-3-5-sonnet-20240620", ...)with the same callback, LLM events do NOT appear in the dashboard trace - API calls work: Both providers successfully complete API calls and return responses
- Sessions created: AgentOps sessions are created for both providers, but only OpenAI sessions show LLM events
- Silent failure: No error messages or warnings indicate that Anthropic events are not being tracked
- Instrumentation conflict: Likely caused by conflicts between LiteLLM's callback system and AgentOps' existing Anthropic instrumentation
Steps to Reproduce
- Install AgentOps and LiteLLM:
pip install agentops litellm openai anthropic- Set up environment variables:
export AGENTOPS_API_KEY="your_agentops_key"
export OPENAI_API_KEY="your_openai_key"
export ANTHROPIC_API_KEY="your_anthropic_key"- Create and run the reproduction script:
#!/usr/bin/env python3
import litellm
import agentops
def test_provider(provider_name, model, message):
print(f"\n=== Testing {provider_name} ===")
agentops.init(auto_start_session=False)
tracer = agentops.start_trace(trace_name=f"{provider_name} Test", tags=[f"{provider_name.lower()}-test"])
litellm.success_callback = ["agentops"]
try:
response = litellm.completion(
model=model,
messages=[{"role": "user", "content": message}],
max_tokens=30
)
print(f"✅ {provider_name} API call successful")
print(f" Response: {response.choices[0].message.content}")
agentops.end_trace(tracer, end_state="Success")
print(f" ^ Check AgentOps dashboard - {provider_name} events should {'appear' if provider_name == 'OpenAI' else 'be MISSING'}")
except Exception as e:
print(f"❌ {provider_name} Error: {e}")
agentops.end_trace(tracer, end_state="Fail")
# Test OpenAI (should show LLM events in dashboard)
test_provider("OpenAI", "gpt-4o", "Say hello from OpenAI!")
# Test Anthropic (LLM events will be missing from dashboard)
test_provider("Anthropic", "anthropic/claude-3-5-sonnet-20240620", "Say hello from Anthropic!")- Run the script:
python reproduce_script.py- Check the AgentOps dashboard sessions:
- OpenAI session: Should show LLM events in the trace
- Anthropic session: Should NOT show LLM events in the trace (events missing)
Expected Behavior
- LLM calls made through LiteLLM with
success_callback = ["agentops"]should be tracked and visible in the AgentOps dashboard for ALL supported providers - Both OpenAI and Anthropic models should show LLM events in AgentOps dashboard traces
- The callback integration should work consistently across different LLM providers
Actual Behavior
- OpenAI models: LLM events appear correctly in AgentOps dashboard traces ✅
- Anthropic models: LLM events do NOT appear in AgentOps dashboard traces ❌
- API calls succeed for both providers, but tracking behavior differs
- No error messages indicate the tracking failure for Anthropic models
- AgentOps sessions are created for both providers, but only OpenAI sessions contain LLM events
Environment
- AgentOps version: 0.4.14
- LiteLLM version: 1.72.6
- Python version: 3.12
- Testing confirmed with valid API keys for both providers
Potential Solutions
Option 1: Fix LiteLLM callback integration for Anthropic models (Recommended)
Investigate and fix the conflict between LiteLLM's callback system and AgentOps' existing Anthropic instrumentation:
- Examine how LiteLLM's
success_callback = ["agentops"]interacts with AgentOps' direct Anthropic instrumentation - Ensure callback events are properly forwarded to AgentOps for Anthropic models
- Test that the fix doesn't break existing direct AgentOps instrumentation
Option 2: Update AgentOps instrumentation priority
Modify AgentOps' instrumentation system to properly handle LiteLLM callback events:
- Ensure LiteLLM callback events take precedence over direct instrumentation when both are present
- Add proper event deduplication to prevent conflicts between callback and direct instrumentation
- Update instrumentation order to prioritize callback-based tracking when configured
Option 3: Document the limitation and provide workaround
If the integration conflict cannot be easily resolved:
- Document that LiteLLM callback integration doesn't work with Anthropic models
- Recommend using AgentOps' direct Anthropic instrumentation instead of LiteLLM callback for Anthropic models
- Provide clear guidance on when to use callback vs. direct instrumentation
Impact
- Users cannot reliably use LiteLLM's
success_callback = ["agentops"]feature for Anthropic models - Anthropic LLM calls made through LiteLLM are not tracked in AgentOps, leading to incomplete observability
- The tracking failure is silent - no error messages indicate that Anthropic events are missing
- Users may not realize their Anthropic models are not being tracked until they check the dashboard
- Mixed provider applications will have inconsistent tracking (OpenAI tracked, Anthropic not tracked)
- This affects observability and monitoring for applications using multiple LLM providers through LiteLLM
Additional Notes
- The issue is provider-specific: OpenAI models work correctly, Anthropic models do not
- AgentOps' direct instrumentation works correctly for both providers when not using LiteLLM callback
- The LiteLLM integration test in AgentOps is currently skipped with reason "TODO: instrumentation for callback handlers and external integrations"
- This suggests the callback integration has known limitations that need to be addressed
- The tracking failure is silent - no error messages or warnings indicate the problem
- Both providers successfully complete API calls, but only OpenAI events appear in AgentOps dashboard traces
- This likely indicates a conflict between LiteLLM's callback system and AgentOps' existing Anthropic instrumentation
Reproduction Script
A complete reproduction script is available that demonstrates the provider-specific tracking behavior difference. The script tests both OpenAI and Anthropic models with LiteLLM callback integration and generates AgentOps session URLs for dashboard verification.
Test Results
When running the reproduction script:
- OpenAI test: API call succeeds, LLM events appear in AgentOps dashboard trace
- Anthropic test: API call succeeds, but LLM events do NOT appear in AgentOps dashboard trace
- Dashboard verification: Checking the session URLs confirms the tracking behavior difference
This demonstrates that the issue is provider-specific and affects dashboard trace visibility, not API call success.