Langsmith example by benjibc · Pull Request #176 · eval-protocol/python-sdk

benjibc · 2025-09-15T05:10:43Z

Adds first-class LangSmith integration for seeding traces and evaluating LangGraph apps, plus a hardened adapter with tests.

Highlights

New adapter: eval_protocol/adapters/langsmith.py
- Pulls root runs from a LangSmith project and converts to EvaluationRows.
- Prefers outputs.messages (when present) to avoid duplicated prompts.
- Robust input/output mapping:
  - Inputs: messages, prompt, user_input, input, raw string/list
  - Outputs: messages, content, result, answer, output, raw string/list
- Preserves tool info (tool_calls, tool_call_id, function_call).
- Normalizes provider-native tool_calls into OpenAI-typed ChatCompletionMessageToolCall/Function objects.
- De-duplicates only consecutive identical user messages (whitespace- and case-normalized); preserves system-first and legitimate multiple user turns.
Quickstart using LangSmith: eval_protocol/quickstart/llm_judge_langsmith.py
- Mirrors quickstart/llm_judge.py (Langfuse), but uses LangSmithAdapter.
- SingleTurnRolloutProcessor + split_multi_turn_rows + Arena-style LLM judge (Gemini/OpenAI).
- Persists results via existing EP persist flow.
Bootstrap scripts (non-production, seeding only): examples/langsmith/
- README.md: clarifies these scripts only dump traces for adapter/quickstart.
- dump_traces_langsmith.py: emits synthetic @traceable runs and tiny echo LangGraph runs.
- emit_tool_calls.py: emits traces with assistant tool_calls and a tool response.

How to validate locally

Seed traces into LangSmith:

cd python-sdk
source .venv/bin/activate
export LANGSMITH_API_KEY=...
export LS_PROJECT=ep-langgraph-examples
export LANGSMITH_TRACING=true

python examples/langsmith/dump_traces_langsmith.py
python examples/langsmith/emit_tool_calls.py

Trace a tool-enabled LangGraph (optional, requires Fireworks):

export FIREWORKS_API_KEY=...
pytest examples/langgraph/test_tools_langsmith_trace.py -q -s

Run LangSmith quickstart eval:

export GEMINI_API_KEY=...   # or OPENAI_API_KEY=...
pytest eval_protocol/quickstart/llm_judge_langsmith.py -q -s

eval_protocol/adapters/langsmith.py

eval_protocol/quickstart/llm_judge_langsmith.py

eval_protocol/adapters/langsmith.py

Langsmith example

5d4daa6

benjibc requested review from dphuang2 and xzrderek September 15, 2025 05:11

xzrderek approved these changes Sep 15, 2025

View reviewed changes

dphuang2 reviewed Sep 15, 2025

View reviewed changes

eval_protocol/quickstart/llm_judge_langsmith.py Show resolved Hide resolved

dphuang2 reviewed Sep 15, 2025

View reviewed changes

eval_protocol/adapters/langsmith.py Outdated Show resolved Hide resolved

benjibc added 3 commits September 16, 2025 01:21

langsmith changes

a65ab80

update lock

f71658c

formatting

e85ac0a

benjibc merged commit c40dc2e into main Sep 16, 2025
6 of 7 checks passed

benjibc deleted the langsmith_example branch September 16, 2025 04:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Langsmith example#176

Langsmith example#176
benjibc merged 4 commits intomainfrom
langsmith_example

benjibc commented Sep 15, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

benjibc commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Highlights

How to validate locally

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

benjibc commented Sep 15, 2025 •

edited

Loading