Description
Now that the agent stack is being rebuilt on LangChain/LangGraph, and there's already per-run JSONL logging for agent messages and MCP tool calls (#1499), a natural next step would be plugging into the LangChain-native observability ecosystem: LangSmith, Langfuse, Opik, Arize, etc.
The JSONL logs are great for replay and post-hoc debugging, but they don't give you what those tools do well:
- Hierarchical trace visualization across agent, tool, and sub-agent calls
- Prompt versioning and side-by-side run comparison
- Dataset and eval iteration loops, which seem especially useful given the breadth of skills and embodiments in DimOS
- Token, latency, and cost tracking out of the box
Because the new agents are LangChain/LangGraph-based, this can be added as optional extras (dimos[langsmith], dimos[langfuse]) gated on env vars, so the dependency only loads when the user opts in. Nothing in the core path changes for users who don't enable it.
If there's interest from the maintainers, I'm happy to put up a PR for LangSmith and Langfuse. I already have a working integration from another feature I'm building on top of DimOS, so it would mostly be cleaning it up and adding tests.
Description
Now that the agent stack is being rebuilt on LangChain/LangGraph, and there's already per-run JSONL logging for agent messages and MCP tool calls (#1499), a natural next step would be plugging into the LangChain-native observability ecosystem: LangSmith, Langfuse, Opik, Arize, etc.
The JSONL logs are great for replay and post-hoc debugging, but they don't give you what those tools do well:
Because the new agents are LangChain/LangGraph-based, this can be added as optional extras (
dimos[langsmith],dimos[langfuse]) gated on env vars, so the dependency only loads when the user opts in. Nothing in the core path changes for users who don't enable it.If there's interest from the maintainers, I'm happy to put up a PR for LangSmith and Langfuse. I already have a working integration from another feature I'm building on top of DimOS, so it would mostly be cleaning it up and adding tests.