feat: Upgrade LangWatch SDK from 0.1 to 0.16#5896
feat: Upgrade LangWatch SDK from 0.1 to 0.16#5896rogeriochaves wants to merge 2 commits intoFlowiseAI:mainfrom
Conversation
… tracing
Migrate the LangWatch integration from the legacy SDK (0.1.x) to the new
OpenTelemetry-based SDK (0.16.x). The new SDK uses standard OTel spans
with LangWatch-specific attributes instead of the old proprietary API.
Changes:
- Upgrade langwatch dependency from ^0.1.1 to ^0.16.1
- Replace LangWatch/LangWatchTrace/autoconvertTypedValues imports with
LangWatchCallbackHandler and getLangWatchTracerFromProvider
- Add ensureLangWatchOtel() to register a v1 OTel tracer provider with
OTLP HTTP exporter pointing at LangWatch's ingestion endpoint
- additionalCallbacks path: use LangWatchCallbackHandler which handles
all LangChain callback events internally via OTel spans
- AnalyticHandler path: use getLangWatchTracerFromProvider to create
spans with LangWatch methods (setType, setInput, setOutput, etc.)
- Update span lifecycle: use setOutput/setMetrics/setRequestModel/end()
instead of the old span.end({output, metrics, model}) API
- Update error handling: use recordException + setStatus + end() instead
of span.end({error})
- Fix pre-existing type error in speechToText.ts for Groq client
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly updates the Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request upgrades the langwatch SDK from v0.1 to v0.16, migrating from the legacy API to the new OpenTelemetry-based SDK. However, this implementation introduces a critical security vulnerability related to global state management in the OpenTelemetry provider initialization, which can lead to cross-tenant data leakage. Additionally, there is a risk of data exfiltration and SSRF due to a lack of validation on the user-supplied analytics endpoint. While the changes in handler.ts are comprehensive, adapting AnalyticHandler to the new span management methods, I've added a few suggestions to enhance type safety and reduce code duplication.
I am having trouble creating individual review comments. Click here to see my feedback.
packages/components/src/handler.ts (47-63)
The ensureLangWatchOtel function registers a global OpenTelemetry tracer provider and uses a module-level variable langWatchOtelInitialized to ensure it's only initialized once per process. In a multi-user or multi-tenant environment, the first user to run a chatflow with LangWatch configured will set the global tracer provider for the entire Node.js process. Subsequent chatflows from other users will use this same global provider, causing their traces (including sensitive prompts and LLM outputs) to be sent to the first user's LangWatch account. Furthermore, provider.register() makes this the global provider for all OpenTelemetry instrumentations in the application, potentially leaking internal application data (HTTP requests, database queries) to the external LangWatch endpoint.
packages/components/src/handler.ts (51-60)
The langWatchEndpoint is obtained from user-controlled node inputs and used to construct the URL for the HttpOTLPTraceExporter without any validation. An attacker can provide a malicious URL to exfiltrate sensitive trace data (prompts, outputs) or perform SSRF attacks against internal services. Since the exporter sends POST requests with trace data, it can be used to probe internal endpoints or exfiltrate sensitive information to an attacker-controlled server.
packages/components/src/handler.ts (40)
The type for LangWatchSpan is set to any, which reduces type safety. While the langwatch SDK might not export this type directly, you could define a more specific local type to improve code clarity and maintainability. This would also make it easier to work with span objects, leveraging auto-completion and compile-time checks.
type LangWatchSpan = import('@opentelemetry/api').Span & { setType: (type: 'llm' | 'chain' | 'tool') => void; setInput: (input: any) => void; setOutput: (output: any) => void; setMetrics: (metrics: { promptTokens?: number; completionTokens?: number; }) => void; setRequestModel: (modelName: string) => void; };
packages/components/src/handler.ts (1294-1298)
This error handling logic is duplicated in onLLMError (lines 1659-1663) and onToolError (lines 1944-1948). To improve maintainability and adhere to the DRY (Don't Repeat Yourself) principle, consider extracting this block into a private helper method within the AnalyticHandler class. For example: private _endSpanWithError(span: LangWatchSpan, error: any) { ... }.
packages/components/src/speechToText.ts (116)
Using as any bypasses type checking and can hide potential issues. While this might be a necessary workaround due to issues with the groq-sdk typings, it would be best to investigate if a more accurate type can be used for groqClient. If not, consider adding a // TODO: comment explaining why as any is used and what needs to be fixed in the future for better long-term maintainability.
|
Hey, this is Claude (AI assistant) responding on behalf of @rogeriochaves to the Gemini code review: 1. Global state / cross-tenant leakage (critical) — Overstated for Flowise's deployment model. The 2. SSRF via endpoint (high) — Not a new concern introduced by this PR. Every analytics provider in this file (LangFuse, LangSmith, Lunary, Arize, Phoenix, Opik) takes an endpoint URL from credential data and makes HTTP requests to it without validation. For example, Arize: 3. 4. DRY error handling (medium) — Fair suggestion. The error-to-exception pattern is repeated 3 times. Could be extracted into a helper. Happy to address if desired. 5. |
Summary
langwatchdependency from^0.1.1to^0.16.1, migrating from the legacy proprietary API to the new OpenTelemetry-based SDKlangwatch.getTrace().getLangChainCallback()withLangWatchCallbackHandlerwhich handles all LangChain callback events internally via OTel spanslangwatch.getTrace().startSpan()/startLLMSpan()withgetLangWatchTracerFromProviderto create spans with LangWatch-specific methods (setType,setInput,setOutput,setMetrics,setRequestModel)ensureLangWatchOtel()helper that registers a v1 OTel tracer provider with an OTLP HTTP exporter pointing at LangWatch's ingestion endpoint — required because Flowise uses OTel v1 while the LangWatch SDK ships OTel v2, makingsetupObservability()incompatiblespeechToText.tsfor Groq clientTest plan