Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 15 additions & 11 deletions langfuse/_client/propagation.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@
*,
user_id: Optional[str] = None,
session_id: Optional[str] = None,
metadata: Optional[Dict[str, str]] = None,
metadata: Optional[Dict[str, Any]] = None,
version: Optional[str] = None,
tags: Optional[List[str]] = None,
trace_name: Optional[str] = None,
Expand Down Expand Up @@ -125,10 +125,11 @@
Must be US-ASCII string, ≤200 characters. Use this to group related traces
within a user session (e.g., a conversation thread, multi-turn interaction).
metadata: Additional key-value metadata to propagate to all spans.
- Keys and values must be US-ASCII strings
- All values must be ≤200 characters
- Keys must be US-ASCII strings
- Values are coerced to strings
- Coerced values must be ≤200 characters
- Use for dimensions like internal correlating identifiers
- AVOID: large payloads, sensitive data, non-string values (will be dropped with warning)
- AVOID: large payloads or sensitive data
version: Version identfier for parts of your application that are independently versioned, e.g. agents
tags: List of tags to categorize the group of observations
trace_name: Name to assign to the trace. Must be US-ASCII string, ≤200 characters.
Expand Down Expand Up @@ -204,9 +205,10 @@
```

Note:
- **Validation**: All attribute values (user_id, session_id, metadata values)
must be strings ≤200 characters. Invalid values will be dropped with a
warning logged. Ensure values meet constraints before calling.
- **Validation**: Attribute values (user_id, session_id, version, tags,
trace_name) must be strings ≤200 characters. Metadata values are
coerced to strings before the 200 character limit is applied. Invalid
values will be dropped with a warning logged.
- **OpenTelemetry**: This uses OpenTelemetry context propagation under the hood,
making it compatible with other OTel-instrumented libraries.

Expand All @@ -229,7 +231,7 @@
*,
user_id: Optional[str] = None,
session_id: Optional[str] = None,
metadata: Optional[Dict[str, str]] = None,
metadata: Optional[Dict[str, Any]] = None,
version: Optional[str] = None,
tags: Optional[List[str]] = None,
trace_name: Optional[str] = None,
Expand All @@ -247,7 +249,7 @@
"trace_name": trace_name,
}

propagated_metadata_attributes: Dict[str, Optional[Dict[str, str]]] = {
propagated_metadata_attributes: Dict[str, Optional[Dict[str, Any]]] = {
"metadata": metadata,
}

Expand Down Expand Up @@ -285,9 +287,11 @@

validated_metadata: Dict[str, str] = {}

for key, value in metadata_value.items():
if _validate_string_value(value=value, key=f"{metadata_key}.{key}"):
validated_metadata[key] = value
coerced_value = value if isinstance(value, str) else str(value)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep None metadata values out of propagated attributes

When callers build propagated metadata with optional fields such as {"tenant": None}, this line now coerces the value to the literal string "None" and propagates langfuse.trace.metadata.tenant to every child span. Before this change those non-string values were dropped, and the SDK's metadata serialization helpers omit None, so optional/absent metadata now becomes a misleading filterable value instead of remaining absent; skip None before string coercion.

Useful? React with 👍 / 👎.


if _validate_string_value(value=coerced_value, key=f"{metadata_key}.{key}"):
validated_metadata[key] = coerced_value

Check failure on line 294 in langfuse/_client/propagation.py

View check run for this annotation

Claude / Claude Code Review

Metadata coercion edge cases: exception from __str__ and None → 'None'

The new metadata coercion at line 291 (`coerced_value = value if isinstance(value, str) else str(value)`) has two regression edge cases vs. the pre-PR `_validate_string_value` path: (1) **`str(value)` can raise**, propagating out of `_propagate_attributes` before the `yield` and breaking the entire `with` block — violating the documented "No exceptions are raised" contract; (2) **`None` is silently coerced to the literal string `'None'`** (4 chars, passes the 200-char check) and propagated to ev
Comment on lines 290 to +294

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The new metadata coercion at line 291 (coerced_value = value if isinstance(value, str) else str(value)) has two regression edge cases vs. the pre-PR _validate_string_value path: (1) str(value) can raise, propagating out of _propagate_attributes before the yield and breaking the entire with block — violating the documented "No exceptions are raised" contract; (2) None is silently coerced to the literal string 'None' (4 chars, passes the 200-char check) and propagated to every child span, where pre-PR it was dropped with a warning. Wrap str(value) in try/except (log + skip on failure) and add if value is None: continue before the coercion — mirroring the None-filtering already done for top-level attrs at lines 267-270.

Extended reasoning...

Bug 1: str(value) can raise and breaks the context manager

Line 291 now unconditionally calls str(value) on any non-string metadata value:

coerced_value = value if isinstance(value, str) else str(value)

This sits before the try: yield block at line 308, with no exception handling. If __str__ raises, the exception propagates out of _propagate_attributes and the user's wrapped with block never executes.

This violates the explicit docstring contract at lines 213-214:

Raises: No exceptions are raised. Invalid values are logged as warnings and dropped.

Pre-PR, _validate_string_value only ever called isinstance(value, str) and len() on already-confirmed strings — neither can raise on user-provided objects. The PR widens the type signature from Dict[str, str]Dict[str, Any], inviting arbitrary objects, but loses the defensive check.

Realistic triggers (not just theoretical):

  • SQLAlchemy detached instance whose backing session is closed (DetachedInstanceError on __str__)
  • Mock objects with side_effect=Exception in tests
  • Custom domain objects with broken __repr__/__str__
  • Third-party objects that lazy-load on stringification

Step-by-step proof:

class Broken:
    def __str__(self):
        raise RuntimeError("oops")

executed = False
try:
    with propagate_attributes(metadata={"k": Broken()}):
        executed = True  # NEVER runs
except RuntimeError:
    pass

assert executed is False  # User's wrapped code never executed
  1. _propagate_attributes enters the metadata loop at line 290
  2. value = Broken(), isinstance(value, str) is False
  3. str(value) is evaluated → raises RuntimeError("oops")
  4. Exception propagates up; yield at line 309 never reached
  5. User's wrapped code never runs, even though a tracing helper should be transparent

This affects experiment_metadata and experiment_item_metadata as well (same loop, lines 256-261), so internal Langfuse experiment runs are also exposed.

Fix:

try:
    coerced_value = value if isinstance(value, str) else str(value)
except Exception as e:
    langfuse_logger.warning(
        f"Propagated metadata '{metadata_key}.{key}' could not be coerced to string: {e}. Dropping value."
    )
    continue

Bug 2: None silently coerced to literal string 'None'

str(None) == 'None' (4 chars), which passes the len(value) <= 200 check in _validate_string_value. So a metadata entry like {'optional_field': None} now propagates the literal four-character string 'None' to every child span and into Langfuse UI/aggregations.

Pre-PR behavior: None failed isinstance(value, str) in _validate_string_value, logged a warning, and was dropped — the field was simply absent.

Internal inconsistency in this same function: lines 267-270 explicitly filter None from top-level propagated string attributes (user_id, session_id, etc.) via:

propagated_string_attributes = {
    k: v for k, v in propagated_string_attributes.items() if v is not None
}

So the established convention in this very function is "None means absent". The new metadata path breaks that convention silently for individual dict values.

Step-by-step proof:

  1. User calls propagate_attributes(metadata={'user_tier': user.tier if user else None}) — an idiomatic Python pattern
  2. Loop at line 290: value = None, isinstance(None, str) is False
  3. coerced_value = str(None) = 'None' (a 4-char string)
  4. _validate_string_value('None', 'metadata.user_tier') returns True (it IS a string, len ≤ 200)
  5. validated_metadata['user_tier'] = 'None' — literal string set on every child span
  6. Langfuse UI shows user_tier='None', aggregations count it as a real value

The PR docstring/changelog says values are "coerced to strings" but does not call out this None special case, and the PR description specifically mentions LangGraph's int/list/tuple values — never None.

Fix:

for key, value in metadata_value.items():
    if value is None:
        continue  # match the convention at lines 267-270
    # ... coercion ...

Why this is one merged finding

Both regressions live on the same new line (291) and have the same root cause: the coercion path lost the defensive checks that _validate_string_value previously provided. Fixing each in isolation is a one-line change; addressing them together as a small refactor of the metadata loop is cleaner.


if validated_metadata:
context = _set_propagated_attribute(
Expand Down
29 changes: 29 additions & 0 deletions tests/unit/test_propagate_attributes.py
Original file line number Diff line number Diff line change
Expand Up @@ -461,6 +461,35 @@ def test_non_string_user_id_dropped(self, langfuse_client, memory_exporter):
child_span, LangfuseOtelSpanAttributes.TRACE_USER_ID
)

def test_non_string_metadata_values_coerced(
self, langfuse_client, memory_exporter, caplog
):
"""Verify non-string metadata values are coerced instead of dropped."""

caplog.set_level("WARNING", logger="langfuse")
metadata = {
"langgraph_step": 1,
"langgraph_triggers": ["branch:agent"],
"langgraph_path": ("root", "agent"),
"max_search_results": 5,
}

with langfuse_client.start_as_current_observation(name="parent-span"):
with propagate_attributes(metadata=metadata):
child = langfuse_client.start_observation(name="child-span")
child.end()

child_span = self.get_span_by_name(memory_exporter, "child-span")

for key, value in metadata.items():
self.verify_span_attribute(
child_span,
f"{LangfuseOtelSpanAttributes.TRACE_METADATA}.{key}",
str(value),
)

assert "value is not a string. Dropping value." not in caplog.text

def test_mixed_valid_invalid_metadata(self, langfuse_client, memory_exporter):
"""Verify mixed valid/invalid metadata - valid entries kept, invalid dropped."""
with langfuse_client.start_as_current_observation(name="parent-span"):
Expand Down