Skip to content

fix(propagation): coerce propagated metadata values#1709

Merged
hassiebp merged 1 commit into
mainfrom
hassiebbot/lfe-8935-coerce-propagated-metadata
Jun 15, 2026
Merged

fix(propagation): coerce propagated metadata values#1709
hassiebp merged 1 commit into
mainfrom
hassiebbot/lfe-8935-coerce-propagated-metadata

Conversation

@hassiebp

@hassiebp hassiebp commented Jun 15, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Coerce propagated metadata values to strings before applying the existing 200-character validation.
  • Keep non-metadata propagated fields on the existing strict string validation path.
  • Add regression coverage for LangGraph-style metadata values (int, list, tuple) and the absence of the old non-string warning.

Root Cause

_propagate_attributes validated metadata values with _validate_string_value before conversion. LangGraph injects non-string metadata such as langgraph_step, langgraph_triggers, and langgraph_path, so those values were dropped and warned on every root callback invocation despite the v4 migration docs promising string coercion.

Linear

Fixes LFE-8935

Verification

  • uv run --frozen pytest tests/unit/test_propagate_attributes.py::TestPropagateAttributesValidation
  • uv run --frozen pytest tests/unit/test_propagate_attributes.py
  • uv run --frozen ruff check .
  • uv run --frozen ruff format --check langfuse/_client/propagation.py tests/unit/test_propagate_attributes.py
  • uv run --frozen mypy langfuse --no-error-summary

Greptile Summary

Fixes the silent dropping of non-string metadata values (such as those injected by LangGraph — langgraph_step: int, langgraph_triggers: list, langgraph_path: tuple) during attribute propagation by coercing them to strings before the existing 200-character validation rather than validating the raw value and warning.

  • _propagate_attributes now converts any non-string metadata value via str(value) before calling _validate_string_value, so LangGraph-style int/list/tuple values are preserved rather than dropped.
  • propagate_attributes and _propagate_attributes public signatures updated from Dict[str, str] to Dict[str, Any] for the metadata parameter to match the new coercion behaviour.
  • A regression test verifies that int, list, and tuple metadata values are stored as their str() representations and that no "not a string" warning is emitted.

Confidence Score: 5/5

The change is narrowly scoped to the metadata coercion path and does not touch string-validated fields (user_id, session_id, version, tags, trace_name).

The fix is a one-liner coercion (str(value)) applied only to metadata values before the existing length check, leaving all other validation paths intact. The public type signature update is consistent with the runtime behavior. The new test exercises all three non-string categories called out in the root cause (int, list, tuple) and also asserts the absence of the old spurious warning. No edge cases appear to be missed.

No files require special attention.

Reviews (1): Last reviewed commit: "fix(propagation): coerce propagated meta..." | Re-trigger Greptile

@hassiebp hassiebp marked this pull request as ready for review June 15, 2026 11:22
@hassiebp hassiebp merged commit 7f82264 into main Jun 15, 2026
21 checks passed
@hassiebp hassiebp deleted the hassiebbot/lfe-8935-coerce-propagated-metadata branch June 15, 2026 11:22
@github-actions

Copy link
Copy Markdown

@claude review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f7a2429639

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

for key, value in metadata_value.items():
if _validate_string_value(value=value, key=f"{metadata_key}.{key}"):
validated_metadata[key] = value
coerced_value = value if isinstance(value, str) else str(value)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep None metadata values out of propagated attributes

When callers build propagated metadata with optional fields such as {"tenant": None}, this line now coerces the value to the literal string "None" and propagates langfuse.trace.metadata.tenant to every child span. Before this change those non-string values were dropped, and the SDK's metadata serialization helpers omit None, so optional/absent metadata now becomes a misleading filterable value instead of remaining absent; skip None before string coercion.

Useful? React with 👍 / 👎.

Comment on lines 290 to +294
for key, value in metadata_value.items():
if _validate_string_value(value=value, key=f"{metadata_key}.{key}"):
validated_metadata[key] = value
coerced_value = value if isinstance(value, str) else str(value)

if _validate_string_value(value=coerced_value, key=f"{metadata_key}.{key}"):
validated_metadata[key] = coerced_value

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The new metadata coercion at line 291 (coerced_value = value if isinstance(value, str) else str(value)) has two regression edge cases vs. the pre-PR _validate_string_value path: (1) str(value) can raise, propagating out of _propagate_attributes before the yield and breaking the entire with block — violating the documented "No exceptions are raised" contract; (2) None is silently coerced to the literal string 'None' (4 chars, passes the 200-char check) and propagated to every child span, where pre-PR it was dropped with a warning. Wrap str(value) in try/except (log + skip on failure) and add if value is None: continue before the coercion — mirroring the None-filtering already done for top-level attrs at lines 267-270.

Extended reasoning...

Bug 1: str(value) can raise and breaks the context manager

Line 291 now unconditionally calls str(value) on any non-string metadata value:

coerced_value = value if isinstance(value, str) else str(value)

This sits before the try: yield block at line 308, with no exception handling. If __str__ raises, the exception propagates out of _propagate_attributes and the user's wrapped with block never executes.

This violates the explicit docstring contract at lines 213-214:

Raises: No exceptions are raised. Invalid values are logged as warnings and dropped.

Pre-PR, _validate_string_value only ever called isinstance(value, str) and len() on already-confirmed strings — neither can raise on user-provided objects. The PR widens the type signature from Dict[str, str]Dict[str, Any], inviting arbitrary objects, but loses the defensive check.

Realistic triggers (not just theoretical):

  • SQLAlchemy detached instance whose backing session is closed (DetachedInstanceError on __str__)
  • Mock objects with side_effect=Exception in tests
  • Custom domain objects with broken __repr__/__str__
  • Third-party objects that lazy-load on stringification

Step-by-step proof:

class Broken:
    def __str__(self):
        raise RuntimeError("oops")

executed = False
try:
    with propagate_attributes(metadata={"k": Broken()}):
        executed = True  # NEVER runs
except RuntimeError:
    pass

assert executed is False  # User's wrapped code never executed
  1. _propagate_attributes enters the metadata loop at line 290
  2. value = Broken(), isinstance(value, str) is False
  3. str(value) is evaluated → raises RuntimeError("oops")
  4. Exception propagates up; yield at line 309 never reached
  5. User's wrapped code never runs, even though a tracing helper should be transparent

This affects experiment_metadata and experiment_item_metadata as well (same loop, lines 256-261), so internal Langfuse experiment runs are also exposed.

Fix:

try:
    coerced_value = value if isinstance(value, str) else str(value)
except Exception as e:
    langfuse_logger.warning(
        f"Propagated metadata '{metadata_key}.{key}' could not be coerced to string: {e}. Dropping value."
    )
    continue

Bug 2: None silently coerced to literal string 'None'

str(None) == 'None' (4 chars), which passes the len(value) <= 200 check in _validate_string_value. So a metadata entry like {'optional_field': None} now propagates the literal four-character string 'None' to every child span and into Langfuse UI/aggregations.

Pre-PR behavior: None failed isinstance(value, str) in _validate_string_value, logged a warning, and was dropped — the field was simply absent.

Internal inconsistency in this same function: lines 267-270 explicitly filter None from top-level propagated string attributes (user_id, session_id, etc.) via:

propagated_string_attributes = {
    k: v for k, v in propagated_string_attributes.items() if v is not None
}

So the established convention in this very function is "None means absent". The new metadata path breaks that convention silently for individual dict values.

Step-by-step proof:

  1. User calls propagate_attributes(metadata={'user_tier': user.tier if user else None}) — an idiomatic Python pattern
  2. Loop at line 290: value = None, isinstance(None, str) is False
  3. coerced_value = str(None) = 'None' (a 4-char string)
  4. _validate_string_value('None', 'metadata.user_tier') returns True (it IS a string, len ≤ 200)
  5. validated_metadata['user_tier'] = 'None' — literal string set on every child span
  6. Langfuse UI shows user_tier='None', aggregations count it as a real value

The PR docstring/changelog says values are "coerced to strings" but does not call out this None special case, and the PR description specifically mentions LangGraph's int/list/tuple values — never None.

Fix:

for key, value in metadata_value.items():
    if value is None:
        continue  # match the convention at lines 267-270
    # ... coercion ...

Why this is one merged finding

Both regressions live on the same new line (291) and have the same root cause: the coercion path lost the defensive checks that _validate_string_value previously provided. Fixing each in isolation is a one-line change; addressing them together as a small refactor of the metadata loop is cleaner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant