Skip to content

fix(extraction): use extra_body for response_format, fix content parsing#10

Open
qiaoranlaodou-cpu wants to merge 1 commit into
lancedb:mainfrom
qiaoranlaodou-cpu:fix/extraction-response-format
Open

fix(extraction): use extra_body for response_format, fix content parsing#10
qiaoranlaodou-cpu wants to merge 1 commit into
lancedb:mainfrom
qiaoranlaodou-cpu:fix/extraction-response-format

Conversation

@qiaoranlaodou-cpu

Copy link
Copy Markdown

Bug

extraction.extract() calls call_llm(..., response_format={"type": "json_object"}). call_llm() has no top-level response_format kwarg, only extra_body — passing it directly raises TypeError, which the surrounding except Exception silently swallows. Net effect: automatic extraction never runs at all, but there's no visible error — it just looks like every conversation is being filtered out as "trivial".

Even after fixing that, the response is parsed wrong: text = getattr(response, "content", response) assumes the response has a top-level .content attribute, but call_llm() returns the raw .choices[0].message.content object (same shape used everywhere else in Hermes — see oneshot.py). The getattr falls through to its default (response itself), gets stringified, and fails json.loads, again silently caught.

Fix

  • Pass response_format through extra_body instead of as a top-level kwarg.
  • Parse the response with agent.auxiliary_client.extract_content_or_reasoning(), the same helper other auxiliary callers use (also handles reasoning-model content=None fallback).

Testing

  • Added tests/test_extraction.py (there was no coverage for this module before) — mocks agent.auxiliary_client per the repo's offline-test convention.
  • Verified end-to-end against a real DeepSeek call through the actual extract() function with a mixed noisy/durable conversation — confirmed extraction now actually executes and returns valid JSON facts (previously: silent no-op).
  • ruff check and pytest both pass on the changed files. Note: tests/test_memory_loop.py, tests/test_embeddings.py::test_client_uses_configured_base_url_and_key, and tests/test_longmemeval_benchmark.py have 12 pre-existing failures in my environment unrelated to this change (confirmed via git stash — they fail identically on main), looks like a local lancedb package version mismatch, not something this PR touches.

call_llm() has no top-level response_format kwarg, only extra_body --
passing it directly raised a TypeError that the except-clause silently
swallowed, so automatic extraction never actually ran (looked like
"filtering everything out" but was really a crash on every call).

Even with that fixed, the old getattr(response, "content", response)
couldn't read the result either: call_llm() returns the raw
.choices[0].message.content response object, not something with a
top-level .content attribute. Use extract_content_or_reasoning(), the
same helper every other Hermes auxiliary caller uses.

Added tests/test_extraction.py since there was no test coverage for
this module at all.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant