Skip to content

Refactor: extract LLMProvider abstraction (no behaviour change)#3

Open
m31527 wants to merge 1 commit into
mainfrom
spike/llm-provider-abstraction
Open

Refactor: extract LLMProvider abstraction (no behaviour change)#3
m31527 wants to merge 1 commit into
mainfrom
spike/llm-provider-abstraction

Conversation

@m31527

@m31527 m31527 commented May 6, 2026

Copy link
Copy Markdown
Owner

Summary

  • Pure refactor — extract every cloud LLM call site behind a single LLMProvider interface so adding OpenRouter / OpenAI / Gemini later is a one-file change instead of a five-file rewrite.
  • No behaviour change, no new dependencies, no new config flag. All four call sites (escalation.resolve, escalation.resolve_whole_task, escalation.synthesize_summary, skill_factory._generate_code, main.optimize_prompt) hit the same Anthropic API the same way as before — they just route through the new AnthropicProvider instead of building anthropic.AsyncAnthropic ad-hoc.
  • Sets up the seam for the OpenRouter follow-up that was discussed earlier this session: subclass LLMProvider once, wire it into get_default_provider(), done.

What's in the new module

llm_provider.py introduces:

  • LLMProvider (ABC) — model_name, provider_name, is_configured, complete_text, complete_with_tools, new_history.
  • LLMHistory (ABC) — provider-managed conversation state. Subclasses store messages in their native shape (Anthropic content blocks today; OpenAI tool_calls in a future implementation). Caller mutates only via add_user_message / add_assistant_turn / add_tool_results, so the loop logic in escalation.resolve_whole_task stays provider-agnostic.
  • Value objectsLLMUsage, LLMTextResponse, LLMToolUse, LLMToolTurnResponse, LLMToolResult. Callers never see SDK-specific types.
  • AnthropicProvider — wraps the Anthropic SDK. Lazy-imports anthropic so import llm_provider works in environments where the SDK isn't installed (landing-page bundle, local-only mode).
  • get_default_provider() — module-level singleton factory. Today always returns AnthropicProvider; later switches on a future config.llm_provider setting.

Refactor details

File Before After
escalation.py import anthropic, owned AsyncAnthropic client, hard-coded "Claude" in 4 log/error strings Takes optional LLMProvider in __init__. Logs/errors use provider.provider_name. Tool-use loop uses history.add_assistant_turn() / add_tool_results() instead of constructing Anthropic content-block dicts inline.
skill_factory.py import anthropic, owned AsyncAnthropic client Takes optional LLMProvider. _generate_code() routes through provider.complete_text(). RuntimeError contract for missing-key preserved (callers catch it specifically).
main.py optimize_prompt built its own anthropic.AsyncAnthropic ad-hoc Routes through get_default_provider().
llm_provider.py (new) 460 lines, fully docstring'd.

What did NOT change (intentional, for the OpenRouter follow-up)

  • config.pyanthropic_api_key + claude_model settings stay as-is. Renaming would break user-stored settings; a future llm_provider setting will sit beside them.
  • settings_store.py — same reason.
  • tracker.py + memory.py — Claude pricing tables are hardcoded for cost estimation. When a non-Anthropic provider is wired the pricing dict gets extended; today the model_provider='anthropic' SQL filters in tracker.py:643,654 still match correctly.
  • No OpenRouter / OpenAI / Gemini implementation — that's the next deliverable. This PR just makes it a single-file add.

Why ship this now (without the second provider yet)

We discussed adding OpenRouter support in this session and concluded it's worth doing after Lifetime ships — but doing the abstraction work upfront has zero ship-blocker risk (pure refactor, identical behaviour) and removes the biggest drag on the future change: rewriting four call sites in lockstep with the new SDK. With this seam in place, the OpenRouter PR is "add OpenRouterProvider + a settings toggle" — touched-files count goes from 5 → 2.

Verification

  • python -c "import main" succeeds.
  • python -c "import llm_provider; p = llm_provider.get_default_provider(); print(p.provider_name, p.is_configured())"anthropic False (no key in test env, expected).
  • grep -n "^import anthropic" *.py → only llm_provider.py matches.
  • All four call sites still import + run.

Test plan

  • Smoke-test escalation in dev: trigger a low-confidence subtask, confirm Claude resolves it, confirm a row appears in usage_records with model_provider='anthropic'.
  • Smoke-test the whole-task takeover (the "Claude fix" button on a failed task) — confirm the multi-turn tool-use loop still completes and tool results round-trip correctly.
  • Smoke-test skill generation — Settings → Skills → Generate, confirm a new skill file lands in skills/ and is syntactically valid.
  • Smoke-test the prompt optimiser button on a draft task description — confirm the rewritten text comes back and tokens_in/out are non-zero.
  • Confirm the Usage tab still shows Claude rows correctly (the model_provider='anthropic' filter in tracker.py still applies).

🤖 Generated with Claude Code

…iction

Pure refactor — no behaviour change, no new dependencies, no config
flag. Sets up the seam so a follow-up can add OpenRouter / OpenAI /
Gemini in one file without touching the call sites.

What changed
------------
New module llm_provider.py:
  * LLMProvider ABC — model_name / provider_name / is_configured /
    complete_text / complete_with_tools / new_history.
  * LLMHistory ABC — provider-managed conversation state. Subclasses
    store messages in their native shape (Anthropic content blocks
    today; OpenAI tool_calls in a future implementation).
  * Value objects (LLMUsage, LLMTextResponse, LLMToolUse,
    LLMToolTurnResponse, LLMToolResult) so callers never see SDK-
    specific types.
  * AnthropicProvider — wraps the Anthropic SDK, lazy-imports it so
    "import llm_provider" works even when anthropic isn't installed
    (landing-page bundle, local-only mode).
  * get_default_provider() — module-level singleton factory. Today
    always returns AnthropicProvider; later switches on a future
    config.llm_provider setting.

escalation.py:
  * Removed "import anthropic". EscalationAgent now takes an optional
    LLMProvider in __init__ (defaults to get_default_provider()).
  * resolve(), resolve_whole_task(), synthesize_summary() all route
    through provider.complete_text / complete_with_tools.
  * Tool-use loop in resolve_whole_task() uses provider.new_history()
    + history.add_assistant_turn() / add_tool_results() — same
    Anthropic message shape today, but the loop logic itself is
    provider-agnostic.
  * Logger / error messages now reference provider.provider_name
    instead of hard-coding "Claude" so swapped providers self-label.

skill_factory.py:
  * Removed "import anthropic". SkillFactory takes optional
    LLMProvider; _generate_code() routes through provider.complete_text.
  * Preserved the RuntimeError contract for missing-key (callers
    catch it specifically).

main.py:
  * optimize_prompt endpoint routes through get_default_provider()
    instead of building its own anthropic.AsyncAnthropic ad-hoc.
  * Drops the inline "import anthropic".

What did NOT change (follow-ups for the OpenRouter implementation)
-------------------------------------------------------------------
  * config.py — anthropic_api_key + claude_model settings stay as-is
    (renaming would break user-stored settings; a future llm_provider
    setting will sit beside them).
  * settings_store.py — same reason.
  * tracker.py + memory.py — Claude pricing tables hardcoded for cost
    estimation. When a non-Anthropic provider is wired we will extend
    the pricing dict; today the model_provider='anthropic' SQL filters
    in tracker.py:643,654 still match correctly.
  * No OpenRouter / OpenAI / Gemini implementation — that's the next
    deliverable; this commit just makes it a single-file add.

Verification
------------
python -c "import main" succeeds.
grep -n "^import anthropic" *.py shows no matches outside llm_provider.py.
Behaviour identical to pre-refactor: all four call sites hit the same
Anthropic API the same way.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
m31527 added a commit that referenced this pull request Jun 9, 2026
Bug report: "整理 threads.com 上最近討論黴菌的熱門推文 top 20" →
4m 40s, 4 plausibly-shaped posts, claimed produced threads_mold_
discussion.html, file didn't exist on download attempt. Three failure
modes stacked:

  1. URL: threads.com is NOT Meta's Threads — it's an unrelated US
     insurance company that redirects somewhere benign. The Meta
     domain is threads.NET. Anonymous browser_fetch on threads.com
     hits the wrong site, the page has no posts, the planner had
     nothing to work with.
  2. No dedicated Threads tool. Planner fell back to python_exec
     for scraping. Small model + unknown DOM = wrong selectors
     producing either empty results or fully hallucinated content
     (which is what the "矽利康發霉、老舊房屋健康影響" categories
     look like — too neatly summarised to be real scrape output).
  3. Synthesizer claimed "📎 產出:threads_mold_discussion.html"
     without verifying. python_exec didn't actually write the file
     (or wrote it to a path that file_tool can't see). When the user
     asked to download, file_tool returned "查無檔案".

This commit fixes #1 + #2 with a dedicated tool, same pattern as
x_search:

  - Canonical URL hard-coded to https://www.threads.net/search?q=
    — `threads_search(query='foo')` ignores whatever the user typed
    and goes to the right place.
  - DOM extractor pinned to data-testid / role / dir=auto attributes
    that Meta keeps stable across UI refreshes. Extracts handle,
    displayName, text, permalink URL, posted_at timestamp, replies/
    likes/reposts counts via aria-label parsing.
  - Login check before extracting — if Threads bounced us to /login,
    return a clear "log in via Continue with Instagram in
    login-helper.sh" error instead of returning empty results.
  - v1.1.6 hardening reused via _open_attached_page() (stealth init,
    cookies inject, etc.); rate limit registered at import time so
    chrome_attach_check reports it (cap 20/hr same as x_search).
  - Real mouse-wheel scrolling + ±15% jitter for the lazy-load loop,
    same as x_search.

Planner prompt:

  - New "【特殊:Threads(Meta)任務】" routing rule. Includes the
    threads.com vs threads.net trap so the planner doesn't reproduce
    the user's typo.
  - New "【特殊:產出 HTML / 報告檔案】" rule that mandates a
    write_file subtask when the goal asks for a file. Adds the
    "DON'T claim 📎 產出 in the synthesizer without actually
    write_file'ing" guardrail explicitly, so future hallucinated-
    artifact scenarios get caught at planning time.

#3 is partially addressed by the planner rule but the synthesizer
itself still has no automatic "did the claimed file actually land
in workspace" verification. That's a follow-up — for now the
planner rule covers the common path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@ShreyasP10 ShreyasP10 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LLMProvider / LLMHistory abstract base classes make the system truly provider‑agnostic – no Anthropic‑specific types leak outside llm_provider.py.

LLMProviderError unifies all SDK errors, so callers never need to import a specific SDK.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants