Progress: core runtime refactor checkpoints#192
Conversation
- Add docs/architecture/ with 11 deep-dive docs covering CC patterns: query loop, tool execution, state/agents, security/permissions, API/prompt infra, PowerShell, plugins, settings/platform, compaction pipeline (4-layer, SM-Compact, Legacy Compact details) - Add cc-patterns.md master blueprint with LangChain mapping, implementation priority roadmap (Phase 1-5), and PARTIAL gap registry - Refactor core agent modules: chat_tool_service, delivery, service, agent runtime, registry, filesystem/search/wechat tool services - Add core/runtime/prompts.py
- Phase 1: slim system prompt — move tool usage guidance to descriptions, keep only sub-agent type routing in system prompt - Phase 2: rewrite all tool descriptions to convey non-intuitive boundary conditions (Read/Write/Edit/Glob/Grep/Bash/Agent/WebSearch/WebFetch/ TaskOutput/TaskStop/TaskCreate/tool_search/load_skill) - Phase 3: add pages param to Read schema; add line_numbers param to Grep schema and handler; add subagent_type enum to Agent schema - Phase 4: mark WebSearch/WebFetch/tool_search/load_skill/TaskGet/TaskList/ wechat_contacts as is_concurrency_safe + is_read_only - Phase 5: sub-agent tool filtering — AGENT_DISALLOWED/EXPLORE_ALLOWED/ PLAN_ALLOWED/BASH_ALLOWED constants; LeonAgent accepts extra_blocked_tools and allowed_tools; _run_agent applies per-type filters - Phase 6: add LSP placeholder to tool_catalog (deferred, default=False) - Extras: search_hint for Agent/TaskOutput/TaskStop/chat tools/wechat_send; TaskOutput marked is_read_only; Edit description adds .ipynb workaround; fix prompt caching to place cache_control on system_message content block; add forkContext parent message inheritance with _filter_fork_messages; expose set_current_messages ContextVar for sub-agent context passing
- Add --max-columns 500 to suppress minified/base64 output - Add missing VCS excludes: .svn, .hg, .bzr, .jj, .sl - Default head_limit 250 (matches CC's undocumented cap)
Registers a DEFERRED LSP tool providing code intelligence: goToDefinition, findReferences, hover, documentSymbol, workspaceSymbol. - _LSPSession: holds multilspy LanguageServer alive in a background asyncio task using start_server() context manager + Event-based lifecycle control - LSPService: lazy per-language session pool, auto-detects language from file extension, converts absolute paths to workspace-relative - Integrated into LeonAgent._init_services() with CleanupRegistry at priority 1 - Optional dep: pip install multilspy (or leonai[lsp]) - Supported: python, typescript, javascript, go, rust, java, ruby, kotlin, csharp - Language servers auto-downloaded on first use per multilspy design
- multilspy moved from optional to core dependencies (avoid restart cost) - Add 10 MB file size limit (matches CC LSP spec) - Add gitignore filtering on returned locations via git check-ignore, batched in groups of 50 (matches CC batch size) - Remove multilspy availability check from handler (always available now)
Adds 4 missing LSP operations via multilspy internal API: - goToImplementation (textDocument/implementation) - prepareCallHierarchy (textDocument/prepareCallHierarchy) - incomingCalls (callHierarchy/incomingCalls) - outgoingCalls (callHierarchy/outgoingCalls) Total supported operations: 9 (matches CC LSP tool surface). incomingCalls/outgoingCalls take the 'item' output from prepareCallHierarchy. Language auto-detected from item.uri for call hierarchy ops.
- _fmt_symbol: handle both SymbolInformation (workspaceSymbol, has location.uri) and DocumentSymbol (documentSymbol, has top-level range/selectionRange) - request_definition/references/hover/document_symbols: catch AssertionError from multilspy when server returns None (maps to empty result / no hover)
…langserver
Python's Jedi server doesn't support goToImplementation or call hierarchy.
Add _PyrightSession — a minimal asyncio LSP client over stdio — that talks to
pyright-langserver (bundled with `pip install pyright`, already a core dep).
Changes:
- _PyrightSession: JSON-RPC/Content-Length stdio client, initialize handshake,
textDocument/didOpen, callHierarchy/{incomingCalls,outgoingCalls},
textDocument/{implementation,prepareCallHierarchy}
- Acks server-to-client requests (window/workDoneProgress/create etc.)
- Keeps files open for session lifetime (required for call hierarchy)
- LSPService routes Python advanced ops to pyright, other languages to multilspy
- Fix _fmt_symbol: handle both SymbolInformation (workspaceSymbol) and
DocumentSymbol (documentSymbol) response formats
- Fix AssertionError from multilspy null responses → empty result
- pyproject.toml: add core.tools.lsp to packages list (was missing — would cause lsp tool to be absent after pip install leonai) - pyproject.toml: add pyright>=1.1.0 as core dep (required by _PyrightSession) - lsp/service.py: remove unused _wait_for_idle, _active_progress, _idle_event, _progress_started from _PyrightSession (pyright doesn't send $/progress) - plan-tool-alignment.md: replace Phase 6 placeholder with actual implementation summary (9 operations, dual-backend architecture, deps)
Language servers (multilspy + pyright) now live in a module-level _LSPSessionPool instead of per-LSPService instances. Sessions are keyed by (language, workspace_root), start lazily on first use, and survive agent restarts. Cleanup moved from CleanupRegistry to the backend lifespan finally block via `await lsp_pool.close_all()`. - Add _LSPSessionPool with asyncio.Task-based dedup for concurrent starts - Simplify LSPService to delegate all session management to lsp_pool - Remove _cleanup_lsp_service from LeonAgent and CleanupRegistry - Add lsp_pool.close_all() to backend/web/core/lifespan.py shutdown Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Processed the concrete review items on this PR and merged latest Addressed slices:
Verification run on the relevant slices:
I also had a second hostile pass on each narrow slice outside the main loop; those all came back green. The remaining higher-level concerns from the review (e.g. |
|
Follow-up on the review items from #4058409950: Handled slices now on this branch:
Local verification on the current head:
Independent hostile re-checks on the narrowed slices are green as well. The only residual caveat I would call out is scope honesty: the router-test drift fixes prove the router contract, not deeper AuthService/Supabase behavior by themselves. |
Simplification Suggestions — Same Power, Less ComplexityFollowing up on the architecture review. These are patterns in the PR that could be simpler AND more extensible with relatively small changes. Ordered by impact. 1. Eliminate sync/async twins in
|
| # | Change | Lines saved | Extensibility gain | Effort |
|---|---|---|---|---|
| 1 | Kill sync/async twins | ~400 | High — removes footgun | Medium |
| 2 | CheckpointStore interface | ~60 | Very high — unblocks langgraph removal | Medium |
| 3 | Recovery strategy chain | 0 (reorg) | High — new strategies = 1 function | Low |
| 4 | Explicit middleware declaration | ~20 | Medium | Very low |
| 5 | Executor DI | ~10 | Medium — enables unit testing | Low |
| 6 | Constructor injection | ~10 | Low — prevents silent failures | Very low |
Total: ~500 lines of refactoring to go from "works but only the author can extend" to "anyone can add a recovery strategy / middleware / checkpoint backend".
|
Follow-up closeout for review thread 4058409950. Everything I could address in this PR has now been handled on the branch, including the later CI fallout from those changes. The last follow-up slices were:
Current status on this PR head (950f3e5):
I also ran the relevant local verification before pushing, and the partner hostile/review loop came back green on the narrow review slices, including the final Windows-only boundary. If there is still a remaining reviewer concern, it should now be a new narrow item rather than an unaddressed piece of the original thread. |
# Conflicts: # docs/en/configuration.md # docs/zh/configuration.md
|
🚀 预发部署已触发 分支: |
|
Deployment note for reviewers (sanitized): Current status
Important caveat: staging CD is currently misleading
Local deployment: what may trip you up
Minimal local verification steps
What is normal vs. not normal
Recommended next step
|
|
Nice work on the review items — the fork encapsulation ( Quick follow-up on the simplification suggestions thread (#4187099517) — any thoughts on those? Specifically interested in your take on:
Not suggesting these go into this PR if you want to keep scope tight — just want to know if you see them as valid follow-ups or if there's a reason the current shape is intentional. |
|
Deployment follow-up, de-sensitive:
One remaining shared-staging note that is separate from the thread-detail/runtime 500: |
Summary
QueryLoop.aget_state/aupdate_statebridge for backend/web callers after the reopened ql-06 regression__start__appends andRemoveMessage-based repair updates_repair_incomplete_tool_calls()andget_thread_history()so the caller contract stays lockedTest Plan
uv run pytest tests/unit/test_loop.py tests/test_query_loop_backend_bridge.py -q:8010reported the original caller-surface blocker no longer reproduces