feat(memory_tree): consolidate module + add agentic walk tool + tests#2556
Conversation
…l impls into one module Move three locations into a single first-class `src/openhuman/memory_tree/` module: - `src/openhuman/memory/tree/` -> `src/openhuman/memory_tree/` - `src/openhuman/tree_summarizer/` -> `src/openhuman/memory_tree/summarizer/` - `src/openhuman/tools/impl/memory/tree/` -> `src/openhuman/memory_tree/tools/` Public RPC method names (`openhuman.memory_tree_*`), tool names (`memory_tree`), and controller-schema symbol names are unchanged - this is a code-location refactor only. `core/all.rs` keeps the same imports via re-exports on the new module.
New `MemoryTreeWalkTool` that, given a free-text query, runs a turn-based inner LLM loop over inner navigation primitives (`descend`, `peek`, `fetch_leaves`, `answer`) to walk the memory tree and return a synthesized answer + step trace. Uses the lightweight summarization model from `config.local_ai.chat_model_id`; capped at 6 turns by default (hard cap 20). Wired as a new `"walk"` mode on the consolidated `MemoryTreeTool` dispatcher and registered as a standalone `MemoryTreeWalkTool` in `tools/ops.rs`. Inner tool-calling uses the existing XML `<tool_call>` convention from the agent harness rather than structured `tool_calls`, since `Provider::chat_with_history` returns a plain `String`. Includes 3 unit tests (happy-path walk, max-turn cap, unknown-node recovery) driven by a scripted stub `Provider`.
…er and walk tool - summarizer/engine.rs: mark `group_by_hour` and `propagate_node` `pub(crate)`, wire `engine_tests.rs` as a sibling test module. - summarizer/engine_tests.rs: 14 unit tests covering group_by_hour edge cases, propagate_node (no-children noop, day/month from children, created_at preservation), and run_summarization (empty buffer, single-hour drain, ancestor chain, multi-hour grouping) — all driven by a scripted stub Provider. - tests/memory_tree_summarizer_e2e.rs: 3 e2e tests calling run_summarization directly with a ScriptedProvider stub. Covers full hour->day->month->year->root build, merge-into-existing-hour with created_at preservation, and partial- progress retention on mid-run LLM error. - tests/memory_tree_walk_e2e.rs: 3 e2e tests for the walk tool driving an OpenAI-compatible Provider against a wiremock server scripted with XML `<tool_call>` responses. Covers happy-path descend->fetch->answer, max-turn cap, and graceful unknown-node recovery. All 20 new tests pass.
|
Warning Review limit reached
Your plan currently allows 2 reviews/hour. Refill in 9 minutes and 2 seconds. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more review capacity refills, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughLarge-scale namespace migration from openhuman::memory::tree to openhuman::memory_tree across controllers, CLI, jobs, stores, retrieval, scoring, and agents. Introduces memory_tree::summarizer, removes legacy exports, and adds a new multi-turn MemoryTree Walk tool with registration and comprehensive unit/e2e tests. ChangesMemory-tree consolidation and tooling
Sequence Diagram(s)sequenceDiagram
participant Client as MemoryTreeWalkTool
participant LLM as Provider
participant Store as Retrieval/Store
Client->>LLM: chat_with_history(query + node context)
LLM-->>Client: text + <tool_call name="descend/peek/fetch_leaves/answer">
Client->>Store: fetch node/children or leaves (per tool_call)
Store-->>Client: results (context or leaf text)
Client->>LLM: next turn with updated history
LLM-->>Client: <tool_call "answer"> final text
Client-->>Client: assemble WalkOutcome(trace, answer)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
|
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
src/openhuman/composio/providers/profile.rs (1)
1-847: 🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy liftSplit this module to get back under the repository size threshold.
src/openhuman/composio/providers/profile.rsis now 847 lines, which exceeds the ~500-line limit and makes further changes riskier to review and maintain. Please split this into focused sibling modules (e.g., identity types/parsing, persistence, read paths, prompt rendering, tests).As per coding guidelines
**/*.{ts,tsx,rs}: “File size should not exceed approximately 500 lines. When a module grows beyond this threshold, split it into smaller, more focused modules with clear responsibilities.”🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/composio/providers/profile.rs` around lines 1 - 847, Split the oversized profile.rs into smaller focused modules: create identity.rs (define IdentityKind, canonicalize, parse_skill_identity_key), persist.rs (persist_provider_profile, expand_identity_rows, json_str, and any profile::profile_upsert/learning_candidate usage), read.rs (load_connected_identities, is_self_identity, is_self_identity_any_toolkit, delete_connected_identity_facets), render.rs (render_connected_identities_section, ConnectedIdentity struct), and helpers.rs (normalize_token, title_case, sanitize_prompt_value, now_secs). Move the corresponding unit tests into matching test modules or a tests/ submodule and update all internal references (e.g., ProviderUserProfile, FacetType, profile_upsert, learning_candidate::global) to import from the new modules; re-export public symbols from a new mod profile { pub use self::identity::*, self::persist::*, ... } in the original path so external callers keep the same API. Ensure visibility (pub/pub(crate)) and fix imports (use super:: or crate::openhuman::composio::providers::profile::...) and run cargo test to resolve any naming/borrow changes.src/openhuman/memory_tree/retrieval/source.rs (1)
1-690: 🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy liftPlease split this module to comply with the repo’s file-size rule.
This file is substantially above the ~500-line target; moving tests and/or semantic rerank helpers into sibling modules would make it easier to maintain.
As per coding guidelines
**/*.{ts,tsx,rs}: “File size should not exceed approximately 500 lines.”🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/memory_tree/retrieval/source.rs` around lines 1 - 690, The module is too large — split out the semantic rerank logic and the tests into sibling modules: move rerank_by_semantic_similarity (and its helper imports like build_embedder_from_config, cosine_similarity and any HashMap/embedding lookup code) into a new retrieval/semantic.rs module and export it (pub(crate) or pub as needed), and move the entire #[cfg(test)] mod tests into retrieval/tests.rs (or retrieval/source_tests.rs) as a test-only module that imports the public helpers from source.rs; update source.rs to declare the new submodules (mod semantic; #[cfg(test)] mod tests;) and replace internal calls like rerank_by_semantic_similarity(...) and any moved helper references with the re-exported symbols, adjust visibility of collect_hits_and_nodes/select_trees/scope_matches_kind if tests need access (make them pub(crate) instead of fn), and fix imports/usages so compilation and tests still pass.src/openhuman/memory_tree/tools/walk.rs (1)
1-957: 🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy liftSplit this module to stay under the repository file-size threshold.
This file is ~957 lines; please split operational pieces (parser/helpers/tests/adapter) into focused sibling modules.
As per coding guidelines
**/*.{ts,tsx,rs}: “File size should not exceed approximately 500 lines.”🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/memory_tree/tools/walk.rs` around lines 1 - 957, Split this large module into focused sibling modules to keep file size under ~500 lines: keep the public API and core loop (MemoryTreeWalkTool, run_walk, WalkOptions, WalkOutcome, WalkStep, WalkStopReason) in the original file and move parsing, adapter, inner helpers, and tests into new modules; specifically extract parse_walk_tool_calls and InnerCall into a parser module, move ChatProviderAdapter into an adapter module, move dispatch_inner_call, build_node_context, build_system_prompt, build_inner_tools_text, and synthesize_fallback_answer into a helpers (or primitives) module, and relocate the #[cfg(test)] test module to a tests module/file; update the original file to import these with mod/use and re-export symbols if needed so run_walk and the Tool implementation still call parser::parse_walk_tool_calls, adapter::ChatProviderAdapter, and helpers::dispatch_inner_call (and helpers::build_node_context, build_system_prompt, build_inner_tools_text, synthesize_fallback_answer) with minimal changes to function signatures.src/openhuman/memory_tree/jobs/handlers/mod.rs (1)
1-1521: 🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy liftThis module should be split before merge.
At ~1521 lines, this file is far above the repository threshold and is now difficult to reason about; please break handlers/tests into smaller focused modules.
As per coding guidelines
**/*.{ts,tsx,rs}: “File size should not exceed approximately 500 lines.”🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/memory_tree/jobs/handlers/mod.rs` around lines 1 - 1521, Large single-file module (~1521 lines) violates the ~500-line guideline; split it into smaller modules. Extract per-kind handlers (handle_extract, handle_append_buffer, handle_seal, handle_topic_route, handle_digest_daily, handle_flush_stale, handle_reembed_backfill), their related constants (L0_DEFAULT_FLUSH_AGE_SECS, REEMBED_BACKFILL_BATCH, REEMBED_BACKFILL_REVISIT_MS) and helper functions (try_mark_chunk_reembed_skipped, try_mark_summary_reembed_skipped) into a new handlers/*.rs (or multiple files) and re-export or call them from this mod.rs's handle_job dispatcher; move the #[cfg(test)] block/tests into a tests/ submodule or separate test files preserving test functions (e.g., source_tree_seal_handler_enqueues_summary_topic_route, reembed_backfill_repopulates_then_completes, reembed_backfill_tombstones_orphan_and_terminates) and adapt visibility (pub(crate) or pub) and use/import paths accordingly; update mod declarations and use paths in this file so existing callers (handle_job, worker::wake_workers, chunk_store::tree_active_signature, etc.) continue to compile. Ensure transactional helpers (chunk_store::with_connection calls) and logging remain reachable after the split and run cargo test to fix any visibility/import issues.
🧹 Nitpick comments (12)
src/openhuman/channels/runtime/startup.rs (1)
42-635: 🏗️ Heavy liftSplit this startup module into smaller focused units before adding more wiring.
This file is already well beyond the size threshold, which makes channel boot flow ownership and maintenance harder. Please break it into focused modules (e.g., bus/subscriber registration, provider/memory bootstrap, channel construction).
As per coding guidelines "
**/*.{ts,tsx,rs}: File size should not exceed approximately 500 lines. When a module grows beyond this threshold, split it into smaller, more focused modules with clear responsibilities."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/channels/runtime/startup.rs` around lines 42 - 635, The start_channels function has grown too large; split its responsibilities into smaller functions/modules: extract bus/subscriber registration into a new module/function (e.g., register_startup_subscribers) that encapsulates calls to event_bus::init_global, TracingSubscriber subscribe, register_health_subscriber, register_skill_cleanup_subscriber, the Phase 2–4 learning OnceLock blocks and cron/proactive/tree subscribers; extract provider/memory/bootstrap logic into a new function (e.g., bootstrap_provider_and_memory) that returns (provider, mem, runtime, security, audit, provider_runtime_options) and contains create_intelligent_routing_provider, provider.warmup, host_runtime::create_runtime, SecurityPolicy::from_config, get_or_create_workspace_audit_logger, memory::create_memory_with_local_ai; extract channel list construction into its own function (e.g., build_channel_list) that returns Vec<Arc<dyn Channel>> and contains all config.channels_config.* branch logic and spawn_supervised_listener wiring; keep start_channels to orchestrate the high-level flow (call the new functions, compute backoff/limits, create runtime_ctx, and call run_message_dispatch_loop). Refactor by moving extracted code into new files/modules, keeping existing symbol names (start_channels, spawn_supervised_listener, run_message_dispatch_loop, build_system_prompt, tools::all_tools_with_runtime) so callers remain unchanged and tests compile.src/openhuman/memory_tree/store_tests.rs (1)
47-771: 🏗️ Heavy liftSplit this test module into smaller focused suites.
The current test file is large enough that targeted maintenance and failure triage are getting expensive; grouping by behavior (connection cache, journaling, re-embed, schema/init) would improve clarity.
As per coding guidelines "
**/*.{ts,tsx,rs}: File size should not exceed approximately 500 lines. When a module grows beyond this threshold, split it into smaller, more focused modules with clear responsibilities."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/memory_tree/store_tests.rs` around lines 47 - 771, The test file is too large; split it into focused test modules (e.g., connection_cache, journaling/migration, reembed, schema_init) by moving related tests into new test files and keeping shared helpers in a single test_helpers module. For each group, create a new test module/file containing the tests that reference the same behavior (e.g., move with_connection_serialises_concurrent_schema_init, is_transient_cold_start_classifies_known_extended_codes, with_connection_keeps_foreign_keys_on_for_every_call, memory_tree_uses_truncate_journal_not_wal, existing_wal_db_migrates_to_truncate into a journaling/schema module; move connection_cache_returns_same_arc_for_same_workspace, connection_cache_uses_separate_connections_for_different_workspaces, circuit_breaker_trips_after_threshold into a connection_cache module; move clear_chunk_reembed_skipped_is_idempotent, clear_reembed_skipped_for_signature_removes_all_tombstones_for_sig, validate_reembed_skip_key into a reembed module; keep legacy_embeddings_migrate_to_sidecar_once with embedding-related helpers), update imports to use the shared helpers (e.g., test_config, sample_chunk, with_connection, get_or_init_connection, clear_connection_cache, try_cleanup_stale_files, clear_connection_cache, mark_chunk_reembed_skipped, clear_reembed_skipped_for_signature, validate_reembed_skip_key), and add mod declarations so Cargo runs them as tests; ensure visibility of helper functions (pub(crate) or move to a common tests/helpers module) so the split files compile and tests run.src/bin/slack_backfill.rs (1)
149-577: 🏗️ Heavy liftBreak this CLI binary into subcommand-focused modules.
mainnow spans too much behavior (probe modes, backfill modes, seal probe, ratelimit probe), making changes harder to reason about and test in isolation.As per coding guidelines "
**/*.{ts,tsx,rs}: File size should not exceed approximately 500 lines. When a module grows beyond this threshold, split it into smaller, more focused modules with clear responsibilities."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/bin/slack_backfill.rs` around lines 149 - 577, main is too large and should be split into focused subcommand handlers; extract the big conditional blocks into separate functions/modules and dispatch from a small main that only does init and CLI parsing. Specifically: move the seal-probe block into a new handler function (e.g. handle_seal_probe) that takes (&Cli, &Config) and uses ingest_chat; move the SLACK_SEARCH_MESSAGES probe into handle_probe_search(&Cli, &Config, &client_kind) which calls execute_action; move the probe_ratelimit loop into handle_probe_ratelimit(&Cli, &Config, &client_kind) (preserve Outcome enum and list_connections_via_kind usage); extract the search backfill loop that calls run_backfill_via_search into handle_search_backfill(&Cli, &Config, &connections); and extract the non-search per-connection backfill (provider.sync loop) into handle_sync_backfill(&Cli, &Config, &provider, &candidates). Keep init_default_providers, memory::global::init, tracing/env_logger setup and create_composio_client in main, then dispatch to these handlers based on CLI flags; wire return Result<()> through each handler and move related helper imports (chrono, ingest_chat, execute_action, list_connections_via_kind, ProviderContext) into their new modules.tests/memory_tree_summarizer_e2e.rs (1)
1-579: 🏗️ Heavy liftSplit this test module below the 500-line threshold.
This new file is ~579 lines, which exceeds the repository’s module-size guideline for Rust files. Please split it into focused modules (for example: env/provider harness helpers vs scenario tests) to keep maintenance and reviewability manageable.
As per coding guidelines: "
**/*.{ts,tsx,rs}: File size should not exceed approximately 500 lines."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/memory_tree_summarizer_e2e.rs` around lines 1 - 579, The file exceeds the 500-line guideline; split helper/harness code out of the test module into a smaller helper module and keep only the three scenario tests in the e2e test file. Move EnvVarGuard, ENV_LOCK/env_lock, ScriptedProvider (and its Provider impl), build_config, ts_hour14/ts_hour15, and NS into a new module (e.g., summarizer_harness) and make those items public, then in the original tests file replace the moved definitions with a mod/use to import summarizer_harness::{EnvVarGuard, env_lock, ScriptedProvider, build_config, ts_hour14, ts_hour15, NS}; ensure visibility changes (pub) where needed and update imports so engine::run_summarization and store::* usages in the three test functions remain unchanged.src/openhuman/context/segment_recap_summarizer_tests.rs (1)
19-19: 🏗️ Heavy liftSplit this test module to align with the Rust file-size guideline.
This file exceeds the ~500-line target; please break it into focused test modules.
As per coding guidelines:
**/*.{ts,tsx,rs}: File size should not exceed approximately 500 lines. When a module grows beyond this threshold, split it into smaller, more focused modules with clear responsibilities.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/context/segment_recap_summarizer_tests.rs` at line 19, The test module in segment_recap_summarizer_tests.rs has grown past the ~500-line guideline; split it into smaller focused test modules (e.g., segment_recap_summarizer_unit_tests.rs, segment_recap_summarizer_integration_tests.rs) by moving related test functions into new files, preserving imports like ChatPrompt and any helper fixtures, exporting or re-exporting shared helpers via a common mod (or a tests/util.rs) so tests still compile, and update the parent mod declarations (pub mod ... or mod ...) so the test suite runs unchanged; ensure each new file contains the appropriate use crate::openhuman::memory_tree::chat::ChatPrompt import and adjust visibility of helpers as needed.src/openhuman/memory_tree/read_rpc.rs (1)
34-39: 🏗️ Heavy liftDecompose this RPC module before merge to satisfy size constraints.
This module is far beyond the ~500-line limit and should be split by concern (list/search/recall, mutation endpoints, graph export, LLM config).
As per coding guidelines:
**/*.{ts,tsx,rs}: File size should not exceed approximately 500 lines. When a module grows beyond this threshold, split it into smaller, more focused modules with clear responsibilities.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/memory_tree/read_rpc.rs` around lines 34 - 39, The file src/openhuman/memory_tree/read_rpc.rs is too large and must be split by concern; create smaller modules (e.g., list_search_recall.rs for listing/search/recall logic that uses content_read and NodeKind/SourceKind, mutations.rs for mutation endpoints that use chunk_store/with_connection and score_store, graph_export.rs for graph export code, and llm_config.rs for LLM configuration and related RPCs), move the corresponding functions/types into those files, export them from a new mod.rs or update the parent mod to pub use the new modules, update imports in callers to reference the new module paths instead of read_rpc.rs, and run cargo check to fix any visibility or import issues.src/openhuman/memory_tree/retrieval/topic.rs (1)
20-28: 🏗️ Heavy liftSplit this retrieval module to meet the repository size guideline.
The file is now above the ~500-line threshold; splitting query/rerank/hydration/test helpers into submodules will keep maintenance safer.
As per coding guidelines:
**/*.{ts,tsx,rs}: File size should not exceed approximately 500 lines. When a module grows beyond this threshold, split it into smaller, more focused modules with clear responsibilities.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/memory_tree/retrieval/topic.rs` around lines 20 - 28, The topic.rs module has grown too large; split it into focused submodules (e.g., query.rs, rerank.rs, hydrate.rs, test_helpers.rs) and move the corresponding functionality into them: relocate query-related functions/types into query.rs, reranking logic into rerank.rs, hydration/assembly code into hydrate.rs, and any test helpers into test_helpers.rs; then add mod declarations in topic.rs (mod query; mod rerank; mod hydrate; mod test_helpers;) and re-export the public APIs you need (pub use query::..., etc.), update all internal uses/imports (e.g., hit_from_summary, QueryResponse, RetrievalHit, build_embedder_from_config, cosine_similarity, lookup_entity, EntityHit, Tree/TreeKind) to the new module paths, and adjust visibility (pub/pub(crate)) so existing callers keep working and tests compile.src/openhuman/memory_tree/tree_global/seal.rs (1)
20-33: 🏗️ Heavy liftSplit this module to stay within the repository size ceiling.
This file is now above the ~500-line limit; please split sealing logic/tests into smaller focused modules.
As per coding guidelines:
**/*.{ts,tsx,rs}: File size should not exceed approximately 500 lines. When a module grows beyond this threshold, split it into smaller, more focused modules with clear responsibilities.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/memory_tree/tree_global/seal.rs` around lines 20 - 33, This file exceeds the repository size ceiling; split its sealing logic and tests into smaller modules: extract core sealing functions and types (the main seal implementation that uses stage_summary, SummaryComposeInput, SummaryTreeKind, new_summary_id, with_connection, and store) into a focused seal_core.rs, move configuration/threshold constants (GLOBAL_TOKEN_BUDGET, MONTHLY_SEAL_THRESHOLD, WEEKLY_SEAL_THRESHOLD, YEARLY_SEAL_THRESHOLD) into a seal_config.rs, isolate embedding/score helper logic that uses build_embedder_from_config into seal_embed.rs, and place summariser-related code (Summariser, SummaryContext, SummaryInput) and Tree/Buffer types (Buffer, SummaryNode, Tree, TreeKind) in a seal_summariser.rs or re-export them from the new modules; also move large test cases into a parallel tests/ module/file so each source file stays under ~500 lines and update module declarations and re-exports accordingly.src/core/all.rs (1)
188-190: 🏗️ Heavy liftSplit controller/schema aggregation out of this oversized module.
Adding more registry entries here keeps growing a single hotspot that is already well beyond the size cap; please break registration/schema builders into smaller focused modules and compose them from this file.
As per coding guidelines:
**/*.{ts,tsx,rs}: File size should not exceed approximately 500 lines.Also applies to: 229-229, 316-317, 335-335
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/core/all.rs` around lines 188 - 190, The all.rs module is becoming too large due to direct aggregation of many controllers/schemas; extract the registration logic into smaller focused modules (e.g., create new modules like openhuman::memory_tree::registration and openhuman::memory_tree::retrieval_registration) that each expose functions such as all_memory_tree_registered_controllers and all_retrieval_registered_controllers (or analogous names) and move the schema/controller builder code into those modules, then in all.rs simply call controllers.extend(...) with those exported helper functions so all.rs only composes registrations instead of containing their implementations.src/openhuman/memory_tree/retrieval/rpc.rs (1)
310-313: 🏗️ Heavy liftMove this large inline test block into a sibling
rpc_test.rs.This module is already over the size threshold; extracting the
#[cfg(test)]section will keep handler code focused and reduce churn in one file.As per coding guidelines:
**/*.{ts,tsx,rs}: File size should not exceed approximately 500 lines.andsrc/**/*.rs: ... prefer a sibling *_test.rs file ....Also applies to: 321-324
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/memory_tree/retrieval/rpc.rs` around lines 310 - 313, Move the large inline #[cfg(test)] test block out of this module into a sibling rpc_test.rs to reduce rpc.rs size: cut the entire test module from src/openhuman/memory_tree/retrieval/rpc.rs and paste it into src/openhuman/memory_tree/retrieval/rpc_test.rs, update imports inside the new file to use the same symbols (content_store, upsert_chunks, and types::{chunk_id, Chunk, Metadata, SourceRef}, chrono::Utc) and any crate-relative paths so the tests compile, and remove the #[cfg(test)] section from rpc.rs so only the production handler code remains in that file. Ensure the new rpc_test.rs has the appropriate use declarations and #[cfg(test)] mod so cargo test picks it up.tests/memory_tree_walk_e2e.rs (1)
1-536: ⚡ Quick winSplit this test module to stay within the file-size cap.
This file is ~536 lines, above the ~500-line threshold. Please extract shared test utilities (e.g.,
ScriptedResponder, seeding/provider helpers) into a sibling test-support module to keep this file focused on scenario assertions.As per coding guidelines: "
**/*.{ts,tsx,rs}: File size should not exceed approximately 500 lines."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/memory_tree_walk_e2e.rs` around lines 1 - 536, The module is too large; extract shared test utilities into a sibling test-support module: move ScriptedResponder, env_lock/ENV_LOCK, test_config, make_node, seed_tree, make_provider and any helper imports (e.g., derive_parent_id, level_from_node_id, estimate_tokens, write_node) into a new test-support file/module, re-export or pub use the necessary symbols, then update this test file to import those helpers and keep only the scenario tests (walks_descend_fetch_answer, respects_max_turns_cap_with_mock, handles_unknown_node_gracefully) plus their local setup; ensure run_walk, WalkOptions and WalkStopReason usages remain unchanged and update module paths so the tests compile.src/openhuman/memory_tree/tools/walk.rs (1)
452-466: ⚡ Quick winAdd diagnostics when a
<tool_call>block is malformed/invalid JSON.Right now malformed blocks are silently skipped, which makes walk failures hard to debug in production traces.
As per coding guidelines
**/*.rs: “Debug logging must follow these rules… log … branches … and errors… All changes lacking diagnosis logging are incomplete.”🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/memory_tree/tools/walk.rs` around lines 452 - 466, The parser currently silently skips malformed <tool_call> blocks; update the block in walk.rs (the branch handling `None => break` and the `Some(close_idx)` branch that parses `after_open`) to emit diagnostic logs: when `None` occurs log a debug/error with context (e.g., the remaining `after_open` content) indicating an unclosed/malformed tool_call, when `serde_json::from_str::<Value>(inner.trim())` returns Err log the JSON parse error and the `inner` string, and when `val.get("name")` is missing or not a string log a debug entry showing the parsed `val`; reference the variables `after_open`, `close_idx`, `inner`, the `calls` push of `InnerCall`, and ensure logs use the crate logger (e.g., log::debug!/log::error!) with concise contextual messages.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/core/jsonrpc.rs`:
- Line 1561: Remove the direct domain bootstrap call
`crate::openhuman::memory_tree::jobs::start(config.clone())` from the transport
layer (jsonrpc.rs) and instead invoke that startup from the application/domain
wiring layer (controller/bootstrap code) where domain initialization belongs;
delete the call in `jsonrpc.rs`, ensure `memory_tree::jobs::start` remains a
public domain API (or add a thin public wrapper) that accepts the same `config`,
and call that API from your domain/controller initialization path (e.g., the
central bootstrap or controller init function) so transport code remains
transport-only and the domain start is performed by the wiring layer.
In `@src/openhuman/composio/providers/slack/ingest.rs`:
- Around line 33-39: Docstrings still reference the old module name
"memory::tree"; update them to the new "memory_tree" naming to match the imports
(e.g., symbols ChatBatch, ChatMessage, raw_store/raw_rel_path/RawItem/RawKind,
ingest_chat, set_chunk_raw_refs/RawRef, redact). Search for occurrences of
"memory::tree" in this file and replace with "memory_tree" (and adjust any
surrounding phrasing) so the comments accurately reflect the current module
layout and imported symbols.
In `@src/openhuman/memory_tree/chunker.rs`:
- Around line 19-20: This file is too large; split chunker.rs into focused
modules by moving the core chunking implementation (functions/types that perform
token counting, chunk creation, and public APIs—e.g., functions referencing
approx_token_count, Chunk, Metadata, SourceKind, and redact) into a new
src/openhuman/memory_tree/chunker_core.rs and move helpers and test utilities
into src/openhuman/memory_tree/chunker_helpers.rs (and tests into
chunker_tests.rs or the tests/ directory); update chunker.rs to re-export the
public items with pub mod declarations (pub mod chunker_core; pub mod
chunker_helpers;) so existing callers keep using the same symbols, and adjust
use paths to import approx_token_count, Chunk, Metadata, SourceKind, and redact
from the new modules.
In `@src/openhuman/memory_tree/store.rs`:
- Around line 38-40: The file memory_tree/store.rs is too large and mixes
schema, connection/cache lifecycle, migrations, and embedding logic; split it
into focused modules (e.g., schema.rs, connection.rs, migrations.rs,
embedding.rs, content_store.rs) and keep memory_tree/store.rs as a lightweight
wiring and re-exports file. Move definitions and implementations that deal with
Chunk, Metadata, SourceKind, SourceRef, and StagedChunk out into the appropriate
new modules, export the necessary types from those modules, and update mod
declarations and use sites to re-export the public API from store.rs so callers
see the same symbols but the implementation is decomposed. Ensure each new file
stays under the ~500-line target and preserve existing function signatures and
visibility so tests/builds remain unchanged.
In `@src/openhuman/memory_tree/tools/walk.rs`:
- Around line 775-776: The code in StubProvider uses
self.responses.lock().unwrap().drain(0..1).next(), which panics when the vector
is empty; replace that pattern with a safe check-and-remove: lock the mutex into
a mutable variable (let mut responses = self.responses.lock().unwrap()), return
Err(anyhow::anyhow!("StubProvider: no more scripted responses")) if
responses.is_empty(), otherwise call responses.remove(0) (or
responses.pop_front() if you change the collection to VecDeque) and return that
value; apply the same change to the other occurrence at the later block so both
sites never panic and instead return the intended error.
- Around line 591-621: The branch fetching leaves calls
retrieval::drill_down(config, &node_id, 1, None, Some(10)) and
do_fetch_leaves(...) without passing the active namespace, which can mix data
across namespaces; update the calls to pass the same namespace used by
run_walk/node reads (e.g., the namespace field or variable in scope) into
retrieval::drill_down and into do_fetch_leaves so both queries are
namespace-aware (adjust argument order/signature if necessary to supply
namespace to retrieval::drill_down and do_fetch_leaves).
In `@src/openhuman/subconscious/engine.rs`:
- Around line 24-27: The module is too large and mixes responsibilities; split
it into smaller focused sibling modules by extracting the tick loop, provider
routing, parsing, and persistence helpers into their own files; create new
modules (e.g., tick_loop.rs, provider_router.rs, parser.rs, persistence.rs) and
move related functions and types out of the large engine.rs while keeping public
APIs intact, updating imports where MemoryClientRef, build_chat_provider,
ChatConsumer, ChatPrompt, and ChatProvider are referenced so engine.rs composes
these smaller modules rather than containing all logic.
---
Outside diff comments:
In `@src/openhuman/composio/providers/profile.rs`:
- Around line 1-847: Split the oversized profile.rs into smaller focused
modules: create identity.rs (define IdentityKind, canonicalize,
parse_skill_identity_key), persist.rs (persist_provider_profile,
expand_identity_rows, json_str, and any
profile::profile_upsert/learning_candidate usage), read.rs
(load_connected_identities, is_self_identity, is_self_identity_any_toolkit,
delete_connected_identity_facets), render.rs
(render_connected_identities_section, ConnectedIdentity struct), and helpers.rs
(normalize_token, title_case, sanitize_prompt_value, now_secs). Move the
corresponding unit tests into matching test modules or a tests/ submodule and
update all internal references (e.g., ProviderUserProfile, FacetType,
profile_upsert, learning_candidate::global) to import from the new modules;
re-export public symbols from a new mod profile { pub use self::identity::*,
self::persist::*, ... } in the original path so external callers keep the same
API. Ensure visibility (pub/pub(crate)) and fix imports (use super:: or
crate::openhuman::composio::providers::profile::...) and run cargo test to
resolve any naming/borrow changes.
In `@src/openhuman/memory_tree/jobs/handlers/mod.rs`:
- Around line 1-1521: Large single-file module (~1521 lines) violates the
~500-line guideline; split it into smaller modules. Extract per-kind handlers
(handle_extract, handle_append_buffer, handle_seal, handle_topic_route,
handle_digest_daily, handle_flush_stale, handle_reembed_backfill), their related
constants (L0_DEFAULT_FLUSH_AGE_SECS, REEMBED_BACKFILL_BATCH,
REEMBED_BACKFILL_REVISIT_MS) and helper functions
(try_mark_chunk_reembed_skipped, try_mark_summary_reembed_skipped) into a new
handlers/*.rs (or multiple files) and re-export or call them from this mod.rs's
handle_job dispatcher; move the #[cfg(test)] block/tests into a tests/ submodule
or separate test files preserving test functions (e.g.,
source_tree_seal_handler_enqueues_summary_topic_route,
reembed_backfill_repopulates_then_completes,
reembed_backfill_tombstones_orphan_and_terminates) and adapt visibility
(pub(crate) or pub) and use/import paths accordingly; update mod declarations
and use paths in this file so existing callers (handle_job,
worker::wake_workers, chunk_store::tree_active_signature, etc.) continue to
compile. Ensure transactional helpers (chunk_store::with_connection calls) and
logging remain reachable after the split and run cargo test to fix any
visibility/import issues.
In `@src/openhuman/memory_tree/retrieval/source.rs`:
- Around line 1-690: The module is too large — split out the semantic rerank
logic and the tests into sibling modules: move rerank_by_semantic_similarity
(and its helper imports like build_embedder_from_config, cosine_similarity and
any HashMap/embedding lookup code) into a new retrieval/semantic.rs module and
export it (pub(crate) or pub as needed), and move the entire #[cfg(test)] mod
tests into retrieval/tests.rs (or retrieval/source_tests.rs) as a test-only
module that imports the public helpers from source.rs; update source.rs to
declare the new submodules (mod semantic; #[cfg(test)] mod tests;) and replace
internal calls like rerank_by_semantic_similarity(...) and any moved helper
references with the re-exported symbols, adjust visibility of
collect_hits_and_nodes/select_trees/scope_matches_kind if tests need access
(make them pub(crate) instead of fn), and fix imports/usages so compilation and
tests still pass.
In `@src/openhuman/memory_tree/tools/walk.rs`:
- Around line 1-957: Split this large module into focused sibling modules to
keep file size under ~500 lines: keep the public API and core loop
(MemoryTreeWalkTool, run_walk, WalkOptions, WalkOutcome, WalkStep,
WalkStopReason) in the original file and move parsing, adapter, inner helpers,
and tests into new modules; specifically extract parse_walk_tool_calls and
InnerCall into a parser module, move ChatProviderAdapter into an adapter module,
move dispatch_inner_call, build_node_context, build_system_prompt,
build_inner_tools_text, and synthesize_fallback_answer into a helpers (or
primitives) module, and relocate the #[cfg(test)] test module to a tests
module/file; update the original file to import these with mod/use and re-export
symbols if needed so run_walk and the Tool implementation still call
parser::parse_walk_tool_calls, adapter::ChatProviderAdapter, and
helpers::dispatch_inner_call (and helpers::build_node_context,
build_system_prompt, build_inner_tools_text, synthesize_fallback_answer) with
minimal changes to function signatures.
---
Nitpick comments:
In `@src/bin/slack_backfill.rs`:
- Around line 149-577: main is too large and should be split into focused
subcommand handlers; extract the big conditional blocks into separate
functions/modules and dispatch from a small main that only does init and CLI
parsing. Specifically: move the seal-probe block into a new handler function
(e.g. handle_seal_probe) that takes (&Cli, &Config) and uses ingest_chat; move
the SLACK_SEARCH_MESSAGES probe into handle_probe_search(&Cli, &Config,
&client_kind) which calls execute_action; move the probe_ratelimit loop into
handle_probe_ratelimit(&Cli, &Config, &client_kind) (preserve Outcome enum and
list_connections_via_kind usage); extract the search backfill loop that calls
run_backfill_via_search into handle_search_backfill(&Cli, &Config,
&connections); and extract the non-search per-connection backfill (provider.sync
loop) into handle_sync_backfill(&Cli, &Config, &provider, &candidates). Keep
init_default_providers, memory::global::init, tracing/env_logger setup and
create_composio_client in main, then dispatch to these handlers based on CLI
flags; wire return Result<()> through each handler and move related helper
imports (chrono, ingest_chat, execute_action, list_connections_via_kind,
ProviderContext) into their new modules.
In `@src/core/all.rs`:
- Around line 188-190: The all.rs module is becoming too large due to direct
aggregation of many controllers/schemas; extract the registration logic into
smaller focused modules (e.g., create new modules like
openhuman::memory_tree::registration and
openhuman::memory_tree::retrieval_registration) that each expose functions such
as all_memory_tree_registered_controllers and
all_retrieval_registered_controllers (or analogous names) and move the
schema/controller builder code into those modules, then in all.rs simply call
controllers.extend(...) with those exported helper functions so all.rs only
composes registrations instead of containing their implementations.
In `@src/openhuman/channels/runtime/startup.rs`:
- Around line 42-635: The start_channels function has grown too large; split its
responsibilities into smaller functions/modules: extract bus/subscriber
registration into a new module/function (e.g., register_startup_subscribers)
that encapsulates calls to event_bus::init_global, TracingSubscriber subscribe,
register_health_subscriber, register_skill_cleanup_subscriber, the Phase 2–4
learning OnceLock blocks and cron/proactive/tree subscribers; extract
provider/memory/bootstrap logic into a new function (e.g.,
bootstrap_provider_and_memory) that returns (provider, mem, runtime, security,
audit, provider_runtime_options) and contains
create_intelligent_routing_provider, provider.warmup,
host_runtime::create_runtime, SecurityPolicy::from_config,
get_or_create_workspace_audit_logger, memory::create_memory_with_local_ai;
extract channel list construction into its own function (e.g.,
build_channel_list) that returns Vec<Arc<dyn Channel>> and contains all
config.channels_config.* branch logic and spawn_supervised_listener wiring; keep
start_channels to orchestrate the high-level flow (call the new functions,
compute backoff/limits, create runtime_ctx, and call run_message_dispatch_loop).
Refactor by moving extracted code into new files/modules, keeping existing
symbol names (start_channels, spawn_supervised_listener,
run_message_dispatch_loop, build_system_prompt, tools::all_tools_with_runtime)
so callers remain unchanged and tests compile.
In `@src/openhuman/context/segment_recap_summarizer_tests.rs`:
- Line 19: The test module in segment_recap_summarizer_tests.rs has grown past
the ~500-line guideline; split it into smaller focused test modules (e.g.,
segment_recap_summarizer_unit_tests.rs,
segment_recap_summarizer_integration_tests.rs) by moving related test functions
into new files, preserving imports like ChatPrompt and any helper fixtures,
exporting or re-exporting shared helpers via a common mod (or a tests/util.rs)
so tests still compile, and update the parent mod declarations (pub mod ... or
mod ...) so the test suite runs unchanged; ensure each new file contains the
appropriate use crate::openhuman::memory_tree::chat::ChatPrompt import and
adjust visibility of helpers as needed.
In `@src/openhuman/memory_tree/read_rpc.rs`:
- Around line 34-39: The file src/openhuman/memory_tree/read_rpc.rs is too large
and must be split by concern; create smaller modules (e.g.,
list_search_recall.rs for listing/search/recall logic that uses content_read and
NodeKind/SourceKind, mutations.rs for mutation endpoints that use
chunk_store/with_connection and score_store, graph_export.rs for graph export
code, and llm_config.rs for LLM configuration and related RPCs), move the
corresponding functions/types into those files, export them from a new mod.rs or
update the parent mod to pub use the new modules, update imports in callers to
reference the new module paths instead of read_rpc.rs, and run cargo check to
fix any visibility or import issues.
In `@src/openhuman/memory_tree/retrieval/rpc.rs`:
- Around line 310-313: Move the large inline #[cfg(test)] test block out of this
module into a sibling rpc_test.rs to reduce rpc.rs size: cut the entire test
module from src/openhuman/memory_tree/retrieval/rpc.rs and paste it into
src/openhuman/memory_tree/retrieval/rpc_test.rs, update imports inside the new
file to use the same symbols (content_store, upsert_chunks, and
types::{chunk_id, Chunk, Metadata, SourceRef}, chrono::Utc) and any
crate-relative paths so the tests compile, and remove the #[cfg(test)] section
from rpc.rs so only the production handler code remains in that file. Ensure the
new rpc_test.rs has the appropriate use declarations and #[cfg(test)] mod so
cargo test picks it up.
In `@src/openhuman/memory_tree/retrieval/topic.rs`:
- Around line 20-28: The topic.rs module has grown too large; split it into
focused submodules (e.g., query.rs, rerank.rs, hydrate.rs, test_helpers.rs) and
move the corresponding functionality into them: relocate query-related
functions/types into query.rs, reranking logic into rerank.rs,
hydration/assembly code into hydrate.rs, and any test helpers into
test_helpers.rs; then add mod declarations in topic.rs (mod query; mod rerank;
mod hydrate; mod test_helpers;) and re-export the public APIs you need (pub use
query::..., etc.), update all internal uses/imports (e.g., hit_from_summary,
QueryResponse, RetrievalHit, build_embedder_from_config, cosine_similarity,
lookup_entity, EntityHit, Tree/TreeKind) to the new module paths, and adjust
visibility (pub/pub(crate)) so existing callers keep working and tests compile.
In `@src/openhuman/memory_tree/store_tests.rs`:
- Around line 47-771: The test file is too large; split it into focused test
modules (e.g., connection_cache, journaling/migration, reembed, schema_init) by
moving related tests into new test files and keeping shared helpers in a single
test_helpers module. For each group, create a new test module/file containing
the tests that reference the same behavior (e.g., move
with_connection_serialises_concurrent_schema_init,
is_transient_cold_start_classifies_known_extended_codes,
with_connection_keeps_foreign_keys_on_for_every_call,
memory_tree_uses_truncate_journal_not_wal, existing_wal_db_migrates_to_truncate
into a journaling/schema module; move
connection_cache_returns_same_arc_for_same_workspace,
connection_cache_uses_separate_connections_for_different_workspaces,
circuit_breaker_trips_after_threshold into a connection_cache module; move
clear_chunk_reembed_skipped_is_idempotent,
clear_reembed_skipped_for_signature_removes_all_tombstones_for_sig,
validate_reembed_skip_key into a reembed module; keep
legacy_embeddings_migrate_to_sidecar_once with embedding-related helpers),
update imports to use the shared helpers (e.g., test_config, sample_chunk,
with_connection, get_or_init_connection, clear_connection_cache,
try_cleanup_stale_files, clear_connection_cache, mark_chunk_reembed_skipped,
clear_reembed_skipped_for_signature, validate_reembed_skip_key), and add mod
declarations so Cargo runs them as tests; ensure visibility of helper functions
(pub(crate) or move to a common tests/helpers module) so the split files compile
and tests run.
In `@src/openhuman/memory_tree/tools/walk.rs`:
- Around line 452-466: The parser currently silently skips malformed <tool_call>
blocks; update the block in walk.rs (the branch handling `None => break` and the
`Some(close_idx)` branch that parses `after_open`) to emit diagnostic logs: when
`None` occurs log a debug/error with context (e.g., the remaining `after_open`
content) indicating an unclosed/malformed tool_call, when
`serde_json::from_str::<Value>(inner.trim())` returns Err log the JSON parse
error and the `inner` string, and when `val.get("name")` is missing or not a
string log a debug entry showing the parsed `val`; reference the variables
`after_open`, `close_idx`, `inner`, the `calls` push of `InnerCall`, and ensure
logs use the crate logger (e.g., log::debug!/log::error!) with concise
contextual messages.
In `@src/openhuman/memory_tree/tree_global/seal.rs`:
- Around line 20-33: This file exceeds the repository size ceiling; split its
sealing logic and tests into smaller modules: extract core sealing functions and
types (the main seal implementation that uses stage_summary,
SummaryComposeInput, SummaryTreeKind, new_summary_id, with_connection, and
store) into a focused seal_core.rs, move configuration/threshold constants
(GLOBAL_TOKEN_BUDGET, MONTHLY_SEAL_THRESHOLD, WEEKLY_SEAL_THRESHOLD,
YEARLY_SEAL_THRESHOLD) into a seal_config.rs, isolate embedding/score helper
logic that uses build_embedder_from_config into seal_embed.rs, and place
summariser-related code (Summariser, SummaryContext, SummaryInput) and
Tree/Buffer types (Buffer, SummaryNode, Tree, TreeKind) in a seal_summariser.rs
or re-export them from the new modules; also move large test cases into a
parallel tests/ module/file so each source file stays under ~500 lines and
update module declarations and re-exports accordingly.
In `@tests/memory_tree_summarizer_e2e.rs`:
- Around line 1-579: The file exceeds the 500-line guideline; split
helper/harness code out of the test module into a smaller helper module and keep
only the three scenario tests in the e2e test file. Move EnvVarGuard,
ENV_LOCK/env_lock, ScriptedProvider (and its Provider impl), build_config,
ts_hour14/ts_hour15, and NS into a new module (e.g., summarizer_harness) and
make those items public, then in the original tests file replace the moved
definitions with a mod/use to import summarizer_harness::{EnvVarGuard, env_lock,
ScriptedProvider, build_config, ts_hour14, ts_hour15, NS}; ensure visibility
changes (pub) where needed and update imports so engine::run_summarization and
store::* usages in the three test functions remain unchanged.
In `@tests/memory_tree_walk_e2e.rs`:
- Around line 1-536: The module is too large; extract shared test utilities into
a sibling test-support module: move ScriptedResponder, env_lock/ENV_LOCK,
test_config, make_node, seed_tree, make_provider and any helper imports (e.g.,
derive_parent_id, level_from_node_id, estimate_tokens, write_node) into a new
test-support file/module, re-export or pub use the necessary symbols, then
update this test file to import those helpers and keep only the scenario tests
(walks_descend_fetch_answer, respects_max_turns_cap_with_mock,
handles_unknown_node_gracefully) plus their local setup; ensure run_walk,
WalkOptions and WalkStopReason usages remain unchanged and update module paths
so the tests compile.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5ba91539-c9af-4891-8751-7a160758246e
📒 Files selected for processing (174)
src/bin/gmail_backfill_3d.rssrc/bin/memory_tree_init_smoke.rssrc/bin/slack_backfill.rssrc/core/all.rssrc/core/cli.rssrc/core/jsonrpc.rssrc/openhuman/agent/harness/archivist.rssrc/openhuman/agent/harness/archivist_tests.rssrc/openhuman/agent/harness/payload_summarizer.rssrc/openhuman/agent/harness/session/turn.rssrc/openhuman/agent/harness/subagent_runner/handoff.rssrc/openhuman/agent/tree_loader.rssrc/openhuman/channels/runtime/startup.rssrc/openhuman/composio/ops_test.rssrc/openhuman/composio/providers/gmail/ingest.rssrc/openhuman/composio/providers/profile.rssrc/openhuman/composio/providers/slack/ingest.rssrc/openhuman/config/ops.rssrc/openhuman/context/segment_recap_summarizer_tests.rssrc/openhuman/doctor/core.rssrc/openhuman/doctor/core_tests.rssrc/openhuman/inference/local/model_requirements.rssrc/openhuman/memory/mod.rssrc/openhuman/memory/ops/learn.rssrc/openhuman/memory/stm_recall/recall_tests.rssrc/openhuman/memory/sync_status/rpc.rssrc/openhuman/memory_tree/README.mdsrc/openhuman/memory_tree/canonicalize/README.mdsrc/openhuman/memory_tree/canonicalize/chat.rssrc/openhuman/memory_tree/canonicalize/document.rssrc/openhuman/memory_tree/canonicalize/email.rssrc/openhuman/memory_tree/canonicalize/email_clean.rssrc/openhuman/memory_tree/canonicalize/mod.rssrc/openhuman/memory_tree/chat/cloud.rssrc/openhuman/memory_tree/chat/local.rssrc/openhuman/memory_tree/chat/mod.rssrc/openhuman/memory_tree/chunker.rssrc/openhuman/memory_tree/content_store/README.mdsrc/openhuman/memory_tree/content_store/atomic.rssrc/openhuman/memory_tree/content_store/compose.rssrc/openhuman/memory_tree/content_store/mod.rssrc/openhuman/memory_tree/content_store/obsidian.rssrc/openhuman/memory_tree/content_store/obsidian_defaults/graph.jsonsrc/openhuman/memory_tree/content_store/obsidian_defaults/types.jsonsrc/openhuman/memory_tree/content_store/paths.rssrc/openhuman/memory_tree/content_store/raw.rssrc/openhuman/memory_tree/content_store/read.rssrc/openhuman/memory_tree/content_store/tags.rssrc/openhuman/memory_tree/ingest.rssrc/openhuman/memory_tree/jobs/README.mdsrc/openhuman/memory_tree/jobs/handlers/README.mdsrc/openhuman/memory_tree/jobs/handlers/mod.rssrc/openhuman/memory_tree/jobs/mod.rssrc/openhuman/memory_tree/jobs/redact.rssrc/openhuman/memory_tree/jobs/scheduler.rssrc/openhuman/memory_tree/jobs/store.rssrc/openhuman/memory_tree/jobs/testing.rssrc/openhuman/memory_tree/jobs/types.rssrc/openhuman/memory_tree/jobs/worker.rssrc/openhuman/memory_tree/mod.rssrc/openhuman/memory_tree/read_rpc.rssrc/openhuman/memory_tree/retrieval/README.mdsrc/openhuman/memory_tree/retrieval/benchmarks.rssrc/openhuman/memory_tree/retrieval/drill_down.rssrc/openhuman/memory_tree/retrieval/fetch.rssrc/openhuman/memory_tree/retrieval/global.rssrc/openhuman/memory_tree/retrieval/integration_test.rssrc/openhuman/memory_tree/retrieval/mod.rssrc/openhuman/memory_tree/retrieval/rpc.rssrc/openhuman/memory_tree/retrieval/schemas.rssrc/openhuman/memory_tree/retrieval/search.rssrc/openhuman/memory_tree/retrieval/source.rssrc/openhuman/memory_tree/retrieval/topic.rssrc/openhuman/memory_tree/retrieval/types.rssrc/openhuman/memory_tree/rpc.rssrc/openhuman/memory_tree/schemas.rssrc/openhuman/memory_tree/score/README.mdsrc/openhuman/memory_tree/score/embed/README.mdsrc/openhuman/memory_tree/score/embed/cloud.rssrc/openhuman/memory_tree/score/embed/factory.rssrc/openhuman/memory_tree/score/embed/inert.rssrc/openhuman/memory_tree/score/embed/mod.rssrc/openhuman/memory_tree/score/embed/ollama.rssrc/openhuman/memory_tree/score/extract/README.mdsrc/openhuman/memory_tree/score/extract/extractor.rssrc/openhuman/memory_tree/score/extract/llm.rssrc/openhuman/memory_tree/score/extract/llm_tests.rssrc/openhuman/memory_tree/score/extract/mod.rssrc/openhuman/memory_tree/score/extract/regex.rssrc/openhuman/memory_tree/score/extract/types.rssrc/openhuman/memory_tree/score/mod.rssrc/openhuman/memory_tree/score/mod_tests.rssrc/openhuman/memory_tree/score/resolver.rssrc/openhuman/memory_tree/score/signals/README.mdsrc/openhuman/memory_tree/score/signals/interaction.rssrc/openhuman/memory_tree/score/signals/metadata_weight.rssrc/openhuman/memory_tree/score/signals/mod.rssrc/openhuman/memory_tree/score/signals/ops.rssrc/openhuman/memory_tree/score/signals/source_weight.rssrc/openhuman/memory_tree/score/signals/token_count.rssrc/openhuman/memory_tree/score/signals/types.rssrc/openhuman/memory_tree/score/signals/unique_words.rssrc/openhuman/memory_tree/score/store.rssrc/openhuman/memory_tree/score/store_tests.rssrc/openhuman/memory_tree/store.rssrc/openhuman/memory_tree/store_tests.rssrc/openhuman/memory_tree/summarizer/bus.rssrc/openhuman/memory_tree/summarizer/cli.rssrc/openhuman/memory_tree/summarizer/engine.rssrc/openhuman/memory_tree/summarizer/engine_tests.rssrc/openhuman/memory_tree/summarizer/mod.rssrc/openhuman/memory_tree/summarizer/ops.rssrc/openhuman/memory_tree/summarizer/schemas.rssrc/openhuman/memory_tree/summarizer/store.rssrc/openhuman/memory_tree/summarizer/store_tests.rssrc/openhuman/memory_tree/summarizer/types.rssrc/openhuman/memory_tree/tools/drill_down.rssrc/openhuman/memory_tree/tools/fetch_leaves.rssrc/openhuman/memory_tree/tools/ingest_document.rssrc/openhuman/memory_tree/tools/mod.rssrc/openhuman/memory_tree/tools/query_global.rssrc/openhuman/memory_tree/tools/query_source.rssrc/openhuman/memory_tree/tools/query_topic.rssrc/openhuman/memory_tree/tools/search_entities.rssrc/openhuman/memory_tree/tools/walk.rssrc/openhuman/memory_tree/tree_global/README.mdsrc/openhuman/memory_tree/tree_global/digest.rssrc/openhuman/memory_tree/tree_global/digest_tests.rssrc/openhuman/memory_tree/tree_global/mod.rssrc/openhuman/memory_tree/tree_global/recap.rssrc/openhuman/memory_tree/tree_global/registry.rssrc/openhuman/memory_tree/tree_global/seal.rssrc/openhuman/memory_tree/tree_source/README.mdsrc/openhuman/memory_tree/tree_source/bucket_seal.rssrc/openhuman/memory_tree/tree_source/bucket_seal_tests.rssrc/openhuman/memory_tree/tree_source/flush.rssrc/openhuman/memory_tree/tree_source/mod.rssrc/openhuman/memory_tree/tree_source/registry.rssrc/openhuman/memory_tree/tree_source/source_file.rssrc/openhuman/memory_tree/tree_source/store.rssrc/openhuman/memory_tree/tree_source/store_tests.rssrc/openhuman/memory_tree/tree_source/summariser/README.mdsrc/openhuman/memory_tree/tree_source/summariser/inert.rssrc/openhuman/memory_tree/tree_source/summariser/llm.rssrc/openhuman/memory_tree/tree_source/summariser/mod.rssrc/openhuman/memory_tree/tree_source/types.rssrc/openhuman/memory_tree/tree_topic/README.mdsrc/openhuman/memory_tree/tree_topic/backfill.rssrc/openhuman/memory_tree/tree_topic/curator.rssrc/openhuman/memory_tree/tree_topic/hotness.rssrc/openhuman/memory_tree/tree_topic/mod.rssrc/openhuman/memory_tree/tree_topic/registry.rssrc/openhuman/memory_tree/tree_topic/routing.rssrc/openhuman/memory_tree/tree_topic/store.rssrc/openhuman/memory_tree/tree_topic/types.rssrc/openhuman/memory_tree/types.rssrc/openhuman/memory_tree/util/README.mdsrc/openhuman/memory_tree/util/mod.rssrc/openhuman/memory_tree/util/redact.rssrc/openhuman/mod.rssrc/openhuman/subconscious/engine.rssrc/openhuman/subconscious/situation_report/digest.rssrc/openhuman/subconscious/situation_report/hotness.rssrc/openhuman/subconscious/situation_report/query_window.rssrc/openhuman/subconscious/situation_report/summaries.rssrc/openhuman/subconscious/source_chunk.rssrc/openhuman/test_support/rpc.rssrc/openhuman/tools/impl/memory/mod.rssrc/openhuman/tools/ops.rssrc/openhuman/whatsapp_data/sqlite_retry.rstests/agent_retrieval_e2e.rstests/json_rpc_e2e.rstests/memory_tree_summarizer_e2e.rstests/memory_tree_walk_e2e.rs
💤 Files with no reviewable changes (1)
- src/openhuman/memory/mod.rs
- walk.rs StubProvider: replace `drain(0..1)` (panics when queue empty) with an explicit empty-check + `remove(0)`, returning the intended `"no more scripted responses"` error. - docs/whatsapp-data-flow.md: point at the new `memory_tree/tools/` path (broken Lychee link from the consolidation refactor). - composio/providers/slack/ingest.rs: update docstring references from `memory::tree::*` to `memory_tree::*` to match the post-refactor imports.
Summary
src/openhuman/memory/tree/,src/openhuman/tree_summarizer/, andsrc/openhuman/tools/impl/memory/tree/into a single first-classsrc/openhuman/memory_tree/module. Public RPC method names (openhuman.memory_tree_*), tool names (memory_tree), and controller-schema symbols are unchanged — code-location refactor only.memory_tree_walk: a new agentic tool that, given a free-text query, runs a turn-based inner LLM loop (lightweight summarization model fromconfig.local_ai.chat_model_id) over inner navigation primitives (descend,peek,fetch_leaves,answer) and returns a synthesized answer + step trace. Wired both as a standalone tool and as a new"walk"mode on the consolidatedMemoryTreeTooldispatcher.summarizer/engineunit tests (group_by_hour edge cases, propagate_node, run_summarization), 3 tree-build e2e tests, 3 walk-tool e2e tests (wiremock-backed Provider scripted with XML<tool_call>responses).Problem
The pre-existing layout split closely-coupled memory-tree code across three locations, and the only way to query the tree was through individual retrieval primitives invoked one-at-a-time by the orchestrator. There was no integration test of the summarizer's full
hour → day → month → year → rootbuild path with a controlled LLM, and no agentic helper that could synthesize an answer from a single(query, namespace)call.Solution
git mvthe three trees, update wiring insrc/openhuman/mod.rs,src/core/all.rs,memory/mod.rs,tools/impl/memory/mod.rs, and rewrite ~90usepaths.src/openhuman/memory_tree/tools/walk.rsdefinesMemoryTreeWalkTool,run_walk,WalkOptions,WalkOutcome. Lifts only the LLM-call-and-parse pattern fromagent/harness/tool_loop.rs(skipping progress/approval/stop-hooks). Uses the existing XML<tool_call>harness convention sinceProvider::chat_with_historyreturnsString.pub(crate)test-access onengine::group_by_hourandengine::propagate_node.ScriptedProviderstub in unit + summarizer e2e tests; wiremock-backedOpenAiCompatibleProviderin walk e2e.Submission Checklist
engine_tests.rs,memory_tree_summarizer_e2e.rs,memory_tree_walk_e2e.rs) target the changed lines; CIcoverage.ymlwill compute the actual percentage.Impact
openhuman.memory_tree_*RPCs or the existingmemory_treetool modes. New"walk"mode +memory_tree_walktool are additive.max_turns: 6(hard cap 20) at temp 0.3 against the lightweight summarization model — bounded per-call cost.ReadOnlypermission, no external effect.Related
canonicalize/andretrieval/primitives — survey indicated existing in-file tests, not re-audited in this PR.AI Authored PR Metadata (required for Codex/Linear PRs)
Linear Issue
Commit & Branch
feat/memory-tree-walk969ca5ad517a4b210220468cff6389b0a6dd7105Validation Run
pnpm --filter openhuman-app format:check— not run in this branch (will run in CI).pnpm typecheck— pre-existing failures on iOS files (PairPhoneModal.tsx,tunnel/crypto.ts,PairScreen.tsx) untouched by this PR, from PR feat(ios): iOS client with QR pairing, E2E-encrypted tunnel, and push-to-talk #1420; this PR's pre-push hook was bypassed with--no-verifyfor that reason.cargo test --lib memory_tree::summarizer::engine(14 pass),tests/memory_tree_summarizer_e2e.rs(3 pass),tests/memory_tree_walk_e2e.rs(3 pass),memory_tree::tools::walkunit tests (3 pass).cargo fmtapplied;cargo check --bin openhuman-coreclean.app/src-tauri/changes.Validation Blocked
command:pnpm compile(typecheck)error:Missing modulesqrcode.react,@noble/ciphers/chacha,@noble/ciphers/webcrypto,@tauri-apps/plugin-barcode-scanner— all inapp/src/{components/settings/panels/devices,lib/tunnel,pages/ios}/....impact:Pre-existing onmainfrom PR feat(ios): iOS client with QR pairing, E2E-encrypted tunnel, and push-to-talk #1420 (iOS client). None of the failing files are touched by this PR. Pre-push hook bypassed with--no-verifyperCLAUDE.mdpolicy.Behavior Changes
memory_tree_walktool /"walk"mode on consolidated dispatcher.Parity Contract
openhuman.memory_tree_*RPCs, tool names, and controller-schema symbols unchanged across the refactor.src/core/all.rsstill imports the sameall_memory_tree_*andall_tree_summarizer_*symbol names from the new module path via re-export.Duplicate / Superseded PR Handling
Summary by CodeRabbit
Release Notes