Skip to content

proofofwork-agency/reporecall

Repository files navigation

Reporecall

██████╗ ███████╗██████╗  ██████╗ ██████╗ ███████╗ ██████╗ █████╗ ██╗     ██╗
██╔══██╗██╔════╝██╔══██╗██╔═══██╗██╔══██╗██╔════╝██╔════╝██╔══██╗██║     ██║
██████╔╝█████╗  ██████╔╝██║   ██║██████╔╝█████╗  ██║     ███████║██║     ██║
██╔══██╗██╔══╝  ██╔═══╝ ██║   ██║██╔══██╗██╔══╝  ██║     ██╔══██║██║     ██║
██║  ██║███████╗██║     ╚██████╔╝██║  ██║███████╗╚██████╗██║  ██║███████╗███████╗
╚═╝  ╚═╝╚══════╝╚═╝      ╚═════╝ ╚═╝  ╚═╝╚══════╝ ╚═════╝╚═╝  ╚═╝╚══════╝╚══════╝

Local codebase memory and retrieval for Claude Code and MCP.

Reporecall indexes your repository locally, classifies each query by intent, and injects focused code context, wiki knowledge, and persistent memory before Claude answers - no cloud, no embeddings API, everything stays on your machine.


Quick Start

npm install -g @proofofwork-agency/reporecall

reporecall init          # Create .memory/, hooks, MCP config
reporecall index         # Index the codebase
reporecall serve         # Start daemon + file watcher

That's it. Context is injected automatically into every Claude Code prompt via hooks.


Features

  • Intent-based retrieval - query mode selected by local rule-based classification, no LLM cost
  • Multi-signal search - FTS keywords, vector similarity, AST metadata, semantic features, imports, call graphs
  • Wiki layer - auto-generated knowledge pages from codebase topology, always-on injection with token budgeting
  • Topology analysis - Louvain community detection, hub node identification, surprise scoring, investigation suggestions
  • Bug localization - dedicated pipeline with subject profiling, contradiction pruning, and graph expansion
  • Local memory - persistent rules, facts, episodes, and working context across sessions
  • Delivery modes - code_context (focused chunks) or summary_only (structured summary when confidence is low)
  • Hook guidance - context strength, execution surface, missing evidence, and recommended next reads
  • Streaming indexer - bounded file windows, adaptive embedding batches, lower peak heap
  • SQLite ABI self-repair - detects native module mismatch and attempts automatic rebuild
  • MCP server - 26 tools for code search, call graphs, topology, memory, and wiki

How It Works

Every user prompt flows through a three-layer pipeline: code retrieval, wiki knowledge, and project memory.

flowchart TB
  User["User Prompt"]
  Hook["Prompt Hook"]
  Daemon["Local Daemon"]
  Intent["Intent Classifier"]

  subgraph Retrieval["Retrieval Pipeline"]
    Decompose["Query Decomposition"]
    Resolve["Target Resolution"]
    FTS["FTS Search"]
    Vector["Vector Search"]
    Semantic["Semantic Feature Search"]
    Graph["Caller / Neighbor Expansion"]
    Prune["Contradiction Pruning"]
    Select["Bundle Selection"]
  end

  subgraph Storage["Local Storage"]
    Index["Chunk / Target / Feature Indexes"]
    Topology["Topology Store (Communities, Hubs, Surprises)"]
    Memory["Memory Store"]
    Wiki["Wiki Store (Auto-generated Pages)"]
  end

  User --> Hook
  Hook --> Daemon
  Daemon --> Intent
  Intent --> Decompose
  Decompose --> Resolve
  Resolve --> FTS
  Resolve --> Vector
  Resolve --> Semantic
  FTS --> Graph
  Vector --> Graph
  Semantic --> Graph
  Graph --> Prune
  Prune --> Select
  Select --> Hook
  Index --> Resolve
  Index --> FTS
  Index --> Vector
  Index --> Semantic
  Memory --> Daemon
  Wiki --> Hook
Loading

Intent classification

Queries are classified locally (no LLM) into one of six modes that determine retrieval strategy:

Mode Purpose
lookup Exact symbol, file, endpoint, or module lookup
trace Implementation path - "how does X work", "what calls Y"
bug Causal debugging - symptom descriptions, "why does this fail"
architecture Broad inventory - "which files implement...", "full flow from A to B"
change Cross-cutting edits - "add logging across the auth flow"
skip Meta/chat/non-code prompts
flowchart LR
  Old["Old: R0 / R1 / R2"] --> Problem1["Prompt shape chosen too early"]
  Old --> Problem2["Broad prompts over-injected"]
  Old --> Problem3["Bug reports drifted into lexical noise"]
  Problem1 --> New["New: Intent-Based Retrieval"]
  Problem2 --> New
  Problem3 --> New
  New --> Lookup["lookup"]
  New --> Trace["trace"]
  New --> Bug["bug"]
  New --> Architecture["architecture"]
  New --> Change["change"]
  Bug --> Evidence["Evidence-chain scoring"]
  Architecture --> Summary["summary_only when weak"]
  Trace --> Path["seed + graph path reconstruction"]
Loading

Wiki layer

Auto-generated wiki pages from codebase topology are indexed alongside code and injected into every prompt context within a configurable token budget. No manual authoring required - pages are created during indexing from hub nodes, flow traces, and architectural surprises.

Inspired by Andrej Karpathy's LLM Wiki concept - structured markdown knowledge bases that LLMs can query efficiently. Reporecall automates wiki generation from code topology and enriches pages with optional LLM summaries.

Memory layer

Persistent project memory across sessions. Stores feedback rules, project decisions, user preferences, and reference pointers. Memory search uses FTS5 with RRF scoring, recency decay, access frequency penalties, and type-based boosts.


CLI

reporecall init          # Create .memory/, hooks, MCP config
reporecall index         # Index the codebase
reporecall serve         # Start daemon + file watcher
reporecall explain       # Inspect retrieval for a query
reporecall mcp           # Run as MCP server (stdio)
reporecall doctor        # Health checks
reporecall search        # Direct search
reporecall stats         # Index statistics
reporecall graph         # Call graph queries
reporecall conventions   # Detected conventions

MCP Tools

Code search & navigation

Tool Description
search_code Multi-signal code search across the indexed codebase
find_callers Find all callers of a function
find_callees Find all functions called by a function
get_symbol Get full source of a symbol by name
get_imports Get import/dependency graph for a file
explain_flow Trace execution path: callers -> seed -> callees
build_stack_tree Full call hierarchy tree (configurable depth)
resolve_seed Resolve a query to the best matching symbol

Topology & architecture

Tool Description
get_communities Module clusters with cohesion scores and auto-generated labels
get_hub_nodes Most-connected nodes (architectural hubs) in the call graph
get_surprises Unexpected cross-boundary connections ranked by surprise score
suggest_investigations Auto-generated investigation questions about weak spots

Wiki

Tool Description
wiki_query Search wiki pages by topic
wiki_read Read a specific wiki page
wiki_write Create or update a wiki page
wiki_check_staleness Find wiki pages that may be outdated

Memory

Tool Description
recall_memories Retrieve memories relevant to a query
store_memory Save a new memory
forget_memory Remove a memory
list_memories List all stored memories
explain_memory Explain why a memory was returned
compact_memories Merge redundant memories
clear_working_memory Clear ephemeral working memories

Index management

Tool Description
index_codebase Trigger a full or incremental re-index
get_stats Index statistics (files, symbols, chunks)
clear_index Reset the index

Configuration

Configuration lives in .memory/config.json in your project root.

Key Default Description
wikiBudget 400 Max tokens for wiki injection per prompt
wikiMaxPages 3 Max wiki pages injected per prompt
memoryBudget 500 Max tokens for memory injection per prompt
shutdownTimeoutMs 10000 Graceful shutdown timeout (1000-60000)
embeddingProvider "keyword" Embedding backend (keyword for FTS-only)

Showcase

Real output from a 1,140-file production codebase.

get_communities - Module clusters with cohesion scores
[
  {
    "id": "c_0",
    "nodeCount": 305,
    "cohesion": 0.01,
    "label": "api: json, getCorsHeaders"
  },
  {
    "id": "c_1",
    "nodeCount": 283,
    "cohesion": 0.01,
    "label": "components: updateNode, runFromNode"
  },
  {
    "id": "c_2",
    "nodeCount": 252,
    "cohesion": 0.01,
    "label": "components: observe, disconnect"
  },
  {
    "id": "c_3",
    "nodeCount": 227,
    "cohesion": 0.02,
    "label": "components+hooks: success, useAuth"
  },
  {
    "id": "c_4",
    "nodeCount": 220,
    "cohesion": 0.02,
    "label": "lib: isArray, updateNodeRun"
  },
  {
    "id": "c_5",
    "nodeCount": 153,
    "cohesion": 0.02,
    "label": "api: assertEquals, sanitizePrompt"
  },
  {
    "id": "c_6",
    "nodeCount": 132,
    "cohesion": 0.02,
    "label": "lib: warn, FloatingActionBar"
  },
  {
    "id": "c_7",
    "nodeCount": 121,
    "cohesion": 0.02,
    "label": "components: render, useTheme"
  }
]

Automatically detects tightly-coupled module clusters using Louvain community detection on the call graph.

get_hub_nodes - Architectural hubs (most-connected functions)
 #  Name              File                                            Edges  Community
 1  json              api/gateway/index.ts                               135  api
 2  updateNode        lib/execution/workflow/WorkflowStore.ts            127  components
 3  isArray           lib/flow/typeGuards.ts                              94  lib
 4  success           lib/events/eventBus.ts                              70  hooks
 5  render            components/ErrorBoundary.tsx                        62  components
 6  asNumber          lib/editor/effects/registry.ts                      55  editor
 7  runFromNode       lib/execution/workflow/starter.ts                   52  components
 8  getCorsHeaders    api/_shared/cors.ts                                 50  api
 9  sanitizePrompt    api/_shared/prompt-sanitizer.ts                     46  api
10  processStep       lib/flow/graphBuilder/core/router.ts                44  graphBuilder

Identifies functions that would cause the most disruption if changed - the structural load-bearing walls of your codebase.

get_surprises - Unexpected cross-boundary connections
Score  Source → Target                                           Why
  7    saveToLibrary → ExtensionCard                             weakly-resolved, crosses backend ↔ UI,
       (job-completion.ts → card.tsx)                            crosses execution surfaces

  6    transcribe → TranscriptionPanel                           weakly-resolved, crosses services ↔ components
       (transcription-service.ts → transcription-panel.tsx)

  6    useToast → Toaster                                        crosses hooks ↔ components,
       (use-toast.ts → toaster.tsx)                              peripheral node reaches hub

  6    getExecutionPathNodeIds → getDownstreamNodeIds            bridges communities 11 → 1,
       (useWorkflowCost.ts → graphTraversal.ts)                 crosses state ↔ shared execution surfaces

  6    compileSystemPrompt → memo_handler                        weakly-resolved, crosses lib ↔ components
       (promptTemplates.ts → PromptNode.tsx)

Surfaces connections that shouldn't exist or deserve closer inspection - potential coupling violations, false positives in the graph, or legitimate but non-obvious architectural bridges.

suggest_investigations - Auto-generated investigation questions
Type              Question
weak_resolution   What is the exact relationship between RecoveryPollingQueue
                  and useRecovery? (alias_path across services ↔ hooks)

weak_resolution   What is the exact relationship between useTemplates and
                  EditorSidebar? (alias_path across hooks ↔ components)

bridge_node       Why does `cn` connect 8 structurally distant communities?
                  (High betweenness centrality)

bridge_node       Why does `updateNode` connect Inspector, ActionBar,
                  isArray, and useAuth? (Bridges distant modules)

verify_inferred   Are the 18 weakly-resolved relationships involving `error`
                  actually correct? (Hub node with alias-resolved edges)

isolated_nodes    What connects defineConfig_handler, SitemapEntry,
                  generateSitemapXml to the rest of the system?
                  (5 weakly-connected nodes - possible documentation gaps)

Tells you where to look next - no prompt engineering required.

explain_flow - Trace execution across files

Query: handleRetryAction

Callers (who invokes this):
  ← handleBatchExecution           (batchProcessor.ts)
  ← triggerDownstreamNodes         (downstreamTrigger.ts)
  ← executeNode                    (nodeExecutor.ts)

Seed:
  ► handleRetryAction              (retryManager.ts:126-196)
    Extracts retry action, clears downstream execution state,
    resets node statuses, re-executes from target node

Callees (what this invokes):
  → getDownstreamNodeIds           (graphTraversal.ts:15-32)
  → addLog                         (workflowStore.ts:134-139)

Returns the full function source of the seed plus caller/callee code - one MCP call, 899 tokens, 6 files traced.

build_stack_tree - Full call hierarchy

Query: runWorkflow (depth: 2, direction: both)

                 StartNode.tsx (memo_handler)
                 InpaintNode.tsx (useCallback_handler)
                 ActionBar.tsx
                        │
                        ▼
              ► runWorkflow (starter.ts)
                        │
            ┌───────────┼───────────┐
            ▼           ▼           ▼
     ensureFlowSaved  addLog   workflowStarted
     (ensureFlowSaved.ts) (workflowStore.ts) (activityLogger.ts)
            │                       │
       ┌────┴────┐                  ▼
       ▼         ▼              info (logger.ts)
 serializeFlow  saveToDatabase
 (serialization.ts) (dataService.ts)

10 nodes, 9 edges, 2 levels deep. Pure static analysis - zero LLM cost.


Changelog

v0.6.1 - Wiki Startup Generation & Version Sync

Wiki startup generation. Wiki pages now auto-generate on MCP server and daemon startup, not just during index_codebase. Ensures wiki context is always available without requiring a full re-index.

Wiki generator freshness guard. writePage now checks sourceCommit from disk to skip unchanged pages. Fixed surprisesPage flag not being set on the generation result.

Version sync. package-lock.json synced to match package.json.

v0.6.0 - Wiki Layer & Memory Precision

This release adds an always-on wiki layer for persistent codebase knowledge and fixes three memory retrieval bugs that caused noisy or missing context injection.

Wiki layer. Auto-generated wiki pages from codebase topology are indexed alongside code and injected into every prompt context within a configurable token budget.

5 new MCP tools for wiki management: wiki_query, wiki_read, wiki_write, wiki_check_staleness.

FTS5 phrase query fix. Stop words ("how", "does", "work") were included in phrase queries, causing FTS5 to match only exact phrases and short-circuit before AND/OR fallback. Queries like "how does image generation work" now correctly find wiki pages matching "image generation".

Memory type isolation. Wiki pages (type wiki) no longer leak into memory search results. The memory search pipeline now explicitly filters to user, feedback, project, and reference types.

Access count penalty. Over-accessed memories are now penalized: >15 accesses -> 0.5x score, >8 accesses -> 0.75x. Prevents generic feedback rules from drowning out topic-specific results.

Tighter relevance threshold. Raised from 0.70 to 0.85, filtering out low-relevance tail results.

Benchmark results (1,140-file production codebase, 30 queries):

Layer Precision Hit Rate Notes
Code 57% 100% Always injected
Wiki 100% 50% Only injects when relevant pages exist
Memory 73% ~60% After access penalty + threshold fix
v0.5.0 - Topology-Aware Search & Architecture Decomposition

Added codebase topology analysis and decomposed the search engine into focused strategy modules.

Topology analysis pipeline. After each index, reporecall runs Louvain community detection on the call graph, identifies architectural hub nodes, scores surprising cross-boundary connections, and generates investigation questions. Results are persisted in SQLite and injected into prompt context automatically.

4 new MCP tools for exploring codebase structure: get_communities, get_hub_nodes, get_surprises, suggest_investigations.

Community-aware search scoring. Results from the same Louvain community as the query seed receive a locality boost, improving architecture and trace queries.

Daemon hardening. Index scheduler queues are bounded at 50k entries. File watcher has backpressure at 10k pending events. Shutdown timeout is now configurable via shutdownTimeoutMs.

Hook request validation. All hook endpoints now validate request bodies with Zod schemas, returning 400 with details on malformed payloads instead of silently misbehaving.

Search architecture decomposition. The monolithic hybrid.ts (~6,800 lines) was split into 7 focused modules: pipeline-core, bug-strategy, architecture-strategy, trace-strategy, lookup-strategy, context-prioritization, and the thin hybrid orchestrator. No public API changes.

Other improvements:

  • Tree-sitter parse timeout (5s) prevents hangs on malformed files
  • reporecall mcp warns when a daemon is already running (SQLite lock contention risk)
  • Ollama health check added to the mcp command
  • Bug intent classifier now recognizes plural forms ("bugs", "issues", "problems")
  • New dependencies: graphology, graphology-communities-louvain for graph analysis
v0.4.1 - Claude Hook Compatibility Fix

This patch fixes Claude hook token lookup for real claude -p / headless sessions. Reporecall-generated hooks now fall back to $PWD when $CLAUDE_PROJECT_DIR is unavailable, so injected context reaches Claude reliably in local CLI sessions after re-running reporecall init.

v0.4.0 - Intent-Based Retrieval Overhaul

This release replaces the old R0 / R1 / R2 routing model with intent-based query modes. The old model described retrieval shape (exact, trace, broad), the new model describes what the user actually wants:

Mode Purpose
lookup Exact symbol, file, endpoint, or module lookup
trace Implementation path - "how does X work", "what calls Y"
bug Causal debugging - symptom descriptions, "why does this fail"
architecture Broad inventory - "which files implement...", "full flow from A to B"
change Cross-cutting edits - "add logging across the auth flow"
skip Meta/chat/non-code prompts

Other changes in this release: streaming windowed indexing, adaptive embedding batches, semantic feature extraction, summary_only delivery for low-confidence bundles, PreToolUse hook guidance, and SQLite ABI self-repair.


Development

npm install
npm run build
npm test

Acknowledgments

The wiki layer is inspired by Andrej Karpathy's LLM Wiki concept - organizing codebase knowledge as structured markdown files that LLMs can query efficiently.

License

MIT