Architecture

Technical design document for RustCoder — an autonomous Rust coding agent that uses local LLMs to generate, scaffold, and fix Rust projects.

The Single-File Thesis

RustCoder generates monolithic src/main.rs files with inline pub mod blocks instead of multi-file project layouts. This is a deliberate design choice driven by three constraints:

Context fragmentation. Small local LLMs (7B–27B parameters) struggle when relevant code is split across files they can't see simultaneously. A single file means the model always has the full picture when fixing errors — no missing imports from unseen modules, no phantom type mismatches across file boundaries.

Rust module system complexity. Multi-file layouts require correct mod.rs files, path attributes, and pub(crate) visibility — all things small models routinely get wrong. A single-file layout sidesteps mod declaration bugs entirely because all modules are inline.

Atomic compilation. One file means one compilation unit with one error stream. The fix loop can read the entire file, locate the error, and patch it without coordinating across files.

The tradeoff is a soft limit around ~5,000 lines before context windows saturate and model quality degrades.

The Scaffold Engine

scaffold.rs handles the structural mechanics of generating and assembling a multi-module project into a single file.

Deterministic Module Trees

generate_single_file_main() (scaffold.rs:1061) builds the initial stub file. It uses a BTreeMap (line 1078) to group modules by parent path, ensuring deterministic output order regardless of HashMap iteration. Nested module names like core::token are split on :: and emitted as nested pub mod blocks.

The recursive emit_module_tree() (line 1103) walks the BTreeMap and emits each module with 4-space indentation per nesting level. Leaf modules get a todo!() placeholder; parent modules recurse into their children.

Brace-Counting Injection

insert_module_into_main() (line 1143) replaces a module's body in the assembled main.rs. It finds pub mod name {, then counts braces to locate the matching }. For nested modules like core::token, it first navigates into the parent's brace scope via find_nested_mod_start() before searching for the leaf module.

This approach is intentionally simple — no AST parsing, just string search + brace counting. It's robust enough for machine-generated code where module blocks are well-formed.

Context Injection

When implementing module B that depends on module A, RustCoder injects only A's public API signatures (extracted by extract_public_api(), line 204), not its full implementation. This keeps prompts small and focused. The build_context() function (line 680) assembles the dependency APIs, working memory notes, and external docs into a single prompt.

The 5-Phase Pipeline

The scaffold() method in rig_agent.rs:864 orchestrates project generation through five phases:

Phase 1: Decomposition (lines 878–914)

The LLM decomposes a project description into a ProjectSpec — a JSON structure with module names, purposes, dependency edges, and public API signatures. The ModuleGraph (scaffold.rs:70) builds a DAG and runs Kahn's algorithm (line 110) for topological sort, producing an implementation order where dependencies come first.

Invalid cross-references (where the model confuses external crates with internal modules) are silently dropped during graph construction (line 96).

Phase 2: Structural Init (lines 915–960)

Creates the output directory, writes a placeholder Cargo.toml, initializes WORKING_MEMORY.md, generates the single-file stub with todo!() bodies, and makes an initial git commit. Each subsequent phase commits its changes atomically.

Phase 3: Iterative Implementation (lines 961–1046)

Modules are implemented in topological order. For each module:

context_for() builds a prompt with the module's spec, its dependency APIs, relevant working memory notes, and gotcha hints
The LLM generates code + structured notes (no tools — all context is pre-fed in the prompt to avoid search loops)
The response is parsed into an ImplementationResponse with code, notes, trait impls, and key signatures
Code is injected into main.rs via insert_module_into_main()
Notes are appended to WORKING_MEMORY.md for downstream modules
Git commit per module

The no-tools decision (line 979–983) is deliberate: giving small models a search_crate_docs tool during implementation causes infinite loops when they search for project-internal types that don't exist in any crate index.

Phase 4: Dependency Resolution (lines 1048–1084)

No LLM involved. extract_external_crates_from_sources() (scaffold.rs:450) scans all generated source code for use statements, #[derive(Serialize)] patterns, and #[tokio::main] attributes to deterministically detect external crate usage. generate_cargo_toml_from_sources() (line 525) writes the Cargo.toml with pinned versions for known crates and "*" for unknown ones.

A cleanup_llm_source() pass (rig_agent.rs:1518) removes common LLM artifacts like orphaned doc comments not attached to any item.

Phase 4.5: Docs RAG Pre-indexing (lines 1085–1108)

Before the fix loop, detected crate dependencies are indexed into a local LanceDB-backed docs database (docs_rag.rs). This gives the Phase 5 fix loop access to real API documentation via a SearchDocsTool, reducing hallucinated API usage.

Phase 5: Fix Loop (lines 1110–1362)

Up to 7 iterations of focused error repair:

Run cargo check --message-format=json for structured diagnostics
Auto-apply machine-applicable fixes from cargo's suggestions
Sort remaining errors by priority (see Error Prioritization below)
Take the single highest-priority error
Use tree-sitter to extract just the affected function (not the whole file)
Build a focused prompt: one error + one function + gotcha hints + API corrections
LLM fixes via patch_file tool, then cargo_check to verify
Git commit, next iteration

The one-error-at-a-time strategy prevents small models from being overwhelmed by dozens of errors. Fixing root causes (imports, types) first cascades into resolving dependent errors.

Error Prioritization Matrix

tree_sitter_extract.rs:123–148 assigns a priority tier to each compiler error. Lower number = fix first:

Priority	Category	Error Codes	Rationale
1	Syntax/parse	No E-code	Blocks all other analysis
2	Import/module	E0432, E0433	Missing imports cascade into type errors
3	Type definition	E0412, E0422, E0425, E0609	Default for unknown E-codes
4	Trait/impl	E0046, E0277, E0599	Requires types to exist first
5	Borrow/lifetime	E0106, E0382, E0505, E0507	Often resolves after type fixes

Tree-Sitter Keyhole Surgery

Rather than sending the entire (potentially 3,000+ line) file to the LLM for each error, extract_function_at_line() (tree_sitter_extract.rs:24) uses tree-sitter to parse the Rust source and extract only the function containing the error line. It also captures:

Use statements from the file header (for import context)
Impl block header if the function is a method (e.g., impl MyStruct {)

For import errors (E0432, E0433, E0405), the first 50 lines of the file are included regardless, since the fix is typically in use statements rather than function bodies.

This "keyhole" approach keeps fix prompts small enough for 7B models to handle effectively, while providing enough context for accurate fixes.

Working Memory

WORKING_MEMORY.md serves as episodic memory across module implementations during Phase 3. After implementing each module, the LLM produces structured notes:

Notes for dependents: API usage gotchas, return type details
Trait implementations: What traits are available (Clone, FromStr, etc.)
Key signatures: Method signatures that downstream modules need

When building the prompt for module B that depends on module A, extract_dependency_notes() (scaffold.rs:934) pulls only A's section from working memory. This compresses what might be a 500-line module into a ~20-line API summary, keeping prompts within context limits.

The working memory file is append-only during a scaffold run — each module's notes accumulate for later modules to reference.

Limitations

Single-file scale: Projects beyond ~5,000 lines risk context window saturation, where the model loses coherence on distant parts of the file
No proc-macro authoring: The single-file layout cannot support procedural macro crates (which require separate crate targets)
Fix loop ceiling: 7 iterations is a hard cap — deeply entangled errors may not resolve in time
Model-dependent quality: The architecture compensates for small model weaknesses but cannot eliminate them; larger models produce better results with the same pipeline
No incremental rebuilds: Each fix iteration runs a full cargo check, which scales with project size

Version History

0.1.0 (Current)

Initial release with core scaffold, fix, and implement commands
LM Studio and Ollama provider support
LanceDB-based documentation RAG
rust-analyzer semantic analysis integration
Single-file project generation architecture
Tree-sitter based error extraction and prioritization

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture

The Single-File Thesis

The Scaffold Engine

Deterministic Module Trees

Brace-Counting Injection

Context Injection

The 5-Phase Pipeline

Phase 1: Decomposition (lines 878–914)

Phase 2: Structural Init (lines 915–960)

Phase 3: Iterative Implementation (lines 961–1046)

Phase 4: Dependency Resolution (lines 1048–1084)

Phase 4.5: Docs RAG Pre-indexing (lines 1085–1108)

Phase 5: Fix Loop (lines 1110–1362)

Error Prioritization Matrix

Tree-Sitter Keyhole Surgery

Working Memory

Limitations

Version History

0.1.0 (Current)

FilesExpand file tree

ARCHITECTURE.md

Latest commit

History

ARCHITECTURE.md

File metadata and controls

Architecture

The Single-File Thesis

The Scaffold Engine

Deterministic Module Trees

Brace-Counting Injection

Context Injection

The 5-Phase Pipeline

Phase 1: Decomposition (lines 878–914)

Phase 2: Structural Init (lines 915–960)

Phase 3: Iterative Implementation (lines 961–1046)

Phase 4: Dependency Resolution (lines 1048–1084)

Phase 4.5: Docs RAG Pre-indexing (lines 1085–1108)

Phase 5: Fix Loop (lines 1110–1362)

Error Prioritization Matrix

Tree-Sitter Keyhole Surgery

Working Memory

Limitations

Version History

0.1.0 (Current)