Technical design document for RustCoder — an autonomous Rust coding agent that uses local LLMs to generate, scaffold, and fix Rust projects.
RustCoder generates monolithic src/main.rs files with inline pub mod blocks instead of multi-file project layouts. This is a deliberate design choice driven by three constraints:
Context fragmentation. Small local LLMs (7B–27B parameters) struggle when relevant code is split across files they can't see simultaneously. A single file means the model always has the full picture when fixing errors — no missing imports from unseen modules, no phantom type mismatches across file boundaries.
Rust module system complexity. Multi-file layouts require correct mod.rs files, path attributes, and pub(crate) visibility — all things small models routinely get wrong. A single-file layout sidesteps mod declaration bugs entirely because all modules are inline.
Atomic compilation. One file means one compilation unit with one error stream. The fix loop can read the entire file, locate the error, and patch it without coordinating across files.
The tradeoff is a soft limit around ~5,000 lines before context windows saturate and model quality degrades.
scaffold.rs handles the structural mechanics of generating and assembling a multi-module project into a single file.
generate_single_file_main() (scaffold.rs:1061) builds the initial stub file. It uses a BTreeMap (line 1078) to group modules by parent path, ensuring deterministic output order regardless of HashMap iteration. Nested module names like core::token are split on :: and emitted as nested pub mod blocks.
The recursive emit_module_tree() (line 1103) walks the BTreeMap and emits each module with 4-space indentation per nesting level. Leaf modules get a todo!() placeholder; parent modules recurse into their children.
insert_module_into_main() (line 1143) replaces a module's body in the assembled main.rs. It finds pub mod name {, then counts braces to locate the matching }. For nested modules like core::token, it first navigates into the parent's brace scope via find_nested_mod_start() before searching for the leaf module.
This approach is intentionally simple — no AST parsing, just string search + brace counting. It's robust enough for machine-generated code where module blocks are well-formed.
When implementing module B that depends on module A, RustCoder injects only A's public API signatures (extracted by extract_public_api(), line 204), not its full implementation. This keeps prompts small and focused. The build_context() function (line 680) assembles the dependency APIs, working memory notes, and external docs into a single prompt.
The scaffold() method in rig_agent.rs:864 orchestrates project generation through five phases:
The LLM decomposes a project description into a ProjectSpec — a JSON structure with module names, purposes, dependency edges, and public API signatures. The ModuleGraph (scaffold.rs:70) builds a DAG and runs Kahn's algorithm (line 110) for topological sort, producing an implementation order where dependencies come first.
Invalid cross-references (where the model confuses external crates with internal modules) are silently dropped during graph construction (line 96).
Creates the output directory, writes a placeholder Cargo.toml, initializes WORKING_MEMORY.md, generates the single-file stub with todo!() bodies, and makes an initial git commit. Each subsequent phase commits its changes atomically.
Modules are implemented in topological order. For each module:
context_for()builds a prompt with the module's spec, its dependency APIs, relevant working memory notes, and gotcha hints- The LLM generates code + structured notes (no tools — all context is pre-fed in the prompt to avoid search loops)
- The response is parsed into an
ImplementationResponsewith code, notes, trait impls, and key signatures - Code is injected into
main.rsviainsert_module_into_main() - Notes are appended to
WORKING_MEMORY.mdfor downstream modules - Git commit per module
The no-tools decision (line 979–983) is deliberate: giving small models a search_crate_docs tool during implementation causes infinite loops when they search for project-internal types that don't exist in any crate index.
No LLM involved. extract_external_crates_from_sources() (scaffold.rs:450) scans all generated source code for use statements, #[derive(Serialize)] patterns, and #[tokio::main] attributes to deterministically detect external crate usage. generate_cargo_toml_from_sources() (line 525) writes the Cargo.toml with pinned versions for known crates and "*" for unknown ones.
A cleanup_llm_source() pass (rig_agent.rs:1518) removes common LLM artifacts like orphaned doc comments not attached to any item.
Before the fix loop, detected crate dependencies are indexed into a local LanceDB-backed docs database (docs_rag.rs). This gives the Phase 5 fix loop access to real API documentation via a SearchDocsTool, reducing hallucinated API usage.
Up to 7 iterations of focused error repair:
- Run
cargo check --message-format=jsonfor structured diagnostics - Auto-apply machine-applicable fixes from cargo's suggestions
- Sort remaining errors by priority (see Error Prioritization below)
- Take the single highest-priority error
- Use tree-sitter to extract just the affected function (not the whole file)
- Build a focused prompt: one error + one function + gotcha hints + API corrections
- LLM fixes via
patch_filetool, thencargo_checkto verify - Git commit, next iteration
The one-error-at-a-time strategy prevents small models from being overwhelmed by dozens of errors. Fixing root causes (imports, types) first cascades into resolving dependent errors.
tree_sitter_extract.rs:123–148 assigns a priority tier to each compiler error. Lower number = fix first:
| Priority | Category | Error Codes | Rationale |
|---|---|---|---|
| 1 | Syntax/parse | No E-code | Blocks all other analysis |
| 2 | Import/module | E0432, E0433 | Missing imports cascade into type errors |
| 3 | Type definition | E0412, E0422, E0425, E0609 | Default for unknown E-codes |
| 4 | Trait/impl | E0046, E0277, E0599 | Requires types to exist first |
| 5 | Borrow/lifetime | E0106, E0382, E0505, E0507 | Often resolves after type fixes |
Rather than sending the entire (potentially 3,000+ line) file to the LLM for each error, extract_function_at_line() (tree_sitter_extract.rs:24) uses tree-sitter to parse the Rust source and extract only the function containing the error line. It also captures:
- Use statements from the file header (for import context)
- Impl block header if the function is a method (e.g.,
impl MyStruct {)
For import errors (E0432, E0433, E0405), the first 50 lines of the file are included regardless, since the fix is typically in use statements rather than function bodies.
This "keyhole" approach keeps fix prompts small enough for 7B models to handle effectively, while providing enough context for accurate fixes.
WORKING_MEMORY.md serves as episodic memory across module implementations during Phase 3. After implementing each module, the LLM produces structured notes:
- Notes for dependents: API usage gotchas, return type details
- Trait implementations: What traits are available (Clone, FromStr, etc.)
- Key signatures: Method signatures that downstream modules need
When building the prompt for module B that depends on module A, extract_dependency_notes() (scaffold.rs:934) pulls only A's section from working memory. This compresses what might be a 500-line module into a ~20-line API summary, keeping prompts within context limits.
The working memory file is append-only during a scaffold run — each module's notes accumulate for later modules to reference.
- Single-file scale: Projects beyond ~5,000 lines risk context window saturation, where the model loses coherence on distant parts of the file
- No proc-macro authoring: The single-file layout cannot support procedural macro crates (which require separate crate targets)
- Fix loop ceiling: 7 iterations is a hard cap — deeply entangled errors may not resolve in time
- Model-dependent quality: The architecture compensates for small model weaknesses but cannot eliminate them; larger models produce better results with the same pipeline
- No incremental rebuilds: Each fix iteration runs a full
cargo check, which scales with project size
- Initial release with core scaffold, fix, and implement commands
- LM Studio and Ollama provider support
- LanceDB-based documentation RAG
- rust-analyzer semantic analysis integration
- Single-file project generation architecture
- Tree-sitter based error extraction and prioritization