The ATLAS CLI launches all required services, connects to the local LLM, and drops you into an interactive coding session powered by the V3 pipeline.
cd /path/to/your/project
atlasThe atlas command automatically detects what's available and launches the best configuration:
- Proxy already running (any method) → connects Aider immediately
- Go installed (1.24+) → builds and launches the proxy locally as a background process, then connects Aider. The proxy runs in your current directory with full file access.
- Docker Compose proxy only (no Go) → connects to the containerized proxy. File access is limited to the directory set in
ATLAS_PROJECT_DIR(defaults to the ATLAS repo root). - Nothing available → falls back to the built-in REPL (
/solve,/benchonly, no file operations)
For the best experience, install Go 1.24+. The local proxy runs in whatever directory you're in when you type
atlas, so it can always see your project files. The Docker Compose proxy can only see the directory that was mounted when the containers started. See Proxy File Access below.
atlas # Interactive session
atlas somefile.py # Add file to chat on launch
atlas --message "fix the bug" # Non-interactive (runs and exits)
echo "solve this" | atlas # Pipe mode (stdin as problem)Any arguments after atlas are passed through to Aider.
flowchart TD
Start["atlas command"] --> ProxyUp{"Proxy already\nrunning?"}
ProxyUp -->|"Yes"| Launch["Launch Aider\nconnected to proxy"]
ProxyUp -->|"No"| GoCheck{"Go 1.24+\ninstalled?"}
GoCheck -->|"Yes"| Build["Build proxy from source\n(~10s first time)"]
Build --> Local["Launch proxy locally\n(runs in your CWD)"]
Local --> Launch
GoCheck -->|"No"| Docker{"Docker proxy\nrunning?"}
Docker -->|"Yes"| Launch
Docker -->|"No"| REPL["Fall back to built-in REPL\n(/solve, /bench only)"]
Launch --> Aider["Aider session\nFull file access\nV3 pipeline\nTool calls"]
style Start fill:#1a3a5c,color:#fff
style Aider fill:#333,color:#fff
style Launch fill:#2d5016,color:#fff
style REPL fill:#5c3a1a,color:#fff
_ _____ _ _ ___
/_\|_ _| | /_\ / __|
/ _ \ | | | |__ / _ \\__ \
/_/ \_\|_| |____/_/ \_\___/
✓ llama-server (port 8080)
✓ Geometric Lens (port 8099)
✓ V3 Pipeline (port 8070)
✓ Proxy v2 (port 8090)
[atlas] Stack ready. Launching aider...
llama-server → V3 Pipeline → Proxy v2 → Aider
Grammar: response_format:json_object | V3 on T2+ files
Context: 32K | GPU: RTX 5060 Ti | ~51 tok/s
Each service is health-checked via GET /health before proceeding:
| Service | Port | Health Timeout |
|---|---|---|
| llama-server | 8080 | 120s (model loading is slow) |
| Geometric Lens | 8099 | 30s |
| V3 Pipeline | 8070 | 15s |
| Proxy v2 | 8090 | 30s |
If a service is already running, ATLAS skips it and shows "(already running)". Logs for each service are written to logs/ in the ATLAS directory.
Every tool call, V3 pipeline stage, and build verification is streamed in real-time:
[Turn 1/30] 📋 planning subtasks...
[Turn 2/30] ✎ writing package.json (T1, direct)
✓ wrote successfully (1.2ms)
[Turn 3/30] ✎ writing app.py (T2, V3 pipeline)
┌─ V3 Pipeline ─────────────────────────────
│ Baseline: 134 lines, scoring...
│ [probe] Generating probe candidate...
│ [probe_scored] C(x)=0.72
│ [plansearch] Generating 3 plans...
│ [sandbox_test] Testing candidates...
└──── V3 complete: phase1, 3 candidates
✓ wrote successfully
[Turn 4/30] 🔧 running: python -m py_compile app.py
✓ exit code 0 (0.3s)
[Turn 5/30] 📖 reading requirements.txt
└─ 12 lines loaded
═══════════════════════════════════════════
✓ Complete (5 turns, 47s)
Files created: 3 (package.json, app.py, requirements.txt)
Commands run: 1
V3 pipeline: 1 file enhanced
Tokens: 8432
═══════════════════════════════════════════
The proxy wraps each status update in OpenAI-compatible SSE chunks:
sequenceDiagram
participant A as Aider
participant P as atlas-proxy
participant L as llama-server
A->>P: POST /v1/chat/completions (stream=true)
P->>L: POST /v1/chat/completions (json_object)
L-->>P: {"type":"tool_call","name":"write_file",...}
Note over P: Classify tier, execute tool
P-->>A: SSE: [Turn 1/30] ✎ writing app.py (T2)
P-->>A: SSE: V3 pipeline progress events
P-->>A: SSE: ✓ wrote successfully
P-->>A: SSE: completion summary
P-->>A: data: [DONE]
All status lines are injected as delta.content in standard OpenAI SSE chunks, so any OpenAI-compatible client can display them.
| Icon | Tool | Example |
|---|---|---|
| ✎ | write_file |
[Turn 2/30] ✎ writing app.py (T1, direct) |
| ✏️ | edit_file |
[Turn 3/30] ✏️ editing auth.py |
| 🔧 | run_command |
[Turn 4/30] 🔧 running: npm test |
| 📖 | read_file |
[Turn 5/30] 📖 reading config.json |
| 🔍 | search_files |
[Turn 6/30] 🔍 searching "handleAuth" |
| 📁 | list_directory |
[Turn 7/30] 📁 listing src/ |
| 📋 | plan_tasks |
[Turn 1/30] 📋 planning subtasks... |
| Symbol | Meaning | Example |
|---|---|---|
| ✓ | Success | ✓ wrote successfully (1.2ms) |
| ✗ | Failure | ✗ failed: SyntaxError on line 12 (0.4s) |
| └─ | Read/search result | └─ 42 lines loaded |
When the model uses edit_file, the proxy shows what changed:
[Turn 3/30] ✏️ editing auth.py
- def authenticate(user, password):
+ def authenticate(user: str, password: str) -> bool:
(1 lines replaced with 1 lines)
✓ edit applied (0.8ms)
After the agent finishes, a summary box shows:
- Files created/edited/deleted with names (max 5 shown, then "+N more")
- Commands run count
- V3 pipeline count (only shown if V3 was used)
- Tokens total consumed
> Create a Flask REST API with user authentication, SQLite database,
and input validation using Pydantic
[Turn 1/30] 📋 planning subtasks...
[Turn 2/30] ✎ writing requirements.txt (T1, direct)
✓ wrote successfully
[Turn 3/30] ✎ writing app.py (T2, V3 pipeline)
┌─ V3 Pipeline ─────────────────────────────
│ [probe] C(x)=0.68, testing...
│ [probe_sandbox] ✓ probe passed
└──── V3 complete: phase0 (probe pass)
✓ wrote successfully
[Turn 4/30] ✎ writing models.py (T1, direct)
✓ wrote successfully
[Turn 5/30] 🔧 running: python -c "import app; print('ok')"
✓ exit code 0 (0.5s)
═══════════════════════════════════════════
✓ Complete (5 turns, 23s)
Files created: 3 (requirements.txt, app.py, models.py)
Commands run: 1
V3 pipeline: 1 file enhanced
═══════════════════════════════════════════
Notice app.py (complex logic) went through V3, while requirements.txt and models.py (simple/short) were written directly as T1.
> The login endpoint returns 500 when the email field is missing.
Fix the input validation.
[Turn 1/30] 📖 reading app.py
└─ 187 lines loaded
[Turn 2/30] 📖 reading models.py
└─ 42 lines loaded
[Turn 3/30] ✏️ editing app.py
- data = request.json
+ data = request.json or {}
+ if not data.get("email"):
+ return jsonify({"error": "email required"}), 400
(1 lines replaced with 3 lines)
✓ edit applied
[Turn 4/30] 🔧 running: python -m pytest tests/ -q
✓ exit code 0 (1.2s)
═══════════════════════════════════════════
✓ Complete (4 turns, 8s)
Files edited: 1 (app.py)
Commands run: 1
═══════════════════════════════════════════
The model reads files first, uses edit_file for surgical changes (not full rewrites), and verifies the fix by running tests.
All standard Aider commands work through ATLAS:
| Command | Description |
|---|---|
/add <file> |
Add a file to the chat context |
/drop <file> |
Remove a file from context |
/clear |
Clear chat history |
/tokens |
Show token usage |
/undo |
Undo last change |
/run <command> |
Run a shell command |
/help |
Show all commands |
The proxy executes all file operations (read_file, write_file, edit_file, run_command, etc.) on the filesystem where it's running. How it accesses your project files depends on how it's launched:
When Go 1.24+ is installed, atlas builds and launches the proxy as a local process. It runs in your current working directory, so it can read and write any file you can. This works from any directory:
cd ~/projects/my-app
atlas # proxy sees ~/projects/my-app/
cd ~/projects/other-app
atlas # proxy sees ~/projects/other-app/Install Go: https://go.dev/dl/ — the proxy builds automatically on first run (~10 seconds).
Without Go, the proxy runs inside a Docker container. It can only see files in the directory that was mounted when the containers started. By default, this is the ATLAS repo root (wherever you ran docker compose up).
To use ATLAS in a different project directory, set ATLAS_PROJECT_DIR in your .env before starting:
# In your .env file:
ATLAS_PROJECT_DIR=/home/username/projects/my-app
# Then restart the proxy to pick up the new mount:
docker compose up -d atlas-proxyLimitation: You must update ATLAS_PROJECT_DIR and restart the proxy each time you switch projects. This is why Go is recommended — the local proxy has no such limitation.
| Sign | Mode |
|---|---|
atlas prints "Starting local proxy..." |
Local (Go) — full CWD access |
atlas connects immediately without printing proxy startup |
Pre-existing proxy (local or Docker) |
Proxy lists files from wrong directory or /tmp |
Docker proxy without correct mount — install Go or set ATLAS_PROJECT_DIR |
ATLAS also includes a standalone Python REPL that talks directly to services without Aider:
pip install -e .
atlas # Falls back to Python REPL if no Docker stack and no bare-metal launcher| Command | Description |
|---|---|
/solve <file> |
Solve a coding problem from a file |
/bench [--tasks N] [--dataset NAME] [--strategy TYPE] |
Run benchmarks |
/status |
Check service health |
/help |
Show available commands |
/quit, /exit, /q |
Exit |
Plain text input (no / prefix) is treated as a coding problem and solved directly.
On startup, the REPL checks:
- llama-server at
ATLAS_INFERENCE_URL(default: localhost:8080) — required, exits if unavailable - Geometric Lens at
ATLAS_RAG_URL(default: localhost:8099) — optional, warns "Lens unavailable — verification disabled" - Sandbox at
ATLAS_SANDBOX_URL(default: localhost:30820) — optional, warns "Sandbox unavailable — code testing disabled"
When you type a problem or use /solve:
- Generate code from llama-server (streaming if interactive, batch if piped)
- Extract code (handles
<think>blocks, markdown fences, raw code) - Score via Geometric Lens (C(x)/G(x) energy + verdict)
- Test via sandbox (if test cases available)
- Display results with token count and elapsed time
Generation parameters: max_tokens=8192, temperature=0.6, top_k=20, top_p=0.95, stop=["<|im_end|>"]
- Single-file creation: Python scripts, Rust CLIs, Go servers, C programs, shell scripts — first-shot, compiles and runs
- Multi-file project scaffolding: Next.js, Flask, Express — correct dependency order, config files included
- Bug fixes: Reads existing files, identifies issues, applies targeted edits via
edit_file - Feature additions: Reads project context, adds features using surgical
old_str/new_strchanges - Code analysis: Reads entire codebases and explains implementation details
- V3-enhanced quality: Files with complex logic (T2) get diverse candidates, build verification, and energy-based selection — producing measurably better code
- Very large existing codebases (50+ files): The 32K context window limits how much project context the model can process at once
- Visual output verification: CSS styling, layout issues, and design quality cannot be verified by the sandbox
- Real-time interactive applications: The model cannot run a browser or test interactive UIs
- Adding features to existing projects: ~67% reliability (L6 test) — the 9B model sometimes over-explores instead of writing code
- Be specific: "Create a Flask API with /users GET and POST endpoints, SQLite backend, input validation with Pydantic" works better than "Create a web app"
- Provide file context: When modifying existing code,
/addfiles to the Aider chat so ATLAS can read them - Complex tasks take longer: V3 pipeline fires on feature files (50+ lines with logic), adding 2-5 minutes but producing better code
- Watch the terminal: Streaming shows every tool call, V3 step, and build verification in real-time
- Use edit_file hints: For large existing files, ask for specific changes rather than full rewrites — the proxy rejects
write_filefor existing files over 100 lines
Symptom: ✗ llama-server failed to start (120s timeout)
Common causes:
- GPU not detected: Check
nvidia-smi— driver must be installed and GPU visible - Model file missing: Check that the GGUF model exists at the expected path (
ATLAS_MODEL_PATHor./models/Qwen3.5-9B-Q6_K.gguf) - Insufficient VRAM: The 9B Q6_K model needs ~8.2 GB VRAM. Run
nvidia-smito check available memory. Close other GPU processes. - Port conflict: Another process may be using port 8080. Check with
lsof -i :8080
Debug: Check logs/llama-server.log for the actual error.
Symptom: ! Lens unavailable — verification disabled
This is non-fatal. ATLAS still works but skips C(x)/G(x) scoring and Lens-based candidate selection. The V3 pipeline falls back to sandbox-only verification.
Common causes:
- Lens service failed to connect to llama-server (check
logs/geometric-lens.log) - Model weight files missing from the models directory (service degrades gracefully)
Symptom: ! Sandbox unavailable — code testing disabled
Non-fatal but significantly impacts quality. Without sandbox, V3 cannot verify candidates by executing them.
Common causes:
- Docker/Podman not installed (sandbox runs in a container)
- Port 30820 already in use
Check that both config files exist in the ATLAS root:
.aider.model.settings.yml— model configuration.aider.model.metadata.json— token limits and cost
These are included in the repo. If they're missing, the launcher's --model-settings-file and --model-metadata-file flags will fail.
The proxy's error loop breaker triggers after 3 consecutive tool failures. This usually means:
- The model is generating truncated output (file too large for one
write_file) - The file doesn't exist where the model expects it
Fix: Try rephrasing your request to be more specific. For large files, ask for targeted edits rather than full rewrites.
V3 fires on T2 files (50+ lines with logic). If Phase 3 repair engages, it can take several minutes. This is normal for complex code generation.
If it's consistently slow:
- Check GPU utilization with
nvidia-smi— should be near 100% during generation - Ensure no other services are competing for GPU VRAM
All ports and URLs are configurable:
| Variable | Default | Used By | Purpose |
|---|---|---|---|
ATLAS_INFERENCE_URL |
http://localhost:8080 |
proxy, v3-service, Python CLI | llama-server endpoint |
ATLAS_RAG_URL |
http://localhost:8099 |
Python CLI | Geometric Lens endpoint |
ATLAS_LENS_URL |
http://localhost:8099 |
proxy, v3-service | Geometric Lens endpoint |
ATLAS_SANDBOX_URL |
http://localhost:30820 |
proxy, v3-service, Python CLI | Sandbox endpoint |
ATLAS_V3_URL |
http://localhost:8070 |
proxy | V3 Pipeline endpoint |
| Variable | Default | Purpose |
|---|---|---|
ATLAS_MODEL_NAME |
Qwen3.5-9B-Q6_K |
Model identifier for API responses |
ATLAS_MODEL_FILE |
Qwen3.5-9B-Q6_K.gguf |
GGUF filename in models directory |
ATLAS_MODELS_DIR |
./models |
Host path to model weights |
ATLAS_CTX_SIZE |
32768 |
Context window size (tokens) |
ATLAS_AGENT_LOOP |
1 |
Enable agent loop in proxy (1 = on) |
ATLAS_PROXY_PORT |
8090 |
Proxy listening port |
ATLAS_V3_PORT |
8070 |
V3 service listening port |
ATLAS_LLAMA_PORT |
8080 |
llama-server listening port |
ATLAS_LENS_PORT |
8099 |
Geometric Lens listening port |
ATLAS_SANDBOX_PORT |
30820 |
Sandbox host port |
GEOMETRIC_LENS_ENABLED |
true |
Enable/disable Lens scoring |
| Variable | Default | Purpose |
|---|---|---|
ATLAS_LLAMA_BIN |
~/llama-cpp-mtp/build/bin/llama-server |
Path to llama-server binary |
ATLAS_MODEL_PATH |
~/models/Qwen3.5-9B-Q6_K.gguf |
Full path to model file |
Controls how Aider interacts with the ATLAS proxy:
- name: openai/atlas
edit_format: whole # Aider sends full file content (not diffs)
weak_model_name: openai/atlas # Use same model for all tasks
use_repo_map: true # Send repo structure to model
send_undo_reply: true # Notify model when user undoes changes
examples_as_sys_msg: true # Include examples in system prompt
extra_params:
max_tokens: 32768 # Match llama-server context window
temperature: 0.3 # Low temp for deterministic output
cache_control: false # No Anthropic-style caching
caches_by_default: false
streaming: true # Enable SSE streaming
reminder: sys # Put reminders in system promptTells Aider the model's token limits and cost (local = free):
{
"openai/atlas": {
"max_tokens": 32768, // Max output tokens
"max_input_tokens": 32768, // Max input context
"max_output_tokens": 32768, // Max generation length
"input_cost_per_token": 0, // Free (local inference)
"output_cost_per_token": 0,
"litellm_provider": "openai",// OpenAI-compatible API
"mode": "chat" // Chat completion mode
}
}Both files are included in the repo and referenced by the launcher via --model-settings-file and --model-metadata-file.
