Skip to content

itigges22/ATLAS

ATLAS Banner

A.T.L.A.S.

Adaptive Test-time Learning and Autonomous Specialization

Version License Model GPU

简体中文 日本語 한국어

🌎 What is ATLAS?

ATLAS is a self-hosted coding assistant built on intelligent inference infrastructure. You point it at an open-weight model running locally, and it turns that model into something that competes with frontier systems, with no fine-tuning, no API calls, and no cloud in between.

Instead of training a larger model or routing to a hosted one, ATLAS wraps a frozen local model in a pipeline that plans before generating, verifies its own output against constraints it extracts from the problem, scores candidates with an energy-based lens, and repairs failures through self-generated test feedback. The weights never change. The intelligence lives in the scaffolding around them.

The result is a serious coding assistant that runs on a single consumer GPU for fractions of a cent per task. Nothing leaves your machine, no vendor can pull the model out from under you, and the entire stack is open source. One model, one GPU, no one else's infrastructure in the loop.


🔥 Latest News

Star History Chart

🧱 What ATLAS Does

  1. atlas-proxy - Go-based agent loop that orchestrates the entire system.
  1. V3 Pipeline - multi-phase code generation that turns a single prompt into verified, high-quality output.
  1. Geometric Lens - energy-based scoring and retrieval without external oracles. (What is a "Geometric Lens"?)
  1. Sandbox - isolated execution environment for build verification.
  • a. Multi-language execution - Python, Rust, Go, C, Shell, and more
  • b. Compilation and linting - syntax verification before scoring
  • c. Test running - executes generated and existing test suites
  1. llama-server - local LLM inference on a single consumer GPU.
  • a. CUDA acceleration - quantized model inference (Q6_K / Q4_K_M)
  • b. Grammar-constrained decoding - structured output at the token level
  • c. Self-embeddings - embedding extraction without a separate model
  1. Interactive CLI - type atlas in any project directory and start building.

Full documentation - setup guides, architecture, configuration, troubleshooting, and benchmark reports - lives in the docs/ directory.


🚀 Get Started

ATLAS requires a GPU with 16GB+ VRAM, Docker (with nvidia-container-toolkit) or Podman, and Python 3.9+. Currently tested on NVIDIA GPUs - ATLAS is not NVIDIA-specific, and ROCm support for AMD GPUs is on the roadmap. See SETUP.md for full installation instructions covering Docker Compose, bare-metal, and K3s deployment. Once running, type atlas in any project directory and start building.


⚠️ Known Limitations

  • Tested on NVIDIA only - ATLAS uses llama.cpp for inference, which supports multiple accelerator backends. ROCm support is a V3.1 priority.
  • 9B model not formally benchmarked - the CLI ships Qwen3.5-9B with the full V3 pipeline, but formal LiveCodeBench scores are from the 14B model. 9B benchmarks are V3.1 work.
  • Complex feature additions can fail - adding features to existing projects succeeds ~67% of the time. The model sometimes over-explores instead of writing code.
  • Grammar-constrained inference speed - ~51 tok/s on llama-server. Faster grammar integration is planned for V3.1.

🗺️ Roadmap

V3.0.1 - Current release. Interactive CLI, Docker Compose deployment, V3 pipeline integration.

V3.1 - In progress.

  • ROCm support - AMD GPU inference via llama.cpp ROCm backend
  • Formal 9B benchmarks - LiveCodeBench, GPQA Diamond, SciCode on Qwen3.5-9B
  • CLI reliability - expanded testing, targeting L6 ≥ 90%
  • Grammar speed - C-side sampler chain for faster constrained decoding

🤝 Contributing

We're building ATLAS in the open and we're actively looking for contributors and core maintainers. Whether you're fixing a bug, adding accelerator support, or rethinking a whole subsystem - there's a place for you here. If you believe open models deserve better infrastructure, come build with us.

Found a bug or hit a wall? Open an issue - you don't need to submit a fix. Bug reports and feedback help just as much as code.

See CONTRIBUTING.md for guidelines.


📄 License

Licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).

About

Adaptive Test-time Learning and Autonomous Specialization

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors