localforge

Local-first TUI tool that takes you from zero to a fully working local LLM stack.

Async, cross-platform, zero-overhead. Production-grade Rust.

High-level Vision

LocalForge is a terminal-based application that guides you through:

Setup Flow (Wizard): Hardware detection → Backend selection → llama.cpp build → Model download
Main Workspace: Chat interface with streaming + Settings for server exposure

All steps are resumable and expertise-aware (Beginner/Intermediate/Expert).

Features

Setup & Installation

Hardware Detection — Auto-detect CPU, GPU (NVIDIA/AMD/Apple), VRAM, CUDA, ROCm, WSL
Optimal Backend Selection — Ranked recommendations (CUDA, Metal, ROCm, Vulkan, CPU)
Automated llama.cpp Build — Async build with tuned CMake flags per backend
Optional llama-swap — Hot-swap models without restarting server
Curated Model Registry — Hardware-fitted GGUF recommendations
HuggingFace Search — Async search with GGUF filtering
Robust Downloads — Async downloads with resume support

Main Workspace

Chat Interface — Streaming token generation, context meter, history
Model Hot-Swap — Switch models instantly (if llama-swap installed)
HTTPS Server — OpenAI-compatible API with API key authentication
KV Cache Advisor — Intelligent slot/VRAM management for 5-10 concurrent users
Expertise-Aware UI — Dynamic tooltips based on user level

Technical Foundation

Async Runtime — Built on Tokio for responsive I/O
Structured Logging — Tracing with rotating file appenders
Typed Errors — Thiserror for robust error handling
Retry Logic — Exponential backoff for all network ops
Minimal Dependencies — Single binary, no heavy runtimes

Prerequisites

Before building llama.cpp, you need Git, CMake, and a C/C++ Compiler:

Linux (Ubuntu/Debian):

sudo apt update
sudo apt install git cmake build-essential pkg-config libssl-dev

Linux (Fedora):

sudo dnf install git cmake gcc gcc-c++ make

macOS:

xcode-select --install
brew install cmake pkg-config

Windows:

winget install Git.Git Kitware.CMake
# Install Visual Studio Build Tools (C++ workload)

Backend-specific SDKs

CUDA (NVIDIA): NVIDIA CUDA Toolkit
HIP (AMD): AMD ROCm
Vulkan: Vulkan SDK

Installation

One-line install (curl)

curl -fsSL https://raw.githubusercontent.com/madbeast-16/LocalForge/localforge-v1/install.sh | bash

The install script auto-detects your OS/architecture, downloads the v0.1.5 binary from GitHub Releases, and falls back to building from source if needed.

You can control the installation:

# Install to a specific directory
INSTALL_DIR=/usr/local/bin curl -fsSL https://raw.githubusercontent.com/madbeast-16/LocalForge/localforge-v1/install.sh | bash

# Install specific version
VERSION=v0.1.5 curl -fsSL https://raw.githubusercontent.com/madbeast-16/LocalForge/localforge-v1/install.sh | bash

Download binary manually

Pre-built binaries are available at: https://github.com/madbeast-16/LocalForge/releases/tag/v0.1.5

Supported platforms:

x86_64-unknown-linux-gnu (Linux x86_64)
aarch64-unknown-linux-gnu (Linux ARM64)
x86_64-apple-darwin (macOS Intel)
aarch64-apple-darwin (macOS Apple Silicon)
x86_64-pc-windows-msvc (Windows)

Build from Source

git clone https://github.com/madbeast-16/LocalForge
cd LocalForge
cargo build --release

Binary will be at target/release/localforge (~3-4 MB).

Install via Cargo

cargo install --path .

Usage

Launch TUI (default)

localforge

The TUI guides you through the complete setup flow.

Headless Mode

localforge --headless build --cuda     # Force CUDA build
localforge --headless build --metal    # Force Metal build

Commands

Detect hardware:

localforge detect

Show model recommendations:

localforge models          # Hardware-fitted
localforge models --all    # Show all curated

Search HuggingFace:

localforge search "llama 3"

Build llama.cpp:

localforge build                    # Auto-detect
localforge build --cuda            # Force CUDA
localforge build --vulkan          # Force Vulkan

Configuration:

localforge config show
localforge config --init           # Write default config

Config File

Location: ~/.config/localforge/config.toml

install_prefix = "~/.local"
models_dir = "~/.local/share/localforge/models"
headless = false
# backend = "Cuda"
# parallel_jobs = 8

Architecture

Clean, layered module structure:

src/
├── main.rs          # Entry point, CLI args
├── lib.rs           # Library root
├── app.rs           # App state & routing
├── config/          # Config/state persistence
├── error.rs         # Typed errors + retry
├── logger.rs        # Tracing setup
├── hardware/        # Hardware detection
├── build/           # llama.cpp pipeline
├── models/          # Model discovery & download
├── inference/       # Inference engines
├── server/          # HTTP API server
├── compute/         # KV cache calculator
└── tui/             # Terminal UI

Key Types

AppState: Central state with expertise level, persistence

struct AppState {
    expertise: ExpertiseLevel,      // Beginner/Intermediate/Expert
    setup_complete: bool,
    build_complete: bool,
    downloaded_models: Vec<String>,
    server_running: bool,
}

RetryPolicy: Exponential backoff with jitter

pub struct RetryPolicy {
    pub max_attempts: u32,
    pub initial_backoff: Duration,
    pub backoff_multiplier: f64,
    pub jitter: bool,
}

InferenceEngine: Pluggable inference backend

#[async_trait]
pub trait InferenceEngine: Send + Sync {
    async fn load_model(&mut self, model: &ModelEntry) -> Result<()>;
    async fn generate(&mut self, session: &ChatSession) -> Result<String>;
}

Development

Build

cargo build --release

Run tests

cargo test

Generate docs

cargo doc --no-deps --open

Platform Support

Platform	Status	Notes
Linux x86_64	✅ Full	Primary target
Linux ARM64	✅ Full	Tested on AWS Graviton
macOS Intel	✅ Full	Tested
macOS Apple Silicon	✅ Full	Native Metal support
Windows x86_64	✅ Full	Native + WSL

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Run cargo fmt and cargo clippy
Submit a PR

Goals

Binary Size: < 12 MB (release, stripped) Idle RAM: < 30 MB Zero Dependencies: Single static binary

License

MIT

Acknowledgments

Built with:

ratatui for TUI
tokio for async runtime
tracing for logging
axum for HTTP server
llama.cpp for inference

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github/workflows		.github/workflows
Formula		Formula
src		src
.gitignore		.gitignore
Cargo.toml		Cargo.toml
README.md		README.md
install.sh		install.sh

Folders and files

Latest commit

History

Repository files navigation

localforge

High-level Vision

Features

Setup & Installation

Main Workspace

Technical Foundation

Prerequisites

Backend-specific SDKs

Installation

One-line install (curl)

Download binary manually

Build from Source

Install via Cargo

Usage

Launch TUI (default)

Headless Mode

Commands

Config File

Architecture

Key Types

Development

Build

Run tests

Generate docs

Platform Support

Contributing

Goals

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages