LAI

LAI is a local-first AI orchestration platform scaffold for running and routing between multiple models, from lightweight classifiers to AirLLM-backed 70B-class execution models. The repository is intentionally structured for serious long-running workloads, disk-heavy model sharding, and future expansion into a full platform rather than a single script.

What this repository is designed to support

Small models for request classification, safety checks, summarization, and routing.
Large models for deep execution, overnight jobs, and high-quality final outputs.
AirLLM-backed inference for large Hugging Face models on constrained hardware.
A model registry and routing policy layer so the platform can choose the right model for the right phase of work.
Clear boundaries between product code, runtime configuration, research, evaluation, and operations.

Repository map

.
|-- .github/                 GitHub issue templates, CI, reviews, ownership
|-- apps/                    Deployable applications and future service surfaces
|   |-- api/                 Control plane and external API
|   |-- web/                 Future frontend or dashboard
|   `-- worker/              Long-running local execution workers
|-- configs/                 Model catalog, routing policies, prompt assets
|-- data/                    Local-only caches, model shards, artifacts
|-- docs/                    Architecture, setup guides, runbooks, ADRs
|-- evals/                   Evaluation scenarios and saved benchmark outputs
|-- logs/                    Local runtime logs
|-- notebooks/               Exploratory research notebooks
|-- scripts/                 Bootstrap and developer automation
|-- src/lai/                 Core Python package
|-- tests/                   Unit, integration, and end-to-end validation
|-- CONTRIBUTING.md          Contribution workflow
|-- GOVERNANCE.md            Decision process and ownership model
|-- ROADMAP.md               Delivery phases and milestones
|-- SECURITY.md              Disclosure and hardening expectations
|-- SUPPORT.md               Support channels and expectations
|-- pyproject.toml           Python package and tooling entrypoint
`-- ruff.toml                Linting rules

Architecture direction

The current repository is a foundation for the following request flow:

A user request is received by the API or CLI.
A small routing model classifies intent, complexity, urgency, and safety needs.
The orchestration layer chooses an execution tier from the routing policy.
The selected runtime executes the task:
- small or medium model for fast tasks
- AirLLM-backed large model for heavyweight generation, reasoning, or overnight jobs
The platform stores artifacts, logs, and evaluation traces for later review.

This matches the project goal of spending cheap compute on planning and reserving the biggest models for the parts of the work that truly benefit from them.

Prerequisites

Recommended baseline

Windows 11 or Linux
Python 3.11 for project tooling stability
Git and GitHub CLI
NVIDIA drivers and CUDA-capable GPU when using large local inference
32 GB RAM minimum for comfortable local experimentation
Large free disk budget for model downloads and AirLLM layer shards
Hugging Face account and token for gated models

AirLLM-specific notes

Install AirLLM separately with pip install airllm.
AirLLM can split a model into layer shards during first use, so the Hugging Face cache and shard directory must have substantial free disk space.
Optional compression support may require bitsandbytes.
CPU inference is possible, but the large-model path is designed around patience rather than interactivity.

Quickstart

git clone <your-repo-url>
cd LAI
py -3.11 -m venv .venv
.venv\Scripts\Activate.ps1
python -m pip install --upgrade pip
python -m pip install -e .[dev]
Copy-Item .env.example .env
python -m lai.cli doctor

To add the large-model runtime later:

python -m pip install -e .[dev,api]
python -m pip install airllm

Initial GitHub rules encoded in this repo

Pull request template and issue forms for consistent planning.
CI that runs linting and unit tests on every push and pull request.
CODEOWNERS so review responsibility is explicit from day one.
SECURITY.md, SUPPORT.md, and contribution guidance for a public-ready repository.
Dependabot updates for Python and GitHub Actions.

Near-term priorities

Implement the model registry and routing engine under src/lai/.
Add the first AirLLM runtime adapter and smoke-test workflows.
Introduce an API surface in apps/api.
Add evaluation scenarios that compare small-model routing against large-model final execution.

References

AirLLM quickstart: https://github.com/lyogavin/airllm?tab=readme-ov-file#quickstart
AirLLM requirements snapshot: https://raw.githubusercontent.com/lyogavin/airllm/main/requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LAI

What this repository is designed to support

Repository map

Architecture direction

Prerequisites

Recommended baseline

AirLLM-specific notes

Quickstart

Initial GitHub rules encoded in this repo

Near-term priorities

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github		.github
apps		apps
configs		configs
data		data
docs		docs
evals		evals
logs		logs
notebooks		notebooks
scripts		scripts
src/lai		src/lai
tests/unit		tests/unit
.editorconfig		.editorconfig
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
GOVERNANCE.md		GOVERNANCE.md
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
ruff.toml		ruff.toml

Folders and files

Latest commit

History

Repository files navigation

LAI

What this repository is designed to support

Repository map

Architecture direction

Prerequisites

Recommended baseline

AirLLM-specific notes

Quickstart

Initial GitHub rules encoded in this repo

Near-term priorities

References

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages