Skip to content

N3uralCreativity/LAI

LAI

LAI is a local-first AI orchestration platform scaffold for running and routing between multiple models, from lightweight classifiers to AirLLM-backed 70B-class execution models. The repository is intentionally structured for serious long-running workloads, disk-heavy model sharding, and future expansion into a full platform rather than a single script.

What this repository is designed to support

  • Small models for request classification, safety checks, summarization, and routing.
  • Large models for deep execution, overnight jobs, and high-quality final outputs.
  • AirLLM-backed inference for large Hugging Face models on constrained hardware.
  • A model registry and routing policy layer so the platform can choose the right model for the right phase of work.
  • Clear boundaries between product code, runtime configuration, research, evaluation, and operations.

Repository map

.
|-- .github/                 GitHub issue templates, CI, reviews, ownership
|-- apps/                    Deployable applications and future service surfaces
|   |-- api/                 Control plane and external API
|   |-- web/                 Future frontend or dashboard
|   `-- worker/              Long-running local execution workers
|-- configs/                 Model catalog, routing policies, prompt assets
|-- data/                    Local-only caches, model shards, artifacts
|-- docs/                    Architecture, setup guides, runbooks, ADRs
|-- evals/                   Evaluation scenarios and saved benchmark outputs
|-- logs/                    Local runtime logs
|-- notebooks/               Exploratory research notebooks
|-- scripts/                 Bootstrap and developer automation
|-- src/lai/                 Core Python package
|-- tests/                   Unit, integration, and end-to-end validation
|-- CONTRIBUTING.md          Contribution workflow
|-- GOVERNANCE.md            Decision process and ownership model
|-- ROADMAP.md               Delivery phases and milestones
|-- SECURITY.md              Disclosure and hardening expectations
|-- SUPPORT.md               Support channels and expectations
|-- pyproject.toml           Python package and tooling entrypoint
`-- ruff.toml                Linting rules

Architecture direction

The current repository is a foundation for the following request flow:

  1. A user request is received by the API or CLI.
  2. A small routing model classifies intent, complexity, urgency, and safety needs.
  3. The orchestration layer chooses an execution tier from the routing policy.
  4. The selected runtime executes the task:
    • small or medium model for fast tasks
    • AirLLM-backed large model for heavyweight generation, reasoning, or overnight jobs
  5. The platform stores artifacts, logs, and evaluation traces for later review.

This matches the project goal of spending cheap compute on planning and reserving the biggest models for the parts of the work that truly benefit from them.

Prerequisites

Recommended baseline

  • Windows 11 or Linux
  • Python 3.11 for project tooling stability
  • Git and GitHub CLI
  • NVIDIA drivers and CUDA-capable GPU when using large local inference
  • 32 GB RAM minimum for comfortable local experimentation
  • Large free disk budget for model downloads and AirLLM layer shards
  • Hugging Face account and token for gated models

AirLLM-specific notes

  • Install AirLLM separately with pip install airllm.
  • AirLLM can split a model into layer shards during first use, so the Hugging Face cache and shard directory must have substantial free disk space.
  • Optional compression support may require bitsandbytes.
  • CPU inference is possible, but the large-model path is designed around patience rather than interactivity.

Quickstart

git clone <your-repo-url>
cd LAI
py -3.11 -m venv .venv
.venv\Scripts\Activate.ps1
python -m pip install --upgrade pip
python -m pip install -e .[dev]
Copy-Item .env.example .env
python -m lai.cli doctor

To add the large-model runtime later:

python -m pip install -e .[dev,api]
python -m pip install airllm

Initial GitHub rules encoded in this repo

  • Pull request template and issue forms for consistent planning.
  • CI that runs linting and unit tests on every push and pull request.
  • CODEOWNERS so review responsibility is explicit from day one.
  • SECURITY.md, SUPPORT.md, and contribution guidance for a public-ready repository.
  • Dependabot updates for Python and GitHub Actions.

Near-term priorities

  1. Implement the model registry and routing engine under src/lai/.
  2. Add the first AirLLM runtime adapter and smoke-test workflows.
  3. Introduce an API surface in apps/api.
  4. Add evaluation scenarios that compare small-model routing against large-model final execution.

References

About

Local-first AI orchestration platform for adaptive model routing and AirLLM-backed large-model execution

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors