A declarative standard for describing what an AI model needs to run — and how it should behave in production.
When a model moves from development to production, critical knowledge gets lost:
- What GPU does it actually need?
- What's the max batch size before latency degrades?
- How many replicas does it need at peak?
- What compliance rules apply to its inputs and outputs?
This knowledge lives in someone's head, a Slack thread, or a runbook nobody reads. When something breaks at 2am, nobody knows what "correct" looks like.
ModelSpec makes this knowledge explicit, machine-readable, and version-controlled.
A minimal ModelSpec — model identity and GPU requirement:
apiVersion: piqc.ai/v1alpha1
kind: ModelSpec
metadata:
name: minimal-llm
spec:
identity:
model:
id: example-llm
family: llama
task: text-generation
framework: transformers
runtime:
accelerator:
vendor: nvidia
type: a10
count: 1A production ModelSpec — full operational contract:
apiVersion: piqc.ai/v1alpha1
kind: ModelSpec
metadata:
name: llama-2-70b-chat-prod
version: "2025-01"
description: Production chat LLM for customer support
labels:
team: ml-platform
environment: prod
spec:
identity:
model:
id: llama-2-70b-chat
family: llama-2
task: chat-completion
framework: vllm
precision: fp16
runtime:
accelerator:
vendor: nvidia
type: a100-80gb
count: 4
batch:
maxBatchSize: 64
maxSequenceLengthTokens: 4096
operations:
serving:
protocol: http
port: 8000
maxConcurrency: 32
timeoutSeconds: 60
scaling:
minReplicas: 2
maxReplicas: 10
targetLatencyMsP95: 800
targetRps: 50
governance:
compliance:
pii:
allowed: false
policy: internal-pii-policy-v3
retention:
logsDays: 30| Section | What it declares |
|---|---|
identity |
Model family, task, framework, precision, artifact locations |
runtime |
GPU type, count, batch limits, sequence length, memory |
operations.serving |
Protocol, port, concurrency, health probes, timeouts |
operations.scaling |
Min/max replicas, latency targets, RPS targets |
operations.observability |
Metrics, logging, tracing expectations |
pipeline |
Dependencies — guardrails, embeddings, RAG components |
governance |
PII policy, data retention, compliance rules |
Not all fields are required. ModelSpec is designed to grow with your deployment's maturity.
python3 -m venv .venv
source .venv/bin/activate
pip install -r tooling/validator/requirements.txt
python tooling/validator/validate.py --schema schema/modelspec.v0.1.json examples/The examples/ directory has a progressive set of specs from minimal to full production:
| Example | What it adds |
|---|---|
00-minimal |
Model identity + GPU requirement |
01-artifacts |
Weight and tokenizer locations |
02-serving |
HTTP interface, health probes |
03-batching |
Batch size and sequence constraints |
04-scaling |
Replica targets, latency SLOs |
05-observability |
Metrics, logs, tracing |
06-dependencies-rag |
RAG pipeline with model dependencies |
07-governance-minimal |
PII policy, data retention |
08-full-production |
Complete production contract |
Start with 00 and work down — each example builds on the previous one.
ModelSpec is one layer in a three-part system:
Knowledge Base — what should be true (best practices, GPU compatibility)
ModelSpec — what was intended (declared model contract) ← this repo
piqc scan — what is actually running (runtime inspection)
Used alone, ModelSpec is a documentation and validation standard. Paired with piqc, it becomes the basis for detecting drift between what a model was declared to need and what it's actually running on.
schema/ — ModelSpec JSON schema (v0.1)
examples/ — Validated example ModelSpecs (00 through 08)
tooling/ — Validator and supporting tools
docs/ — Versioning guide and reference documentation
ModelSpec is an open standard. Contributions are welcome — new fields, new examples, validator improvements, or corrections.
- Read the Contributing Guide
- Check open Issues
- Join the discussion on GitHub Discussions
This repository targets ModelSpec v0.1 (v1alpha1). See the versioning guide for schema version and compatibility details.
Apache License 2.0 — see LICENSE for details.
