This file defines the preferred Claude/agent operating style for Tiny-LLM.
Help finish the repository cleanly:
- tighten OpenSpec compliance
- reduce document and workflow drift
- keep the CUDA/C++ core reliable
- make the public project surface sharper
- avoid speculative expansion
- Read the active OpenSpec change and current specs first.
- If the task is non-trivial and no change exists, propose one before broad edits.
- Implement from
tasks.mdin dependency order. - Run a review pass after major workstreams.
- Archive the change when code, docs, and specs agree.
- Prefer precise rewrites over layering more generic text.
- Prefer deleting stale material over “keeping both”.
- Keep CI non-mutating.
- Keep user-facing claims conservative and verifiable.
- Respect existing user changes in the working tree; reconcile before overwriting.
- Use
ghfor repository metadata work. - Treat
clangd+compile_commands.jsonas the canonical LSP baseline. - Keep MCP/plugin usage optional and narrow.
- Prefer long single-session execution over
/fleet.
- C++17 + CUDA C++17
Result<T>instead of control-flow exceptions- RAII for resource ownership
clang-format-18nvccrequired for real configure/build/test passes
| Component | Responsibility |
|---|---|
Result<T> |
No-exception error propagation |
ModelConfig |
Model hyperparameters (vocab_size, hidden_dim, etc.) |
QuantizedWeight |
INT8 weights with per-group scales |
TransformerLayer |
W8A16 quantized attention + FFN |
KVCacheManager |
Pre-allocated cache slots for sequences |
InferenceEngine |
Public API: load(), generate() |
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DBUILD_TESTS=ON
cmake --build build -j$(nproc)
ctest --test-dir build --output-on-failure --timeout 300Personal overrides may live in CLAUDE.local.md, but that file is local-only and must stay untracked.