|
| 1 | +# INKFORGE — AI Development Guidelines |
| 2 | + |
| 3 | +> **These guidelines are for AI coding assistants (GitHub Copilot, Cursor, Cline, Claude, etc.) |
| 4 | +> working on the Inkforge codebase.** Follow these conventions strictly. |
| 5 | +
|
| 6 | +--- |
| 7 | + |
| 8 | +## 1. Project Overview |
| 9 | + |
| 10 | +**Inkforge** is a human-like handwriting synthesis engine powered by a stroke-level generative ML model (LSTM + Mixture Density Network). It is **not** a font renderer. The system generates handwriting as sequences of pen strokes with learned distributions over pressure, velocity, slant, and spacing. |
| 11 | + |
| 12 | +### Architecture (3-Tier) |
| 13 | + |
| 14 | +``` |
| 15 | +React Frontend → FastAPI Backend → PyTorch Inference Engine |
| 16 | + ↓ |
| 17 | + Celery + Redis (async task queue) |
| 18 | + ↓ |
| 19 | + CairoSVG + Pillow (rendering) |
| 20 | +``` |
| 21 | + |
| 22 | +### Key Reference |
| 23 | + |
| 24 | +- **Paper:** Graves (2013) — "Generating Sequences with Recurrent Neural Networks" (arXiv:1308.0850) |
| 25 | +- **Dataset:** IAM On-Line Handwriting Database (13,049 texts, 221 writers) |
| 26 | + |
| 27 | +--- |
| 28 | + |
| 29 | +## 2. Stroke Representation (CRITICAL) |
| 30 | + |
| 31 | +All handwriting is represented as sequences of **5-tuples**: |
| 32 | + |
| 33 | +``` |
| 34 | +(Δx, Δy, p₁, p₂, p₃) |
| 35 | +
|
| 36 | +Δx, Δy = relative pen displacements from previous position |
| 37 | +p₁ = pen-down (actively drawing) |
| 38 | +p₂ = pen-up (moving without drawing) |
| 39 | +p₃ = end-of-sequence sentinel |
| 40 | +``` |
| 41 | + |
| 42 | +**Rules:** |
| 43 | +- Exactly one of `p₁, p₂, p₃` is 1 at any timestep; the others are 0 |
| 44 | +- `Δx, Δy` are relative (delta) coordinates, NOT absolute |
| 45 | +- When converting to absolute for rendering, accumulate deltas |
| 46 | +- Stroke sequences are variable-length; pad/truncate to `max_seq_len=700` for training |
| 47 | + |
| 48 | +--- |
| 49 | + |
| 50 | +## 3. Model Architecture Constants |
| 51 | + |
| 52 | +Do NOT change these values without explicit approval — they are baked into the PRD: |
| 53 | + |
| 54 | +| Parameter | Value | Location | |
| 55 | +|-----------|-------|----------| |
| 56 | +| Character embedding dim | d=256 | `model.py` | |
| 57 | +| Style latent dim | z ∈ ℝ¹²⁸ | `model.py`, `style_encoder.py` | |
| 58 | +| LSTM hidden dim | 512 | `model.py` | |
| 59 | +| LSTM layers | 3 | `model.py` | |
| 60 | +| Dropout | 0.2 | `model.py` | |
| 61 | +| MDN mixtures (M) | 20 | `model.py` | |
| 62 | +| MDN params per mixture | 6 (π, μx, μy, σx, σy, ρ) | `model.py` | |
| 63 | +| Pen state outputs | 3 (p₁, p₂, p₃) | `model.py` | |
| 64 | + |
| 65 | +--- |
| 66 | + |
| 67 | +## 4. Humanization Parameters |
| 68 | + |
| 69 | +These 7 parameters are exposed to users via UI sliders. They are NOT post-processing — they operate at the model/latent level: |
| 70 | + |
| 71 | +| Parameter | Default | Range | Implementation | |
| 72 | +|-----------|---------|-------|----------------| |
| 73 | +| Stroke Width Variation | 0.5 | 0.0–1.0 | Derived from pen velocity | |
| 74 | +| Character Inconsistency | 0.4 | 0.0–1.0 | Noise in style vector z | |
| 75 | +| Slant Angle | 5° | -30° to +30° | Global bias + per-word variance | |
| 76 | +| Baseline Drift | 0.3 | 0.0–1.0 | Sinusoidal y-axis noise | |
| 77 | +| Ligature Formation | Enabled | On/Off | Contextual stroke connections | |
| 78 | +| Fatigue Simulation | Disabled | On/Off | Increasing latent noise over position | |
| 79 | +| Ink Bleed | 0.2 | 0.0–1.0 | Post-render Gaussian diffusion | |
| 80 | + |
| 81 | +--- |
| 82 | + |
| 83 | +## 5. Python Code Style (Backend + ML) |
| 84 | + |
| 85 | +### General |
| 86 | +- **Python 3.10+** — use modern type hints (`list[str]`, `dict[str, int]`, `X | None`) |
| 87 | +- **PEP 8** — enforced via `ruff` |
| 88 | +- **Line length:** 100 characters max |
| 89 | +- **Imports:** sorted with `isort` (ruff handles this) |
| 90 | + |
| 91 | +### Type Hints |
| 92 | +```python |
| 93 | +# ✅ Good — all args and returns typed |
| 94 | +def generate(self, text: str, style_z: torch.Tensor, temperature: float = 0.4) -> list[tuple]: |
| 95 | + ... |
| 96 | + |
| 97 | +# ❌ Bad — missing types |
| 98 | +def generate(self, text, style_z, temperature=0.4): |
| 99 | + ... |
| 100 | +``` |
| 101 | + |
| 102 | +### Docstrings (Google Style) |
| 103 | +```python |
| 104 | +def compute_mdn_loss( |
| 105 | + mdn_params: torch.Tensor, |
| 106 | + target: torch.Tensor, |
| 107 | +) -> torch.Tensor: |
| 108 | + """ |
| 109 | + Compute MDN negative log-likelihood loss. |
| 110 | +
|
| 111 | + Args: |
| 112 | + mdn_params: Predicted mixture parameters [batch, seq, M*6]. |
| 113 | + target: Ground truth strokes [batch, seq, 2]. |
| 114 | +
|
| 115 | + Returns: |
| 116 | + Scalar loss tensor. |
| 117 | + """ |
| 118 | +``` |
| 119 | + |
| 120 | +### Pydantic Models |
| 121 | +- Use `pydantic.BaseModel` for all API schemas |
| 122 | +- Use `Field(...)` with descriptions for all fields |
| 123 | +- Use enums for fixed choice sets |
| 124 | +- Validate constraints with `ge`, `le`, `min_length`, `max_length` |
| 125 | + |
| 126 | +### FastAPI Patterns |
| 127 | +- Use `APIRouter` per domain (generate, export, styles, health) |
| 128 | +- All route functions must be `async` |
| 129 | +- Use dependency injection for services |
| 130 | +- Return proper HTTP status codes (202 for async jobs, 404 for not found) |
| 131 | + |
| 132 | +--- |
| 133 | + |
| 134 | +## 6. JavaScript/JSX Code Style (Frontend) |
| 135 | + |
| 136 | +- **React 18** with functional components and hooks only (no class components) |
| 137 | +- **Zustand** for state management (no Redux) |
| 138 | +- **Tailwind CSS** for styling (utility-first) |
| 139 | +- Use `const` by default; `let` only when reassignment is needed |
| 140 | +- Destructure props and state |
| 141 | +- File naming: `PascalCase.jsx` for components, `camelCase.js` for utils/hooks/stores |
| 142 | + |
| 143 | +### Component Structure |
| 144 | +```jsx |
| 145 | +// 1. Imports |
| 146 | +import { useState, useEffect } from "react"; |
| 147 | + |
| 148 | +// 2. Component |
| 149 | +function TextInputPanel({ onTextChange, maxLength = 2000 }) { |
| 150 | + const [text, setText] = useState(""); |
| 151 | + |
| 152 | + // 3. Handlers |
| 153 | + const handleChange = (e) => { |
| 154 | + // ... |
| 155 | + }; |
| 156 | + |
| 157 | + // 4. Render |
| 158 | + return ( |
| 159 | + <div>...</div> |
| 160 | + ); |
| 161 | +} |
| 162 | + |
| 163 | +// 5. Export |
| 164 | +export default TextInputPanel; |
| 165 | +``` |
| 166 | + |
| 167 | +--- |
| 168 | + |
| 169 | +## 7. File Organization Rules |
| 170 | + |
| 171 | +``` |
| 172 | +backend/ |
| 173 | + app/ |
| 174 | + api/routes/ → One file per endpoint group |
| 175 | + models/ → Pydantic schemas only (NOT ML models) |
| 176 | + services/ → Business logic (inference, rendering) |
| 177 | + ml/ → PyTorch model definitions and training code |
| 178 | + tests/ → Mirror app/ structure with test_ prefix |
| 179 | +
|
| 180 | +frontend/ |
| 181 | + src/ |
| 182 | + components/ → React components (PascalCase.jsx) |
| 183 | + hooks/ → Custom hooks (useXxx.js) |
| 184 | + stores/ → Zustand stores (xxxStore.js) |
| 185 | + utils/ → Helper functions (camelCase.js) |
| 186 | + assets/ → Static assets (images, icons) |
| 187 | +``` |
| 188 | + |
| 189 | +**Rules:** |
| 190 | +- Never put ML model code in `models/` (that's for Pydantic schemas) |
| 191 | +- ML code goes in `app/ml/` |
| 192 | +- One React component per file |
| 193 | +- Keep components under 200 lines; extract sub-components if longer |
| 194 | + |
| 195 | +--- |
| 196 | + |
| 197 | +## 8. API Conventions |
| 198 | + |
| 199 | +### Endpoints (MVP) |
| 200 | +| Method | Path | Purpose | |
| 201 | +|--------|------|---------| |
| 202 | +| POST | `/generate` | Submit async generation job | |
| 203 | +| GET | `/job/{job_id}` | Poll job status | |
| 204 | +| POST | `/export` | Render to PNG/PDF/SVG | |
| 205 | +| GET | `/styles` | List style presets | |
| 206 | +| GET | `/health` | Service health check | |
| 207 | + |
| 208 | +### Response Format |
| 209 | +- Always return JSON |
| 210 | +- Use `202 Accepted` for async jobs (not 200) |
| 211 | +- Include `job_id` in generation responses |
| 212 | +- Error responses must include `detail` field |
| 213 | + |
| 214 | +--- |
| 215 | + |
| 216 | +## 9. Git & Commit Conventions |
| 217 | + |
| 218 | +### Branch Naming |
| 219 | +- `feat/` — new features |
| 220 | +- `fix/` — bug fixes |
| 221 | +- `refactor/` — code restructuring |
| 222 | +- `docs/` — documentation |
| 223 | +- `ml/` — ML model changes |
| 224 | + |
| 225 | +### Commit Messages (Conventional Commits) |
| 226 | +``` |
| 227 | +feat(api): add WebSocket stroke streaming endpoint |
| 228 | +fix(ml): correct MDN loss gradient computation |
| 229 | +docs: update README with training instructions |
| 230 | +refactor(frontend): extract CanvasPreview component |
| 231 | +``` |
| 232 | + |
| 233 | +--- |
| 234 | + |
| 235 | +## 10. Testing Requirements |
| 236 | + |
| 237 | +- **Backend:** pytest with `pytest-asyncio` for async endpoints |
| 238 | +- **ML:** Test model instantiation, output shapes, and MDN sampling |
| 239 | +- **API:** Use `TestClient` from FastAPI |
| 240 | +- All new features must include tests |
| 241 | +- Maintain >80% coverage on core modules |
| 242 | + |
| 243 | +--- |
| 244 | + |
| 245 | +## 11. Common Pitfalls — AVOID THESE |
| 246 | + |
| 247 | +1. **DO NOT** use absolute coordinates for strokes — always use deltas `(Δx, Δy)` |
| 248 | +2. **DO NOT** treat this as a font rendering system — strokes are generated, not looked up |
| 249 | +3. **DO NOT** put ML model Python code in `app/models/` — that's for Pydantic schemas |
| 250 | +4. **DO NOT** use `any` type in TypeScript/JavaScript — use proper types |
| 251 | +5. **DO NOT** commit model checkpoints (`.pt`, `.pth`) — they are gitignored |
| 252 | +6. **DO NOT** commit `.env` files — only `.env.example` |
| 253 | +7. **DO NOT** hardcode model hyperparameters — use config YAML files |
| 254 | +8. **DO NOT** use synchronous inference in API routes — always queue via Celery |
| 255 | +9. **DO NOT** mix pen states — exactly one of `(p₁, p₂, p₃)` must be 1 at each timestep |
| 256 | +10. **DO NOT** use class-based React components — only functional + hooks |
| 257 | + |
| 258 | +--- |
| 259 | + |
| 260 | +## 12. Security & Ethics |
| 261 | + |
| 262 | +- Never generate content that simulates signatures |
| 263 | +- Include watermark metadata in all exports |
| 264 | +- Sanitize all user text input before processing |
| 265 | +- Rate-limit generation endpoints (future: API key auth) |
| 266 | +- No PII stored in generation artifacts |
0 commit comments