F.0: Caddy federation helpers + hardware gate + rate limiter + storage translators#8
Closed
F.0: Caddy federation helpers + hardware gate + rate limiter + storage translators#8
Conversation
added 9 commits
April 12, 2026 12:43
Documents findings from the five blocking spikes before Phase 1 schema/tool signatures lock: - Spike 1 (WS routing): backend is per-connection (client_contexts dict with model_copy(deep=True)). backend-0002 patch slot collapses to empty — no change needed. - Spike 2 (persona swap): Open-LLM-VTuber already supports per-connection persona via the switch-config WS message + /app/characters/*.yaml. web-0006 patch slot collapses to empty. - Spike 3 (tutor-event handler): lives in the Python backend (websocket_handler.py) next to text-input/switch-config. backend-0001 owns it; web side untouched. - Spike 4 (Cubism SDK): CDN confirmed, SHA-256 pinned (94278358…), EULA prompt drafted, CUBISM-LICENSE.md drafted. Install-time fetch posture — we are not redistributing. - Spike 5 (Ollama benchmark): NUM_PARALLEL=4 is the sweet spot on RTX 5060 Ti (27 tok/s aggregate at N=8; p95=36s at N=25). NUM_PARALLEL=8 crashes. Supports the decision to broaden the manifest dep from "ollama" to "any OpenAI-compatible endpoint" with vLLM recommended for classroom mode (Phase 4 sibling bundle).
Phase 1 of the maker-lab bundle — a scaffolded AI learning companion paired with FOSS maker surfaces. Adds the MCP server (21 tools), admin panel (solo/family/classroom view modes + guest sidecar), schema, skill file, registry entry, and the cross-cutting audit patches required so learner_profile rows don't bleed into generic projects/memory surfaces. Security posture (enforced in code, not convention): - All kid-session tools take session_token, never learner_id — the LLM can't cross profiles by hallucinating a different id. - filterHint() runs on every hint: Flesch-Kincaid grade cap (kid-tutor only), kid-safe blocklist, per-persona word budget. Failed filters fall back to canned lesson hints. - rateLimitCheck() caps hints at 6/min per session. - Session state machine: active -> ending (5s flush) -> revoked. - Guest sessions: learner_id NULL + is_guest=1, CHECK constraint enforces exclusivity, boot-time sweep removes orphans. MCP server (bundles/maker-lab/server/): - server.js: 21 tools (learner CRUD admin-only, sessions, hint, progress, artifact, export, validate_lesson, etc). - filters.js: shared output filter, rate limiter, persona resolvers. - hint-pipeline.js: OpenAI-compat chat-completions call with filter + canned fallback + transcript write. - init-tables.js: 6 tables — sessions, bound_devices, redemption_codes, batches, transcripts, learner_settings. Learner profiles reuse research_projects.type='learner_profile' (no schema change there). Cross-bundle audit (Phase 1 deliverable): learner_profile rows must not appear in generic projects/memory surfaces. Patched: - servers/memory/server.js: crow_recall_by_context excludes source='maker-lab' by default; opt in via include_maker_lab=true. - servers/memory/crow-context.js: active-projects section excludes learner_profile. - servers/research/server.js: crow_list_projects, crow_project_stats, projects://list resource hide learner_profile unless caller explicitly filters by type. - servers/gateway/dashboard/panels/projects.js: count, list, detail all exclude learner_profile. - servers/gateway/dashboard/panels/nest/data-queries.js: dashboard project count excludes learner_profile. Platform registration: - registry/add-ons.json: new maker-lab entry. - extensions.js: ICON_MAP graduation-cap; CATEGORY_COLORS/_LABELS education. - nav-registry.js: CATEGORY_TO_GROUP education -> content. - i18n.js: extensions.categoryEducation en + es. - skills/superpowers.md: trigger row for maker-lab. - CLAUDE.md: server factory list + skills reference entry. Hard deps declared in manifest: companion bundle + any OpenAI-compatible local-LLM endpoint (recommended: ollama for solo/family, vllm for classroom). Phase 0 report has the benchmark numbers.
Phase 2 wires the first maker surface and its security-critical handoff. Kiosk HTTP routes (bundles/maker-lab/panel/routes.js): - GET /kiosk/r/:code — atomic redemption. Uses UPDATE ... WHERE used_at IS NULL AND expires_at > now() RETURNING so an expired or already-used code fails in the same WHERE clause. No TOCTOU. Issues an HttpOnly, SameSite=Strict, __Host--prefixed cookie signed with HMAC-SHA256; payload = sessionToken.fingerprint. - Fingerprint = sha256(UA + Accept-Language + per-device localStorage salt echoed via x-maker-kiosk-salt). Every subsequent /kiosk/* hit re-verifies signature AND fingerprint — lifting a cookie to a different browser fails. - Cookie secret persists at ~/.crow/maker-lab.cookie.secret. Rotating the secret invalidates all kiosks (force re-bind). - /kiosk/api/context, /api/lesson/:id, /api/progress, /api/hint, /api/end — all session-cookie-guarded, no Nest password required, state machine enforced (ending/revoked responses). Blockly kiosk (bundles/maker-lab/public/blockly/): - Minimal index.html + kiosk.css + tutor-bridge.js. - tutor-bridge.js: session-cookie API client, "?" hint button with level escalation, "I'm done!" progress POST, IndexedDB offline queue with online-replay, browser speechSynthesis TTS for the Phase 2 MVP audio path, kid-visible transcript indicator from /api/context. - Blockly loaded from pinned jsDelivr (self-host for air-gap, noted in SCHEMA.md). Curriculum (bundles/maker-lab/curriculum/): - SCHEMA.md documents the lesson JSON shape for teacher/parent authors. - Three starter 5-9 lessons: move-cat, repeat, on-click. Companion patch drafts (bundles/companion/patches/backend/): - 0001-tutor-event-handler.patch: typed WS message for scaffolded hints. The handler NEVER treats the payload as user text — only the filtered return from maker_hint reaches TTS. - 0003-maker-lab-mcp-registration.patch: optional direct MCP bridge (tools already reachable via the existing crow router bridge). - README documents that these apply via the Phase 3 submodule build pipeline; Phase 2 MVP runs via HTTP + browser TTS without them. - 0002 slot empty (Spike 1 — backend already per-connection). Peer-sharing guard (servers/shared/kiosk-guard.js + servers/sharing/server.js): crow_generate_invite, crow_share, and crow_send_message refuse to run while any maker_sessions row is active — defense-in-depth for the rule "no peer-sharing ever initiated from inside a kid session." Cached 1s to avoid per-call DB churn; silent no-op on installs without the maker_sessions table.
Wires the admin panel's full session lifecycle: Start session, Bulk
Start (printable QR sheet), Guest flow (age picker), and live session
controls (End / Force End / Unlock Idle / Revoke Batch).
Session minting (new bundles/maker-lab/server/sessions.js):
- Single source of truth for mintSessionForLearner, mintGuestSession,
mintBatchSessions. Both the MCP tools and the admin panel now call
into the same functions — no duplicate INSERT paths.
- Returns { sessionToken, redemptionCode, shortUrl, codeExpiresAt,
sessionExpiresAt, learnerId, learnerName, batchId }.
Panel (panel/maker-lab.js — full rewrite):
- Three view modes share a single handler. Family: per-card Start +
duration input. Classroom: multi-select checkboxes + Bulk Start form
with batch_label + printable QR sheet. Solo: simplified tile.
- Guest picker: "Try it without saving" → age band buttons (5-9 /
10-13 / 14+) → mint + redirect to QR handoff page.
- QR handoff page: inlines SVG (via qrcode npm pkg, pinned 1.5.3),
renders the redemption code + full URL + Print button. Uses
CROW_GATEWAY_URL for the public URL embedded in the QR; falls back
to relative if unset.
- Batch sheet: grid of per-learner QR cards, Print button, revoke-batch
form that requires a reason.
- Active sessions section: End / Force End / Unlock Idle / batch link
buttons on every active session. Pre-fetches the latest redemption
code per session so the "QR" link on active cards works.
- Error banner with friendly messages for each err= query param.
Schema (maker_learner_settings):
- Added age + avatar columns (research_projects has no metadata column;
previous attempt to use one was a bug).
- Boot-time addColumnIfMissing migrates existing installs safely.
Refactor (server/server.js):
- maker_start_session, maker_start_sessions_bulk, maker_start_guest_session
delegate to sessions.js helpers. ~100 lines of duplicated INSERT logic
removed.
- maker_create_learner, maker_list_learners, maker_get_learner,
maker_update_learner read/write age+avatar from maker_learner_settings.
- server/filters.js getLearnerAge() reads from maker_learner_settings.
- panel/routes.js /kiosk/api/context joins maker_learner_settings for age.
Dep:
- qrcode@^1.5.3 (pure JS, SVG output). Pinned for air-gap friendliness.
End-to-end sanity-checked against live DB: create_learner → list_learners
→ start_session returns well-formed redemption_code + short_url.
…TA-HANDLING Per-learner settings editor (panel/maker-lab.js): - New ?edit=<learner_id> view with form for name, age, avatar, transcripts_enabled, transcripts_retention_days, idle_lock_default_min, auto_resume_min, voice_input_enabled. - New action=update_learner POST handler. - "Settings" and "Transcripts" buttons on every learner card (Transcripts only when recording is enabled). Kiosk idle lock (panel/routes.js + public/blockly/tutor-bridge.js): - /api/context is now PASSIVE (read-only) — no longer touches last_activity_at. That would have made idle-lock impossible since the client polls it. - New /api/heartbeat endpoint — the only client-initiated path besides /api/hint and /api/progress that counts as activity (per plan's allowlist: hint, progress, Blockly workspace change, heartbeat — NOT mouse-move or scroll). - Idle-lock state machine runs inline on every /api/context hit: (1) lock when last_activity > idle_lock_min ago; (2) auto-resume when locked > auto_resume_min ago. - tutor-bridge.js: 15s context poll, lock screen with auto-resume countdown (built with createElement, no innerHTML), throttled heartbeat on Blockly workspace changes (create/delete/change/move only — ignores UI events). Transcripts viewer (panel/maker-lab.js): - ?transcripts=<learner_id> view groups turns by session_token, shows role-coded turn bubbles, retention banner. - Up to 500 most-recent turns across all sessions for one learner. Transcripts retention sweep (server/retention-sweep.js): - Hourly sweep deletes maker_transcripts older than per-learner transcripts_retention_days (default 30, 0 = purge on session end). - Also sweeps orphaned guest sessions hourly (belt + suspenders on top of the boot-time sweep). - Timer uses unref() so the process can still exit cleanly when stdin closes. - startRetentionSweep() is process-globally idempotent; both the stdio MCP entry and the panel router call it. DATA-HANDLING.md (bundles/maker-lab/DATA-HANDLING.md): - Ships with the bundle. Plain-language summary for parents/teachers + legal-reference section for school administrators. - Exhaustive field inventory across every table, COPPA + GDPR-K posture, incident response procedure, deployment checklist. - Maker Lab's consent checkbox is a timestamped audit record — explicitly NOT a substitute for the school's own VPC process. The doc says so.
Solo-mode kiosk security (bundles/maker-lab/server/device-binding.js): - isLoopback(req): detects same-host requests across IPv4, IPv6, and IPv4-mapped-IPv6. Checks req.ip, req.socket.remoteAddress, and req.connection.remoteAddress for defense in depth. - getSoloLanExposure() / setSoloLanExposure(): dashboard_settings key maker_lab.solo_lan_exposure. Default "off" (loopback-only). - getBoundDevice(), bindDevice(), unbindDevice(), listBoundDevices(): CRUD for the maker_bound_devices table. - hasAdminSession(req): validates the req's crow_session cookie against oauth_tokens via the existing dashboard auth helper. - ensureDefaultLearner(db): creates a "Default learner" with consent timestamp if no learners exist yet. Used by the solo auto-redeem path. Solo kiosk auto-redeem (panel/routes.js GET /kiosk/): - If a valid session cookie is present → serve Blockly. - Else in solo mode: - Loopback → auto-mint default-learner session and set cookie. - LAN exposure off → 403 "loopback-only" page. - LAN exposure on + known bound device → auto-mint + touch last_seen_at. - LAN exposure on + admin crow_session present → bind device + mint. - LAN exposure on + unknown device → "sign in to Nest first" page. - Non-solo modes continue to require a redemption code handoff. Panel Settings section (panel/maker-lab.js ?settings=1): - Solo LAN exposure toggle (auto-submit on change). - Bound devices table with Unbind button per row. - Data handling pointer to DATA-HANDLING.md. - Accessible from a new ⚙ Settings button at the top of the main view. Lesson authoring (panel/maker-lab.js ?lessons=1): - Lists bundled lessons grouped by age band. - Lists custom lessons from ~/.crow/bundles/maker-lab/curriculum/custom/. - Import form: paste JSON → validate via shared lesson-validator.js (same path the maker_validate_lesson MCP tool uses) → write to the custom dir. No restart needed; the /kiosk/api/lesson/:id route already picks up custom lessons. - Specific error messages surfaced inline on validation failure. - Delete button per custom lesson (confirm dialog). - Accessible from a new 📚 Lessons button at the top of the main view. Shared validator (server/lesson-validator.js): - Extracted from the inline maker_validate_lesson tool so both the MCP tool and the panel import flow use identical rules. Adds stricter checks: id regex, canned_hints non-empty, steps non-empty, reading_level <= 3 for age_band '5-9', length caps on prompt and canned_hints, tag array typing.
Closes the remaining Phase 2 items before moving to Phase 3.
Curriculum — 10 lessons total (was 3) for ages 5-9:
- blockly-01-move-cat, -02-repeat, -03-on-click (existing, rewritten to
declare a toolbox + success_check)
- blockly-04-two-in-a-row sequences: stack two Do blocks
- blockly-05-count-to-ten loops: repeat with count
- blockly-06-change-the-words sequences: edit text literals
- blockly-07-big-and-small conditions: compare with numbers
- blockly-08-loops-inside-loops loops: nesting, multiplicative feel
- blockly-09-yes-or-no conditions: if / else
- blockly-10-capstone-party capstone combining all prior concepts
Lesson schema + validator (server/lesson-validator.js):
- `toolbox`: either a flat array of block-type strings or a
{ categories: [{ name, colour, blocks }] } structure. Validator checks
both forms.
- `success_check.required_blocks`: array of block-type strings. If any
are missing from the workspace when the kid hits "I'm done!", the
progression is blocked with `success_check.message_missing`.
- SCHEMA.md documents both fields with examples.
Blockly kiosk (public/blockly/):
- Removed the hard-coded <xml id="toolbox"> in index.html.
- tutor-bridge.js now builds the toolbox dynamically from the active
lesson's `toolbox` field. Default shadows for common block types
(controls_repeat_ext → TIMES=4, text_print → TEXT="Hi!", logic_compare
→ A=5 B=3) so blocks drop in ready to run instead of empty.
- "I'm done!" now enforces success_check before POSTing progress —
missing blocks surface the lesson's `message_missing` in the hint
bubble and speak it via TTS instead of marking complete.
Companion tutor-event patch (applied at container build time):
- bundles/companion/scripts/patch-tutor-event.py — idempotent Python
patcher. Detects a marker string and exits early if already patched.
Registers a new "tutor-event" WS message type and appends an
`_handle_tutor_event` method to WebSocketHandler. The handler NEVER
treats the payload as user text; it POSTs {session_token, question,
level, ...} to http://127.0.0.1:3004/maker-lab/api/hint-internal,
then speaks the filtered reply via the per-client TTS engine.
- entrypoint.sh invokes the patcher after patch-auto-group.py.
- Dockerfile copies patch-tutor-event.py into /app/scripts/.
- Activation requires a companion rebuild. Until then the Phase 2 MVP
`speechSynthesis` path continues to work.
Maker Lab internal hint endpoint (panel/routes.js):
- New POST /maker-lab/api/hint-internal — loopback-only (refuses non-
loopback IPs with 403), accepts {session_token, surface, question,
level, lesson_id?, canned_hints?}, validates the token directly
(no cookie/fingerprint binding — access is loopback-restricted), runs
the same handleHintRequest pipeline as /kiosk/api/hint.
- This is the endpoint the companion's patched handler calls.
Phase 2.5 kiosk launcher (scripts/launch-kiosk.sh):
- Opens the Blockly kiosk tile-left (2/3) and the AI Companion web UI
tile-right (1/3) in Chromium --app windows (or Firefox with
xdotool-assisted positioning). Same-host solo-mode deployment.
- Until Phase 3 ships pet-mode, this is the documented "Phase 2.5
visual layout via crow-wm" path.
README.md: first-pass bundle docs covering quick start, modes, hint
pipeline, lesson authoring, companion integration (Phase 2 MVP vs
post-rebuild upgrade), and a Phase 3 preview.
Lays the groundwork for the 8 federated-app bundles enumerated in the
Phase 2 plan (Matrix-Dendrite, Mastodon, GoToSocial, Pixelfed, PeerTube,
Funkwhale, Lemmy, WriteFreely). No federated app ships in this PR — this
is pure platform infra so the apps in F.1+ are thin wrappers.
Caddy side:
- bundles/caddy/server/federation-profiles.js — 4 canned profiles
(matrix, activitypub, peertube, generic-ws) with the directives each
app family needs (websocket upgrade, 40MB/8GB body, forwarded headers,
300s/1800s timeouts) plus builders for the standard /.well-known/
JSON payloads (matrix-server, matrix-client, nodeinfo)
- bundles/caddy/server/caddyfile.js — upsertRawSite() helper for
idempotent site-block replacement. Parser already handled :8448 and
inner blocks; round-trip verified via smoke test
- bundles/caddy/server/server.js — 4 new MCP tools:
caddy_add_federation_site — profile-aware site block
caddy_set_wellknown — standalone /.well-known/<path>
handler (e.g. matrix-server
delegation on an apex domain)
caddy_add_matrix_federation_port — :8448 site block with its own
LE cert (refuses if the same
domain already has matrix-server
delegation — enforces "one or
the other, not both")
caddy_cert_health — ok/warning/error per domain,
surfaces staging-cert use and
near-expiry that would otherwise
stay silent until outage
- bundles/caddy/scripts/post-install.sh — creates the crow-federation
external docker network (idempotent)
- bundles/caddy/docker-compose.yml — Caddy now joins crow-federation so
federated apps in F.1+ become reachable by docker service name with
no host port publish
- bundles/caddy/skills/caddy.md — full doc for the 4 new tools,
covering the matrix-8448-vs-well-known either/or and the shared-
network model
Gateway side:
- servers/gateway/hardware-gate.js — checkInstall() refuses installs
whose min_ram_mb exceeds effective RAM (MemAvailable + 0.5 ×
SSD-backed swap + 0.5 × zram; SD-card swap explicitly excluded)
minus already-committed RAM (sum of recommended_ram_mb across
installed bundles) minus a flat 512 MB host reserve. Warns but
allows when under the recommended threshold. Unit-tested against
Pi-with-SD-swap, Pi-with-SSD-swap, and Pi-with-zram fixtures
- servers/gateway/routes/bundles.js — hardware gate wired into
POST /bundles/api/install before the consent-token check. CLI-only
force_install bypass (request body, never UI)
Design notes:
- Caddyfile on-disk remains the source of truth. All new tools go
through upsertRawSite() so hand-edits outside the managed blocks
survive round-trips, and re-running a tool with the same domain
replaces instead of duplicating
- Effective RAM is deliberately pessimistic: SD-card swap does not
count as headroom even though /proc/meminfo reports it in SwapFree.
zram counts at half-weight because it's compressed RAM, not true
extra capacity. Host-reserve keeps the base OS + gateway responsive
- Hardware gate is a refuse-by-default mechanism with a CLI override;
the web UI never surfaces --force-install
Part 2/2 (follow-up): shared rate-limiter (SQLite-backed token bucket),
storage-translators (per-app S3 env-var schema mapping), init-db entry
for rate_limit_buckets, end-to-end verification on grackle with the
cert-health panel card.
Closes out F.0 so the federated-app bundles in F.1+ can plug in without rebuilding shared infra. Builds on part 1/2 (222e175). - servers/shared/rate-limiter.js — SQLite-backed token-bucket helper. Persistence in the rate_limit_buckets table survives bundle restart (round-2 reviewer flagged bypass-by-restart). Bucket key resolves to conversation_id when MCP supplies one, else a hash of transport identity, else a per-tool global. Defaults gated by tool-name suffix: *_post 10/hr, *_follow 30/hr, *_search 60/hr, *_block_* / *_defederate 5/hr, *_import_blocklist 2/hr. Overrides via ~/.crow/rate-limits.json with fs.watch hot-reload. Exposed as wrapRateLimited(db) returning limiter(toolId, handler) so bundles wrap their MCP handlers with one import. Smoke-tested: 11th call in the window is denied with retry_after_seconds. - servers/gateway/storage-translators.js — per-app S3 env-var schema mapping. Canonical Crow S3 credentials { endpoint, bucket, accessKey, secretKey, region?, forcePathStyle? } translate into Mastodon's S3_*, PeerTube's PEERTUBE_OBJECT_STORAGE_* (env-var path, not YAML patching), Pixelfed's AWS_*, and Funkwhale's AWS_*/AWS_S3_* schemas. One function per app, validated fixtures. - scripts/init-db.js — rate_limit_buckets table (tool_id, bucket_key, tokens, refilled_at) with PRIMARY KEY (tool_id, bucket_key) and a refilled_at index for GC later. Verified via `npm run init-db && npm run check`. - bundles/caddy/panel/caddy.js + panel/routes.js — cert-health card on the Caddy panel. New GET /api/caddy/cert-health endpoint surfaces ok/warning/error per domain with issuer, expiry, and staging-cert detection. Panel shows an Overall badge + per-domain rows with colored status dots; textContent + createElement only (XSS-safe pattern). Verified (this PR): - node --check on all changed files: OK - Rate-limit token bucket: 10 calls through, 11th denied with retry_after_seconds=360 on a 10/3600 bucket - Storage translator: mastodon/peertube/pixelfed/funkwhale all produce valid env vars with credentials present; unknown app and missing credentials both throw - `npm run init-db` creates rate_limit_buckets cleanly - `npm run check` passes Deferred to F.1: - Live end-to-end install of the updated Caddy bundle on grackle (requires the uninstall + reinstall flow — will exercise as part of F.1 GoToSocial pilot where the shared network + cert-health path is actually used in anger) - Panel cert-health card live-verification with a real issued cert (waits on F.1 for a federated site to exist)
Owner
Author
|
Recreating to refresh diff scope after origin/main caught up with prior work. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 2 federation foundation — lays the groundwork for the 8 federated-app bundles planned for F.1 onward (GoToSocial, WriteFreely, Matrix-Dendrite, Funkwhale, Pixelfed, Lemmy, Mastodon, PeerTube). No federated app ships in this PR; this is pure platform infra so F.1+ can be thin wrappers.
Part 2/2 (follow-up): shared rate-limiter, per-app S3 storage translators, `rate_limit_buckets` schema, end-to-end verification + cert-health panel card.
Design notes
Test plan