fix(ai): correct on-device model size labels#86
Merged
Merged
Conversation
Registry `approxSizeBytes` values had drifted from the real on-disk
weights — the Quality tier was showing "≈ 1.8 GB total" in the gate
when the actual download is ~1.88 GB, and the Compact tier overstated
by ~390 MB on the chat model alone. Verified against HuggingFace file
listings for each pinned dtype:
- LFM2.5-1.2B (q4): 1.2 GB → 810 MB (model_q4.onnx_data is 850 MB)
- LFM2-2.6B (q4f16): 1.5 GB → 1.55 GB
- EmbeddingGemma q8: 309 MB → 320 MB (was missing the 26 MB Gemma
SentencePiece tokenizer)
- Embed peak RAM: 400 MB → 500 MB (int8 dequant overhead)
Aggregate now reads "≈ 1.1 GB total" on Compact and "≈ 1.9 GB total"
on Quality. Prose updated in README, docs/local-ai.md, tool-registry,
ChatModelPicker, AiModelDetailsModal, ai-runtime, useRagModels, and
the AskPdf timing-weight comment.
feat(ai): on-device hybrid RAG for Ask your PDF
Deploying with
|
| Status | Name | Latest Commit | Preview URL | Updated (UTC) |
|---|---|---|---|---|
| ✅ Deployment successful! View logs |
cloakpdf | db278fa | Commit Preview URL Branch Preview URL |
May 17 2026, 12:45 PM |
There was a problem hiding this comment.
Pull request overview
Updates the Ask PDF on-device AI model size metadata and related prose so consent/download labels better reflect the actual model bundle sizes.
Changes:
- Corrects
approxSizeBytesand embed peak RAM metadata in the AI model registry. - Updates user-facing README/docs and inline comments for Compact/Quality bundle sizes.
- Refreshes picker/modal/tool copy to match the revised model footprints.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
src/utils/ai-models.ts |
Updates AI registry model size/RAM metadata and explanatory comments. |
src/utils/ai-runtime.ts |
Updates cache eviction size documentation. |
src/hooks/useRagModels.ts |
Updates full-evict size documentation. |
src/tools/AskPdf.tsx |
Updates indexing progress comment for embedder size. |
src/config/tool-registry.ts |
Updates Ask PDF tool registry comments with revised tier/bundle sizes. |
src/components/ChatModelPicker.tsx |
Updates picker documentation for chat tier sizes. |
src/components/AiModelDetailsModal.tsx |
Updates modal deletion/storage documentation. |
README.md |
Updates public Ask PDF feature and Local AI size descriptions. |
docs/local-ai.md |
Updates implementation docs and cache diagram with revised model bundle sizes. |
| * Evict the Transformers.js model bytes from the browser's | ||
| * CacheStorage. Frees ~1.5 GB of disk for the current AI bundle | ||
| * (chat + embed + rerank) and forces a fresh download on next use. | ||
| * CacheStorage. Frees roughly 1.2 GB on the Compact tier / 1.9 GB on |
| * `cloakpdf:ai-model-ready:*` flag so the consent dialog re- | ||
| * appears on next use. Frees ~1.5 GB of disk for the current AI | ||
| * bundle; the user pays a full re-download next time they touch | ||
| * appears on next use. Frees roughly 1.2 GB (Compact) / 1.9 GB |
| * (release RAM, keep the downloaded weights cached on disk so the | ||
| * next use warm-loads in seconds) and a destructive "Delete cached | ||
| * models" (also evict the CacheStorage bytes, ~1.5 GB). | ||
| * models" (also evict the CacheStorage bytes — roughly 1.2 GB on the |
| // digits/emails instead of copying from the retrieved chunk. | ||
| // Sticking to 1.2B keeps the discipline guarantee. | ||
| approxSizeBytes: Math.round(1.2 * 1024 * 1024 * 1024), | ||
| approxSizeBytes: Math.round(810 * 1024 * 1024), |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Promotes the size-label fix from
devtomain. Single change in flight: registryapproxSizeBytescorrected against the actual HuggingFace file listings so the consent gate's "X GB total" matches what users actually download.Prose updated in README, docs/local-ai.md, tool-registry, ChatModelPicker, AiModelDetailsModal, ai-runtime, useRagModels, and the AskPdf timing-weight comment to match.
Came in via #85.
Test plan
vp checkpasses (format + lint + type-check)vp testpasses (unit suite)