fix(ai): correct on-device model size labels by sumitsahoo · Pull Request #86 · cloakyard/cloakpdf

sumitsahoo · 2026-05-17T12:46:05Z

Summary

Promotes the size-label fix from dev to main. Single change in flight: registry approxSizeBytes corrected against the actual HuggingFace file listings so the consent gate's "X GB total" matches what users actually download.

Compact tier: 1.13 GB total → displays "≈ 1.1 GB" (was implicitly 1.5 GB)
Quality tier: 1.88 GB total → displays "≈ 1.9 GB" (was 1.8 GB)
LFM2.5-1.2B (q4): 1.2 GB → 810 MB
LFM2-2.6B (q4f16): 1.5 GB → 1.55 GB
EmbeddingGemma q8: 309 MB → 320 MB (the 26 MB Gemma SentencePiece tokenizer was previously missing from the estimate)
Embed peak RAM: 400 MB → 500 MB (int8 dequant overhead)

Prose updated in README, docs/local-ai.md, tool-registry, ChatModelPicker, AiModelDetailsModal, ai-runtime, useRagModels, and the AskPdf timing-weight comment to match.

Came in via #85.

Test plan

vp check passes (format + lint + type-check)
vp test passes (unit suite)
Manual smoke on production preview: open Ask PDF on a fresh browser profile and confirm the gate's aggregate label matches the picker's per-tile sum on each tier

Registry `approxSizeBytes` values had drifted from the real on-disk weights — the Quality tier was showing "≈ 1.8 GB total" in the gate when the actual download is ~1.88 GB, and the Compact tier overstated by ~390 MB on the chat model alone. Verified against HuggingFace file listings for each pinned dtype: - LFM2.5-1.2B (q4): 1.2 GB → 810 MB (model_q4.onnx_data is 850 MB) - LFM2-2.6B (q4f16): 1.5 GB → 1.55 GB - EmbeddingGemma q8: 309 MB → 320 MB (was missing the 26 MB Gemma SentencePiece tokenizer) - Embed peak RAM: 400 MB → 500 MB (int8 dequant overhead) Aggregate now reads "≈ 1.1 GB total" on Compact and "≈ 1.9 GB total" on Quality. Prose updated in README, docs/local-ai.md, tool-registry, ChatModelPicker, AiModelDetailsModal, ai-runtime, useRagModels, and the AskPdf timing-weight comment.

feat(ai): on-device hybrid RAG for Ask your PDF

cloudflare-workers-and-pages · 2026-05-17T12:46:11Z

Deploying with Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status	Name	Latest Commit	Preview URL	Updated (UTC)
✅ Deployment successful! View logs	cloakpdf	`db278fa`	Commit Preview URL Branch Preview URL	May 17 2026, 12:45 PM

Copilot

Pull request overview

Updates the Ask PDF on-device AI model size metadata and related prose so consent/download labels better reflect the actual model bundle sizes.

Changes:

Corrects approxSizeBytes and embed peak RAM metadata in the AI model registry.
Updates user-facing README/docs and inline comments for Compact/Quality bundle sizes.
Refreshes picker/modal/tool copy to match the revised model footprints.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`src/utils/ai-models.ts`	Updates AI registry model size/RAM metadata and explanatory comments.
`src/utils/ai-runtime.ts`	Updates cache eviction size documentation.
`src/hooks/useRagModels.ts`	Updates full-evict size documentation.
`src/tools/AskPdf.tsx`	Updates indexing progress comment for embedder size.
`src/config/tool-registry.ts`	Updates Ask PDF tool registry comments with revised tier/bundle sizes.
`src/components/ChatModelPicker.tsx`	Updates picker documentation for chat tier sizes.
`src/components/AiModelDetailsModal.tsx`	Updates modal deletion/storage documentation.
`README.md`	Updates public Ask PDF feature and Local AI size descriptions.
`docs/local-ai.md`	Updates implementation docs and cache diagram with revised model bundle sizes.

 * Evict the Transformers.js model bytes from the browser's
- * CacheStorage. Frees ~1.5 GB of disk for the current AI bundle
- * (chat + embed + rerank) and forces a fresh download on next use.
+ * CacheStorage. Frees roughly 1.2 GB on the Compact tier / 1.9 GB on


   * `cloakpdf:ai-model-ready:*` flag so the consent dialog re-
-   * appears on next use. Frees ~1.5 GB of disk for the current AI
-   * bundle; the user pays a full re-download next time they touch
+   * appears on next use. Frees roughly 1.2 GB (Compact) / 1.9 GB


 * (release RAM, keep the downloaded weights cached on disk so the
 * next use warm-loads in seconds) and a destructive "Delete cached
- * models" (also evict the CacheStorage bytes, ~1.5 GB).
+ * models" (also evict the CacheStorage bytes — roughly 1.2 GB on the


  // digits/emails instead of copying from the retrieved chunk.
  // Sticking to 1.2B keeps the discipline guarantee.
-  approxSizeBytes: Math.round(1.2 * 1024 * 1024 * 1024),
+  approxSizeBytes: Math.round(810 * 1024 * 1024),


sumitsahoo added 2 commits May 17, 2026 18:13

Merge pull request #85 from cloakyard/feature/ai-local-hybrid-rag

db278fa

feat(ai): on-device hybrid RAG for Ask your PDF

Copilot AI review requested due to automatic review settings May 17, 2026 12:46

Copilot started reviewing on behalf of sumitsahoo May 17, 2026 12:46 View session

Copilot AI reviewed May 17, 2026

View reviewed changes

sumitsahoo merged commit 977a534 into main May 17, 2026
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(ai): correct on-device model size labels#86

fix(ai): correct on-device model size labels#86
sumitsahoo merged 2 commits into
mainfrom
dev

sumitsahoo commented May 17, 2026

Uh oh!

cloudflare-workers-and-pages Bot commented May 17, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

sumitsahoo commented May 17, 2026

Summary

Test plan

Uh oh!

cloudflare-workers-and-pages Bot commented May 17, 2026

Deploying with Cloudflare Workers

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants