You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Create a website chatbot framework that runs LLM inference in the browser using WebLLM (WebGPU). This enables small LLMs to run locally on any WebGPU-capable device (PCs and smartphones) and improves user privacy by keeping interactions on-device.
Key challenges
Content size may exceed in-browser memory limits.
Ensuring response accuracy and relevance.
Supporting multiple languages.
Seamless integration with existing sites: avoid a separate crawler-based chat service. Prefer a library that exports chatbot components (Vue, React, plain HTML/JS) for easy embedding.
Possible approach
Provide a lightweight embeddable component (Vue/React/vanilla JS) that accepts site content via an API (e.g., indexed snippets, DOM extraction, or developer-provided documents) rather than crawling externally.
Implement streaming/chunked loading and retrieval-augmented generation (RAG) to handle large content within memory constraints.
Offer configurable model size/precision and fallbacks (server-side inference) for devices that cannot run the model locally.
Include multilingual tokenizers and language detection with automatic selection of the appropriate model or prompt templates.
Provide tools for indexing site content (on-demand or during build) and secure, privacy-preserving options for optional server-side augmentation.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Note
Applied-research idea, not a GSoC topic yet.
Project goal
Create a website chatbot framework that runs LLM inference in the browser using WebLLM (WebGPU). This enables small LLMs to run locally on any WebGPU-capable device (PCs and smartphones) and improves user privacy by keeping interactions on-device.
Key challenges
Possible approach
References
Beta Was this translation helpful? Give feedback.
All reactions