ondevice-llm

A chat interface powered by Chrome's built-in Prompt API — running entirely on your device.

No server. No API key. No data leaving your machine.

What Is This?

ondevice-llm is a minimal, production-quality chat UI built on top of Chrome's experimental Prompt API — a browser-native interface to Gemini Nano, Google's on-device language model.

Everything runs locally. Inference happens on your GPU/NPU. No cloud round-trips, no per-token billing, no privacy concerns.

Features

Feature	Details
Text chat	Typewriter streaming effect, copy & regenerate actions
Image input	Attach PNG / JPEG, ask questions about it
Voice input	Click to record, auto-submits audio to the model
Download progress	Live progress bar while Gemini Nano downloads
Send / Pause	Cancel generation mid-response
Timestamps	`HH:MM:SS` monospace timestamp on every message
Input locking	UI fully disabled until model is ready
Dark mode	Automatic via `prefers-color-scheme`
Single file build	Entire app inlines to one `index.html` via `vite-plugin-singlefile`

Demo

rahuldas-dev.github.io/ondevice-llm

Note: Requires Chrome Canary with the Prompt API flag enabled. See Setup below.

Setup

1. Enable the Prompt API in Chrome [ Optional Incase LLM is Crashing]

The API is experimental and requires Chrome Canary:

Download Chrome Canary
Go to chrome://flags
Search for Prompt API for Gemini Nano → set to Enabled BypassPerfRequirement
Go to chrome://components → find Optimization Guide On Device Model → Check for update
Restart Chrome

The model is ~1–2 GB and downloads in the background. The UI shows a progress bar while it loads.

2. Run Locally

git clone https://github.com/RahulDas-dev/ondevice-llm.git
cd ondevice-llm
npm install
npm run dev

Open http://localhost:5173 in Chrome Canary.

3. Build for Production

npm run build
# Output: dist/index.html (single inlined file)

How It Works

Core API Usage

// Create a session with multimodal support
const session = await LanguageModel.create({
  expectedInputs: [
    { type: 'text' },
    { type: 'image' },
    { type: 'audio' },
  ],
  expectedOutputs: [{ type: 'text' }],
  monitor(m) {
    m.addEventListener('downloadprogress', (e) => {
      console.log(`Downloading: ${(e.loaded / e.total * 100).toFixed(1)}%`)
    })
  }
})

// Text prompt
const text = await session.prompt('Explain quantum entanglement simply.')

// Image prompt
const blob = await fetch('photo.jpg').then(r => r.blob())
const description = await session.prompt([{
  role: 'user',
  content: [
    { type: 'text',  value: 'What is in this image?' },
    { type: 'image', value: blob },
  ]
}])

// Always destroy when done
session.destroy()

Architecture

index.html          ← single entry point, SVG icons in <template>
src/
  main.js           ← all app logic (vanilla JS, no framework)
  style.css         ← CSS custom properties, dark mode, animations
vite.config.js      ← viteSingleFile plugin, base path for GitHub Pages
.github/
  workflows/
    deploy.yaml     ← build → gh-pages branch → GitHub Pages

No React. No Vue. No dependencies at runtime. Just the browser and the model.

Known Limitations

The Prompt API is experimental. Here's what to expect:

Session crashes — Chrome's on-device model process can crash if multiple tabs use LanguageModel simultaneously. The app handles this with session.destroy() on beforeunload and auto-retry with exponential backoff (up to 3 attempts).

Streaming instability — promptStreaming() behaviour varies across Chrome builds (cumulative vs delta chunks, empty final tick). This app uses prompt() with a typewriter effect instead.

Audio support — { type: 'audio' } is the newest addition to the API and may not work on all Chrome Canary builds. The app falls back to text-only gracefully.

Hardware requirements — Gemini Nano needs a capable GPU and ≥8 GB RAM. The model process will be killed on underpowered hardware.

Tech Stack


Runtime	Chrome Prompt API (Gemini Nano)
Frontend	Vanilla JS, CSS custom properties
Build	Vite + vite-plugin-singlefile
Deploy	GitHub Actions → GitHub Pages

Deployment

Every push to main triggers the build pipeline:

git push main
  → npm ci + npm run build
  → dist/ pushed to gh-pages branch
  → GitHub Pages serves the live site

Workflow file: .github/workflows/deploy.yaml

Contributing

Issues and PRs are welcome. If you're experimenting with the Prompt API and hit something interesting (or broken), open an issue — happy to compare notes.

License

Built with the Chrome Prompt API · Deployed on GitHub Pages

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.github/workflows		.github/workflows
public		public
src		src
.gitignore		.gitignore
README.md		README.md
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
vite.config.js		vite.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ondevice-llm

What Is This?

Features

Demo

Setup

1. Enable the Prompt API in Chrome [ Optional Incase LLM is Crashing]

2. Run Locally

3. Build for Production

How It Works

Core API Usage

Architecture

Known Limitations

Tech Stack

Deployment

Contributing

License

Next

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ondevice-llm

What Is This?

Features

Demo

Setup

1. Enable the Prompt API in Chrome [ Optional Incase LLM is Crashing]

2. Run Locally

3. Build for Production

How It Works

Core API Usage

Architecture

Known Limitations

Tech Stack

Deployment

Contributing

License

Next

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages