Solve a Rubik's cube from two phone photos, a pasted state string, or a flat-net image.
Live demo: https://jeffhuber.github.io/cube-snap/
This is a fork of the algorithmic core from
jeffhuber/rubiks-solver (a
flat-net image solver that's at v0.4.4 and considered shipped). cube-snap is
the more ambitious "real-camera capture" follow-up:
- v0.0.1 — scaffold
- v0.1 — camera capture shell, on-screen alignment guide, paste/upload/share
entry points, PWA assets, and cloud recognizer wiring are in progress on
main - v0.2 — automatic cube-face detection (replaces the alignment guide)
Point your phone at a scrambled Rubik's cube held at an isometric angle,
showing three faces (U + R + F). Snap one photo. Rotate the cube 180°,
snap a second photo (D + L + B). The current app captures those two photos
with a fixed alignment overlay and sends them to the cloud recognizer by
default; ?recognizer=local still exercises the in-browser rigid-overlay
sampler for offline/debug work. The classical cv-local recognizer is exposed
through Fixer for corpus/debugging workflows. The resulting 54-character state
feeds the existing Kociemba solver for a step-by-step solution shown on a 3D
cube.
npm install
npm run dev # http://localhost:5173/cube-snap/
npm test # vitest
npm run build # production bundleCubesnap's photo recognizer calls a multimodal LLM via a small
/api/recognize endpoint deployed to Vercel. The static site stays on
GitHub Pages and calls the API via CORS.
-
Connect this repo to a Vercel project (already done; the deployed recognizer function is reachable at
https://cube-snap-liard.vercel.app/api/recognize, which is what the frontend defaults to insrc/Fixer.tsxandsrc/recognizeCubeRemote.ts). The barecube-snap.vercel.apphost currently serves the SPA HTML at/api/recognize, not the function — if you're configuring a local client or operator-facing override (VITE_RECOGNIZER_ENDPOINT=...), use thecube-snap-liardhost. See cube-snap#106 for the history. -
In Vercel → Settings → Environment Variables, set:
Variable Required for GEMINI_KEYGemini providers (default gemini-flash-lite)ANTHROPIC_KEYclaude-sonnetproviderOPENAI_KEYDirect gpt-5providerOPENROUTER_KEYOpenRouter-backed providers ( or-*aliases)RECOGNIZER_DEFAULTOptional. Provider name to use when client doesn't pass ?provider=. Default:gemini-flash-lite -
The
vercel.jsonconfig skips the Vite build (the static site is on GitHub Pages) and only deploys the/api/directory as functions.
Frontend: ?recognizer=local falls back to the in-browser CV pipeline.
?provider=<id> overrides the server's default provider for one request.
Useful IDs:
gemini-flash-lite(cheapest, ~4s, decent accuracy)gemini-flash(~7s)claude-sonnet(~5s, best structural accuracy)or-claude-sonnet,or-gemini-pro,or-gpt5, etc. — same models via OpenRouter
GEMINI_KEY=... ANTHROPIC_KEY=... npx tsx api/eval/compare.ts
# or specific providers:
npx tsx api/eval/compare.ts --providers=claude-sonnet,gemini-flash-liteReads photos from /tmp/cube-photos/ (override with PHOTO_DIR=),
sends each pair through the chosen provider(s), and reports validity +
solvability per cube.
Vite · React 19 · TypeScript · Vitest · cubejs (Kociemba) · three.js + react-three-fiber + drei · GitHub Pages · Vercel Functions (LLM recognizer)
MIT — see LICENSE.