Skip to content

jeffhuber/cube-snap

Repository files navigation

cube-snap

Solve a Rubik's cube from two phone photos, a pasted state string, or a flat-net image.

Live demo: https://jeffhuber.github.io/cube-snap/

Status

This is a fork of the algorithmic core from jeffhuber/rubiks-solver (a flat-net image solver that's at v0.4.4 and considered shipped). cube-snap is the more ambitious "real-camera capture" follow-up:

  • v0.0.1 — scaffold
  • v0.1 — camera capture shell, on-screen alignment guide, paste/upload/share entry points, PWA assets, and cloud recognizer wiring are in progress on main
  • v0.2 — automatic cube-face detection (replaces the alignment guide)

How it works

Point your phone at a scrambled Rubik's cube held at an isometric angle, showing three faces (U + R + F). Snap one photo. Rotate the cube 180°, snap a second photo (D + L + B). The current app captures those two photos with a fixed alignment overlay and sends them to the cloud recognizer by default; ?recognizer=local still exercises the in-browser rigid-overlay sampler for offline/debug work. The classical cv-local recognizer is exposed through Fixer for corpus/debugging workflows. The resulting 54-character state feeds the existing Kociemba solver for a step-by-step solution shown on a 3D cube.

Run locally

npm install
npm run dev      # http://localhost:5173/cube-snap/
npm test         # vitest
npm run build    # production bundle

Cloud LLM recognizer (Vercel)

Cubesnap's photo recognizer calls a multimodal LLM via a small /api/recognize endpoint deployed to Vercel. The static site stays on GitHub Pages and calls the API via CORS.

Setup

  1. Connect this repo to a Vercel project (already done; the deployed recognizer function is reachable at https://cube-snap-liard.vercel.app/api/recognize, which is what the frontend defaults to in src/Fixer.tsx and src/recognizeCubeRemote.ts). The bare cube-snap.vercel.app host currently serves the SPA HTML at /api/recognize, not the function — if you're configuring a local client or operator-facing override (VITE_RECOGNIZER_ENDPOINT=...), use the cube-snap-liard host. See cube-snap#106 for the history.

  2. In Vercel → Settings → Environment Variables, set:

    Variable Required for
    GEMINI_KEY Gemini providers (default gemini-flash-lite)
    ANTHROPIC_KEY claude-sonnet provider
    OPENAI_KEY Direct gpt-5 provider
    OPENROUTER_KEY OpenRouter-backed providers (or-* aliases)
    RECOGNIZER_DEFAULT Optional. Provider name to use when client doesn't pass ?provider=. Default: gemini-flash-lite
  3. The vercel.json config skips the Vite build (the static site is on GitHub Pages) and only deploys the /api/ directory as functions.

Switching providers

Frontend: ?recognizer=local falls back to the in-browser CV pipeline. ?provider=<id> overrides the server's default provider for one request. Useful IDs:

  • gemini-flash-lite (cheapest, ~4s, decent accuracy)
  • gemini-flash (~7s)
  • claude-sonnet (~5s, best structural accuracy)
  • or-claude-sonnet, or-gemini-pro, or-gpt5, etc. — same models via OpenRouter

Eval harness

GEMINI_KEY=... ANTHROPIC_KEY=... npx tsx api/eval/compare.ts
# or specific providers:
npx tsx api/eval/compare.ts --providers=claude-sonnet,gemini-flash-lite

Reads photos from /tmp/cube-photos/ (override with PHOTO_DIR=), sends each pair through the chosen provider(s), and reports validity + solvability per cube.

Tech stack

Vite · React 19 · TypeScript · Vitest · cubejs (Kociemba) · three.js + react-three-fiber + drei · GitHub Pages · Vercel Functions (LLM recognizer)

License

MIT — see LICENSE.

Releases

No releases published

Packages

 
 
 

Contributors