A gestural synthesizer for live performance — play with your hands, mouth, and face.
Move your hands, open your mouth, smile, frown, point, swipe — a generative music brain responds in real time. Designed for stage: webcam in, low-latency audio out, no MIDI hardware needed. Cyberpunk visualizer · MediaPipe webcam tracking · Tone.js audio engine.
A web app + Electron desktop app that turns your body into an expressive musical instrument for live performance. No keys, no knobs to learn — your hands, head, mouth, and face are the interface. Setup is a webcam and a browser. The audio output works through any system route (laptop speakers, USB interface, BlackHole into a DAW, mixer).
📖 Install on macOS: INSTALL.md (🇮🇹 italiano) — detailed walkthrough including Gatekeeper, audio routing for stage, webcam choice tips.
- Hand position sweeps the master filter
- Hand height drives note density and brightness
- Hand openness controls reverb, delay, drive, and resonance
- Hand depth (toward / away from camera) raises and lowers the master volume
- Palm rotation modulates brightness and grit
- Per-finger curl — each of the 10 fingers maps to its own audio dimension (right hand: delay feedback, filter cutoff, reverb, brightness, delay wet; left hand: drive, Q, reverb extra, master volume, tremolo). Play your hands like an instrument.
- Pinch right triggers a harmony-aware lead stab
- Pinch left advances the chord progression
- Mouth open sweeps delay wet, filter cutoff, reverb, and brightness simultaneously
- Smile / frown / surprise / anger modulate brightness, cutoff, drive, and resonance — your face is the fourth controller
- Both fists mutes; both hands above head is the "drop"
The music brain stays inside the active scale — it is impossible to play a wrong note — and four built-in vibes (Tycho, Bonobo, Hopkins, Floating Points) set tonality, BPM, and timbre. The PATCH editor lets you override the key (12 chromatic roots) and scale (Major, Minor, Harmonic / Melodic Minor, the seven church modes, Pentatonic Major / Minor, Blues, Chromatic) independently of the vibe — keep the Bonobo style but play in C-minor, or run Tycho through Phrygian for a darker drift. Eight factory PATCH presets (LUSH, ACID, DUB, BRIGHT, DARK, TAPE, SPACE, INIT) flip whole-sound character with one click — each preset re-shapes the oscillators and envelopes of the pad/lead/bass voices on top of the FX chain, so they sound genuinely different (not just "same patch through different reverb"). Save your own patches to localStorage.
Intelligent voicing. Each pitched voice (pad/lead/bass) runs two oscillator stacks through a Tone.js CrossFade — the analog-synth "WAVE" knob feel. The PATCH editor's VOICE section gives you BOTH a waveform dropdown to explicitly pick the A-side oscillator per voice (13 options: sine, triangle, sawtooth, square, pulse, plus the fat/FM/AM variants) AND a Pad/Lead/Bass Morph dial to crossfade continuously between that pick and a clean morph destination (sine for pad, pulse for lead, triangle for bass). On top of that, a SMART toggle (ON by default) nudges each voice's morph by up to ±0.15 based on the running FX state — low cutoff biases the pad toward sine, high drive pulls everything toward harmonic-rich waveforms for the saturator to bite, high reverb pushes the pad glassy. Subtle, never an override — your knob always wins.
pnpm install
pnpm devOpen http://localhost:5173, click Allow webcam and begin (or Permetti webcam e iniziare if your browser is set to Italian), raise your hands.
For live shows, run pnpm build && pnpm preview instead — production mode, no HMR overhead, much steadier audio.
Tested in Chrome, Edge, Safari 15+, Firefox 114+. Webcam permission required. Full Mac install walkthrough including Gatekeeper, audio routing, and stage tips: INSTALL.md · INSTALL.it.md.
The entire interface is available in English and Italian. The browser
locale picks the starting language automatically (it, it-IT, it-CH →
Italian; everything else → English), and a small IT / EN pill in the
bottom-right HUD strip flips the UI live without a reload. Your choice is
remembered in localStorage. The in-app manual itself is fully translated —
see USER_MANUAL.md (English) and
USER_MANUAL.it.md (Italian).
# Windows .exe (NSIS installer with directory picker + shortcuts)
pnpm electron:build:win
# → release/HandSynth-Setup-0.1.0.exe (~87 MB)
# macOS .dmg (run on a Mac — cross-build from Windows is unreliable)
pnpm electron:build:mac
# → release/HandSynth-0.1.0-arm64.dmg + HandSynth-0.1.0-x64.dmg
# Dev with hot reload
pnpm electron:devSee electron/BUILD.md for code-signing + notarization notes.
| Gesture | Audio target | Range |
|---|---|---|
| Hands distance (3D) | Filter cutoff | 200 Hz → 12 kHz, log |
| Mean hand height | Note density + brightness | 1/4 → 1/16, dim → bright |
| Right palm openness | Reverb wet + delay feedback | dry → wash |
| Left palm openness | Saturator drive + filter Q | clean → screaming |
| Right pinch (rising edge) | Trigger harmony-aware lead stab | one-shot |
| Left pinch (rising edge) | Advance chord progression | one-shot |
| Both fists held | Master mute (200 ms fade) | toggle |
| Both hands above head | Drop (max reverb, filter wide open) | held |
| Mean hand depth (Z) | Master volume (closer = louder) | additive |
| Right palm roll | Brightness fine-tune (±0.15) | additive |
| Left palm roll | Saturator drive fine-tune (±0.4) | additive |
| Mean palm pitch | Delay feedback fine-tune (±0.15) | additive |
Each finger on each hand emits its own continuous 0..1 curl scalar and drives its own audio dimension. Tuned high-sensitivity (One-Euro β=0.08) so small deliberate motions register.
| Finger | Right (FX hand) | Left (drive hand) |
|---|---|---|
| Thumb | Delay feedback 0.7 → 0.1 | Saturator drive 2.6 → 0.8 |
| Index | Filter cutoff 14 kHz → 600 Hz (log) | Filter Q 12 → 1.5 |
| Middle | Reverb wet 0.85 → 0.05 | Reverb wet extra 0.7 → 0.05 |
| Ring | Brightness offset +0.15 → −0.15 | Master volume offset +0.10 → −0.10 |
| Pinky | Delay wet 0.6 → 0.05 | Tremolo depth (brightness LFO @5 Hz) 0 → 1 |
| Gesture | Hand | Effect |
|---|---|---|
| Point | right | Filter Q spike +6 (decays over 600 ms) |
| Peace (V) | right | Brightness pulse (vibrato approximation) |
| Rock on (horns) | right | Saturator drive +0.35 for 1.5 s |
| OK (ring) | right | Delay feedback +0.2 for 1 s (tape flutter) |
| Finger gun | right | Lead chord-tone stab |
| Thumbs up | right | Save quick-patch (logged) |
| Thumbs down | right | Reset to INIT factory preset |
| Three (I+M+R) | right | Apply factory preset slot 3 |
| Four (I+M+R+P) | right | Apply factory preset slot 4 |
| Call me (shaka) | right | Percussion one-shot |
| Snap (fast index extend) | either | Percussion one-shot |
| Swipe right | right | Next factory preset |
| Swipe left | right | Previous factory preset |
| Fist pump (both hands, fast down) | both | Drop bomb (delay fb +0.3 + reverb +0.3) |
| Wave (oscillating, ≥3 flips in 1.2 s) | either | Tremolo wobble (brightness LFO) |
Discrete gestures are gated by a 3-frame consensus + per-gesture cooldown so
flicker doesn't fire spurious events. See USER_MANUAL.md
for the full cheat sheet.
| Gesture | Audio target | Visual |
|---|---|---|
| Apparent face size | Reverb wet blend | (closer = drier) |
| Head roll | Brightness offset (±0.15) | — |
| Head yaw | Filter resonance offset (±5 Q) | — |
| Head pitch | Note density boost | — |
| Mouth open | Delay wet + filter cutoff +6k + reverb +0.3 + brightness +0.4 | Mouth-emit particles |
| Mouth open (rising edge) | Lead chord-tone stab | — |
| Smile | Brightness +0.2, masterDuck −0.15 (brighter, louder) | — |
| Frown | Filter cutoff pulled toward 1.5 kHz (darkens) | — |
| Surprise | Reverb +0.3, delay feedback +0.1 (opens up) | — |
| Anger | Saturator drive +0.6 (clamped), filter Q +5 (clamped) | — |
| Face lost > 1.5 s | Master duck +0.15 | — |
| Key | Action |
|---|---|
t |
Toggle the live event terminal (left side) |
p |
Toggle the PATCH editor |
h or F1 |
Toggle the in-app manual (also exposed as the bottom-right help icon) |
m |
Flip the selfie mirror (only if your webcam stream is already pre-mirrored) |
Escape |
Mute / unmute the audio (also exposed as the bottom-right STOP icon) |
All shortcuts use letter or function/control keys so they remain reachable on every international keyboard layout (no symbol-key dependencies).
A small bottom-right HUD strip surfaces these as three tiny icon buttons (STOP / TERMINAL / HELP) for users who prefer a click. See USER_MANUAL.md for the long-form user guide.
The PATCH editor exposes two dropdowns above the knob grid: KEY (12 chromatic roots) and SCALE (13 modes, from Major and the church modes through Harmonic / Melodic Minor, Pentatonics, Blues, and Chromatic). Pick whatever you want and the music brain re-tunes the lead and bass generators on the next note — the chord progression keeps the vibe's character, the snap-to-consonance filter handles any clashes. The ↺ button resets to the current vibe's default; your selection persists across sessions in localStorage.
Eight modular subsystems with strictly-typed contracts in src/types/contracts.ts:
HandTracker ─┐
FaceTracker ─┴─→ InteractionMapper ─→ AudioEngine ─→ AnalyserNode ─→ Visualizer
└→ MusicBrain ─→ AudioEngine
AudioEngineowns Tone.js: master FX chain (filter → saturator → EQ → compressor → widener → ping-pong delay → reverb → limiter), four voice engines (pad, lead, bass, perc), AudioWorklet saturator with Tone.Distortion fallback.MusicBrainis the generative composer: order-2 Markov chains on scale degrees with Bezier contour shaping, harmonic filter that snaps every note to a chord-tone or consonant tension, Bjorklund Euclidean rhythms for percussion, swing on 16ths.HandTracker/FaceTrackerwrap MediaPipe Tasks Vision, do per-landmark One-Euro filtering, derive scalars (openness, pinch, depth, roll, pitch, eyesWide, mouthOpen, etc.), emit events at 24 Hz / 8 Hz respectively to keep main-thread headroom for the audio scheduler.InteractionMapperis the patch bay — gesture state in, audio params + music inputs out. Per-param epsilon diff so unchanged knobs don't fire ramp events.Visualizeris a p5.js instance-mode sketch: low-poly orange-on-charcoal cyberpunk style, FFT-reactive triangle particles, hex grid backdrop, hand silhouettes with diamond fingertips, full face mesh with iris rings, fake-arm bezier connectors.SettingsPanelis the analog-synth-style PATCH editor with knobs, factory presets, and patch save/load.Terminalis the translucent left-side event log.
Full module map and data-flow diagram in ARCHITECTURE.md.
pnpm typecheck # tsc strict, no errors
pnpm lint # eslint
pnpm test # vitest — 275 unit + integration tests
pnpm build # vite production buildCI runs all four on every push (see .github/workflows/ci.yml).
| Build | Vite 6, TypeScript strict (ES2022), pnpm |
| Audio | Tone.js v15, AudioWorklet, custom soft-clip saturator |
| Tracking | MediaPipe Tasks Vision (HandLandmarker + FaceLandmarker) |
| Visualizer | p5.js 1.x in instance mode, Canvas2D, custom bloom + scanline pipeline |
| Music | @tonaljs/tonal for scales / chords / voicings |
| Smoothing | One Euro Filter (Casiez, Roussel, Vogel — CHI 2012) |
| Desktop | Electron 33 + electron-builder |
The app runs heavy real-time work on the main thread (two MediaPipe inferences + p5 sketch + Tone.js scheduler). To keep audio glitch-free under cumulative load:
- HandLandmarker capped at 24 Hz; FaceLandmarker at 8 Hz
- p5 frameRate capped at 30
- Per-particle pool with no per-frame allocations
- Per-param epsilon diff so static gestures don't queue ramp events
- AudioContext built with
latencyHint: 0.15and TonelookAhead: 0.4 sso short main-thread stalls don't audibly drop audio
If you experience audio glitches: try pnpm preview (production build, no Vite HMR overhead) instead of pnpm dev. The production build is significantly lighter on the main thread.
PRs welcome. See CONTRIBUTING.md. The project uses a multi-agent contract pattern — modules in src/audio/, src/music/, src/hands/, src/face/, src/visual/, src/interaction/, src/ui/ are owned independently and connected only via interfaces in src/types/contracts.ts. Don't break the contract.
MIT — see LICENSE. Use it, fork it, install it on your friends' machines, run a live show, build a follow-on with it. Attribution is appreciated but not required.
Built with Claude Code.
