Skip to content

mitwilli-create/voice-os

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 

Repository files navigation

voice-os

Six-axis voice scoring. Dual-persona routing. QA gates that enforce stylistic fidelity before a draft ever reaches a human. Built on Claude. Calibrated on 6.9M+ words of executive communication.

This is the engine behind the Voice DNA RAG pipeline I shipped at Google xGE — a system that functions as a digital twin for VP-level communications. It cut drafting latency by 90% and holds 99% stylistic fidelity across production volume.


What it does

Ingests a voice corpus, builds a scored representation across six stylistic axes, and routes drafts through dual personas (generative + adversarial) before a QA gate decides whether output clears or cycles back. The result is drafts that sound like the person they're supposed to sound like — not a generic LLM.


Why it matters

Most "voice matching" is prompt engineering with a few examples. This is a calibrated scoring system. The six axes catch what vibes-based prompting misses: rhetorical pace, risk tolerance, sentence rhythm, escalation pattern, hedging behavior, and editorial register. The kill list — a curated set of rejected drafts — teaches the system what the voice refuses to do, not just what it does.


Quick Start

# Clone the repo
git clone https://github.com/mitwilli-create/voice-os.git
cd voice-os

# Install dependencies
pip install -r requirements.txt

# Set your API key
export ANTHROPIC_API_KEY=your_key_here

# Run a scoring pass against the sample corpus
python score.py --corpus data/sample_corpus.txt --draft data/sample_draft.txt

# Run the full dual-persona pipeline with QA gate
python pipeline.py \
  --corpus data/sample_corpus.txt \
  --kill-list data/kill_list.txt \
  --draft data/sample_draft.txt \
  --output output/scored_draft.json

Output includes axis scores, persona deltas, QA gate decision (pass / cycle), and a revision trace.


Architecture

Layer Function
Corpus ingestion Chunks and embeds voice corpus; builds axis score baseline
Six-axis scorer Evaluates drafts against baseline across six stylistic dimensions
Dual-persona router Generative persona drafts; adversarial persona stress-tests fidelity
QA gate Blocks output below threshold; returns structured revision signal
Kill list enforcement Flags patterns the voice explicitly rejects

What this demonstrates

  • Production RAG design — not a demo, a system that ran at VP scale inside Google
  • Evaluation rigor — quantified fidelity scoring, not vibes
  • Agentic architecture — multi-step pipeline with conditional routing and gate logic
  • Domain depth — a decade in newsrooms and eight years at Google built the editorial judgment that makes the scoring axes meaningful

Status

Pipeline architecture and scoring logic are documented here. Core corpus and VP-identity data are not included — that's proprietary. Sample data is synthetic but structurally representative.

CI/evaluation harness: in progress.


Built with

  • Claude (Anthropic) — generation and adversarial persona
  • Python
  • Custom embedding + scoring layer

Mitchell Williams · LinkedIn · GitHub · thestorytellermitch.com

About

Six-axis voice scoring, dual-persona routing, QA gates. 99% stylistic fidelity at VP-scale deployment. Calibrated on 6.9M+ words. Built on Claude.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors