Voice Robustness Lab (Prototype)

This repository contains a small, personal proof-of-concept for a Voice Robustness Lab.

The goal is to experiment with a helper that can:

Take voice utterance transcripts (and light metadata)
Exercise prompt + response patterns under varied conditions (noise, phrasing, intent)
Summarize robustness issues (e.g., ambiguous intents, brittle prompts, mis-routes)
Append a minimal evidence record for each test run

This is a personal R&D prototype, not a production voice/IVR system.

Quick Start

Prerequisites

Python 3.8+
pip

Install

pip install -e ".[dev]"

Or install dependencies directly:

pip install click pyyaml pytest

Run a Test Set

# Text report (default)
python -m src.cli.main run --test-set fixtures/balance_inquiry.yaml

# JSON report
python -m src.cli.main run --test-set fixtures/balance_inquiry.yaml --format json

# Use the JSON fixture
python -m src.cli.main run --test-set fixtures/balance_inquiry.json --format json

# With a custom evidence log path and note
python -m src.cli.main run \
  --test-set fixtures/support_request.yaml \
  --log runs/my-evidence.jsonl \
  --note "experiment-1"

# Use the LLM classifier (requires LLM_API_KEY env var; falls back to ambiguous)
python -m src.cli.main run \
  --test-set fixtures/balance_inquiry.yaml \
  --classifier llm

# Verbose mode (debug logging to stderr)
python -m src.cli.main -v run --test-set fixtures/balance_inquiry.yaml

CLI Options

Usage: python -m src.cli.main run [OPTIONS]

Options:
  --test-set PATH          Path to a voice test set file (YAML or JSON). [required]
  --classifier [rule|llm]  Classifier backend to use.  [default: rule]
  --log PATH               Path to the evidence log file (JSONL).  [default: runs/evidence.log.jsonl]
  --format [text|json]     Report output format.  [default: text]
  --note TEXT              Optional note to attach to the evidence log entry.
  --help                   Show this message and exit.

Run Tests

python -m pytest tests/ -v

Project Structure

src/
  models.py              - Data models (Utterance, VoiceTestSet, ClassificationResult, etc.)
  classifier/
    base.py              - Classifier protocol + outcome determination
    rule_based.py        - Keyword-matching classifier stub
    llm_adapter.py       - LLM classifier scaffold (requires API key)
  runner/
    loader.py            - YAML/JSON test set loader with validation
    runner.py            - Test set runner (iterates utterances through classifier)
  report/
    reporter.py          - Report aggregation + JSON/text rendering
  evidence/
    logger.py            - JSONL evidence log writer
  cli/
    main.py              - Click CLI entry point
fixtures/                - Sample voice test case files
tests/                   - pytest test suite
runs/                    - Evidence log output directory

How It Works

Load a voice test set (YAML or JSON) describing utterances and an expected intent label.
Classify each utterance using a pluggable classifier (rule-based keyword matching by default).
Determine outcome per utterance: pass (correct), fail (wrong label), or ambiguous (low confidence / no match).
Generate a report with aggregate stats (total, passes, fails, ambiguous, pass rate) and highlighted failures.
Append an evidence entry (JSONL) with timestamp, test set ID, and counts for tracking over time.

Non-goals

End-to-end telephony, ASR, or TTS integration
Real-time audio capture or streaming
Full NLU engine or production routing logic

Status

Initial specification (SPEC.md)
Minimal flow: voice test set -> prompts/queries -> robustness report
Evidence log of test runs
Basic CLI
Run instructions in README

See SPEC.md for the full specification.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
fixtures		fixtures
memory		memory
runs		runs
specs/000-voice-robustness-lab		specs/000-voice-robustness-lab
src		src
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
DISCLAIMER.md		DISCLAIMER.md
LICENSE		LICENSE
README.md		README.md
SPEC.md		SPEC.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voice Robustness Lab (Prototype)

Quick Start

Prerequisites

Install

Run a Test Set

CLI Options

Run Tests

Project Structure

How It Works

Non-goals

Status

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Voice Robustness Lab (Prototype)

Quick Start

Prerequisites

Install

Run a Test Set

CLI Options

Run Tests

Project Structure

How It Works

Non-goals

Status

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages