Skip to content

1dg618/quorum-gate

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Quorum — a verification gate for code changes

Distributed on PyPI as quorum-gate; imported and invoked as quorum.

Quorum decides whether a code change is safe to keep. It is built around one idea: "it compiled" and "it passed all checks" are different claims, and only the second — backed by checks that can actually fail — is worth trusting.

You give it a list of independent check functions. Each one takes a throwaway copy of your codebase and returns pass/fail with a reason. Some checks are static ("does every file still compile?"), but the useful ones are behavioral: they spin up a subprocess, import the candidate's modified code in isolation, and actually run it — e.g. launch 16 threads at a spend tracker to check for lost updates, or feed a tool_use/tool_result pair through a message trimmer to confirm it's never split.

A change is promoted to your live files only if a quorum of checks agrees: it passes everything (or, in scored mode, strictly improves the score without breaking any check that was already passing). Otherwise it's discarded — and since it only ever touched a copy, there's nothing to roll back.

Why subprocesses

Two layers of isolation, and the second is the one that matters:

  1. Filesystem isolation — every check sees a fresh disposable copy. Nothing it writes survives or affects your real files.
  2. Process isolation — each check runs in its own subprocess with a wall clock timeout. A broken candidate that infinite-loops, deadlocks 16 threads, segfaults, runs out of memory, or hard-exits takes down a disposable child and is reported back as TIMEOUT / CRASHED. It cannot crash or pollute the verifier that is judging it.

A naive test runner imports candidate code into its own process — one bad candidate can then hang or corrupt the judge. Quorum never does this.

Install

pip install quorum-gate          # PyYAML is only needed if you use --config
# or, from a checkout:
pip install -e .

The installed command is quorum; the import package is quorum.

Define checks

A function check takes the path to the throwaway copy and returns a CheckResult, a bool, or a (passed, reason[, score]) tuple. Register checks on a module-level gate object:

# checks.py
import importlib, threading
from quorum import Gate, CheckResult, Outcome

gate = Gate()

@gate.check(name="spend_tracker_no_lost_updates", timeout_s=20)
def spend_tracker(codebase_path):
    core = importlib.import_module("myapp.core")
    tracker = core.SpendTracker()
    threads = [threading.Thread(target=lambda: [tracker.add(1) for _ in range(5000)])
               for _ in range(16)]
    for t in threads: t.start()
    for t in threads: t.join()
    expected = 16 * 5000
    ok = tracker.total == expected
    return CheckResult("spend_tracker_no_lost_updates", ok,
                       "all increments recorded" if ok else f"lost {expected - tracker.total}",
                       Outcome.PASSED if ok else Outcome.FAILED,
                       score=tracker.total / expected)

A shell check lets you reuse tools you already have (pytest, mypy, ruff, a compile step), expressed in YAML:

# quorum.yaml
checks:
  - name: types
    command: "mypy myapp/"
    timeout_s: 60
  - name: unit_tests
    command: "pytest -q && echo passed=1"
    timeout_s: 120
    score_from: passed     # parses `passed=<number>` from stdout as the score

Run

# pass/fail mode: promote iff every check passes
quorum verify --candidate ./candidate --checks checks.py --config quorum.yaml

# scored mode: promote iff candidate strictly improves the total score
# without regressing any check that was passing on the baseline
quorum scored --candidate ./candidate --baseline ./live --checks checks.py

# actually copy a passing candidate over your live files
quorum verify --candidate ./candidate --checks checks.py --promote --live ./live

(python -m quorum.cli ... works identically if you prefer not to rely on the console script.)

Exit code is 0 if promoted, 1 otherwise — so it drops straight into CI or a patch-proposing loop.

The two modes

pass/fail — promote iff every check passes. Simple and strict.

scored — each check can return a numeric score. Quorum runs the checks against your baseline (the current live code) first to learn which checks were already passing and what the baseline score was, then runs them against the candidate. It promotes only if the candidate's total score is strictly greater and no previously-passing check now fails. This is the mode for "make it better without making anything worse" — e.g. an optimizer or an agent proposing patches.

Use as a library

from quorum import Gate

g = Gate()
g.add_function(my_check)
g.add_shell("tests", "pytest -q")

report = g.verify("./candidate")
print(report.summary())
if g.promote("./candidate", "./live", report):
    print("shipped")

What a result tells you

Every check returns a reason, not just a verdict — that's the point. Outcomes are PASSED, FAILED, TIMEOUT (exceeded its budget), CRASHED (process died without a verdict), or ERROR (the check function itself raised).

Try the example

quorum verify --candidate examples/candidate_fixed  --checks examples/checks.py  # promoted
quorum verify --candidate examples/live_code         --checks examples/checks.py  # rejected: real bugs
quorum verify --candidate examples/candidate_broken  --checks examples/checks.py  # one check times out, Quorum survives

About

Speculate in isolation; promote on verification. Run checks on a throwaway branch; only promote proven changes.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages