-
Notifications
You must be signed in to change notification settings - Fork 0
feat(bench): index performance benchmark suite #173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
tobyhede
wants to merge
28
commits into
main
Choose a base branch
from
index-performance-scheduled-benchmarks
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
28 commits
Select commit
Hold shift + click to select a range
2f9dad0
feat(bench): add benchmark table with 10K rows, indexes, and verifica…
tobyhede 44dabd7
refactor(bench): extract BENCH_ROW_COUNT constant from magic number
tobyhede 1934a91
fix(bench): move 10K row INSERT from migration to opt-in fixture
tobyhede a57682a
fix(test): correct pg_stat_statements_reset argument order
tobyhede 89f86f6
fix(bench): address code review feedback
tobyhede 493f085
fix(bench): address second code review round
tobyhede 2e371e6
docs(bench): address CodeRabbit feedback on fixture docs
tobyhede 53972f9
refactor(bench): use Zipf-like skew for bench fixture distribution
tobyhede 86e2e14
feat(bench): add Tier 1 plan assertions for ORE range queries and P0 …
tobyhede cb22f6b
feat(bench): add Tier 1 magnitude regression tests with timing thresh…
tobyhede b6133f4
fix(bench): address post-review code quality issues
tobyhede 999d22a
fix(bench): address third code review round
tobyhede 808b789
chore(bench): scaffold tests/benchmarks/ directory with README and gi…
tobyhede 41b95cb
feat(bench): add docker-compose with Postgres + CipherStash Proxy for…
tobyhede 721a3f0
feat(bench): add schema.sql with bench table and Proxy search configu…
tobyhede 8fe6748
feat(bench): add generate.sh for 100K dataset generation via Proxy
tobyhede c58629c
feat(bench): add mise tasks bench:up/down/generate/full
tobyhede a827486
feat(bench): add PerfResult struct and JSON/Markdown report writer
tobyhede 780ac79
feat(bench): add Tier 2 perf test infrastructure and hmac_256 baselin…
tobyhede 0941f2d
feat(bench): add Tier 2 perf tests for P0/P1/P2 query patterns
tobyhede dc27f91
test(bench): add consistent assertion messages to Tier 2 perf tests
tobyhede 285593b
feat(bench): add scheduled GitHub Actions workflow for weekly Tier 2 …
tobyhede 5b9dba5
fix(bench): write Proxy credentials safely via env block + printf
tobyhede 6d98632
refactor(bench): use DATABASE_URL for Tier 2 tests, drop BENCH_DATABA…
tobyhede c2dc431
perf(ci): mark slow perf/O(n²) tests as #[ignore] to cut PR runtime
tobyhede 511d97b
style(bench): apply cargo fmt to reports.rs and bench_perf_tests.rs
tobyhede e5fff4d
fix(bench): address CodeRabbit review feedback on PR #173
tobyhede 2ef4dec
perf(bench): reduce RUNS from 1000 to 10 to fit CI timeout
tobyhede File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,77 @@ | ||
| name: "Scheduled Benchmarks (Tier 2)" | ||
|
|
||
| on: | ||
| schedule: | ||
| - cron: '0 3 * * 1' # Every Monday 03:00 UTC | ||
| workflow_dispatch: | ||
|
|
||
| # Prevent a scheduled run from racing a manual dispatch for the same ports. | ||
| concurrency: | ||
| group: scheduled-benchmarks | ||
| cancel-in-progress: false | ||
|
|
||
| env: | ||
| # Matches test-eql.yml — forces JS-based composite actions onto Node 24. | ||
| FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: "true" | ||
|
|
||
| jobs: | ||
| benchmark: | ||
| name: "100K dataset benchmark (Postgres 17)" | ||
| runs-on: ubuntu-latest | ||
| timeout-minutes: 60 | ||
|
|
||
| steps: | ||
| - uses: actions/checkout@v4 | ||
|
|
||
| - name: Install postgresql-client | ||
| # generate.sh uses psql directly against Postgres (port 7433) and Proxy | ||
| # (port 6433). jdx/mise-action only installs Rust + Python. | ||
| run: | | ||
| sudo apt-get update | ||
| sudo apt-get install -y postgresql-client | ||
|
|
||
| - uses: jdx/mise-action@v3 | ||
| with: | ||
| version: 2026.4.0 | ||
| install: true | ||
| cache: true | ||
|
|
||
| - name: Write Proxy credentials to .env | ||
| env: | ||
| CS_CLIENT_ACCESS_KEY: ${{ secrets.CS_CLIENT_ACCESS_KEY }} | ||
| CS_DEFAULT_KEYSET_ID: ${{ secrets.CS_DEFAULT_KEYSET_ID }} | ||
| CS_CLIENT_KEY: ${{ secrets.CS_CLIENT_KEY }} | ||
| CS_CLIENT_ID: ${{ secrets.CS_CLIENT_ID }} | ||
| CS_WORKSPACE_CRN: ${{ secrets.CS_WORKSPACE_CRN }} | ||
| run: | | ||
| { | ||
| printf 'CS_CLIENT_ACCESS_KEY=%s\n' "$CS_CLIENT_ACCESS_KEY" | ||
| printf 'CS_DEFAULT_KEYSET_ID=%s\n' "$CS_DEFAULT_KEYSET_ID" | ||
| printf 'CS_CLIENT_KEY=%s\n' "$CS_CLIENT_KEY" | ||
| printf 'CS_CLIENT_ID=%s\n' "$CS_CLIENT_ID" | ||
| printf 'CS_WORKSPACE_CRN=%s\n' "$CS_WORKSPACE_CRN" | ||
| } > tests/benchmarks/.env | ||
|
|
||
| - name: Bring up Postgres + Proxy | ||
| run: mise run bench:up | ||
|
|
||
| - name: Generate 100K dataset | ||
| run: mise run bench:generate | ||
|
|
||
| - name: Run Tier 2 benchmark suite | ||
| run: | | ||
| BENCH_REPORT_DATE="$(date -u +%Y-%m-%d)-${{ github.run_id }}" | ||
| export BENCH_REPORT_DATE | ||
| mise run bench:full | ||
|
coderabbitai[bot] marked this conversation as resolved.
|
||
|
|
||
| - name: Tear down containers | ||
| if: always() | ||
| run: mise run bench:down | ||
|
|
||
| - name: Upload benchmark report | ||
| if: always() | ||
| uses: actions/upload-artifact@v4 | ||
| with: | ||
| name: benchmark-report-${{ github.run_id }} | ||
| path: tests/benchmarks/reports/ | ||
| retention-days: 90 | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,35 @@ | ||
| ["bench:up"] | ||
| description = "Start Postgres + Proxy for benchmark data generation" | ||
| dir = "{{config_root}}" | ||
| run = """ | ||
| if [ ! -f tests/benchmarks/.env ]; then | ||
| echo "ERROR: tests/benchmarks/.env missing. Copy .env.example and fill in credentials." >&2 | ||
| exit 1 | ||
| fi | ||
| docker compose --env-file tests/benchmarks/.env -f tests/benchmarks/docker-compose.yml up -d --wait | ||
| """ | ||
|
|
||
| ["bench:down"] | ||
| description = "Stop benchmark Postgres + Proxy" | ||
| dir = "{{config_root}}" | ||
| run = """ | ||
| docker compose -f tests/benchmarks/docker-compose.yml down -v | ||
| """ | ||
|
|
||
| ["bench:generate"] | ||
| description = "Generate 100K encrypted bench dataset (requires bench:up first)" | ||
| # `build` produces release/cipherstash-encrypt.sql, which generate.sh | ||
| # installs into the bench Postgres container before applying schema.sql. | ||
| depends = ["build"] | ||
| dir = "{{config_root}}" | ||
| run = """ | ||
| tests/benchmarks/generate.sh 100k | ||
| """ | ||
|
|
||
| ["bench:full"] | ||
| description = "Run full Tier 2 benchmark suite against bench-postgres" | ||
| dir = "{{config_root}}/tests/sqlx" | ||
| env = { DATABASE_URL = "postgresql://cipherstash:password@localhost:7433/cipherstash" } | ||
| run = """ | ||
| cargo test --test bench_perf_tests run_all_benchmarks -- --ignored --nocapture | ||
| """ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| # CipherStash Proxy credentials | ||
| # Get these from https://dashboard.cipherstash.com | ||
| CS_CLIENT_ACCESS_KEY= | ||
| CS_DEFAULT_KEYSET_ID= | ||
| CS_CLIENT_KEY= | ||
| CS_CLIENT_ID= | ||
| CS_WORKSPACE_CRN= |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| # Generated reports (too large for git, regenerated on demand) | ||
| reports/* | ||
| !reports/.gitkeep | ||
|
|
||
| # Local Proxy credentials | ||
| .env |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,48 @@ | ||
| # EQL Scheduled Benchmarks (Tier 2) | ||
|
|
||
| Heavy-weight performance benchmarks that run weekly in CI against 100K-row | ||
| encrypted datasets. Complements the Tier 1 tests in `tests/sqlx/tests/bench_*`. | ||
|
|
||
| ## What this is | ||
|
|
||
| - Brings up Postgres + CipherStash Proxy via docker-compose | ||
| - Inserts 100K plaintext rows through the Proxy (which encrypts them) | ||
| - Runs each P0/P1/P2 query pattern 10 times | ||
| - Reads `pg_stat_statements` for statistical aggregates | ||
| - Outputs JSON + Markdown reports | ||
|
|
||
| ## Local usage | ||
|
|
||
| ```bash | ||
| # Populate credentials | ||
| cp tests/benchmarks/.env.example tests/benchmarks/.env | ||
| # Edit .env with your CipherStash credentials | ||
|
|
||
| # Start Postgres + Proxy | ||
| mise run bench:up | ||
|
|
||
| # Build EQL and generate 100K dataset (bench:generate depends on build) | ||
| mise run bench:generate | ||
|
|
||
| # Run the full Tier 2 suite | ||
| mise run bench:full | ||
|
|
||
| # Results land in tests/benchmarks/reports/ | ||
| ``` | ||
|
|
||
| ## CI usage | ||
|
|
||
| Runs automatically every Monday at 03:00 UTC via | ||
| `.github/workflows/benchmark.yml`. Also manually invocable from the | ||
| GitHub Actions UI (Run workflow button). | ||
|
|
||
| ## Why a separate workflow | ||
|
|
||
| - 100K generation takes ~100 seconds via the Proxy | ||
| - The slowest pattern (`bench_ore_order_by_limit`) takes several seconds per run on 100K rows | ||
| - Regular PR CI must stay under 10 minutes; this suite would blow that budget | ||
|
|
||
| ## Output | ||
|
|
||
| `tests/benchmarks/reports/benchmark-YYYY-MM-DD.{json,md}` — uploaded as | ||
| GitHub Actions artifact named `benchmark-report-<run-id>`. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,59 @@ | ||
| services: | ||
| postgres: | ||
| image: postgres:17 | ||
| container_name: bench-postgres | ||
| command: > | ||
| postgres | ||
| -c track_functions=all | ||
| -c shared_preload_libraries=pg_stat_statements | ||
| -c pg_stat_statements.track=all | ||
| -c pg_stat_statements.max=10000 | ||
| ports: | ||
| - "127.0.0.1:7433:5432" | ||
| environment: | ||
| POSTGRES_DB: cipherstash | ||
| POSTGRES_USER: cipherstash | ||
| POSTGRES_PASSWORD: password | ||
| healthcheck: | ||
| test: ["CMD-SHELL", "pg_isready -U cipherstash"] | ||
| interval: 1s | ||
| timeout: 5s | ||
| retries: 10 | ||
| networks: | ||
| - bench | ||
|
|
||
| proxy: | ||
| image: cipherstash/proxy:latest | ||
| container_name: bench-proxy | ||
| ports: | ||
| - "127.0.0.1:6433:6432" | ||
| environment: | ||
| CS_DATABASE__NAME: cipherstash | ||
| CS_DATABASE__USERNAME: cipherstash | ||
| CS_DATABASE__PASSWORD: password | ||
| CS_DATABASE__HOST: postgres | ||
| CS_DATABASE__PORT: 5432 | ||
| # EQL install is performed explicitly by generate.sh before schema.sql runs. | ||
| # Leaving Proxy's own install off avoids racing against generate.sh. | ||
| CS_DATABASE__INSTALL_EQL: "false" | ||
| CS_CLIENT_ACCESS_KEY: ${CS_CLIENT_ACCESS_KEY} | ||
| CS_DEFAULT_KEYSET_ID: ${CS_DEFAULT_KEYSET_ID} | ||
| CS_CLIENT_KEY: ${CS_CLIENT_KEY} | ||
| CS_CLIENT_ID: ${CS_CLIENT_ID} | ||
| CS_WORKSPACE_CRN: ${CS_WORKSPACE_CRN} | ||
| healthcheck: | ||
| # Probe the Proxy's pg-protocol listener (no auth handshake required). | ||
| # busybox `nc` is present in the cipherstash/proxy image. | ||
| test: ["CMD-SHELL", "nc -z localhost 6432"] | ||
| interval: 1s | ||
| timeout: 5s | ||
| retries: 30 | ||
| depends_on: | ||
| postgres: | ||
| condition: service_healthy | ||
| networks: | ||
| - bench | ||
|
|
||
| networks: | ||
| bench: | ||
| driver: bridge |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,58 @@ | ||
| #!/usr/bin/env bash | ||
| set -euo pipefail | ||
|
|
||
| # Generates a 100K-row encrypted bench dataset via CipherStash Proxy. | ||
| # No dump is written in v1 — the Tier 2 workflow regenerates fresh each run. | ||
| # | ||
| # Prerequisites: | ||
| # - mise run build (produces release/cipherstash-encrypt.sql) | ||
| # - docker compose -f tests/benchmarks/docker-compose.yml up -d --wait | ||
| # - tests/benchmarks/.env populated with CipherStash credentials | ||
|
|
||
| REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)" | ||
| SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" | ||
| EQL_SQL="$REPO_ROOT/release/cipherstash-encrypt.sql" | ||
| SCALE="${1:-100k}" | ||
|
|
||
| case "$SCALE" in | ||
| 100k) ROWS=100000 ;; | ||
| *) echo "Unsupported scale: $SCALE (only 100k in v1)" >&2; exit 1 ;; | ||
| esac | ||
|
|
||
| if [ ! -f "$EQL_SQL" ]; then | ||
| echo "ERROR: $EQL_SQL not found. Run 'mise run build' first." >&2 | ||
| exit 1 | ||
| fi | ||
|
|
||
| PG_URL="postgresql://cipherstash:password@localhost:7433/cipherstash" | ||
| PROXY_URL="postgresql://cipherstash:password@localhost:6433/cipherstash" | ||
|
|
||
| echo "==> Installing EQL into bench-postgres" | ||
| psql "$PG_URL" -v ON_ERROR_STOP=1 -f "$EQL_SQL" >/dev/null | ||
|
|
||
| echo "==> Applying bench schema and Proxy search configuration" | ||
| psql "$PG_URL" -v ON_ERROR_STOP=1 -f "$SCRIPT_DIR/schema.sql" | ||
|
|
||
| echo "==> Inserting $ROWS plaintext rows through Proxy (this encrypts them)" | ||
| # generate_series emits plaintext rows; Proxy intercepts and encrypts each | ||
| # column per the search config applied in schema.sql. | ||
| psql "$PROXY_URL" -v ON_ERROR_STOP=1 -c " | ||
| INSERT INTO bench (encrypted_text, encrypted_int, encrypted_bigint) | ||
| SELECT | ||
| ('text_' || (((gs - 1) % 1000) + 1))::text, | ||
| (((gs - 1) % 1000) + 1)::int, | ||
| (((gs - 1) % 1000) + 1)::bigint * 1000000000 | ||
| FROM generate_series(1, $ROWS) AS gs; | ||
| " | ||
|
|
||
| echo "==> Creating indexes and running ANALYZE" | ||
| psql "$PG_URL" -v ON_ERROR_STOP=1 -c " | ||
| CREATE INDEX IF NOT EXISTS bench_text_hmac_idx ON bench USING hash (eql_v2.hmac_256(encrypted_text)); | ||
| CREATE INDEX IF NOT EXISTS bench_text_ore_idx ON bench USING btree (encrypted_text eql_v2.encrypted_operator_class); | ||
| CREATE INDEX IF NOT EXISTS bench_int_ore_idx ON bench USING btree (encrypted_int eql_v2.encrypted_operator_class); | ||
| CREATE INDEX IF NOT EXISTS bench_bigint_ore_idx ON bench USING btree (encrypted_bigint eql_v2.encrypted_operator_class); | ||
| CREATE INDEX IF NOT EXISTS bench_text_bloom_idx ON bench USING gin (eql_v2.bloom_filter(encrypted_text)); | ||
| ANALYZE bench; | ||
| " | ||
|
|
||
| echo "==> Done. Rows: $ROWS" |
Empty file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,35 @@ | ||
| -- Bench schema for Tier 2 benchmarks. | ||
| -- Applied against the bench-postgres container AFTER EQL has been explicitly | ||
| -- installed by generate.sh (see Task 4 — generate.sh installs | ||
| -- release/cipherstash-encrypt.sql directly, not relying on Proxy's async install). | ||
|
|
||
| DROP TABLE IF EXISTS bench; | ||
|
|
||
| CREATE TABLE bench ( | ||
| id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, | ||
| encrypted_text eql_v2_encrypted, | ||
| encrypted_int eql_v2_encrypted, | ||
| encrypted_bigint eql_v2_encrypted | ||
| ); | ||
|
|
||
| -- Proxy search configuration: tells Proxy which index terms to generate | ||
| -- for each column when plaintext is inserted. | ||
| -- | ||
| -- Signature: eql_v2.add_search_config(table, column, index, cast_as) | ||
| -- (see src/config/functions.sql). add_search_config calls activate_config | ||
| -- internally when migrating=false, so no explicit activate_config call. | ||
|
|
||
| -- text column: equality (hmac), pattern match (bloom), ordering (ore) | ||
| SELECT eql_v2.add_search_config('bench', 'encrypted_text', 'unique', 'text'); | ||
| SELECT eql_v2.add_search_config('bench', 'encrypted_text', 'match', 'text'); | ||
| SELECT eql_v2.add_search_config('bench', 'encrypted_text', 'ore', 'text'); | ||
|
|
||
| -- integer column: equality + ORE range/ordering | ||
| SELECT eql_v2.add_search_config('bench', 'encrypted_int', 'unique', 'int'); | ||
| SELECT eql_v2.add_search_config('bench', 'encrypted_int', 'ore', 'int'); | ||
|
|
||
| -- bigint column: equality + ORE range/ordering | ||
| SELECT eql_v2.add_search_config('bench', 'encrypted_bigint', 'unique', 'big_int'); | ||
| SELECT eql_v2.add_search_config('bench', 'encrypted_bigint', 'ore', 'big_int'); | ||
|
|
||
| -- Indexes (created after data load in generate.sh, after ANALYZE) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.