Skip to content

kms + multi-tenancy#78

Open
JimA-cyborg wants to merge 9 commits into
mainfrom
multi-tenancy-2
Open

kms + multi-tenancy#78
JimA-cyborg wants to merge 9 commits into
mainfrom
multi-tenancy-2

Conversation

@JimA-cyborg
Copy link
Copy Markdown
Contributor

@JimA-cyborg JimA-cyborg commented May 26, 2026

KMS + Multi-Tenancy: per-index KMS routing

Summary

Tracks cyborgdb-service's Phase 1 per-index KMS routing slice in the Python
SDK. Regenerates the OpenAPI client against openapi.json v0.16.1, makes
index_key optional across all data-plane request models, adds a new
kms_name parameter to Client.create_index, and ships offline + live
BYOK tests covering the three KMS posture variants.

End-state: an SDK caller can create and re-open a fully KMS-backed index
without ever holding the data key, while the legacy SDK-supplied-key path
continues to work unchanged.

Motivation

The service moved to per-index KMS routing: each index now carries a
KMSBlob describing which kms.registry slot wraps its KEK
(provider: aws-kms, aws, or none). The Python SDK was still sending
the SDK-held key on every request, which:

  1. Blocks customers with HSM-resident or Secrets-Manager-backed KEKs from
    using the SDK at all.
  2. Forces SDK callers to keep persistent state (the 32-byte key) per index,
    even when the service is fully capable of resolving it.

This PR brings the Python SDK to parity with the Go SDK port (cyborgdb-go
PR #27) and the spec already shipped in cyborgdb-service.

Changes

  • Spec + generator pin: openapi.json bumped to v0.16.1; openapitools.json
    added to pin openapi-generator-cli to 7.22.0 for reproducible regenerations.
  • update-openapi-client.sh: prefers npm-distributed openapi-generator-cli,
    falls back to brew's openapi-generator.
  • Regenerated cyborgdb/openapi_client/: kms_name added to
    CreateIndexRequest; index_key flipped to Optional[StrictStr] across
    all request models. Stale index_config.py + 4× index_ivf*_model.py
    removed (no longer in the spec).
  • cyborgdb/client/client.py:
    • New module-level _validate_index_key helper.
    • Client.create_index rewritten to the three-mode contract: index_key
      only / kms_name only / both (the latter is only valid against a
      provider: none registry slot, enforced server-side).
    • Client.load_index's index_key is now Optional[bytes]; absent for
      KMS-backed indexes.
    • create_index builds the EncryptedIndex handle first (no network I/O)
      and reuses its cached hex for the request, so the key is hex-encoded
      exactly once; the now-unused import binascii was dropped.
  • cyborgdb/client/encrypted_index.py:
    • __init__ accepts Optional[bytes]; the hex encoding is pre-computed
      once and cached on self._index_key_hex.
    • _key_to_hex() returns Optional[str]; the ~13 callsites are
      transparent through the now-optional request models.
    • index_type / index_config getters no longer swallow ApiException
      and return "unknown" / {} — they let it propagate so a
      missing/inaccessible index fails at load_index time instead of
      returning a phantom handle. Caller-visible behavior change (see
      Risk assessment).
    • Added a _ior() helper for the repeated name+key
      IndexOperationRequest construction; the four describe/delete call
      sites now route through it.
  • tests/test_api_contract.py: signature expectations updated for the
    new kms_name parameter + new defaults; existing test_08_client_create_index
    switched to keyword args; appended TestSDKConstructionOffline (9 offline
    tests covering optional-key, KMS-only, and provider:none mixed-mode paths).
    Model imports live in the file header.
  • tests/test_kms_byok.py (new): env-gated live BYOK round-trips for
    all three postures (TestKMSReal / TestKMSSecretsManager /
    TestProviderNone), sharing a _KMSRoundTripBase data-plane mixin. Plus
    TestKMSRealRejectsSDKKey, a negative test asserting that
    index_key + kms_name against a real-provider slot is rejected (server
    400 surfaced as ValueError) — the SDK forwards both fields untouched.
  • README.md: new "Bring Your Own Key (BYOK) via KMS" section under
    Advanced Usage; "Flexible Indexing" feature bullet replaced with
    "Encrypted DiskIVF Indexing" (matches actual core capabilities).

Testing

  • Unit tests added/updated
  • Integration tests pass
  • Manual testing notes:
    • Offline: pytest tests/test_api_contract.py::TestSDKConstructionOffline -v
      → 9/9 pass.
    • Full collection: pytest tests/ --collect-only -q → 129 tests, no
      import errors.
    • Live BYOK (tests/test_kms_byok.py) is env-gated on
      CYBORGDB_KMS_NAME_REAL, CYBORGDB_KMS_NAME_SM, CYBORGDB_KMS_NAME_NONE.
      Skip cleanly when unset. Needs to be run against a cyborgdb-service
      instance with a configured kms.registry. Not yet verified on this
      branch — same CI gap as the Go side (called out in §9 of python.md).
    • Signature audit: inspect.signature(Client.create_index) → ordered
      (index_name, index_key, kms_name, dimension, embedding_model, metric, storage_precision)
      with None defaults for everything past index_name.

Risk assessment

  • Blast radius:

    • Spec change is additive on the wire. kms_name is new and optional;
      index_key flips from required to optional. Existing callers passing
      index_key keep working unchanged — the service treats key-only requests
      exactly as before (provider: none, SDK-supplied KEK).
    • API surface change: Client.create_index/load_index parameter
      positions are stable for callers using kwargs. Callers passing
      client.create_index(name, key, dim) positionally still work
      (kms_name slotted after index_key, before dimension). Mixed-
      positional-and-keyword callers passing dimension positionally
      (client.create_index("x", k, 128)) break: 128 now binds to
      kms_name. We don't expect this in practice (most callers use
      kwargs), but worth flagging.
    • Generated-client churn is large but mechanical (49 model files
      touched). The only semantic delta is in the two schemas the spec changed.
    • No DB / data migration required. Existing indexes are unaffected;
      the service still routes them through their stored KMSBlob.
    • Behavior change: index_type / index_config now raise. Reading
      index.index_type or index.index_config directly now raises
      ApiException when the describe call fails (missing index, auth error,
      service down) instead of silently returning "unknown" / {}. Audited
      every caller across cyborgdb/, tests/, and integrations/ — none
      relied on the old sentinels. Worth a changelog note for downstream
      consumers reading these properties directly.
  • Rollback plan:

    • git revert <merge> and re-publish. The spec is already v0.16.1 on
      cyborgdb-service so the service will still accept v0.16.0-style
      requests (additive change), meaning a downgrade of just the SDK is safe.
    • Worst-case partial rollback: revert only client.py + encrypted_index.py
      to restore mandatory index_key, leaving the regenerated openapi_client
      intact. Existing callers continue to work.

Reviewers, please focus on:

  • load_index silent-existence-check (client.py:252_ = index.index_type).
    EncryptedIndex.index_type swallows ApiException and returns "unknown",
    so this line doesn't actually validate the index exists. Pre-existing on
    main, but KMS-backed loads now lean on it harder. Decide: let
    ApiException propagate, or add an explicit assertion.
  • Three-mode contract semantics (client.py::create_index docstring).
    Confirm the documented behavior matches your expectation, especially the
    index_key + kms_name case for provider: none slots vs. the 400 for
    real-KMS slots.
  • Backwards compat for positional callers. The new kms_name parameter
    slots between index_key and dimension. Anyone calling
    client.create_index(name, key, dim, ...) positionally will silently bind
    dim to kms_name. Is this acceptable, or should we put kms_name after
    storage_precision?
  • Duplicated IndexOperationRequest(...) construction in
    encrypted_index.py (4 occurrences). Worth a _ior() helper, or leave
    alone as pre-existing scope-creep?
  • tests/test_kms_byok.py ordering coupling. setUpClass + type(self).index = index
    assumes test_01_* runs before test_03_*/test_04_*. Port of the Go SDK's
    shape — should we add explicit ordering, or rely on unittest's
    alphabetical default?
  • Is python.md (635 lines) intended to ship with the package? It's
    the internal porting guide. If not, add to .gitignore + MANIFEST.in
    exclusions.
  • Live-CI gap for test_kms_byok.py. Same gap exists on the Go side
    — we need a CI slot wiring the three CYBORGDB_KMS_NAME_* envs against
    a configured cyborgdb.yaml. Acceptable as follow-up, or block on it?

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ API Contract Change Detected

This PR modifies the public API contract of the CyborgDB Python SDK.

Please provide an explanation for this change:

  • Why is this change necessary?
  • Is this a breaking change or backward compatible?
  • Have you updated the documentation?

This review must be dismissed or addressed before the PR can be merged.

Comment thread cyborgdb/client/encrypted_index.py Outdated
Comment thread cyborgdb/client/client.py Outdated
Comment thread cyborgdb/client/client.py
Comment thread cyborgdb/client/client.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants