Skip to content

coordinator: domain-separate transit engine keys from workload secrets#2500

Open
sespiros wants to merge 1 commit into
mainfrom
sse/transit-key-domain-separation
Open

coordinator: domain-separate transit engine keys from workload secrets#2500
sespiros wants to merge 1 commit into
mainfrom
sse/transit-key-domain-separation

Conversation

@sespiros

@sespiros sespiros commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

The transit engine derived its AES keys with DeriveWorkloadSecret("_"), the same HKDF namespace that hands each pod its workload secret. A pod deployed with WorkloadSecretID "0_vault_unsealing" thus received, in its own workload secret, the exact key the transit engine uses to wrap OpenBao's master key. That defeats the transit API's per-workload authorization on both directions: the pod can unwrap the auto-unseal blob offline, and (since it holds the key) forge a blob the Coordinator will decrypt, injecting an attacker-chosen master key on OpenBao's untrusted storage.

Derive transit keys from a dedicated seed and HKDF label so their keyspace can never alias a workload secret. No decrypt-side fallback to the old derivation is provided on purpose: honoring the legacy key would keep the forgery path open, since the server cannot tell an original legacy blob from an attacker-forged one. This is a breaking change for existing OpenBao auto-unseal deployments, which must migrate their master key (seal-migration or re-init) onto the new key on upgrade.

Fixes CON-201

TODO

  • Decide on rollout/backwards compatibility as this is a breaking change! This is now tagged as a breaking change for the release notes.

The transit engine derived its AES keys with DeriveWorkloadSecret("<version>_<name>"),
the same HKDF namespace that hands each pod its workload secret. A pod deployed with
WorkloadSecretID "0_vault_unsealing" thus received, in its own workload secret, the
exact key the transit engine uses to wrap OpenBao's master key. That defeats the
transit API's per-workload authorization on both directions: the pod can unwrap the
auto-unseal blob offline, and (since it holds the key) forge a blob the Coordinator
will decrypt, injecting an attacker-chosen master key on OpenBao's untrusted storage.

Derive transit keys from a dedicated seed and HKDF label so their keyspace can never
alias a workload secret. No decrypt-side fallback to the old derivation is provided on
purpose: honoring the legacy key would keep the forgery path open, since the server
cannot tell an original legacy blob from an attacker-forged one. This is a breaking
change for existing OpenBao auto-unseal deployments, which must migrate their master
key (seal-migration or re-init) onto the new key on upgrade.

Signed-off-by: Spyros Seimenis <sse@edgeless.systems>
@sespiros sespiros requested a review from burgerdev July 2, 2026 13:10
@sespiros sespiros added the changelog PRs that should be part of the release notes label Jul 2, 2026
@linear-code

linear-code Bot commented Jul 2, 2026

Copy link
Copy Markdown

CON-201

@burgerdev burgerdev left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this, lgtm!

As migration path, we could recommend starting a pod with the problematic 0_$ID to obtain the old sealing secret, then reseal using the transit secret engine. I don't think we need to keep a backwards-compatible implementation around.

}, logger)
return
}
key, err := deriveEncryptionKey(r.Context(), guard, fmt.Sprintf("%d_%s", encReq.KeyVersion, workloadSecretID))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I think of it, I remember that we picked %d_%s because generated workloadSecretIDs don't contain an underscore: #1199 (comment). We could enforce that in manifest.Validate, but that would be even more breaking.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! I didn't like that _ was essentially that load-bearing. I initially solved this by changing the HKDF info label but then I thought that a dedicated seed would split the two keyspaces even more cleanly.

Comment on lines +240 to +249
decryptReqBody, err := json.Marshal(map[string]string{"ciphertext": string(bytes.Trim(ciphertext, `"`))})
require.NoError(err)

req := httptest.NewRequestWithContext(t.Context(), http.MethodPut, "/v1/transit/decrypt/"+name, bytes.NewReader(decryptReqBody))
req.Header.Set("Content-Type", "application/json")

rec := httptest.NewRecorder()
mux.ServeHTTP(rec, req)
res := rec.Result()
t.Cleanup(func() { _ = res.Body.Close() })

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you choose HTTP instead of calling the function directly?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which function should I be calling? The rest of the test suite exercises the API this way exercising the full end-to-end path. I initially wrote the test and it proved meaningful because in some initial version I experimented with adding various types of fallbacks which obviously all failed this test (as the comment describes).

@sespiros sespiros added breaking change A user-affecting breaking change and removed changelog PRs that should be part of the release notes labels Jul 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking change A user-affecting breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants