Use AgentAssertion downstream behind use_agent_identity#17388
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a9459f7442
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| anyhow::ensure!( | ||
| stored_identity.agent_runtime_id == agent_task.agent_runtime_id, | ||
| "agent task runtime {} does not match stored agent identity {}", | ||
| agent_task.agent_runtime_id, | ||
| stored_identity.agent_runtime_id | ||
| ); |
There was a problem hiding this comment.
Re-register agent task when runtime IDs diverge
Handle runtime mismatches by refreshing the cached task instead of bailing. SessionState reuses persisted agent_task, but identities can be regenerated (e.g., auth/account change or missing secret on restart). This ensure! turns that recoverable state drift into a hard failure, causing every downstream request to fail before sending any API call.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Moved/fixed in the replacement PR: #17807.
We changed this from a hard ensure! failure into a recoverable stale-task path. Assertion construction now reports AgentTaskRuntimeMismatch when the persisted task points at an old runtime, client setup maps that to CodexErr::AgentTaskStale, and the sampling loop clears that cached task, registers a fresh one once, and retries with a new task-scoped session.
Code pointers in #17807:
- mismatch detection:
codex/codex-rs/core/src/agent_identity/assertion.rs
Lines 15 to 52 in a87e178
- refresh/retry path:
codex/codex-rs/core/src/codex.rs
Lines 7072 to 7089 in a87e178
- regression coverage:
codex/codex-rs/core/src/client_tests.rs
Lines 397 to 424 in a87e178
5f089f4 to
f530190
Compare
a9459f7 to
38e8366
Compare
f530190 to
b6ea8a1
Compare
38e8366 to
e91a717
Compare
b6ea8a1 to
fd5337b
Compare
0ae9407 to
03e0861
Compare
d25b68d to
321c151
Compare
a45208e to
934b1b9
Compare
321c151 to
56cda90
Compare
8b333ce to
fec1370
Compare
6a72f1b to
3cbb522
Compare
fec1370 to
00f376a
Compare
3cbb522 to
c1c3c37
Compare
00f376a to
17dcda8
Compare
c1c3c37 to
edb4613
Compare
a76e07b to
e3e80fb
Compare
99099a4 to
fa9d205
Compare
e3e80fb to
7c631ff
Compare
fa9d205 to
c66f085
Compare
fdaec7a to
bc9d0fe
Compare
| ) -> Self { | ||
| Self::new_with_agent_identity_manager( | ||
| auth_manager, | ||
| /*agent_identity_manager*/ None, |
There was a problem hiding this comment.
Just to clarify - it is plumbed but not fully at this point?
There was a problem hiding this comment.
I'm not sure what happened, but somehow this PR was merged. I'll respond anyway, and we can figure out what's going on, I'll re-open this on another PR, I think.
There was a problem hiding this comment.
Ah ok, my codex fucked up, and github closed this automatically. Will reopen here: #17807
There was a problem hiding this comment.
No worries - let me know if you want to continue the conversation here or there.
There was a problem hiding this comment.
Yep, that read is right: the path was plumbed, but we are intentionally not making API-key auth work with AgentAssertion in this PR.
In the replacement PR (#17807), the current scope is explicit: AgentAssertion is only for ChatGPT-backed Codex sessions, and API-key sessions continue using the API key until the registration service supports API-key identity binding. The downstream request setup still switches to AgentAssertion when a task exists for a ChatGPT-auth session.
Code pointers in #17807:
- ChatGPT-only scope comment:
codex/codex-rs/core/src/agent_identity.rs
Lines 454 to 459 in a87e178
- downstream AgentAssertion header selection:
codex/codex-rs/core/src/client.rs
Lines 751 to 775 in a87e178
- API-key task registration remains skipped:
codex/codex-rs/core/src/agent_identity/task_registration.rs
Lines 193 to 204 in a87e178
There was a problem hiding this comment.
Let me know if these followups answer your questions @shijie-oai
| client: ModelClient, | ||
| websocket_session: WebsocketSession, | ||
| agent_task: Option<RegisteredAgentTask>, | ||
| cache_websocket_session_on_drop: bool, |
There was a problem hiding this comment.
let me play it back a little bit - cache_websocket_session_on_drop is a bit that indicates if a websocket session can be cached and reused on drop.
There was a problem hiding this comment.
Exactly, and the replacement PR (#17807) makes that bit a little less load-bearing for auth correctness.
The cache now stores the RegisteredAgentTask alongside the WebSocket session. cache_websocket_session_on_drop only controls whether this session should be put back in the cache at all; actual reuse is allowed only when the cached connection's task matches the next session's task. That means same-thread/same-task WebSockets can be reused, while taskless vs task-scoped or different-task auth contexts are dropped instead of mixed.
Code pointers in #17807:
- task-keyed cache/reuse logic:
codex/codex-rs/core/src/client.rs
Lines 383 to 459 in a87e178
- cache-on-drop stores the task key:
codex/codex-rs/core/src/client.rs
Lines 917 to 923 in a87e178
- reuse coverage for the same task:
codex/codex-rs/core/src/client_tests.rs
Lines 503 to 578 in a87e178
| let mut prewarmed_client_session = prewarmed_client_session; | ||
| if agent_task.is_some() |
There was a problem hiding this comment.
For my personal understanding of this code block - why would an agent task associated with a thread not able to use the prewarmed client session of that thread? Would that be a possible scenario?
There was a problem hiding this comment.
Chatted with Codex a bit locally and correct me if I am wrong - in this case an initial prewarmed thread is created without agent_task (none) and if we want to reuse it, it has to have a matching task. The effectively meant that the initial prewarmed thread on thread creation is always discarded and we do not get the benefit of it?
There was a problem hiding this comment.
Yep, your follow-up was the key point here: a taskless startup-prewarmed WebSocket is not useful once the first real turn needs task-scoped AgentAssertion auth, because we would have to discard that taskless connection.
In the replacement PR (#17807), we stopped scheduling startup prewarm for ChatGPT/AgentIdentity sessions. Instead, task registration starts once a user turn is accepted and overlaps with the rest of turn prep, then we await it before the model request. That preserves the "do not create a task at startup" behavior while still warming up as late/early as core can observe. We did not hook literal keypress/typing events in this PR, since that would require app/TUI input plumbing rather than the core turn path.
Code pointers in #17807:
- skip taskless startup prewarm for AgentIdentity sessions:
codex/codex-rs/core/src/session_startup_prewarm.rs
Lines 159 to 168 in a87e178
- lazy task registration overlapped with turn prep:
codex/codex-rs/core/src/codex.rs
Lines 6357 to 6394 in a87e178
- task-aware WebSocket reuse after that:
codex/codex-rs/core/src/client.rs
Lines 383 to 459 in a87e178
There was a problem hiding this comment.
Sorry I am re-reading those and I am not 100% sure no longer leveraging prewarming for chatgpt auth like method is ideal. Correct me if I am wrong but in long term agent identity would be applicable to all user instance and if that is the case we take away all prewarming capability essentially and lose the performance benefit (we are still keeping it for the api auth path)
bc9d0fe
into
dev/adrian/codex/agent-identity-register-task
Summary
Stack PR 4 of 4 for feature-gated agent identity support.
This PR uses the registered agent task to attach
AgentAssertionauthorization on downstream backend paths whenfeatures.use_agent_identityis enabled.Stack
features.use_agent_identityValidation
Covered as part of the local stack validation pass:
just fmtcargo test -p codex-core --lib agent_identitycargo test -p codex-core --lib agent_assertioncargo test -p codex-core --lib websocket_agent_taskcargo test -p codex-api api_bridgecargo build -p codex-cli --bin codexNotes
The full local app-server E2E path is still being debugged after PR creation. The current branch stack is directionally ready for review while that follow-up continues.