Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
258 commits
Select commit Hold shift + click to select a range
9c3abcd
[codex] Move config loading into codex-config (#19487)
pakrym-oai Apr 26, 2026
dda8199
permissions: migrate approval and sandbox consumers to profiles (#19393)
bolinfest Apr 26, 2026
ba159cb
Fix codex-core config test type paths (#19726)
pakrym-oai Apr 26, 2026
4c58e64
test: increase core-all-test shard count to 16 (#19727)
bolinfest Apr 26, 2026
0bda816
Split MCP connection modules (#19725)
aibrahim-oai Apr 26, 2026
35bc6e3
Delete unused ResponseItem::Message.end_turn (#19605)
andmis Apr 27, 2026
2cb8746
permissions: remove core legacy policy round trips (#19394)
bolinfest Apr 27, 2026
1f304dd
Allow agents.max_threads to work with multi_agent_v2 (#19733)
andmis Apr 27, 2026
ad57a3f
permissions: finish profile-backed app surfaces (#19395)
bolinfest Apr 27, 2026
c3e6084
inline hostname resolution for remote sandbox config (#19739)
abhinav-oai Apr 27, 2026
0d8cdc0
permissions: centralize legacy sandbox projection (#19734)
bolinfest Apr 27, 2026
8033b6a
Add /auto-review-denials retry approval flow (#19058)
won-openai Apr 27, 2026
0ccd659
permissions: store only constrained permission profiles (#19735)
bolinfest Apr 27, 2026
523e4aa
permissions: constrain requirements as profiles (#19736)
bolinfest Apr 27, 2026
a6ca39c
permissions: derive legacy exec policies at boundaries (#19737)
bolinfest Apr 27, 2026
4f1d5f0
Add Codex issue digest skill (#19779)
etraut-openai Apr 27, 2026
f8c527e
multi_agent_v2: move thread cap into feature config (#19792)
jif-oai Apr 27, 2026
01ab25d
feat: use git-backed workspace diffs for memory consolidation (#18982)
jif-oai Apr 27, 2026
5d314f3
Allow Phase 2 memory claims after retry exhaustion (#19809)
jif-oai Apr 27, 2026
79b4f69
Avoid rewriting Phase 2 selection on clean workspace (#19812)
jif-oai Apr 27, 2026
f431ec1
nit: one more fix (#19813)
jif-oai Apr 27, 2026
bb83eec
chore: split memories part 1 (#19818)
jif-oai Apr 27, 2026
6c51bf0
Hide rewind preview when no user message exists (#19510)
etraut-openai Apr 27, 2026
0e2300c
Persist shell mode commands in prompt history (#19618)
etraut-openai Apr 27, 2026
48dd7b5
Render delegated patch approval details (#19709)
etraut-openai Apr 27, 2026
4ed22fc
Streamline plugin, apps, and skills handlers (#19490)
pakrym-oai Apr 27, 2026
2009f6e
refactor: make auth loading async (#19762)
efrazer-oai Apr 27, 2026
c208455
ci: pin npm staging smoke test to a recent rust-release run (#19854)
bolinfest Apr 27, 2026
cafe717
ci: migrate Bazel setup away from archived setup-bazelisk (#19851)
bolinfest Apr 27, 2026
e5709db
Streamline account and command handlers (#19491)
pakrym-oai Apr 27, 2026
85c1500
fix: filter dynamic deferred tools from model_visible_specs (#19771)
sayan-oai Apr 27, 2026
215d5a8
[codex-analytics] remove ga flag (#19863)
rhan-oai Apr 27, 2026
277186e
Cap original-detail image token estimates (#19865)
fjord-oai Apr 27, 2026
850f035
Fix filtered thread-list resume regression in TUI (#19591)
etraut-openai Apr 27, 2026
0bd25ab
Delay approval prompts while typing (#19513)
etraut-openai Apr 27, 2026
52c06b8
Preserve TUI markdown list spacing after code blocks (#19706)
etraut-openai Apr 27, 2026
4b55979
permissions: remove cwd special path (#19841)
bolinfest Apr 27, 2026
c5e2921
Streamline thread start handler (#19492)
pakrym-oai Apr 27, 2026
798de22
[codex-backend] Prefer state git metadata in filtered thread lists (#…
joeytrasatti-openai Apr 27, 2026
5c30d79
Streamline thread mutation handlers (#19493)
pakrym-oai Apr 27, 2026
4ded800
[codex] Shard exec Bazel integration test (#19862)
starr-openai Apr 27, 2026
0f40261
Publish Python SDK with Codex-pinned versioning (#18996)
sdcoffey Apr 27, 2026
2be9fd5
Streamline thread read handlers (#19494)
pakrym-oai Apr 27, 2026
30c5c76
[codex] Trace cancelled inference streams (#19839)
cassirer-openai Apr 27, 2026
739ab6b
Streamline thread resume and fork handlers (#19495)
pakrym-oai Apr 27, 2026
e903d00
Streamline turn and realtime handlers (#19497)
pakrym-oai Apr 27, 2026
e64c765
Show action required in terminal title (#18372)
canvrno-oai Apr 27, 2026
dcd139b
Add MCP app feature flag (#19884)
mzeng-openai Apr 27, 2026
c5a495c
Streamline review and feedback handlers (#19498)
pakrym-oai Apr 27, 2026
755880e
permissions: derive config defaults as profiles (#19772)
bolinfest Apr 27, 2026
2f3b5ed
disallow fileparams metadata for custom mcps (#19836)
colby-oai Apr 28, 2026
a3350de
Refactor exec-server filesystem API into codex-file-system (#19892)
miz-openai Apr 28, 2026
7e8594f
Stabilize plugin MCP fixture tests (#19452)
dylan-hurd-oai Apr 28, 2026
4e05f30
Remove ghost snapshots (#19481)
pakrym-oai Apr 28, 2026
af95662
permissions: require profiles in TUI thread state (#19773)
bolinfest Apr 28, 2026
2307aa8
Allow /statusline and /title slash commands during active turns (#19917)
canvrno-oai Apr 28, 2026
c08177f
refactor: load agent identity runtime eagerly (#19763)
efrazer-oai Apr 28, 2026
6a8df2b
[codex-analytics] include user agent in default headers (#17689)
marksteinbrick-oai Apr 28, 2026
b7e5588
Clarify PR template invitation requirement (#19912)
etraut-openai Apr 28, 2026
5ba908d
Avoid persisting ShutdownComplete after thread shutdown (#19630)
etraut-openai Apr 28, 2026
bf38def
permissions: make SessionConfigured profile-only (#19774)
bolinfest Apr 28, 2026
fc2a691
permissions: derive snapshot sandbox projections (#19775)
bolinfest Apr 28, 2026
92fb848
Allow large remote app-server resume responses (#19920)
etraut-openai Apr 28, 2026
341550c
permissions: store thread sessions as profiles (#19776)
bolinfest Apr 28, 2026
0a32c8b
app-server-protocol: mark permission profiles experimental (#19899)
bolinfest Apr 28, 2026
b985768
Add `codex update` command (#19933)
etraut-openai Apr 28, 2026
7d72fc8
feat: Cache remote plugin bundles on install (#19914)
xl-openai Apr 28, 2026
803705f
Add remote plugin uninstall API (#19456)
xli-oai Apr 28, 2026
fd36838
Add MultiAgentV2 root and subagent context hints (#19805)
jif-oai Apr 28, 2026
431ebea
feat: split memories part 2 (#19860)
jif-oai Apr 28, 2026
b7c0f26
feat: fix hinting 2 (#19961)
jif-oai Apr 28, 2026
54d1401
feat: fix hinting 3 (#19963)
jif-oai Apr 28, 2026
fa127be
Stabilize memory Phase 2 input ordering (#19967)
jif-oai Apr 28, 2026
a9e5c34
feat: trigger memories from user turns with cooldown (#19970)
jif-oai Apr 28, 2026
0e8d6b8
fix: configure AgentIdentity AuthAPI base URL (#19904)
efrazer-oai Apr 28, 2026
1b74360
feat: skip memory startup when Codex rate limits are low (#19990)
jif-oai Apr 28, 2026
5a79dfa
feat: house-keeping memories 1 (#19998)
jif-oai Apr 28, 2026
21e1991
feat: house-keeping memories 2 (#20000)
jif-oai Apr 28, 2026
598bbcd
Preserve assistant phase for replayed messages (#19832)
friel-openai Apr 28, 2026
a61c785
Reset TUI keyboard reporting on exit (#19625)
etraut-openai Apr 28, 2026
5e73737
feat(tui): add configurable keymap support (#18593)
fcoury-oai Apr 28, 2026
0156b1e
[sandbox] Enforce protected workspace metadata paths (#19846)
evawong-oai Apr 28, 2026
5b7d6f5
feat: house-keeping memories 3 (#20005)
jif-oai Apr 28, 2026
087c9c1
TUI: use cumulative turn duration for worked-for separator (#19929)
etraut-openai Apr 28, 2026
4e0cf94
Terminate stdio MCP servers on shutdown to avoid process leaks (#19753)
etraut-openai Apr 28, 2026
ccec84b
Add turn start timestamp to turn metadata (#19473)
mchen-oai Apr 28, 2026
6138063
Strip connector provenance metadata from custom MCP tools (#19875)
colby-oai Apr 28, 2026
f6797c3
feat: verify agent identity JWTs with JWKS (#19764)
efrazer-oai Apr 28, 2026
0670d89
Enforce workspace metadata protections in Seatbelt (#19847)
evawong-oai Apr 28, 2026
01de13b
Record MCP result telemetry on mcp.tools.call spans (#19509)
mchen-oai Apr 28, 2026
273c2e2
Clarify network approval auto-review prompts (#19907)
maja-openai Apr 28, 2026
c6bcd27
feat(tui): suggest plan mode from composer drafts (#19901)
fcoury-oai Apr 28, 2026
bc5a1b9
Move local /resume cwd filtering into thread/list (#19931)
canvrno-oai Apr 28, 2026
a036584
fix(tui): let esc exit empty shell mode (#19986)
fcoury-oai Apr 28, 2026
4c68bd7
External agent session support (#19895)
stefanstokic-oai Apr 28, 2026
3afb185
fix(network-proxy): tighten network proxy bypass defaults (#20002)
viyatb-oai Apr 28, 2026
9e26613
permissions: add built-in default profiles (#19900)
bolinfest Apr 28, 2026
640a1b2
Fix plan mode nudge test after task completion signature change (#20045)
canvrno-oai Apr 28, 2026
de2ccf9
[codex] Add token usage to turn tracing spans (#19432)
charley-openai Apr 28, 2026
3377afd
fix(network-proxy): harden linux proxy bridge helpers (#20001)
viyatb-oai Apr 28, 2026
7f7c7c2
Fix log db batch flush flake (#19959)
dylan-hurd-oai Apr 28, 2026
0700f97
app-server: run initialized rpcs with keyed serialization (#17373)
euroelessar Apr 28, 2026
25ac0e4
Load cloud requirements for agent identity (#19708)
shijie-oai Apr 28, 2026
e1ba87c
fix(network-proxy): recheck network proxy connect targets (#19999)
viyatb-oai Apr 28, 2026
1de7a9b
app-server: allow remote_control runtime feature override (#20047)
euroelessar Apr 28, 2026
34d71d4
Make MultiAgentV2 wait minimum configurable (#20052)
jif-oai Apr 28, 2026
3b74a4d
tui: use permission profiles for sandbox state (#20008)
bolinfest Apr 28, 2026
10e2a73
app-server: disable remote control without sqlite (#20068)
euroelessar Apr 28, 2026
89698ad
[rollout-trace] Include x-request-id in rollout trace. (#20066)
cassirer-openai Apr 28, 2026
c6e7d56
Discover hooks bundled with plugins (#19705)
abhinav-oai Apr 28, 2026
66b0781
/plugins: add marketplace install flow (#18704)
canvrno-oai Apr 28, 2026
2e598df
fix: don't auto approve git -C ... (#20085)
owenlin0 Apr 28, 2026
3291463
Fix flaky plugin hook env test (#20088)
abhinav-oai Apr 28, 2026
2dbde94
fix(network-proxy): normalize network proxy host matching (#19995)
viyatb-oai Apr 28, 2026
8917228
core tests: submit turns with permission profiles (#20010)
bolinfest Apr 28, 2026
5e6cbba
Return None when auth refresh fails (#20092)
gpeal Apr 28, 2026
c6465c1
app-server: notify clients of remote-control status changes (#19919)
euroelessar Apr 28, 2026
2223b31
Refine Codex issue digest summaries (#20097)
etraut-openai Apr 28, 2026
7d15936
core tests: build user turns from permission profiles (#20011)
bolinfest Apr 29, 2026
52e79ee
core tests: migrate more turns to permission profiles (#20013)
bolinfest Apr 29, 2026
158b2a4
core tests: configure profiles directly (#20015)
bolinfest Apr 29, 2026
d6d79ff
core tests: send model turns with permission profiles (#20016)
bolinfest Apr 29, 2026
5b0d9df
Increase plugin hook env test timeout (#20100)
abhinav-oai Apr 29, 2026
d77d23d
core tests: migrate model/personality turns to profiles (#20018)
bolinfest Apr 29, 2026
2a8ce9b
core tests: migrate view image turns to profiles (#20021)
bolinfest Apr 29, 2026
162f4e3
core tests: migrate safety check turns to profiles (#20024)
bolinfest Apr 29, 2026
8d3992d
core tests: migrate plan item turns to profiles (#20026)
bolinfest Apr 29, 2026
3ef09c7
core tests: migrate tools tests to permission profiles (#20027)
bolinfest Apr 29, 2026
b599849
core tests: migrate permissions message tests to profiles (#20028)
bolinfest Apr 29, 2026
5d08315
core tests: migrate exec policy turns to profiles (#20030)
bolinfest Apr 29, 2026
af39e48
core tests: migrate prompt caching turns to profiles (#20032)
bolinfest Apr 29, 2026
1ea9041
core tests: migrate request permissions tool turns to profiles (#20033)
bolinfest Apr 29, 2026
026df71
core tests: migrate zsh-fork permissions to profiles (#20034)
bolinfest Apr 29, 2026
6662c0f
core tests: migrate compact turns to profiles (#20035)
bolinfest Apr 29, 2026
1dae578
core tests: migrate rmcp turns to profiles (#20037)
bolinfest Apr 29, 2026
1fed948
core tests: migrate apply patch turns to profiles (#20040)
bolinfest Apr 29, 2026
1211a90
core tests: migrate hook turns to profiles (#20041)
bolinfest Apr 29, 2026
ebdf3a8
Support disabling tool suggest for specific tools. (#20072)
mzeng-openai Apr 29, 2026
cb8b1bb
Support detect and import MCP, Subagents, hooks, commands from extern…
alexsong-oai Apr 29, 2026
f8fe96d
feat: disable capabilities by model provider (#19442)
celia-oai Apr 29, 2026
c9f7c88
fix: restore live event submit path for apply patch tests (#20108)
bolinfest Apr 29, 2026
24be9ac
Restore TUI working status after steer message is set (#19939)
canvrno-oai Apr 29, 2026
4c39ad3
Fix plugin list workspace settings test isolation (#20086)
canvrno-oai Apr 29, 2026
8c47e36
feat: expose provider capability bounds to app server clients (#20049)
celia-oai Apr 29, 2026
80fb070
feat: update Bedrock Mantle endpoint and GPT-5.4 model ID (#20109)
celia-oai Apr 29, 2026
e6db1a9
linux-sandbox: switch helper plumbing to PermissionProfile (#20106)
bolinfest Apr 29, 2026
6f328d5
Soften skill description budget warnings (#20112)
xl-openai Apr 29, 2026
e1ec9e6
Add environment provider snapshot (#20058)
starr-openai Apr 29, 2026
3d10ba9
chore(cli) deprecate --full-auto (#20133)
dylan-hurd-oai Apr 29, 2026
6ed0440
feat(cli): add explicit sandbox permission profiles (#20117)
viyatb-oai Apr 29, 2026
857146b
Delete multi_agent_v2 followup_task interrupt parameter (#20139)
andmis Apr 29, 2026
5597925
feat(cli): add sandbox profile config controls (#20118)
viyatb-oai Apr 29, 2026
d92c909
Fix migrated hook path rewriting (#20144)
alexsong-oai Apr 29, 2026
5cac3f8
Fix Windows pseudoconsole attribute handling for sandboxed PTY sessio…
iceweasel-oai Apr 29, 2026
c41b74c
nit: drop old memories things (#20186)
jif-oai Apr 29, 2026
70ac0f1
Make multi-agent v2 ignore agents.max_depth (#20180)
jif-oai Apr 29, 2026
91ca551
Use /goal resume for paused goals (#20082)
etraut-openai Apr 29, 2026
1c420a9
TUI: Remove core protocol dependency [1/7] (#20172)
etraut-openai Apr 29, 2026
cecca5a
Improve Windows process management edge cases (#19211)
iceweasel-oai Apr 29, 2026
df96699
[rollout-tracer] Match analysis messages on encrypted id. (#20123)
cassirer-openai Apr 29, 2026
4456298
TUI: Remove core protocol dependency [2/7] (#20173)
etraut-openai Apr 29, 2026
d0204c3
TUI: Remove core protocol dependency [3/7] (#20174)
etraut-openai Apr 29, 2026
47fba5d
[codex-backend] Prefer sqlite git info for rollout-path reads (#20228)
joeytrasatti-openai Apr 29, 2026
8356806
Add ThreadManager sample crate (#20141)
pakrym-oai Apr 29, 2026
05fd904
test protocol: lock inter-agent commentary phase (#20046)
friel-openai Apr 29, 2026
5cf0adb
Include auto-review rollout in feedback uploads (#20064)
won-openai Apr 29, 2026
73cd831
feat: Use remote installed plugin cache for skills and MCP (#20096)
xl-openai Apr 29, 2026
07c8b8c
fix: handle deferred network proxy denials (#19184)
viyatb-oai Apr 29, 2026
9d1e5df
expand the set of core shell env vars for Windows. (#20089)
iceweasel-oai Apr 29, 2026
0690ab0
[codex-analytics] ingest server requests and responses (#17088)
rhan-oai Apr 29, 2026
8ce48f9
[tool_suggest] Improve tool_suggest triggering conditions. (#20091)
mzeng-openai Apr 29, 2026
b15074d
app-server: fix outgoing sender test setup (#20258)
sayan-oai Apr 29, 2026
973c5c8
[app-server] type client response payloads (#20050)
rhan-oai Apr 29, 2026
afbddab
Require remote plugin detail before uninstall (#19966)
xli-oai Apr 29, 2026
72a39e3
[app-server] centralize client response analytics (#20059)
rhan-oai Apr 29, 2026
8d5da3f
Fallback login callback port when default is busy (#19334)
xli-oai Apr 29, 2026
f63b19b
[apps] Add apps MCP path override (#20231)
adaley-openai Apr 29, 2026
b154600
docs: discourage `#[async_trait]` and `#[allow(async_fn_in_trait)]` (…
bolinfest Apr 29, 2026
4241df4
Escape turn metadata headers as ASCII JSON (#19620)
etraut-openai Apr 29, 2026
e20391e
[mcp] Fix plugin MCP approval policy. (#19537)
mzeng-openai Apr 29, 2026
7821915
Add agent graph store interface (#19229)
rasmusrygaard Apr 29, 2026
8de2a7a
Add codex-core public API listing (#20243)
pakrym-oai Apr 29, 2026
13dbcda
stop blocking unified_exec on Windows (#19435)
iceweasel-oai Apr 29, 2026
74f06dc
Enforce workspace metadata protections in Linux sandbox (#19852)
evawong-oai Apr 29, 2026
98f67b1
Update Codex login success page UX (#20136)
rafael-jac Apr 29, 2026
6eab751
chore: increase release build timeout from 60 min to 90 (#20271)
bolinfest Apr 29, 2026
8774229
Add hooks/list app-server RPC (#19778)
abhinav-oai Apr 29, 2026
7bcd462
Consume ai-title from external sessions and add end marker (#20261)
alexsong-oai Apr 30, 2026
c8abcbf
Import external agent sessions in background (#20284)
stefanstokic-oai Apr 30, 2026
fedcefe
Reduce the surface of collaboration modes (#20149)
pakrym-oai Apr 30, 2026
515aa9a
tui: return from side chat on Ctrl-D (#20282)
etraut-openai Apr 30, 2026
8b07132
update codex_plugins_beta_setting (from workspace settings) (#20250)
zamoshchin-openai Apr 30, 2026
bb536d6
[codex-analytics] prevent stale guardian events from satisfying reuse…
rhan-oai Apr 30, 2026
4e677d6
app-server: remove dead api version handling from bespoke events (#20…
pakrym-oai Apr 30, 2026
ebe602d
[plugins] Allow MSFT curated plugins in tool_suggest (#20304)
mzeng-openai Apr 30, 2026
ac4332c
permissions: expose active profile metadata (#20095)
bolinfest Apr 30, 2026
8f3c06c
Add persisted hook enablement state (#19840)
abhinav-oai Apr 30, 2026
ae863e7
ci: increase Windows release workflow timeouts (#20343)
bolinfest Apr 30, 2026
87d0cf1
feat: Add workspace plugin sharing APIs (#20278)
xl-openai Apr 30, 2026
a73403a
Make missing config clears no-ops (#20334)
etraut-openai Apr 30, 2026
c37f743
Gate multi-agent v2 tools independently of collab (#20246)
jif-oai Apr 30, 2026
8a97f3c
realtime: rename provider session ids (#20361)
aibrahim-oai Apr 30, 2026
3516cb9
fix(core): truncate large mcp tool outputs in rollouts (#20260)
owenlin0 Apr 30, 2026
c02814c
Mark goals feature as experimental (#20083)
etraut-openai Apr 30, 2026
a85d265
/plugins: remove marketplace (#19843)
canvrno-oai Apr 30, 2026
487716a
[Extension] Allowlist Chrome Extension in the tool_suggest tool (#20458)
teddywyly-oai Apr 30, 2026
c70cdc1
Remove core protocol dependency [1/2] (#20324)
etraut-openai Apr 30, 2026
5cc5f12
Move item event mapping into app-server-protocol (#20299)
pakrym-oai Apr 30, 2026
f2bc2f2
Remove core protocol dependency [2/2] (#20325)
etraut-openai Apr 30, 2026
b520831
Stop emitting item/fileChange/outputDelta output delta notifications …
pakrym-oai Apr 30, 2026
719431d
[Codex] Add browser use external feature flag (#20245)
khoi-oai Apr 30, 2026
93d53f6
Add /hooks browser for lifecycle hooks (#19882)
abhinav-oai Apr 30, 2026
31f8813
fix: show correct Bedrock runtime endpoint in /status (#20275)
celia-oai Apr 30, 2026
06f3b48
[codex] Fix elevated Windows sandbox named-pipe access (#20270)
iceweasel-oai Apr 30, 2026
7dd08e3
feat(rollouts): store EventMsg::ApplyPatchEnd in limited history mode…
owenlin0 Apr 30, 2026
8121710
install WFP filters for Windows sandbox setup (#20101)
iceweasel-oai Apr 30, 2026
70090c9
[plugin] Add Canva to suggesteable list. (#20474)
mzeng-openai Apr 30, 2026
9121132
Send external import completion for sync imports (#20379)
alexsong-oai Apr 30, 2026
127be06
[codex] Migrate thread turns list to thread store (#19280)
wiltzius-openai Apr 30, 2026
7b3de63
Move plugin out of core. (#20348)
xl-openai Apr 30, 2026
8426edf
Stateful streaming apply_patch parser
akshaynathan Apr 30, 2026
6014b66
fix flaky test falls_back_to_registered_fallback_port_when_default_po…
owenlin0 Apr 30, 2026
9ddb267
fix: ignore dangerous project-level config keys (#20098)
owenlin0 Apr 30, 2026
2686873
Sync remote installed plugin bundles (#20268)
xli-oai Apr 30, 2026
5de7992
fix(tui): set persist_extended_history: false (#20502)
owenlin0 Apr 30, 2026
a5ebede
Bypass review for always-allow MCP tools in auto-review (#20069)
maja-openai Apr 30, 2026
b6f8125
feat(tui): add vim composer mode (#18595)
fcoury-oai May 1, 2026
acdf908
Emit analytics for remote plugin installs (#20267)
xli-oai May 1, 2026
5affb7f
fix(app-server): mark thread/turns/list and exclude_turns as experime…
owenlin0 May 1, 2026
0d9a5d2
Alias codex_hooks feature as hooks (#20522)
abhinav-oai May 1, 2026
4f96001
execpolicy: unwrap PowerShell -Command wrappers on Windows (#20336)
iceweasel-oai May 1, 2026
af089fb
fix(exec_policy) heredoc parsing file_redirect (#20113)
dylan-hurd-oai May 1, 2026
972b819
app-server: switch remote control to protocol v3 segmentation (#20341)
euroelessar May 1, 2026
6b1b227
[codex-analytics] centralize thread analytics state (#20300)
rhan-oai May 1, 2026
c39824c
[codex] Improve PR babysitter CI diagnostics and guardrails (#20484)
wiltzius-openai May 1, 2026
bb60b78
Surface admin-disabled remote plugin status (#20298)
xli-oai May 1, 2026
f50c02d
[codex] Remove unused event messages (#20511)
pakrym-oai May 1, 2026
fe05aca
Make thread store process-scoped (#19474)
wiltzius-openai May 1, 2026
d898cc8
Format multi-day goal durations in the TUI (#20558)
etraut-openai May 1, 2026
a93c89f
Color TUI statusline from active theme (#19631)
etraut-openai May 1, 2026
a62b52f
Refresh remote plugin cache on auth changes (#20265)
xli-oai May 1, 2026
96d2ea9
Add remote plugin skill read API (#20150)
xli-oai May 1, 2026
4879192
feat: Track local paths for shared plugins (#20560)
xl-openai May 1, 2026
87fe1ea
Refresh codex-rs mirror to upstream/main
zemaj May 1, 2026
fcbe0f5
Merge upstream/main: refresh upstream history
zemaj May 1, 2026
8dcde0d
fix(models): align upstream model metadata backports
zemaj May 1, 2026
c081eed
Merge branch 'main' of https://github.com/just-every/code
zemaj May 1, 2026
269ee19
chore(release): 0.6.97 [skip ci]
actions-user May 1, 2026
31240d1
docs(changelog): update for v0.6.97 [skip ci]
actions-user May 1, 2026
d5e00d7
Merge remote-tracking branch 'upstream/main' into local/cbusillo-overlay
cbusillo May 1, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
25 changes: 16 additions & 9 deletions .codex/skills/babysit-pr/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,10 +27,10 @@ Accept any of the following:
2. Run the watcher script to snapshot PR/review/CI state (or consume each streamed snapshot from `--watch`).
3. Inspect the `actions` list in the JSON response.
4. If `diagnose_ci_failure` is present, inspect failed run logs and classify the failure.
5. If the failure is likely caused by the current branch, patch code locally, commit, and push.
5. If the failure is likely caused by the current branch, patch code locally, commit, and push. Do not patch random flaky tests, CI infrastructure, dependency outages, runner issues, or other failures that are unrelated to the branch.
6. If `process_review_comment` is present, inspect surfaced review items and decide whether to address them.
7. If a review item is actionable and correct, patch code locally, commit, push, and then mark the associated review thread/comment as resolved once the fix is on GitHub.
8. If a review item from another author is non-actionable, already addressed, or not valid, post one reply on the comment/thread explaining that decision (for example answering the question or explaining why no change is needed). Prefix the GitHub reply body with `[codex]` so it is clear the response is automated. If the watcher later surfaces your own reply, treat that self-authored item as already handled and do not reply again.
8. Do not post replies to human-authored review comments/threads unless the user explicitly confirms the exact response. If a human review item is non-actionable, already addressed, or not valid, surface the item and recommended response to the user instead of replying on GitHub.
9. If the failure is likely flaky/unrelated and `retry_failed_checks` is present, rerun failed jobs with `--retry-failed-now`.
10. If both actionable review feedback and `retry_failed_checks` are present, prioritize review feedback first; a new commit will retrigger CI, so avoid rerunning flaky checks on the old SHA unless you intentionally defer the review change.
11. On every loop, look for newly surfaced review feedback before acting on CI failures or mergeability state, then verify mergeability / merge-conflict status (for example via `gh pr view`) alongside CI.
Expand Down Expand Up @@ -69,12 +69,18 @@ python3 .codex/skills/babysit-pr/scripts/gh_pr_watch.py --pr <number-or-url> --o
Use `gh` commands to inspect failed runs before deciding to rerun.

- `gh run view <run-id> --json jobs,name,workflowName,conclusion,status,url,headSha`
- `gh run view <run-id> --log-failed`
- `gh api repos/<owner>/<repo>/actions/runs/<run-id>/jobs -X GET -f per_page=100`
- `gh api repos/<owner>/<repo>/actions/jobs/<job-id>/logs > /tmp/codex-gh-job-<job-id>-logs.zip`
- `gh run view <run-id> --log-failed` as a fallback after the overall workflow run is complete

Prefer treating failures as branch-related when logs point to changed code (compile/test/lint/typecheck/snapshots/static analysis in touched areas).
`gh run view --log-failed` is workflow-run scoped and may not expose failed-job logs until the overall run finishes. For faster diagnosis, poll the run's jobs first and, as soon as a specific job has failed, fetch that job's logs directly from the Actions job logs endpoint. The watcher includes a `failed_jobs` list with each failed job's `job_id` and `logs_endpoint` when GitHub exposes one.

Prefer treating failures as branch-related when failed-job logs point to changed code (compile/test/lint/typecheck/snapshots/static analysis in touched areas).

Prefer treating failures as flaky/unrelated when logs show transient infra/external issues (timeouts, runner provisioning failures, registry/network outages, GitHub Actions infra errors).

Do not attempt to fix flaky/unrelated failures by changing tests, build scripts, CI configuration, dependency pins, or infrastructure-adjacent code unless the logs clearly connect the failure to the PR branch. For flaky/unrelated failures, rerun only when the watcher recommends `retry_failed_checks`; otherwise wait or stop for user help.

If classification is ambiguous, perform one manual diagnosis attempt before choosing rerun.

Read `.codex/skills/babysit-pr/references/heuristics.md` for a concise checklist.
Expand All @@ -99,7 +105,8 @@ When you agree with a comment and it is actionable:
5. Resume watching on the new SHA immediately (do not stop after reporting the push).
6. If monitoring was running in `--watch` mode, restart `--watch` immediately after the push in the same turn; do not wait for the user to ask again.

If you disagree or the comment is non-actionable/already addressed, reply once directly on the GitHub comment/thread so the reviewer gets an explicit answer, then continue the watcher loop. Prefix any GitHub reply to a code review comment/thread with `[codex]` so it is clear the response is automated and not from the human user. If the watcher later surfaces your own reply because the authenticated operator is treated as a trusted review author, treat that self-authored item as already handled and do not reply again.
Do not post replies to human-authored GitHub review comments/threads automatically. If you disagree with a human comment, believe it is non-actionable/already addressed, or need to answer a question, report the item to the user with a suggested response and wait for explicit confirmation before posting anything on GitHub. If the user approves a response, prefix it with `[codex]` so it is clear the response is automated and not from the human user.
If the watcher later surfaces your own approved reply because the authenticated operator is treated as a trusted review author, treat that self-authored item as already handled and do not reply again.
If a code review comment/thread is already marked as resolved in GitHub, treat it as non-actionable and safely ignore it unless new unresolved follow-up feedback appears.

## Git Safety Rules
Expand All @@ -125,11 +132,11 @@ Use this loop in a live Codex session:
2. Read `actions`.
3. First check whether the PR is now merged or otherwise closed; if so, report that terminal state and stop polling immediately.
4. Check CI summary, new review items, and mergeability/conflict status.
5. Diagnose CI failures and classify branch-related vs flaky/unrelated.
6. For each surfaced review item from another author, either reply once with an explanation if it is non-actionable or patch/commit/push and then resolve it if it is actionable. If a later snapshot surfaces your own reply, treat it as informational and continue without responding again.
5. Diagnose CI failures and classify branch-related vs flaky/unrelated. If the overall run is still pending but `failed_jobs` already includes a failed job, fetch that job's logs and diagnose immediately instead of waiting for the whole workflow run to finish. Patch only when the failure is branch-related.
6. For each surfaced review item from another author, patch/commit/push and then resolve it if it is actionable. If it is non-actionable, already addressed, or requires a written answer, surface it to the user with a suggested response instead of posting automatically. If a later snapshot surfaces your own approved reply, treat it as informational and continue without responding again.
7. Process actionable review comments before flaky reruns when both are present; if a review fix requires a commit, push it and skip rerunning failed checks on the old SHA.
8. Retry failed checks only when `retry_failed_checks` is present and you are not about to replace the current SHA with a review/CI fix commit.
9. If you pushed a commit, resolved a review thread, replied to a review comment, or triggered a rerun, report the action briefly and continue polling (do not stop).
8. Retry failed checks only when `retry_failed_checks` is present and you are not about to replace the current SHA with a review/CI fix commit. Do not make code changes for unrelated flakes or infrastructure failures just to get CI green.
9. If you pushed a commit, resolved a review thread, or triggered a rerun, report the action briefly and continue polling (do not stop). If a human review comment needs a written GitHub response, stop and ask for confirmation before posting.
10. After a review-fix push, proactively restart continuous monitoring (`--watch`) in the same turn unless a strict stop condition has already been reached.
11. If everything is passing, mergeable, not blocked on required review approval, and there are no unaddressed review items, report that the PR is currently ready to merge but keep the watcher running so new review comments are surfaced quickly while the PR remains open.
12. If blocked on a user-help-required issue (infra outage, exhausted flaky retries, unclear reviewer request, permissions), report the blocker and stop.
Expand Down
2 changes: 1 addition & 1 deletion .codex/skills/babysit-pr/agents/openai.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
interface:
display_name: "PR Babysitter"
short_description: "Watch PR review comments, CI, and merge conflicts"
default_prompt: "Babysit the current PR: monitor reviewer comments, CI, and merge-conflict status (prefer the watcher’s --watch mode for live monitoring); surface new review feedback before acting on CI or mergeability work, fix valid issues, push updates, and rerun flaky failures up to 3 times. Keep exactly one watcher session active for the PR (do not leave duplicate --watch terminals running). If you pause monitoring to patch review/CI feedback, restart --watch yourself immediately after the push in the same turn. If a watcher is still running and no strict stop condition has been reached, the task is still in progress: keep consuming watcher output and sending progress updates instead of ending the turn. Do not treat a green + mergeable PR as a terminal stop while it is still open; continue polling autonomously after any push/rerun so newly posted review comments are surfaced until a strict terminal stop condition is reached or the user interrupts."
default_prompt: "Babysit the current PR: monitor reviewer comments, CI, and merge-conflict status (prefer the watcher’s --watch mode for live monitoring); surface new review feedback before acting on CI or mergeability work, fix valid issues, push updates, and rerun flaky failures up to 3 times. Do not post replies to human-authored review comments unless the user explicitly confirms the exact response. Do not patch unrelated flaky tests, CI infrastructure, dependency outages, runner issues, or other failures that are not caused by the branch. Keep exactly one watcher session active for the PR (do not leave duplicate --watch terminals running). If you pause monitoring to patch review/CI feedback, restart --watch yourself immediately after the push in the same turn. If a watcher is still running and no strict stop condition has been reached, the task is still in progress: keep consuming watcher output and sending progress updates instead of ending the turn. Do not treat a green + mergeable PR as a terminal stop while it is still open; continue polling autonomously after any push/rerun so newly posted review comments are surfaced until a strict terminal stop condition is reached or the user interrupts."
12 changes: 11 additions & 1 deletion .codex/skills/babysit-pr/references/github-api-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,11 @@ Used to discover failed workflow runs and rerunnable run IDs.
### Failed log inspection

- `gh run view <run-id> --json jobs,name,workflowName,conclusion,status,url,headSha`
- `gh api repos/{owner}/{repo}/actions/runs/{run_id}/jobs -X GET -f per_page=100`
- `gh api repos/{owner}/{repo}/actions/jobs/{job_id}/logs > /tmp/codex-gh-job-{job_id}-logs.zip`
- `gh run view <run-id> --log-failed`

Used by Codex to classify branch-related vs flaky/unrelated failures.
Used by Codex to classify branch-related vs flaky/unrelated failures. Prefer the direct job log endpoint as soon as a job has failed because `gh run view --log-failed` may not produce failed-job logs until the overall workflow run completes.

### Retry failed jobs only

Expand Down Expand Up @@ -70,3 +72,11 @@ Reruns only failed jobs (and dependencies) for a workflow run.
- `conclusion`
- `html_url`
- `head_sha`

### Actions run jobs API (`jobs[]`)

- `id`
- `name`
- `status`
- `conclusion`
- `html_url`
10 changes: 9 additions & 1 deletion .codex/skills/babysit-pr/references/heuristics.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,16 +18,20 @@ Treat as **likely flaky or unrelated** when evidence points to transient or exte
- Cloud/service rate limits or transient API outages
- Non-deterministic failures in unrelated integration tests with known flake patterns

Do not patch likely flaky/unrelated failures. Use the retry budget for rerunnable failures, wait for pending jobs, or stop and report the blocker when the failure is persistent or infrastructure-owned.

If uncertain, inspect failed logs once before choosing rerun.

## Decision tree (fix vs rerun vs stop)

1. If PR is merged/closed: stop.
2. If there are failed checks:
- Diagnose first.
- If checks are still pending but an individual job has already failed: fetch that job's logs and diagnose now.
- If branch-related: fix locally, commit, push.
- If likely flaky/unrelated and all checks for the current SHA are terminal: rerun failed jobs.
- If checks are still pending: wait.
- If likely flaky/unrelated and not safely rerunnable: stop and report the blocker; do not edit unrelated tests, build scripts, CI configuration, dependency pins, or infrastructure code.
- If checks are still pending and no failed job is available yet: wait.
3. If flaky reruns for the same SHA reach the configured limit (default 3): stop and report persistent failure.
4. Independently, process any new human review comments.

Expand All @@ -40,12 +44,15 @@ Address the comment when:
- The requested change does not conflict with the user’s intent or recent guidance.
- The change can be made safely without unrelated refactors.

Fix valid human review feedback in code when possible, but do not post a GitHub reply to a human-authored comment/thread unless the user explicitly confirms the exact response.

Do not auto-fix when:

- The comment is ambiguous and needs clarification.
- The request conflicts with explicit user instructions.
- The proposed change requires product/design decisions the user has not made.
- The codebase is in a dirty/unrelated state that makes safe editing uncertain.
- The comment only needs a written answer or disagreement response; propose the reply to the user instead of posting it automatically.

## Stop-and-ask conditions

Expand All @@ -56,3 +63,4 @@ Stop and ask the user instead of continuing automatically when:
- The PR branch cannot be pushed.
- CI failures persist after the flaky retry budget.
- Reviewer feedback requires a product decision or cross-team coordination.
- A human review comment requires a written GitHub reply instead of a code change.
67 changes: 65 additions & 2 deletions .codex/skills/babysit-pr/scripts/gh_pr_watch.py
Original file line number Diff line number Diff line change
Expand Up @@ -338,6 +338,66 @@ def failed_runs_from_workflow_runs(runs, head_sha):
return failed_runs


def get_jobs_for_run(repo, run_id):
endpoint = f"repos/{repo}/actions/runs/{run_id}/jobs"
data = gh_json(["api", endpoint, "-X", "GET", "-f", "per_page=100"], repo=repo)
if not isinstance(data, dict):
raise GhCommandError("Unexpected payload from actions run jobs API")
jobs = data.get("jobs") or []
if not isinstance(jobs, list):
raise GhCommandError("Expected `jobs` to be a list")
return jobs


def failed_jobs_from_workflow_runs(repo, runs, head_sha):
failed_jobs = []
for run in runs:
if not isinstance(run, dict):
continue
if str(run.get("head_sha") or "") != head_sha:
continue
run_id = run.get("id")
if run_id in (None, ""):
continue
run_status = str(run.get("status") or "")
run_conclusion = str(run.get("conclusion") or "")
if run_status.lower() == "completed" and run_conclusion not in FAILED_RUN_CONCLUSIONS:
continue
jobs = get_jobs_for_run(repo, run_id)
for job in jobs:
if not isinstance(job, dict):
continue
conclusion = str(job.get("conclusion") or "")
if conclusion not in FAILED_RUN_CONCLUSIONS:
continue
job_id = job.get("id")
logs_endpoint = None
if job_id not in (None, ""):
logs_endpoint = f"repos/{repo}/actions/jobs/{job_id}/logs"
failed_jobs.append(
{
"run_id": run_id,
"workflow_name": run.get("name") or run.get("display_title") or "",
"run_status": run_status,
"run_conclusion": run_conclusion,
"job_id": job_id,
"job_name": str(job.get("name") or ""),
"status": str(job.get("status") or ""),
"conclusion": conclusion,
"html_url": str(job.get("html_url") or ""),
"logs_endpoint": logs_endpoint,
}
)
failed_jobs.sort(
key=lambda item: (
str(item.get("workflow_name") or ""),
str(item.get("job_name") or ""),
str(item.get("job_id") or ""),
)
)
return failed_jobs


def get_authenticated_login():
data = gh_json(["api", "user"])
if not isinstance(data, dict) or not data.get("login"):
Expand Down Expand Up @@ -568,7 +628,7 @@ def is_pr_ready_to_merge(pr, checks_summary, new_review_items):
return True


def recommend_actions(pr, checks_summary, failed_runs, new_review_items, retries_used, max_retries):
def recommend_actions(pr, checks_summary, failed_runs, failed_jobs, new_review_items, retries_used, max_retries):
actions = []
if pr["closed"] or pr["merged"]:
if new_review_items:
Expand All @@ -583,7 +643,7 @@ def recommend_actions(pr, checks_summary, failed_runs, new_review_items, retries
if new_review_items:
actions.append("process_review_comment")

has_failed_pr_checks = checks_summary["failed_count"] > 0
has_failed_pr_checks = checks_summary["failed_count"] > 0 or bool(failed_jobs)
Comment thread
cbusillo marked this conversation as resolved.
if has_failed_pr_checks:
if checks_summary["all_terminal"] and retries_used >= max_retries:
actions.append("stop_exhausted_retries")
Expand Down Expand Up @@ -621,12 +681,14 @@ def collect_snapshot(args):
checks_summary = summarize_checks(checks)
workflow_runs = get_workflow_runs_for_sha(pr["repo"], pr["head_sha"])
failed_runs = failed_runs_from_workflow_runs(workflow_runs, pr["head_sha"])
failed_jobs = failed_jobs_from_workflow_runs(pr["repo"], workflow_runs, pr["head_sha"])

retries_used = current_retry_count(state, pr["head_sha"])
actions = recommend_actions(
pr,
checks_summary,
failed_runs,
failed_jobs,
new_review_items,
retries_used,
args.max_flaky_retries,
Expand All @@ -641,6 +703,7 @@ def collect_snapshot(args):
"pr": pr,
"checks": checks_summary,
"failed_runs": failed_runs,
"failed_jobs": failed_jobs,
"new_review_items": new_review_items,
"actions": actions,
"retry_state": {
Expand Down
Loading