Prefer explicit exec approvals over ToolStarted proxy-cancel in untrusted (always-ask) mode#3
Conversation
|
@copilot did this PR fix the issues#1 |
|
@poseidonchan I've opened a new pull request, #4, to work on those changes. Once the pull request is ready, I'll request review from you. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 069a14c5be
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if pending_untrusted_tool_start is not None and not isinstance(ev, ExecApprovalRequest): | ||
| await _prompt_proxy_exec_approval(command=pending_untrusted_tool_start.command) | ||
| pending_untrusted_tool_start = None | ||
| break |
There was a problem hiding this comment.
Prompt proxy approval before consuming next event
Deferring ToolStarted handling until a later non-ExecApprovalRequest event means the command can already finish before cancellation is attempted. With an event order like ToolStarted -> TurnCompleted (or any fast completion signal), this branch calls _prompt_proxy_exec_approval() only after the completion event is read, so rejecting no longer prevents execution and accepting can re-run the same command via the proxy_exec callback path. In approval_policy="untrusted", this weakens the intended always-ask gate for short/silent commands.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Summary
- Updated untrusted-mode tool handling to prompt proxy approval immediately on
ToolStarted(and then break), instead of deferring to a later event; this closes the race where short commands could finish before cancellation was attempted. src/tgcodex/bot/commands.pyL1265-L1280 - Kept proxy approval prompt/cancel behavior centralized in
_prompt_proxy_exec_approval, which still sets pending action state and cancels the run after sending the prompt. src/tgcodex/bot/commands.pyL1003-L1049 - Replaced the prior explicit-approval-preference regression test with a new regression test that simulates
ToolStarted -> TurnCompleted -> ExecApprovalRequestand asserts we still stop immediately for proxy approval in untrusted mode. tests/test_proxy_approval_flow.pyL521-L597 - Added
TurnCompletedto test imports for the new scenario. tests/test_proxy_approval_flow.pyL10
Testing
⚠️ curl -I -s 'https://duckduckgo.com/?q=python%20async%20event%20handling%20race%20condition%20tool%20started%20turn%20completed%20approval%20flow' | head(returned HTTP 403 in this environment, but web search was attempted per instruction)- ✅
python -m pytest tests/test_proxy_approval_flow.py tests/test_events.py
Committed on the current branch (e0b1cef) and created a follow-up PR via make_pr with title: “Follow-up: gate untrusted ToolStarted events immediately before completion races”.
|
This PR basically proposed a never working way to solve the problem |
Motivation
ToolStartedevents could trigger a proxy-cancel approval flow before an explicitExecApprovalRequestarrived, causing commands to execute (or be cancelled) incorrectly inapproval_policy="untrusted"(Always ask) mode.Description
ToolStartedevents inon_text_messageby introducingpending_untrusted_tool_startso an imminentExecApprovalRequestcan take precedence._prompt_proxy_exec_approval()to centralize the proxy-approval prompt, setActiveRunstate, send the Telegram prompt, and cancel the run as needed.ExecApprovalRequestis received so the explicit approval path wins; trigger the proxy prompt only if no explicit approval arrives (including at stream end).test_untrusted_prefers_explicit_exec_approval_after_tool_startedintests/test_proxy_approval_flow.pyand adjust test harness imports accordingly.Testing
python -m pytest tests/test_proxy_approval_flow.py tests/test_events.pywhich completed successfully with16 passed(regression test included).ExecApprovalRequestevents win over deferredToolStartedproxy logic.Codex Task