Skip to content

Addressing PR comments#8

Merged
poseidonchan merged 2 commits intomasterfrom
codex/review-and-minimize-tests
Feb 14, 2026
Merged

Addressing PR comments#8
poseidonchan merged 2 commits intomasterfrom
codex/review-and-minimize-tests

Conversation

@Codex
Copy link
Contributor

@Codex Codex AI commented Feb 14, 2026

Original prompt

This section details on the original issue you should resolve

<issue_title>Are these tests really helpful?</issue_title>
<issue_description>The codebase includes many tests, however the real world problem still exists after passing all tests. Do we still need to keep all tests?</issue_description>

<agent_instructions>Please review the tests and try to minimize the real gap between tests and real world user's experience. </agent_instructions>

Comments on the Issue (you are @codex[agent] in this section)

@Codex Codex AI changed the title [WIP] Review tests for real world relevance Handle exec_command dict arguments in event parsing Feb 14, 2026
@Codex Codex AI requested a review from poseidonchan February 14, 2026 03:52
@poseidonchan
Copy link
Owner

@claude[agent] review this PR, does it fully solve the issue?

@poseidonchan
Copy link
Owner

@claude[agent] review this PR, does it fully solve the issue?

@claude[agent] review this PR, does it fully solve the issue?

@Claude
Copy link
Contributor

Claude AI commented Feb 14, 2026

@claude[agent] review this PR, does it fully solve the issue?

@claude[agent] review this PR, does it fully solve the issue?

Yes, the PR solves the core parsing issue.

What was broken: The original code at events.py:310 only accepted arguments as a string (isinstance(args_s, str)), so when Codex sent dict-format arguments like {"cmd": ["ls", "-la"]}, those events were silently dropped—leaving the bot blind to tool executions.

What the fix does (51651fb):

  • Accepts both string and dict arguments formats
  • Properly converts list cmd values to shell strings using shlex.join()
  • Handles escalation metadata in both formats for approval requests

Test coverage: Added 2 regression tests covering dict arguments with/without escalation. Existing integration tests in test_proxy_approval_flow.py validate the full event loop (ThreadStarted → ExecApprovalRequest → approval UI), though they use structured event objects rather than raw JSONL payloads.

The broader issue of test-vs-reality gap exists because unit tests focused on structured event objects while real events arrive as JSONL. This PR closes that specific gap for exec_command parsing.

@Claude Claude AI changed the title Handle exec_command dict arguments in event parsing Addressing PR comments Feb 14, 2026
@poseidonchan poseidonchan marked this pull request as ready for review February 14, 2026 04:04
@poseidonchan poseidonchan merged commit ddad4f5 into master Feb 14, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Are these tests really helpful?

3 participants