test(uipath-agents): fill remaining coded-agent test gaps#633
test(uipath-agents): fill remaining coded-agent test gaps#633
Conversation
|
Claude finished @radugheo's task in 4m 12s —— View job Review: PR #633 — test(uipath-agents): fill remaining coded-agent test gaps
SummaryAdds 12 well-structured coded-agent test tasks (YAML + Python checkers) covering evaluator typeIds, edit lifecycle, anti-patterns, SDK services, and solution upload. Also extends Change-by-Change Review1.
|
| Area | Status | Notes |
|---|---|---|
| Frontmatter | N/A | No skill SKILL.md changes — test-only PR |
| E2E Tests | OK | All 12 tasks well-formed: correct task_id pattern (skill-agent-coded-*), first tag is uipath-agents, required tags present (skill + tier + lifecycle:), feature tags use established vocabulary |
| Skill Body | N/A | No skill content changes |
| References & Assets | N/A | No reference/asset changes |
| Repo Hygiene | OK | No secrets, no cross-skill refs, changes scoped to tests/tasks/uipath-agents/ + one command file |
Issues for Manual Review
- CLI command correctness: The eval tasks instruct the agent to run
uip codedagent eval ... --no-report. Verify this flag exists and behaves as described (saves results locally without uploading). Automated review cannot validate actual CLI surface. uip codedagent runvsuip codedagent dev: Theantipattern_dev_non_interactivetest claimsdevauto-appends--interactive. Verify this is still the current behavior of the wrapper.pack/publishwrapper-disabled status: Theantipattern_pack_publishtest claims these commands fail with "command is disabled". Verify this matches the currentuipwrapper behavior — the pre-existingdeploy_my_workspacetest seems to still acceptpack+publishas valid (see What's Missing).UiPath()import path:sdk_services_suitechecker assertsfrom uipath.platform import UiPath. Verify this is still the canonical import (notfrom uipath import UiPath).
Conclusion
Approve. This is a clean, well-structured PR that significantly improves test coverage. All 12 tasks follow established conventions, tag taxonomy is consistent, checkers are thorough with good AST-based validation, and shared helpers are reused without modification. The only actionable finding is a minor inconsistency (version check in one eval checker but not the others — Low severity). The pre-existing deploy_my_workspace description conflict is worth a follow-up but is not blocking.
|
Claude finished @radugheo's task in 6m 25s —— View job Coder-eval task lint (advisory)12 task YAMLs added; verdicts (after theme downgrade): 0 Critical, 0 High, 1 Medium, 2 Low, 9 OK. 1 theme at High severity. Rubric: .claude/commands/lint-task.md. This check is advisory and never blocks merge. Evidence of passing run❌ High — PR body claims Per-task lint
|
2e05e6a to
f02375f
Compare
|
Claude finished @radugheo's task in 4m 34s —— View job Coder-eval task lint (advisory)17 task YAMLs changed (12 added, 5 modified); verdicts: 0 Critical, 0 High, 0 Medium, 1 Low, 16 OK. Rubric: .claude/commands/lint-task.md. This check is advisory and never blocks merge. Evidence of passing run❌ High — PR body claims Per-task lint
|
f02375f to
eb1a4fd
Compare
|
Claude finished @radugheo's task in 4m 18s —— View job
Coder-eval task lint (advisory)15 task YAMLs changed (11 added, 4 modified); verdicts: 0 Critical, 0 High, 0 Medium, 1 Low, 14 OK. Rubric: .claude/commands/lint-task.md. This check is advisory and never blocks merge. Evidence of passing run❌ High — PR body claims Per-task lint
|
Summary
Adds 11 coded-agent tests covering remaining gaps in the uipath-agents test suite, plus small fixes to four existing tests.
Tests added
skill-agent-coded-antipattern-dev-non-interactiveuip codedagent devauto-appends--interactive; agent must switch touip codedagent run.skill-agent-coded-antipattern-output-aggregationAnnotated[list, operator.add]reducers vanish from--output-file; agent must carry aggregate fields explicitly per node.skill-agent-coded-antipattern-pack-publishuip codedagent packandpublishare wrapper-disabled; agent must consolidate touip codedagent deploy.skill-agent-coded-chat-react-agentcreate_react_agent+ tools.skill-agent-coded-edit-add-nodelifecycle:edittest — modifies an existing LangGraph by adding a second node.skill-agent-coded-eval-classificationuipath-binary-classification+uipath-multiclass-classificationevaluators.skill-agent-coded-eval-output-suiteuipath-contains+uipath-json-similarity+uipath-llm-judge-output-strict-json-similarity+uipath-llm-judge-trajectory-simulation.skill-agent-coded-eval-tool-call-suiteuipath-tool-call-{order,args,count,output}).skill-agent-coded-langgraph-supervisorskill-agent-coded-login-status-checkpointuip login status --output json; no redundant logins.skill-agent-coded-sdk-services-suiteUiPath()SDK client surface across multiple services (assets + llm + attachments).Existing tests touched
bindings_sync—python -c→python3 -c(sandbox only has Python 3).eval_exact_match,eval_llm_judges— natural-language prompt cleanup; evalturn_timeout1200 → 1800s.openai_agents_handoff— natural-language prompt cleanup; check no longer pins the typed-context class name (was hardcodedCustomerInput).Test plan
check_*.pydry-runs green against synthetic well-formed projects.tests/README.md.