Skip to content

Commit 67af92e

Browse files
committed
feat: add Claude Code skills for agents
1 parent 9433e49 commit 67af92e

3 files changed

Lines changed: 408 additions & 14 deletions

File tree

src/uipath/_cli/cli_init.py

Lines changed: 18 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -109,34 +109,38 @@ def generate_agent_md_file(
109109

110110

111111
def generate_agent_md_files(target_directory: str, no_agents_md_override: bool) -> None:
112-
"""Generate an agent-specific file from the packaged resource.
112+
"""Generate AGENTS.md related files and Claude Code skills.
113113
114114
Args:
115115
target_directory: The directory where the files should be created.
116116
no_agents_md_override: Whether to override existing files.
117117
"""
118118
agent_dir = os.path.join(target_directory, ".agent")
119119
os.makedirs(agent_dir, exist_ok=True)
120+
claude_commands_dir = os.path.join(target_directory, ".claude", "commands")
121+
os.makedirs(claude_commands_dir, exist_ok=True)
120122

121-
root_files = ["AGENTS.md", "CLAUDE.md"]
122-
123-
agent_files = ["CLI_REFERENCE.md", "REQUIRED_STRUCTURE.md", "SDK_REFERENCE.md"]
123+
files_to_create = {
124+
target_directory: ["AGENTS.md", "CLAUDE.md"],
125+
agent_dir: ["CLI_REFERENCE.md", "REQUIRED_STRUCTURE.md", "SDK_REFERENCE.md"],
126+
claude_commands_dir: ["new-agent.md", "eval.md"],
127+
}
124128

125129
any_overridden = False
126-
127-
for file_name in root_files:
128-
if generate_agent_md_file(target_directory, file_name, no_agents_md_override):
129-
any_overridden = True
130-
131-
for file_name in agent_files:
132-
if generate_agent_md_file(agent_dir, file_name, no_agents_md_override):
133-
any_overridden = True
130+
for directory, filenames in files_to_create.items():
131+
for filename in filenames:
132+
if generate_agent_md_file(directory, filename, no_agents_md_override):
133+
any_overridden = True
134134

135135
if any_overridden:
136-
console.success(f"Updated {click.style('AGENTS.md', fg='cyan')} related files.")
136+
console.success(
137+
f"Updated {click.style('AGENTS.md', fg='cyan')} files and Claude Code skills."
138+
)
137139
return
138140

139-
console.success(f"Created {click.style('AGENTS.md', fg='cyan')} related files.")
141+
console.success(
142+
f"Created {click.style('AGENTS.md', fg='cyan')} files and Claude Code skills."
143+
)
140144

141145

142146
def write_bindings_file(bindings: Bindings) -> Path:

src/uipath/_resources/eval.md

Lines changed: 287 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,287 @@
1+
---
2+
allowed-tools: Bash, Read, Write, Edit, Glob
3+
description: Create and run agent evaluations
4+
---
5+
6+
I'll help you create and run evaluations for your UiPath agent.
7+
8+
## Step 1: Check project setup
9+
10+
Let me check your project structure:
11+
12+
!ls -la evaluations/ entry-points.json 2>/dev/null || echo "NEEDS_SETUP"
13+
14+
# Check if schemas might be stale (main.py newer than entry-points.json)
15+
!if [ -f main.py ] && [ -f entry-points.json ] && [ main.py -nt entry-points.json ]; then echo "SCHEMAS_MAY_BE_STALE"; fi
16+
17+
### If NEEDS_SETUP
18+
19+
If `entry-points.json` doesn't exist, initialize the project first:
20+
21+
!uv run uipath init
22+
23+
Then re-run this skill.
24+
25+
### If SCHEMAS_MAY_BE_STALE
26+
27+
Your `main.py` is newer than `entry-points.json`. Refresh schemas:
28+
29+
!uv run uipath init --no-agents-md-override
30+
31+
## Step 2: What would you like to do?
32+
33+
1. **Create new eval set** - Set up evaluations from scratch
34+
2. **Add test case** - Add a test to existing eval set
35+
3. **Run evaluations** - Execute tests and see results
36+
4. **Analyze failures** - Debug failing tests
37+
38+
---
39+
40+
## Creating an Eval Set
41+
42+
First, create the directory structure:
43+
44+
!mkdir -p evaluations/eval-sets evaluations/evaluators
45+
46+
Read the agent's Input/Output schema from entry-points.json to understand the data types.
47+
48+
### Evaluator Selection Guide
49+
50+
| If your output is... | Use this evaluator | evaluatorTypeId |
51+
|---------------------|-------------------|-----------------|
52+
| Exact string/number | `ExactMatchEvaluator` | `uipath-exact-match` |
53+
| Contains key phrases | `ContainsEvaluator` | `uipath-contains` |
54+
| Semantically correct | `LLMJudgeOutputEvaluator` | `uipath-llm-judge-output-semantic-similarity` |
55+
| JSON with numbers | `JsonSimilarityEvaluator` | `uipath-json-similarity` |
56+
57+
### Step 1: Create Evaluator Config Files
58+
59+
**Each evaluator needs a JSON config file** in `evaluations/evaluators/`.
60+
61+
**ExactMatchEvaluator** (`evaluations/evaluators/exact-match.json`):
62+
```json
63+
{
64+
"version": "1.0",
65+
"id": "ExactMatchEvaluator",
66+
"name": "ExactMatchEvaluator",
67+
"description": "Checks for exact output match",
68+
"evaluatorTypeId": "uipath-exact-match",
69+
"evaluatorConfig": {
70+
"name": "ExactMatchEvaluator",
71+
"targetOutputKey": "*"
72+
}
73+
}
74+
```
75+
76+
**LLMJudgeOutputEvaluator** (`evaluations/evaluators/llm-judge-output.json`):
77+
```json
78+
{
79+
"version": "1.0",
80+
"id": "LLMJudgeOutputEvaluator",
81+
"name": "LLMJudgeOutputEvaluator",
82+
"description": "Uses LLM to judge semantic similarity",
83+
"evaluatorTypeId": "uipath-llm-judge-output-semantic-similarity",
84+
"evaluatorConfig": {
85+
"name": "LLMJudgeOutputEvaluator",
86+
"model": "gpt-4o-mini-2024-07-18"
87+
}
88+
}
89+
```
90+
91+
**JsonSimilarityEvaluator** (`evaluations/evaluators/json-similarity.json`):
92+
```json
93+
{
94+
"version": "1.0",
95+
"id": "JsonSimilarityEvaluator",
96+
"name": "JsonSimilarityEvaluator",
97+
"description": "Compares JSON structures",
98+
"evaluatorTypeId": "uipath-json-similarity",
99+
"evaluatorConfig": {
100+
"name": "JsonSimilarityEvaluator",
101+
"targetOutputKey": "*"
102+
}
103+
}
104+
```
105+
106+
**ContainsEvaluator** (`evaluations/evaluators/contains.json`):
107+
```json
108+
{
109+
"version": "1.0",
110+
"id": "ContainsEvaluator",
111+
"name": "ContainsEvaluator",
112+
"description": "Checks if output contains text",
113+
"evaluatorTypeId": "uipath-contains",
114+
"evaluatorConfig": {
115+
"name": "ContainsEvaluator"
116+
}
117+
}
118+
```
119+
120+
### Step 2: Create Eval Set
121+
122+
**Eval Set Template** (`evaluations/eval-sets/default.json`):
123+
```json
124+
{
125+
"version": "1.0",
126+
"id": "default-eval-set",
127+
"name": "Default Evaluation Set",
128+
"evaluatorRefs": ["ExactMatchEvaluator"],
129+
"evaluations": [
130+
{
131+
"id": "test-1",
132+
"name": "Test description",
133+
"inputs": {
134+
"field": "value"
135+
},
136+
"evaluationCriterias": {
137+
"ExactMatchEvaluator": {
138+
"expectedOutput": {
139+
"result": "expected value"
140+
}
141+
}
142+
}
143+
}
144+
]
145+
}
146+
```
147+
148+
**Important notes:**
149+
- `evaluatorRefs` must list ALL evaluators used in any test case
150+
- Each evaluator in `evaluatorRefs` needs a matching JSON config in `evaluations/evaluators/`
151+
- `evaluationCriterias` keys must match entries in `evaluatorRefs`
152+
- Use `expectedOutput` for most evaluators
153+
- LLM evaluators need `model` in their config. Available models are defined in the SDK's `ChatModels` class (`uipath.platform.chat.ChatModels`):
154+
- `gpt-4o-mini-2024-07-18` (recommended for cost-efficiency)
155+
- `gpt-4o-2024-08-06` (higher quality, higher cost)
156+
- `o3-mini-2025-01-31` (latest reasoning model)
157+
- Model availability varies by region and tenant configuration
158+
- Check your UiPath Automation Cloud portal under AI Trust Layer for available models in your region
159+
160+
---
161+
162+
## Adding a Test Case
163+
164+
When adding a test to an existing eval set:
165+
166+
1. Read the existing eval set
167+
2. Check which evaluators are in `evaluatorRefs`
168+
3. Add the new test to `evaluations` array
169+
4. If using a new evaluator, add it to `evaluatorRefs`
170+
171+
### Test Case Template
172+
173+
```json
174+
{
175+
"id": "test-{n}",
176+
"name": "Description of what this tests",
177+
"inputs": { },
178+
"evaluationCriterias": {
179+
"EvaluatorName": {
180+
"expectedOutput": { }
181+
}
182+
}
183+
}
184+
```
185+
186+
---
187+
188+
## Running Evaluations
189+
190+
First, read entry-points.json to get the entrypoint name (e.g., `main`):
191+
192+
!uv run uipath eval main evaluations/eval-sets/default.json --output-file eval-results.json
193+
194+
**Note:** Replace `main` with your actual entrypoint from entry-points.json.
195+
196+
### Analyze Results
197+
198+
After running, read `eval-results.json` and show:
199+
- Pass/fail summary table
200+
- For failures: expected vs actual output
201+
- Suggestions for fixing or changing evaluators
202+
203+
### Results Format
204+
205+
```json
206+
{
207+
"evaluationSetResults": [{
208+
"evaluationRunResults": [
209+
{
210+
"evaluationId": "test-1",
211+
"evaluatorId": "ExactMatchEvaluator",
212+
"result": { "score": 1.0 },
213+
"errorMessage": null
214+
}
215+
]
216+
}]
217+
}
218+
```
219+
220+
- Score 1.0 = PASS
221+
- Score < 1.0 = FAIL (show expected vs actual)
222+
- errorMessage present = ERROR (show message)
223+
224+
---
225+
226+
## Evaluator Reference
227+
228+
### Deterministic Evaluators
229+
230+
**ExactMatchEvaluator** - Exact output matching
231+
```json
232+
"ExactMatchEvaluator": {
233+
"expectedOutput": { "result": "exact value" }
234+
}
235+
```
236+
237+
**ContainsEvaluator** - Output contains substring
238+
```json
239+
"ContainsEvaluator": {
240+
"searchText": "must contain this"
241+
}
242+
```
243+
244+
**JsonSimilarityEvaluator** - JSON comparison with tolerance
245+
```json
246+
"JsonSimilarityEvaluator": {
247+
"expectedOutput": { "value": 10.0 }
248+
}
249+
```
250+
251+
### LLM-Based Evaluators
252+
253+
**LLMJudgeOutputEvaluator** - Semantic correctness
254+
```json
255+
"LLMJudgeOutputEvaluator": {
256+
"expectedOutput": { "summary": "Expected semantic meaning" }
257+
}
258+
```
259+
260+
**LLMJudgeTrajectoryEvaluator** - Validate agent reasoning
261+
```json
262+
"LLMJudgeTrajectoryEvaluator": {
263+
"expectedAgentBehavior": "The agent should first fetch data, then process it"
264+
}
265+
```
266+
267+
---
268+
269+
## Common Issues
270+
271+
### "No evaluations found"
272+
- Check `evaluations/eval-sets/` directory exists
273+
- Verify JSON file is valid
274+
275+
### Evaluator not found
276+
- Each evaluator needs a JSON config file in `evaluations/evaluators/`
277+
- Config file must have correct `evaluatorTypeId` (see templates above)
278+
- Config file must have `name` field at root level
279+
- LLM evaluators need `model` in `evaluatorConfig`
280+
281+
### Evaluator skipped
282+
- Ensure evaluator is listed in root `evaluatorRefs` array
283+
- Check evaluator config file exists in `evaluations/evaluators/`
284+
285+
### Schema mismatch
286+
- Run `uv run uipath init --no-agents-md-override` to refresh schemas
287+
- Check `entry-points.json` matches your Input/Output models

0 commit comments

Comments
 (0)