adding readme for skills-eval#2055
Conversation
Greptile SummaryThis PR adds a 299-line
|
| Filename | Overview |
|---|---|
| nemo_retriever/src/nemo_retriever/skill_eval/README.md | New README documenting the skill-eval benchmarking harness; contains a documentation defect where the example YAML config uses ~/ paths that PyYAML won't expand, contradicting the inline comment that calls them "absolute paths." |
Sequence Diagram
sequenceDiagram
participant User
participant CLI as retriever skill-eval run
participant Runner as runner.py
participant Claude as claude subprocess
participant Judge as LLM Judge (NVIDIA NIM)
participant Artifacts as artifacts/skilleval_ts/
User->>CLI: skill-eval run --config skill_eval.yaml
CLI->>Runner: load config + manifest
Runner->>Runner: validate pdf_dirs keys vs manifest domains
loop for each (condition, domain)
Runner->>Runner: build scratch workdir in /tmp/skill_eval/
Runner->>Runner: symlink PDFs into workdir/pdfs/
Runner->>Claude: setup turn (--permission-mode bypassPermissions)
Claude-->>Runner: session-id + token usage
loop for each manifest entry (query turn)
Runner->>Claude: --resume session-id + paraphrased prompt
Claude-->>Runner: final_answer + ranked_retrieved
opt NVIDIA_API_KEY set
Runner->>Judge: score final_answer vs ground-truth
Judge-->>Runner: 0-5 score
end
Runner->>Artifacts: write TrialResult JSON
end
Runner->>Runner: "compute recall@1/5/10"
Runner->>Runner: delete /tmp/skill_eval/cond/domain/
end
Runner->>Artifacts: write session_summary.json + session_summary.md
Runner-->>User: print per-(condition,domain) recall + judge table
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
nemo_retriever/src/nemo_retriever/skill_eval/README.md:136-137
**`~/` paths in YAML example won't expand**
The comment on line 136 says "Absolute path to your agent-eval manifest," but `~/datasets/...` is not an absolute path — `~` is a shell shorthand that standard YAML loaders (PyYAML's `safe_load`) pass through verbatim. Unless `runner.py` explicitly calls `os.path.expanduser()` on every path value after loading, the runner will look for a literal directory named `~` and fail. The troubleshooting section (line 291) already hints at this: "the value under `pdf_dirs.<domain>` was unset (`~` expansion failed …)" — but users who hit this error after following the example config will find it confusing. The example should use a real absolute path (e.g. `/home/user/datasets/vidore_v3/agent_eval_manifest.json`) or add a note that `~` must be pre-expanded before writing it into the YAML.
Reviews (2): Last reviewed commit: "Merge pull request #1 from NVIDIA/dev/wa..." | Re-trigger Greptile
|
|
||
| ## 1. Make the PDF tree reachable | ||
|
|
||
| ViDoRe v3 is split per-domain. The seven domains the harness recognises are: |
There was a problem hiding this comment.
The list immediately below this sentence contains eight domain entries, but the prose says "seven." This mismatch will confuse readers trying to reconcile the domain count. The same section later (line 204) correctly states "three conditions × eight domains = 24 sessions," confirming eight is the right number.
| ViDoRe v3 is split per-domain. The seven domains the harness recognises are: | |
| ViDoRe v3 is split per-domain. The eight domains the harness recognises are: |
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemo_retriever/src/nemo_retriever/skill_eval/README.md
Line: 56
Comment:
The list immediately below this sentence contains eight domain entries, but the prose says "seven." This mismatch will confuse readers trying to reconcile the domain count. The same section later (line 204) correctly states "three conditions × eight domains = 24 sessions," confirming eight is the right number.
```suggestion
ViDoRe v3 is split per-domain. The eight domains the harness recognises are:
```
How can I resolve this? If you propose a fix, please make it concise.|
|
||
| **`config 'pdf_dirs' is missing an entry for domain '<X>'`** — your manifest contains a `domain` value that has no key in `pdf_dirs`. Either add the key, or use `--domains` to skip that subset. | ||
|
|
||
| **`PDF directory '…' for domain '…' does not exist or is not a directory`** — the value under `pdf_dirs.<domain>` was unset (`~` expansion failed, typo, etc.). Resolve the path manually with `ls "$PATH"` and update the config. |
There was a problem hiding this comment.
$PATH is the shell variable for executable search directories, not for the PDF directory path. Running ls "$PATH" would expand to something like ls "/usr/local/bin:/usr/bin:/bin" and error or list unrelated directories. The intent is to inspect the configured PDF directory from the YAML config.
| **`PDF directory '…' for domain '…' does not exist or is not a directory`** — the value under `pdf_dirs.<domain>` was unset (`~` expansion failed, typo, etc.). Resolve the path manually with `ls "$PATH"` and update the config. | |
| **`PDF directory '…' for domain '…' does not exist or is not a directory`** — the value under `pdf_dirs.<domain>` was unset (`~` expansion failed, typo, etc.). Resolve the path manually with `ls "/your/configured/pdf_dirs/path"` and update the config. |
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemo_retriever/src/nemo_retriever/skill_eval/README.md
Line: 290
Comment:
`$PATH` is the shell variable for executable search directories, not for the PDF directory path. Running `ls "$PATH"` would expand to something like `ls "/usr/local/bin:/usr/bin:/bin"` and error or list unrelated directories. The intent is to inspect the configured PDF directory from the YAML config.
```suggestion
**`PDF directory '…' for domain '…' does not exist or is not a directory`** — the value under `pdf_dirs.<domain>` was unset (`~` expansion failed, typo, etc.). Resolve the path manually with `ls "/your/configured/pdf_dirs/path"` and update the config.
```
How can I resolve this? If you propose a fix, please make it concise.…utorun-prereq Update skill-eval README prerequisites
| # Absolute path to your agent-eval manifest (JSON list). | ||
| eval_manifest_path: ~/datasets/vidore_v3/agent_eval_manifest.json |
There was a problem hiding this comment.
~/ paths in YAML example won't expand
The comment on line 136 says "Absolute path to your agent-eval manifest," but ~/datasets/... is not an absolute path — ~ is a shell shorthand that standard YAML loaders (PyYAML's safe_load) pass through verbatim. Unless runner.py explicitly calls os.path.expanduser() on every path value after loading, the runner will look for a literal directory named ~ and fail. The troubleshooting section (line 291) already hints at this: "the value under pdf_dirs.<domain> was unset (~ expansion failed …)" — but users who hit this error after following the example config will find it confusing. The example should use a real absolute path (e.g. /home/user/datasets/vidore_v3/agent_eval_manifest.json) or add a note that ~ must be pre-expanded before writing it into the YAML.
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemo_retriever/src/nemo_retriever/skill_eval/README.md
Line: 136-137
Comment:
**`~/` paths in YAML example won't expand**
The comment on line 136 says "Absolute path to your agent-eval manifest," but `~/datasets/...` is not an absolute path — `~` is a shell shorthand that standard YAML loaders (PyYAML's `safe_load`) pass through verbatim. Unless `runner.py` explicitly calls `os.path.expanduser()` on every path value after loading, the runner will look for a literal directory named `~` and fail. The troubleshooting section (line 291) already hints at this: "the value under `pdf_dirs.<domain>` was unset (`~` expansion failed …)" — but users who hit this error after following the example config will find it confusing. The example should use a real absolute path (e.g. `/home/user/datasets/vidore_v3/agent_eval_manifest.json`) or add a note that `~` must be pre-expanded before writing it into the YAML.
How can I resolve this? If you propose a fix, please make it concise.
Description
Checklist