Skip to content

add per-stage references for skill#2034

Open
edknv wants to merge 1 commit into
NVIDIA:mainfrom
edknv:edwardk/skill-per-stage
Open

add per-stage references for skill#2034
edknv wants to merge 1 commit into
NVIDIA:mainfrom
edknv:edwardk/skill-per-stage

Conversation

@edknv
Copy link
Copy Markdown
Collaborator

@edknv edknv commented May 13, 2026

Description

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.
  • If adjusting docker-compose.yaml environment variables have you ensured those are mimicked in the Helm values.yaml file.

@edknv edknv requested review from a team as code owners May 13, 2026 20:31
@edknv edknv requested a review from ChrisJar May 13, 2026 20:31
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 13, 2026

Greptile Summary

This PR adds 16 new per-stage reference files under .claude/skills/nemo-retriever/references/ and reorganises SKILL.md to list them in logical groups (end-to-end, per-input-type extractors, storage/eval, cross-cutting). A small improvement to ingest.md adds actionable guidance around the lack of an --overwrite flag on retriever ingest.

  • New reference files cover every major CLI surface (pdf, chart, audio, txt, html, image, pipeline, pipeline-stages, local, service, vector-store, recall, eval, benchmark, harness, compare), each with consistent sections: when to use, canonical invocations, flags table, failure modes, and [[related]] cross-links.
  • SKILL.md drops the "will be added as those stages stabilize" placeholder and replaces it with the full, categorised index.
  • ingest.md gains a note that subsequent ingests always append and documents the rm -rf / vector-store workaround for starting fresh.

Confidence Score: 4/5

All changes are confined to Claude skill reference documentation; no production code, tests, or configuration is modified.

The new reference files are accurate, internally consistent, and cross-referenced correctly. Two minor documentation clarity issues were found: an unnecessary --overwrite flag in the ingest.md example (where overwrite is already the default), and a vague 'pass the stage-appropriate flag' in local.md that omits the actual flag name.

.claude/skills/nemo-retriever/references/ingest.md and local.md have small accuracy/clarity gaps worth tidying before merge.

Important Files Changed

Filename Overview
.claude/skills/nemo-retriever/SKILL.md Expanded subcommand references index, organized into logical categories; old "will be added" placeholder removed now that all refs exist.
.claude/skills/nemo-retriever/references/ingest.md Added actionable "no --overwrite" guidance; the --overwrite example points to vector-store which overwrites by default, making the flag redundant in the snippet.
.claude/skills/nemo-retriever/references/local.md New per-stage local runner reference; failure mode for stage6 overwrite mentions "pass the stage-appropriate flag" without naming it.
.claude/skills/nemo-retriever/references/pipeline.md Comprehensive pipeline reference with flag groups, stage ordering, and cross-references; correctly calls out --use-table-structure/--use-graphic-elements off-by-default footgun.
.claude/skills/nemo-retriever/references/eval.md Thorough eval reference; correctly documents the --lancedb-table vs --table-name naming inconsistency between eval export and the rest of the CLI.
.claude/skills/nemo-retriever/references/pipeline-stages.md Useful cross-reference map linking each internal stage to its tuning flags, benchmark subcommand, and standalone CLI; consistent with other reference files.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    SKILL[SKILL.md\nentry point]

    subgraph E2E["End-to-end / search"]
        ingest[ingest.md]
        query[query.md]
        pipeline[pipeline.md]
        service[service.md]
        local[local.md]
    end

    subgraph EXTRACT["Per-input-type extractors"]
        pdf[pdf.md]
        chart[chart.md]
        audio[audio.md]
        txt[txt.md]
        html[html.md]
        image[image.md]
    end

    subgraph STORE["Storage and evaluation"]
        vs[vector-store.md]
        recall[recall.md]
        eval[eval.md]
        benchmark[benchmark.md]
        harness[harness.md]
        compare[compare.md]
    end

    subgraph CROSS["Cross-cutting"]
        stages[pipeline-stages.md]
    end

    SKILL --> E2E
    SKILL --> EXTRACT
    SKILL --> STORE
    SKILL --> CROSS

    pipeline -->|"wraps"| pdf
    pipeline -->|"wraps"| chart
    pipeline -->|"wraps"| audio
    ingest -->|"defaults wrapper"| pipeline
    local -->|"per-stage debug"| pdf
    local -->|"per-stage debug"| chart
    local -->|"per-stage debug"| vs
    pdf -->|"primitives feed"| chart
    vs -->|"feeds"| recall
    vs -->|"feeds"| eval
    harness -->|"orchestrates"| recall
    harness -->|"orchestrates"| eval
    harness -->|"orchestrates"| benchmark
    compare -->|"diffs output of"| recall
    compare -->|"diffs output of"| eval
    stages -->|"maps stages to"| pipeline
    stages -->|"maps stages to"| benchmark
Loading
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
.claude/skills/nemo-retriever/references/ingest.md:99-100
Since `--overwrite` is the default for `vector-store stage run`, explicitly passing it adds no effect and may suggest to readers that it's a non-default option. Dropping the flag makes the example cleaner and avoids a misleading implication.

```suggestion
  use [[vector-store]] (`vector-store stage run`) on the
  embeddings stage of the [[local]] flow.
```

### Issue 2 of 2
.claude/skills/nemo-retriever/references/local.md:79-81
The failure mode says "pass the stage-appropriate flag" but never names it, leaving Claude (and human readers) without actionable guidance. If stage6 exposes something like `--append` or `--overwrite`, naming it here prevents a fruitless `--help` hunt.

```suggestion
- **`stage6` overwrites a table I wanted to append to** — use
  [[vector-store]] (`vector-store stage run --append`) which exposes an
  explicit `--append` flag, or inspect `retriever local stage6 run --help`
  for the stage-level equivalent.
```

Reviews (1): Last reviewed commit: "add per-stage references for skill" | Re-trigger Greptile

Comment on lines +99 to +100
use [[vector-store]] (`vector-store stage run --overwrite`) on the
embeddings stage of the [[local]] flow.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Since --overwrite is the default for vector-store stage run, explicitly passing it adds no effect and may suggest to readers that it's a non-default option. Dropping the flag makes the example cleaner and avoids a misleading implication.

Suggested change
use [[vector-store]] (`vector-store stage run --overwrite`) on the
embeddings stage of the [[local]] flow.
use [[vector-store]] (`vector-store stage run`) on the
embeddings stage of the [[local]] flow.
Prompt To Fix With AI
This is a comment left during a code review.
Path: .claude/skills/nemo-retriever/references/ingest.md
Line: 99-100

Comment:
Since `--overwrite` is the default for `vector-store stage run`, explicitly passing it adds no effect and may suggest to readers that it's a non-default option. Dropping the flag makes the example cleaner and avoids a misleading implication.

```suggestion
  use [[vector-store]] (`vector-store stage run`) on the
  embeddings stage of the [[local]] flow.
```

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +79 to +81
- **`stage6` overwrites a table I wanted to append to** — pass the
stage-appropriate flag, or use [[vector-store]] which has explicit
`--append`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The failure mode says "pass the stage-appropriate flag" but never names it, leaving Claude (and human readers) without actionable guidance. If stage6 exposes something like --append or --overwrite, naming it here prevents a fruitless --help hunt.

Suggested change
- **`stage6` overwrites a table I wanted to append to** — pass the
stage-appropriate flag, or use [[vector-store]] which has explicit
`--append`.
- **`stage6` overwrites a table I wanted to append to** — use
[[vector-store]] (`vector-store stage run --append`) which exposes an
explicit `--append` flag, or inspect `retriever local stage6 run --help`
for the stage-level equivalent.
Prompt To Fix With AI
This is a comment left during a code review.
Path: .claude/skills/nemo-retriever/references/local.md
Line: 79-81

Comment:
The failure mode says "pass the stage-appropriate flag" but never names it, leaving Claude (and human readers) without actionable guidance. If stage6 exposes something like `--append` or `--overwrite`, naming it here prevents a fruitless `--help` hunt.

```suggestion
- **`stage6` overwrites a table I wanted to append to** — use
  [[vector-store]] (`vector-store stage run --append`) which exposes an
  explicit `--append` flag, or inspect `retriever local stage6 run --help`
  for the stage-level equivalent.
```

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant