Skip to content

init: rewrite discovery prompt + add LLM-driven metadata enrichment#6

Open
jakejimenez wants to merge 1 commit into
mainfrom
feat/init-llm-enrichment
Open

init: rewrite discovery prompt + add LLM-driven metadata enrichment#6
jakejimenez wants to merge 1 commit into
mainfrom
feat/init-llm-enrichment

Conversation

@jakejimenez
Copy link
Copy Markdown
Owner

Summary

The init flow today uses a single LLM call to suggest more command paths to probe, then fills description / system_prompt / safety / synonyms from hardcoded heuristics — every tool ends up with description: <tool> CLI, the same generic system prompt, and empty safety lists. The model has the help text in hand and can do better.

This PR implements both halves of the approved plan: an improved discovery prompt (§A) and a new metadata-enrichment call (§B).

§A — Discovery prompt rewrite

buildInitInferencePrompt and initInferenceSystemPrompt:

  • XML-tagged input (<tool>, <root_help>, <verified>) per Anthropic's prompting guidance
  • Long help text placed at the top of the user prompt (and trimmed to 2KB to fit Apple Intelligence's 4096-token ceiling)
  • One concrete few-shot example showing the expected output shape
  • Positive instructions ("Look for: …") instead of "Never invent…"

Output format unchanged — still newline-separated paths that DiscoverFromSeeds verifies.

§B — New metadata enrichment call

After the verified command tree is finalised, a second LLM call (enrichInitMetadata) asks for description, tool-specific system_prompt, safety patterns, and synonyms in a single YAML block wrapped in <yaml>...</yaml> sentinel tags.

The prompt includes:

  • Two few-shot examples — docker (command_tree) and curl (flag_driven) — so the model generalises to either shape
  • A leading <evidence> block where the model quotes help-text lines that justify each destructive pattern (grounds the safety list in reality; the parser ignores it)
  • Mode-aware rules so synonym values map to subcommand paths or capability names depending on the tool's shape

parseEnrichmentYAML extracts the block via regex, validates with yaml.Unmarshal, and rejects placeholder leakage like <your-tool> in the description.

writeScaffold and writeFlagDrivenScaffold accept an optional *EnrichmentMetadata. When present, LLM-produced description/system_prompt replace the hardcoded defaults; LLM safety patterns populate the previously-empty arrays; LLM synonyms union with the heuristic ones via mergeSynonyms.

Apple Intelligence handling

nlci-apple's Swift bridge constrains output to a Generable CommandResult struct (apple/Sources/NLCIApple/App.swift:6-13), which cannot return free-form YAML. runMetadataEnrichment detects this via Router.ActiveName() == "apple" and skips §B silently with a clear stderr message:

Metadata enrichment skipped: Apple backend constrains output to single commands.
Use --backend ollama|llamacpp|lmstudio for richer metadata.

§A still runs on Apple — it produces newline-separated paths which fit within the bridge's command-shaped output.

Test plan

  • go test ./... passes
  • go vet ./... clean
  • parseEnrichmentYAML covered: happy path, missing tag, empty block, junk-before-tag, placeholder-leak rejection
  • mergeSynonyms covered: union semantics, nil-LLM passthrough
  • nlci init docker on Apple: skips §B with the documented message; falls back to hardcoded
  • Reviewer with Ollama: verify nlci --backend ollama init docker produces a real description and populated safety list
  • Reviewer: verify nlci --backend ollama init curl produces a flag-driven-aware system_prompt

Files changed

  • cmd/nlci/main.go — §A prompt rewrite, §B implementation, parser, merge helper, integration
  • cmd/nlci/main_test.go — parser + merge unit tests; signature update for the existing writeFlagDrivenScaffold test

🤖 Generated with Claude Code

The previous init flow used a single LLM call whose only job was to
suggest more command paths to probe; everything else (description,
system_prompt, safety patterns, synonyms) came from hardcoded heuristics
that produced uniformly generic output (e.g. description="<tool> CLI",
empty safety lists). The model had the help text in hand and could do
better.

This change:

§A — Discovery prompt rewrite (cmd/nlci/main.go::buildInitInferencePrompt
and initInferenceSystemPrompt). Switches to XML-tagged input, embeds the
verified-commands context inline, and includes one concrete few-shot
example showing the desired output shape. Long help text is placed at
the top of the user prompt per Anthropic's prompting guidance. Help text
is trimmed to 2KB to keep the prompt under Apple Intelligence's 4096-
token ceiling. Output remains newline-separated paths so DiscoverFromSeeds
verifies them as before.

§B — New metadata enrichment call (enrichInitMetadata + parseEnrichmentYAML).
After the verified command tree is finalised, a second LLM call asks for
description, tool-specific system_prompt, safety patterns, and synonyms
in a single YAML block wrapped in <yaml>...</yaml> sentinel tags. The
prompt includes two few-shot examples (docker for command_tree, curl
for flag_driven) so the model generalises to either shape. A leading
<evidence> block grounds destructive-pattern picks in real help-text
quotes — the parser ignores it. Results merge into writeScaffold and
writeFlagDrivenScaffold; LLM-produced synonyms union with the heuristic
ones rather than replacing them. On any failure (no router, malformed
YAML, placeholder leakage, timeout) the scaffold falls back to today's
hardcoded values, so init never blocks on §B.

Apple Intelligence is detected and skipped silently: the nlci-apple
Swift bridge constrains output to a Generable CommandResult struct
which cannot return free-form YAML. Users get the §B benefit on Ollama
/ llama.cpp / LM Studio; on Apple they get §A discovery improvements
plus a clear stderr message explaining why §B was skipped.

Tests: parseEnrichmentYAML covers happy path, missing tag, empty block,
junk-before-tag, and placeholder-leak rejection. mergeSynonyms covers
union semantics and nil-LLM passthrough.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant