init: rewrite discovery prompt + add LLM-driven metadata enrichment#6
Open
jakejimenez wants to merge 1 commit into
Open
init: rewrite discovery prompt + add LLM-driven metadata enrichment#6jakejimenez wants to merge 1 commit into
jakejimenez wants to merge 1 commit into
Conversation
The previous init flow used a single LLM call whose only job was to suggest more command paths to probe; everything else (description, system_prompt, safety patterns, synonyms) came from hardcoded heuristics that produced uniformly generic output (e.g. description="<tool> CLI", empty safety lists). The model had the help text in hand and could do better. This change: §A — Discovery prompt rewrite (cmd/nlci/main.go::buildInitInferencePrompt and initInferenceSystemPrompt). Switches to XML-tagged input, embeds the verified-commands context inline, and includes one concrete few-shot example showing the desired output shape. Long help text is placed at the top of the user prompt per Anthropic's prompting guidance. Help text is trimmed to 2KB to keep the prompt under Apple Intelligence's 4096- token ceiling. Output remains newline-separated paths so DiscoverFromSeeds verifies them as before. §B — New metadata enrichment call (enrichInitMetadata + parseEnrichmentYAML). After the verified command tree is finalised, a second LLM call asks for description, tool-specific system_prompt, safety patterns, and synonyms in a single YAML block wrapped in <yaml>...</yaml> sentinel tags. The prompt includes two few-shot examples (docker for command_tree, curl for flag_driven) so the model generalises to either shape. A leading <evidence> block grounds destructive-pattern picks in real help-text quotes — the parser ignores it. Results merge into writeScaffold and writeFlagDrivenScaffold; LLM-produced synonyms union with the heuristic ones rather than replacing them. On any failure (no router, malformed YAML, placeholder leakage, timeout) the scaffold falls back to today's hardcoded values, so init never blocks on §B. Apple Intelligence is detected and skipped silently: the nlci-apple Swift bridge constrains output to a Generable CommandResult struct which cannot return free-form YAML. Users get the §B benefit on Ollama / llama.cpp / LM Studio; on Apple they get §A discovery improvements plus a clear stderr message explaining why §B was skipped. Tests: parseEnrichmentYAML covers happy path, missing tag, empty block, junk-before-tag, and placeholder-leak rejection. mergeSynonyms covers union semantics and nil-LLM passthrough. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The init flow today uses a single LLM call to suggest more command paths to probe, then fills
description/system_prompt/safety/synonymsfrom hardcoded heuristics — every tool ends up withdescription: <tool> CLI, the same generic system prompt, and empty safety lists. The model has the help text in hand and can do better.This PR implements both halves of the approved plan: an improved discovery prompt (§A) and a new metadata-enrichment call (§B).
§A — Discovery prompt rewrite
buildInitInferencePromptandinitInferenceSystemPrompt:<tool>,<root_help>,<verified>) per Anthropic's prompting guidanceOutput format unchanged — still newline-separated paths that
DiscoverFromSeedsverifies.§B — New metadata enrichment call
After the verified command tree is finalised, a second LLM call (
enrichInitMetadata) asks fordescription, tool-specificsystem_prompt,safetypatterns, andsynonymsin a single YAML block wrapped in<yaml>...</yaml>sentinel tags.The prompt includes:
<evidence>block where the model quotes help-text lines that justify each destructive pattern (grounds the safety list in reality; the parser ignores it)parseEnrichmentYAMLextracts the block via regex, validates withyaml.Unmarshal, and rejects placeholder leakage like<your-tool>in the description.writeScaffoldandwriteFlagDrivenScaffoldaccept an optional*EnrichmentMetadata. When present, LLM-produced description/system_prompt replace the hardcoded defaults; LLM safety patterns populate the previously-empty arrays; LLM synonyms union with the heuristic ones viamergeSynonyms.Apple Intelligence handling
nlci-apple's Swift bridge constrains output to aGenerable CommandResultstruct (apple/Sources/NLCIApple/App.swift:6-13), which cannot return free-form YAML.runMetadataEnrichmentdetects this viaRouter.ActiveName() == "apple"and skips §B silently with a clear stderr message:§A still runs on Apple — it produces newline-separated paths which fit within the bridge's command-shaped output.
Test plan
go test ./...passesgo vet ./...cleanparseEnrichmentYAMLcovered: happy path, missing tag, empty block, junk-before-tag, placeholder-leak rejectionmergeSynonymscovered: union semantics, nil-LLM passthroughnlci init dockeron Apple: skips §B with the documented message; falls back to hardcodednlci --backend ollama init dockerproduces a real description and populated safety listnlci --backend ollama init curlproduces a flag-driven-aware system_promptFiles changed
cmd/nlci/main.go— §A prompt rewrite, §B implementation, parser, merge helper, integrationcmd/nlci/main_test.go— parser + merge unit tests; signature update for the existingwriteFlagDrivenScaffoldtest🤖 Generated with Claude Code