Skip to content

docs(skill): A/B-tuned ecp skill for agent usability (net -26 lines)#532

Merged
coseto6125 merged 1 commit into
mainfrom
chore/ecp-skill-coverage
Jun 3, 2026
Merged

docs(skill): A/B-tuned ecp skill for agent usability (net -26 lines)#532
coseto6125 merged 1 commit into
mainfrom
chore/ecp-skill-coverage

Conversation

@coseto6125
Copy link
Copy Markdown
Owner

Tunes the embedded ecp skill source (docs/skills/ecp/{SKILL.md,ECP.md}) for agent — especially Haiku — usability. Prompt-only; token strictly decreases (SKILL.md 6235→6106 bytes, net −26 lines). Distributed via the existing include_dir! + ecp admin claude install skills path (verified byte-identical after a rebuild).

Method

Two rounds of empirical A/B: Haiku agents read the skill and execute real navigation tasks end-to-end on this repo; scored on which verb/tool they reached (vs a no-prompt baseline, per the validate-prompt-rules discipline).

Round 1 — correctness

  • 'who calls X' → impact (list), not find (count) — added to Directive 1.
  • Never fabricate: on a true found:false, report 'doesn't exist'; never synthesize a caller list / blast radius for a missing symbol (both arms previously invented one). Directive 3 + ECP.md.
  • Ambiguous-name error → retry --file/--kind, don't fall back to Read.

Round 2 — coverage (expand ecp's reach)

  • cypher for orphans / all-impls (was looping impact) — fixed.
  • impact --literal for filename read/write (was falling back to grep) — needed hoisting the anchor into the description verb-map; buried triggers fail regardless of wording, the fix is altitude.

Validation (held-out, executed)

impact --literal 3/3, cypher-for-orphans 2/2, zero over-trigger on a string-literal guard task, fabricated:false on planted fake symbols.

Notes

  • No test asserts skill text, so the rewording is safe; cargo build -p egent-code-plexus --bin ecp green (embeds the new files).
  • Followed by a patch release (0.6.3 → 0.6.4) so the binary ships the tuned skill.

…26 lines)

Two rounds of empirical A/B (Haiku agents executing real navigation tasks
end-to-end, scored on which verb they reached) drove these prompt-only edits
to the embedded ecp skill source under docs/skills/ecp/:

- description reframed from a capability list into a reflex trigger + a
  question→verb map, with anchors for the two under-used verbs A/B surfaced:
  where-a-filename-is-read-vs-written→impact --literal, and
  graph-question-with-no-verb (orphans/all-impls)→cypher.
- Directive 1: name the two weak-model traps — 'who calls X' is impact (list)
  not find (count); an ambiguous-name error means retry with --file/--kind,
  not fall back to Read.
- Directive 3 + ECP.md: a real found:false miss means report 'doesn't exist',
  never synthesize a caller list / blast radius for a symbol ecp couldn't find.
- Collapsed verbose tables (literal/group/schema/architecture) into tight prose.

Net effect: SKILL.md 6235→6106 bytes (smaller), and held-out Haiku validation
went from fabricating blast radii / falling back to grep → using impact --literal
3/3, cypher for orphans 2/2, with zero over-triggering on string-literal tasks.

Embedded via include_dir!; verified `ecp admin claude install skills` emits
the tuned files byte-identical.
@coseto6125 coseto6125 enabled auto-merge (squash) June 3, 2026 06:06
@coseto6125 coseto6125 added the merge-queue Opt-in to Mergify merge queue label Jun 3, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 3, 2026

ecp impact cache (0 symbols) — internal, used by ecp dev pr-analyze

[]

@github-actions github-actions Bot added the ecp:risk-low ecp signal label Jun 3, 2026
@coseto6125 coseto6125 merged commit 150e5ed into main Jun 3, 2026
18 of 19 checks passed
@coseto6125 coseto6125 deleted the chore/ecp-skill-coverage branch June 3, 2026 06:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ecp:risk-low ecp signal merge-queue Opt-in to Mergify merge queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant