Skip to content

fix(skill): customize-opencode off by default; promote browser-execute guide to a registered skill#72

Merged
Alezander9 merged 2 commits into
mainfrom
fix/customize-opencode-default-and-browser-skill
May 16, 2026
Merged

fix(skill): customize-opencode off by default; promote browser-execute guide to a registered skill#72
Alezander9 merged 2 commits into
mainfrom
fix/customize-opencode-default-and-browser-skill

Conversation

@Alezander9
Copy link
Copy Markdown
Member

@Alezander9 Alezander9 commented May 16, 2026

Summary

Eval-agent's deep-dive on v0.1.6 regression traces (gemini-3-flash-preview pooled n=296, -7.5pp vs v0.1.2 baseline) surfaced two skill-registration problems on browser sessions:

  1. customize-opencode is force-registered in every session. The skill ships from upstream (opencode-metacustomize-opencode, #26617 / #26899) and teaches the agent how to author opencode.json, opencode plugins, and opencode agent files. For BrowserCode browser-driving workflows it is pure system-prompt pollution: ~480 chars the model has to evaluate-and-discard every turn, with a negatively-correlated name (opencode ≠ what the binary is called now).
  2. The actually-useful browser-execute-guide.md is not a registered skill. It sits at <dataDir>/skills/browser-execute-guide.md as a plain markdown file. The only reason agents find it is the browser_execute tool description's hard requirement: "you MUST use the Read tool first to read …/browser-execute-guide.md". Eval data shows 77% of tasks do eventually read it — but it has no presence in <available_skills> at planning time, so the agent only discovers it after committing to browser_execute.

Changes

Gate customize-opencode registration on BCODE_ENABLE_CUSTOMIZE_OPENCODE=1 (default off). The built-in registration in packages/opencode/src/skill/index.ts is now wrapped in an env check. When unset / falsy, the skill is not registered at all — it doesn't appear in <available_skills> and the skill tool can't load it. The gate runs before disk discovery, so a user who places their own customize-opencode/SKILL.md on disk still gets it. Opt-in via BCODE_ENABLE_CUSTOMIZE_OPENCODE=1 for sessions that are genuinely editing bcode.json or agent configs.

Promote browser-execute-guide.md to a real registered skill.

  • File renamed: packages/bcode-browser/skills/browser-execute-guide.mdpackages/bcode-browser/skills/browser-execute/SKILL.md (matches opencode's **/SKILL.md discovery glob).
  • Frontmatter added: name: browser-execute, description front-loads "Use ONLY when calling the browser_execute tool or driving a real browser via the Chrome DevTools Protocol. Required reading before the first browser_execute call in a session."
  • discoverSkills extended to scan the bcode-shipped skills materialization dir (<dataDir>/skills/) so the skill auto-registers without users needing a bcode.json entry.
  • browser-execute.txt tool description updated: "you MUST use the skill tool first to load the browser-execute skill" — keeps the strong MUST wording verbatim (the eval-agent confirmed it materially improves scores).

Why this should help eval scores

  • Planning-time visibility: agents see browser-execute in <available_skills> before they ever pick the tool, instead of discovering the guide reactively after the first browser_execute call.
  • Cleaner system prompt on browser sessions: one relevant skill (browser-execute) instead of one irrelevant one (customize-opencode).
  • Same strong MUST read first prompting on browser_execute, now routed through the canonical skill-loading path.

Diff size

8 files, +46 / -17. Yellow-zone modifications in packages/opencode/src/skill/index.ts (gate + new scan, ~20 lines net) and packages/opencode/src/tool/browser-execute.txt (1 line). Everything else is in packages/bcode-browser/ (Green zone). Logged in maintainer-side EXCEPTIONS.md.

Test plan

  • bun typecheck from packages/opencode/
  • bun typecheck from packages/bcode-browser/
  • bun typecheck from repo root (filtered) ✓
  • bun test in packages/bcode-browser/ ✓ (13 pass, 8 skip — the 8 skips are smoke tests gated on BCODE_SMOKE_CHROME=1)
  • bun test test/skill in packages/opencode/: 14 pass, 5 fail — the 5 failures are pre-existing on main (verified by stashing my changes); unrelated to this PR.

Suggested next eval: re-run glm-5.1 and gemini-3-flash-preview baselines on a build of this branch and compare to v0.1.6.


Summary by cubic

Removed the built-in customize-opencode skill and promoted the browser execution guide to a registered browser-execute skill. This cleans up prompt noise and makes the guide visible at planning time; browser_execute now instructs loading the skill first.

  • New Features

    • Promoted the guide to a real skill: skills/browser-execute/SKILL.md with frontmatter (name: browser-execute). It auto-registers, appears in available skills, and is loaded via the skill tool. Tool text now says “MUST use the skill tool to load browser-execute.”
    • Skill discovery scans <dataDir>/skills/ so first‑party skills shipped by @browser-use/bcode-browser are picked up automatically.
  • Bug Fixes

    • Removed forced registration of customize-opencode (no env gate; it no longer ships). A user skill with the same name on disk still loads via normal discovery.

Written for commit a6ac76a. Summary will update on new commits. Review in cubic

…e guide to a registered skill

Eval-agent deep-dive on v0.1.6 regressed traces: the only skill registered in browser sessions was upstream's customize-opencode (opencode.json schema authoring) — pure pollution for browser-driving workflows. Meanwhile the genuinely useful browser-execute-guide.md was not a registered skill, only surfaced because the tool description said 'you MUST Read this file first.'

Two changes:

1) Gate customize-opencode built-in registration on BCODE_ENABLE_CUSTOMIZE_OPENCODE=1 (default off). A user-disk skill of the same name still loads, since the gate runs before disk discovery.

2) Rename packages/bcode-browser/skills/browser-execute-guide.md → browser-execute/SKILL.md, add frontmatter (name: browser-execute, description front-loads 'Use ONLY when calling browser_execute'), and extend discoverSkills to scan <dataDir>/skills/ where the bcode-browser package already materializes first-party skills. The skill now appears in <available_skills> at planning time and is loaded via the skill tool.

Updated browser-execute.txt to instruct 'you MUST use the skill tool first to load the browser-execute skill' — keeps the strong MUST language verbatim per user confirmation that the wording materially improves eval scores.
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 8 files

Re-trigger cubic

Per user: env-var gate added unnecessary surface area. The skill teaches opencode.json schema authoring; for BrowserCode that's the wrong product surface, so don't ship it at all. Removes the const + import + registration block (about 17 lines), and the now-orphaned 377-line prompt body. Net diff: -388 lines.
@Alezander9 Alezander9 merged commit c5b4f26 into main May 16, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant