feat(skills): improve agent-tty skill score 85% to 99% by rohan-tessl · Pull Request #121 · coder/agent-tty

rohan-tessl · 2026-06-04T12:41:01Z

really like how the bootstrap skill delegates to the CLI as the source of truth while still being discoverable by agents. The "Playwright for terminals" framing in the README nails the positioning, and the Codex/Claude demo videos showing real agents using the tool are a nice touch.

I ran your skill through tessl skill review at work and found some targeted improvements. Here's the before/after:

Skill	Before	After	Change
agent-tty	85%	99%	+14%

the description was already scoring perfectly at 100%. The content score had the most headroom so I focused there.

Changes made

Content improvements (biggest impact):

Quick start workflow: added a complete create, run, wait, capture, destroy example so agents can start using agent-tty immediately without needing to run agent-tty skills get first
TUI automation example: added a second example showing the --no-wait plus wait --screen-stable-ms plus send-keys pattern for interactive TUIs
Command surface reference: added a compact listing of all command groups (lifecycle, input, observe, capture, environment) so agents know what's available at a glance
Preserved bootstrap design: kept the delegation to agent-tty skills get agent-tty for the full canonical skill with extended patterns and anti-patterns, just moved it to a clear "Full Skill and Extensions" section

What stayed the same:

Description (100%, untouched)
Frontmatter structure and advertise flag
All three agent-tty skills get/list references

I also stress-tested your agent-tty skill against a few real-world task evals and it held up really well on driving an interactive nvim session with keystrokes and capturing proof artifacts. Kudos for that.

quick honest disclosure. I work at https://github.com/tesslio where we build tooling around skills like these. Not a pitch, just saw room for improvement and wanted to contribute.

If you want to self-improve your skills, or define your own scenarios to pressure test, just ask your agent (Claude Code, Codex, etc.) to evaluate and optimize your skill with Tessl. Ping me @rohan-tessl, if you hit any snags.

Thanks in advance 🙏

@ThomasK33

Hey @ThomasK33 👋 really like how the bootstrap skill delegates to the CLI as the source of truth while still being discoverable by agents. The "Playwright for terminals" framing in the README nails the positioning, and the Codex/Claude demo videos showing real agents using the tool are a nice touch. I ran your skill through `tessl skill review` at work and found some targeted improvements. Here's the before/after: | Skill | Before | After | Change | |-------|--------|-------|--------| | agent-tty | 85% | 99% | +14% | the description was already scoring perfectly at 100%. The content score had the most headroom so I focused there. <details> <summary>Changes made</summary> **Content improvements (biggest impact):** - **Quick start workflow**: added a complete create, run, wait, capture, destroy example so agents can start using `agent-tty` immediately without needing to run `agent-tty skills get` first - **TUI automation example**: added a second example showing the `--no-wait` plus `wait --screen-stable-ms` plus `send-keys` pattern for interactive TUIs - **Command surface reference**: added a compact listing of all command groups (lifecycle, input, observe, capture, environment) so agents know what's available at a glance - **Preserved bootstrap design**: kept the delegation to `agent-tty skills get agent-tty` for the full canonical skill with extended patterns and anti-patterns, just moved it to a clear "Full Skill and Extensions" section **What stayed the same:** - Description (100%, untouched) - Frontmatter structure and advertise flag - All three `agent-tty skills get/list` references </details> I also stress-tested your `agent-tty` skill against a few real-world task evals and it held up really well on driving an interactive nvim session with keystrokes and capturing proof artifacts. Kudos for that. quick honest disclosure. I work at https://github.com/tesslio where we build tooling around skills like these. Not a pitch, just saw room for improvement and wanted to contribute. If you want to self-improve your skills, or define your own scenarios to pressure test, just ask your agent (Claude Code, Codex, etc.) to evaluate and optimize your skill with Tessl. Ping me @rohan-tessl, if you hit any snags. Thanks in advance 🙏

ThomasK33 · 2026-06-05T09:21:53Z

Hey @rohan-tessl, appreciate the PR.

The quick start guide and removing the fact that the CLI is the source of truth regarding skills defeats the purpose of a bootstrap skill, as it softens the requirement to load the skill via CLI.
So I'm not sure if I want to merge this.

Also, the real-world task evals you linked are not related to the agent-tty skill, but rather to Matt Pocock's to-issue skill, which is imported and vendored in this project via the skills cli.

@ThomasK33

Hey @ThomasK33 👋 really like how the bootstrap skill delegates to the CLI as the source of truth while still being discoverable by agents. The "Playwright for terminals" framing in the README nails the positioning, and the Codex/Claude demo videos showing real agents using the tool are a nice touch. I ran your skill through `tessl skill review` at work and found a small, targeted improvement. Here's the before/after: | Skill | Before | After | Change | |-------|--------|-------|--------| | agent-tty | 85% | 88% | +3% | the description was already scoring perfectly at 100%. The content score had some headroom, specifically on progressive disclosure, so I focused there while keeping the bootstrap minimal. <details> <summary>Changes made</summary> **Content improvements (small, additive):** - **Core loop hint**: added a one-liner naming the create, run, wait, capture, destroy loop so agents understand the workflow shape before loading the full skill - **Command scope summary**: added a brief listing of command categories (lifecycle, input, observation, capture) and the `--json` / `--home` flags so agents know what's available at a glance - **Preserved bootstrap design**: kept the "source of truth" statement, kept the skill minimal, kept all three `agent-tty skills get/list` references unchanged - Used fenced code blocks for the CLI commands (slightly easier to copy) **What stayed the same:** - Description (100%, untouched) - Frontmatter structure and advertise flag - The intentional bootstrap philosophy </details> quick honest disclosure. I work at https://github.com/tesslio where we build tooling around skills like these. Not a pitch, just saw room for improvement and wanted to contribute. If you want to self-improve your skills, or define your own scenarios to pressure test, just ask your agent (Claude Code, Codex, etc.) to evaluate and optimize your skill with Tessl. Ping me @rohan-tessl, if you hit any snags. Thanks in advance 🙏

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(skills): improve agent-tty skill score 85% to 99%#121

feat(skills): improve agent-tty skill score 85% to 99%#121
rohan-tessl wants to merge 2 commits into
coder:mainfrom
rohan-tessl:improve/skill-review-optimization

rohan-tessl commented Jun 4, 2026 •

edited

Loading

Uh oh!

ThomasK33 commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rohan-tessl commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ThomasK33 commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rohan-tessl commented Jun 4, 2026 •

edited

Loading