You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Overall success rate is 79.7%, a slight uptick from 79% seen over Jun 6–8.
Prompt Categories and Success Rates
Category
Total
Merged
Success Rate
📝 Docs
146
121
83%
🧪 Tests
181
151
83%
🔧 Chore/Deps
295
240
81%
✨ Feature
371
295
80%
🔒 Security
137
106
77%
🐛 Bug Fix
400
296
74%
♻️ Refactor
126
91
72%
Note: PRs may match multiple categories; counts reflect overlapping matches.
Prompt Analysis
✅ Successful Prompt Patterns
Common characteristics in merged PRs:
Average prompt length: 179 words (sweet spot: 100–199 words → 82% success)
Include code blocks: 86% of merged PRs have fenced code examples vs 76% of closed
Reference specific files/functions: 83% of merged PRs
WIP labels: only 1.8% of merged PRs are marked WIP
Top action verbs in merged PRs:change, add, update, test, remove
View Example Successful Prompts
PR #38188 — docs: fix mcp list-tools example to use --server flag → Merged
The CLI docs showed <mcp-server> as a positional argument for mcp list-tools, but --server is a required flag and the only positional argument is an optional workflow name. Following the docs literally silently passes the server name as the workflow filter.
Two beginner stumbling blocks on the Quick Start page identified via the Documentation Noob Test: no context on where githubnext/agentics comes from, and a mid-setup deep-dive into the compilation model that interrupts onboarding flow.
The Cost Management page now explicitly documents the built-in guardrails that prevent runaway agent spend. It also clarifies how those defaults are overridden via frontmatter and enterprise-level environment variables.
❌ Unsuccessful Prompt Patterns
Common characteristics in closed PRs:
Average prompt length: 226 words — 26% longer than merged PRs
Very long prompts (400+ words): 45% success rate — lowest of any length bucket
Over-use of vague verbs: resolve (13% closed vs 2% merged), make (18% vs 6%)
View Example Unsuccessful Prompts
PR #38190 — feat(linters): add execcommand analyzer — flag exec.Command() in context-receiving functions → Closed
109 bare exec.Command calls exist in the codebase vs. 24 exec.CommandContext, with no automated enforcement preventing new ones from landing. This adds a execcommand analyzer (the 24th custom linter)... (Complex new analyzer — high implementation scope)
PR #38184 — Validate default AI credit guardrails in compiler output, runtime env wiring, and compile logs → Closed
The compiler defaulted max-ai-credits and max-daily-ai-credits through different paths, but the generated runtime wiring did not consistently apply the same fallback order for both metrics... (Multiple systems touched simultaneously)
PR #38168 — Close 2026-06-09 SPDD spec gaps across ET, AIC, SDK driver, parser, and experiments → Closed
Updates five reviewed specs to resolve daily SPDD findings across ET, AIC, SDK, parser, and experiments... (Batch operation with too wide a scope)
Key Insights
Prompt length matters: 100–199 word prompts achieve 82% success. Prompts with 400+ words fall to 45%. Concise, focused descriptions outperform verbose ones.
Code blocks are a strong positive signal: 86% of merged PRs include fenced code examples. Showing the before/after or exact code context significantly improves clarity.
WIP is a strong negative signal: PRs marked WIP have a 19× higher rate of being closed without merging. If a PR isn't ready, it shouldn't be opened.
Docs and tests have the highest success rates (both 83%). These tasks are well-defined, bounded, and verifiable — qualities that transfer well to Copilot prompts.
Refactoring underperforms (72%): Large-scope structural changes require more context and have higher review friction. Breaking refactors into smaller targeted PRs improves outcomes.
Recommendations
Based on today's analysis:
DO keep prompts between 100–200 words — this is the optimal range with 82% merge rate
DO include code blocks with before/after examples or error output to ground the change
DO frame bug-fix prompts with reproduction steps or error messages (not just "fix X")
AVOID opening PRs with "WIP" in the title or body — these close 19× more often
AVOID prompts over 400 words or spanning many unrelated files/systems simultaneously
AVOID vague verbs like "resolve", "make", "check" without specific targets — prefer "add", "update", "remove" with precise scopes
Historical Trends
Date
PRs
Merged
Success Rate
2026-06-09 (today)
1,000
787
79.7%
2026-06-08
1,000
794
79.0%
2026-06-07
1,000
791
79.0%
2026-06-06
1,000
788
79.0%
2026-06-02
1,000
809
80.0%
2026-06-01
1,000
802
80.0%
2026-05-30
1,000
802
80.0%
Trend: After a dip to 79% during Jun 6–8, today's rate ticked back up to 79.7%. The peak of 80.5% was observed in mid-May. The data suggests a mild regression in prompt quality over the past two weeks.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Summary
Analysis Period: Last 30 days | Run: §27236307914
Overall success rate is 79.7%, a slight uptick from 79% seen over Jun 6–8.
Prompt Categories and Success Rates
Prompt Analysis
✅ Successful Prompt Patterns
Common characteristics in merged PRs:
Top action verbs in merged PRs:
change,add,update,test,removeView Example Successful Prompts
PR #38188 —
docs: fix mcp list-tools example to use --server flag→ MergedPR #38189 —
docs(quick-start): clarify githubnext/agentics origin and simplify .lock.yml paragraph→ MergedPR #38186 —
docs: add default runaway-cost guardrails to Cost Management reference→ Merged❌ Unsuccessful Prompt Patterns
Common characteristics in closed PRs:
resolve(13% closed vs 2% merged),make(18% vs 6%)View Example Unsuccessful Prompts
PR #38190 —
feat(linters): add execcommand analyzer — flag exec.Command() in context-receiving functions→ ClosedPR #38184 —
Validate default AI credit guardrails in compiler output, runtime env wiring, and compile logs→ ClosedPR #38168 —
Close 2026-06-09 SPDD spec gaps across ET, AIC, SDK driver, parser, and experiments→ ClosedKey Insights
Recommendations
Based on today's analysis:
Historical Trends
Trend: After a dip to 79% during Jun 6–8, today's rate ticked back up to 79.7%. The peak of 80.5% was observed in mid-May. The data suggests a mild regression in prompt quality over the past two weeks.
References:
Beta Was this translation helpful? Give feedback.
All reactions