It’s been almost a month since we started collecting feedback on each individual AI review finding, and since then almost 500 findings have been reported across 81 PRs, with roughly half of them having feedback. We have 140 likes and 138 dislikes, and 58 findings include comments from contributors, which is very valuable.
The collected feedback was analyzed by GPT-5 Pro. TL;DR is below.
Full report
Summary
Overall signal on AI reviews
Across 490 AI findings, 239 received reactions with 140 likes and 138 dislikes, so feedback is almost perfectly split, but the split is not random: clear bug-fixes and correctness issues are widely accepted, while rigid or heavy-handed style and safety enforcement draw most of the dislikes. Contributors often silently accept fixes by applying them without comment, and when they comment explicitly, they tend to praise precise technical catches and push back on suggestions that feel verbose, nitpicky, or out of touch with local conventions.
Where AI is clearly valuable
Contributors repeatedly appreciate AI catching hard-to-spot correctness problems: invalid code, wrong variable names, broken constants, misleading explanations, and broken links or navigation. These comments often get likes plus “Nice catch”, “good one”, or “AI is right”, and sometimes prompt maintainers to say they will re‑check the whole page for compilation or correctness. AI also adds value by consistently enforcing a few high-safety rules such as avoiding inline secrets and hard-coded mnemonics, highlighting funds-moving actions, and flagging incomplete or misleading safety warnings, even if people sometimes trim the suggested wording.
Main tension lines
Most negative feedback clusters around AI enforcing a global style template too literally: verbose safety callouts, aggressive “no marketing” policing, overstrict placeholder rules, and assumptions that every snippet must be runnable or every image/link must follow a single pattern. Contributors often consider these suggestions noisy, context-insensitive, or even wrong when they conflict with project-specific exceptions, existing external specs, or deliberate design choices. The net result is: AI is welcome as a correctness and consistency checker, but contributors want it to be more conservative, less verbose, and more aware of local conventions and exception cases.
Positive
Catching concrete code bugs and non-compiling examples
Contributors strongly appreciate AI spotting real code errors: wrong function names or identifiers, invalid syntax (end_parse instead of end_cell, incorrect Tact send mode constants, use of sender instead of sender()), inconsistent variable names (keyPair vs v4KeyPair), broken lambda examples, and bad TL‑B mappings. These findings frequently get likes and comments like “good one” or “Nice catch!”, and sometimes lead authors to say they will recompile the entire page, which shows that AI is trusted as an extra compiler and proofreader for documentation code.
Improving copy‑paste safety and runnability of snippets
AI is praised when it makes examples more copy‑safe: replacing literal ... ellipses with comments, insisting that runnable examples use valid syntax, fixing invalid TypeScript interface declarations, and flagging partial snippets that should be labeled “Not runnable”. At least one maintainer explicitly states “The AI was on-point here. These should be variables. Code must be runnable.”, and several such findings receive both likes and “Same” replies, indicating that the “examples should either run or be marked non‑runnable” rule is considered valuable in many parts of the docs.
Enforcing placeholder clarity in dangerous or confusing spots
Many accepted findings focus on placeholders: converting free‑form strings like “put receiver address” into <ANGLE_CASE> placeholders, insisting that placeholders be defined on first use, and pointing out inconsistent or misleading placeholder names. While there is some pushback in marginal cases, maintainers generally agree that in runnable examples and funds-moving code, explicit, well-defined placeholders reduce copy/paste mistakes, and they apply those suggestions with little or no argument.
Strengthening safety around funds and secrets
AI’s insistence on avoiding inline mnemonics and API keys, on flagging examples that send real funds, and on tightening mnemonic/private key warnings is often adopted even when the exact wording is softened. Several authors accept “Funds at risk” type suggestions, adapt them, and acknowledge that adding or improving such callouts is a “good suggestion” even if they trim the verbosity, which shows that the underlying risk detection is valued.
Fixing broken links, anchors, and navigation paths
Findings that point to broken internal links, nonexistent pages, wrong anchors, mis-cased routes, or navigation entries that would 404 tend to get likes and little to no pushback. Examples include broken /patterns/reserve links, wrong language/ vs languages/ prefixes, incorrect instruction anchors, and casing mismatches in navigation paths. Contributors treat these as straightforward quality wins and occasionally extend them into broader cleanups like reconsidering uppercase filenames.
Cleaning up subtle semantic and mathematical mistakes
AI gets positive feedback for spotting semantic errors in explanations: incorrect reasoning in address derivation, wrong bounds for VarUInteger types, mismatched account state descriptions, or misleading comments about what a function does. These are the kind of issues that are easy for humans to miss but can deeply confuse readers, so contributors are grateful when AI surfaces them and generally accept these corrections without argument.
Highlighting unfinished content, TODOs, and stubs
AI reliably flags leftover TODO comments, stub pages like <Stub issue="…"/>, and inline “Needs a note” markers. Even when some stubs are intentional, maintainers still call these “good catch”, and acknowledge that such placeholders should usually not ship to readers. This positions AI as a useful guardrail against accidentally publishing incomplete work.
Nudging toward more neutral, data-backed language
Some marketing-style phrasing and unbacked claims are removed explicitly because AI called them out, with maintainers saying “AI is right. Let data speak for itself.” or upvoting findings that strip subjective language like “definitely beautiful” from technical sections. Used sparingly, this style policing is welcomed as a way to keep docs neutral and evidence-based.
Negative
Overly verbose and template-heavy safety callouts
The most consistent complaint is that AI’s safety-callout suggestions are too long and procedural, with repeated comments like “Too verbose” and “Overly verbose callout suggestion, skipping”, plus multiple dislikes on findings that demand explicit Risk/Scope/Mitigation/Environment blocks everywhere. Contributors feel that multi-paragraph danger/caution templates overwhelm pages, especially when the underlying step is small or already explained, and they prefer shorter, context-aware warnings rather than a rigid, repeated checklist.
Rigid rule enforcement that ignores local conventions and exceptions
AI often applies generic style rules where the project deliberately uses exceptions: insisting that images live under /resources/images/ when a special /resources/logo/ or landing-page setup is intentional, demanding root-absolute links in frontmatter that is designed to use relative paths, or requiring sentence case headings for what are effectively product names. Maintainers explicitly push back in these cases, saying the AI is enforcing rules “too literally” or on special-case pages that intentionally diverge, which makes those findings feel noisy or wrong rather than helpful.
Hallucinated or context-wrong findings and anchors
Several comments directly call out hallucinations or misread context: a “Hallucinated link” where the target anchor never existed, a suggestion labelled “nonsense” when it misinterprets shard address examples, and at least one case where a maintainer says “There was nothing like this in the text at that moment”. This undermines trust in style- and anchor-related findings and makes contributors wary that some AI complaints are against a phantom version of the file or based on incorrect assumptions about headings and anchors.
Suggestions that degrade clarity or conflict with product naming
When AI rewrites intros or marketing-ish sentences strictly for style compliance, maintainers sometimes say the new wording “makes the explanation unclear” or argue that terms like “seamless” are acceptable in context when describing protocol advantages. Similarly, enforcing sentence case on headings that include official names like “Testgiver TON Bot” is seen as fighting against real product naming. In these cases, contributors prioritize clarity, correctness, and brand fidelity over abstract style rules, and view AI’s rephrasing as a net regression.
Unpatchable, scuffed, or too-broad suggestions
Some suggestions bundle multiple unrelated changes into a single patch, contain nested fenced code blocks that break suggestion formatting, or introduce placeholders like <PINNED_COMMIT_OR_TAG> that cannot be applied as-is. Maintainers describe these as “bad suggestion — it's not narrow enough”, “unappliable as is”, or note that the formatting “died and suggestion got scuffed”. This makes review friction higher, because the human has to manually decompose or rewrite the fix, reducing the utility of the AI review.
Overzealous placeholder and link rules in already-clear contexts
AI sometimes insists that obvious placeholders like <YOUR_APP_URL> be defined inline, or that high-level schemas with detailed examples still need local definitions for abstract tokens like <TON_CONNECT_LINK_BODY>. Contributors respond that such placeholders are “pretty self-evident” or that the code block is “just an overall scheme” with detailed examples elsewhere, so extra definition text is seen as redundant clutter rather than added clarity, especially when it breaks the flow of concise reference material.
Disagreement over “all snippets must be runnable or labeled” doctrine
While many developers like runnable examples, others explicitly oppose forcing every tiny snippet to be copy‑pasteable, especially in language reference pages where short fragments are not meant to be complete programs. One maintainer notes that even var x = 0; is technically invalid at top-level in their language, and that the page intentionally uses small, illustrative non-runnable snippets. In these areas, AI’s insistence on labels and no literal ellipses is perceived as misaligned with the pedagogical style of the page and the language’s realities.
Style policing that conflicts with external specs or legacy interfaces
In some places AI tries to “fix” terms like “whitelist” or gendered pronouns inside external API descriptions or reference strings that mirror third-party interfaces, and maintainers explicitly note that these come from upstream references or contracts and cannot be easily changed. Similarly, suggestions to rewrite explorer placeholders, external URLs, or API examples sometimes clash with external guidelines and are deferred “to sort out later”. This creates friction where AI applies internal style rules to strings that must remain faithful to external systems.
Heavy enforcement of “no marketing language” where mild promotion is intended
AI regularly flags words like “seamless”, “simple”, or “easy to audit” even in sections specifically explaining protocol benefits, leading to disagreements where maintainers argue that a bit of promotional framing is acceptable and “aligns with the requirements” in those contexts. Repeated dislikes on these findings show that overly strict anti-marketing enforcement can be perceived as out of place when the documentation intentionally includes value propositions, not just dry reference content.
It’s been almost a month since we started collecting feedback on each individual AI review finding, and since then almost 500 findings have been reported across 81 PRs, with roughly half of them having feedback. We have 140 likes and 138 dislikes, and 58 findings include comments from contributors, which is very valuable.
The collected feedback was analyzed by GPT-5 Pro. TL;DR is below.
Positive
Negative
Full report
Summary
Overall signal on AI reviews
Across 490 AI findings, 239 received reactions with 140 likes and 138 dislikes, so feedback is almost perfectly split, but the split is not random: clear bug-fixes and correctness issues are widely accepted, while rigid or heavy-handed style and safety enforcement draw most of the dislikes. Contributors often silently accept fixes by applying them without comment, and when they comment explicitly, they tend to praise precise technical catches and push back on suggestions that feel verbose, nitpicky, or out of touch with local conventions.
Where AI is clearly valuable
Contributors repeatedly appreciate AI catching hard-to-spot correctness problems: invalid code, wrong variable names, broken constants, misleading explanations, and broken links or navigation. These comments often get likes plus “Nice catch”, “good one”, or “AI is right”, and sometimes prompt maintainers to say they will re‑check the whole page for compilation or correctness. AI also adds value by consistently enforcing a few high-safety rules such as avoiding inline secrets and hard-coded mnemonics, highlighting funds-moving actions, and flagging incomplete or misleading safety warnings, even if people sometimes trim the suggested wording.
Main tension lines
Most negative feedback clusters around AI enforcing a global style template too literally: verbose safety callouts, aggressive “no marketing” policing, overstrict placeholder rules, and assumptions that every snippet must be runnable or every image/link must follow a single pattern. Contributors often consider these suggestions noisy, context-insensitive, or even wrong when they conflict with project-specific exceptions, existing external specs, or deliberate design choices. The net result is: AI is welcome as a correctness and consistency checker, but contributors want it to be more conservative, less verbose, and more aware of local conventions and exception cases.
Positive
Catching concrete code bugs and non-compiling examples
Contributors strongly appreciate AI spotting real code errors: wrong function names or identifiers, invalid syntax (
end_parseinstead ofend_cell, incorrect Tact send mode constants, use ofsenderinstead ofsender()), inconsistent variable names (keyPairvsv4KeyPair), broken lambda examples, and bad TL‑B mappings. These findings frequently get likes and comments like “good one” or “Nice catch!”, and sometimes lead authors to say they will recompile the entire page, which shows that AI is trusted as an extra compiler and proofreader for documentation code.Improving copy‑paste safety and runnability of snippets
AI is praised when it makes examples more copy‑safe: replacing literal
...ellipses with comments, insisting that runnable examples use valid syntax, fixing invalid TypeScript interface declarations, and flagging partial snippets that should be labeled “Not runnable”. At least one maintainer explicitly states “The AI was on-point here. These should be variables. Code must be runnable.”, and several such findings receive both likes and “Same” replies, indicating that the “examples should either run or be marked non‑runnable” rule is considered valuable in many parts of the docs.Enforcing placeholder clarity in dangerous or confusing spots
Many accepted findings focus on placeholders: converting free‑form strings like “put receiver address” into
<ANGLE_CASE>placeholders, insisting that placeholders be defined on first use, and pointing out inconsistent or misleading placeholder names. While there is some pushback in marginal cases, maintainers generally agree that in runnable examples and funds-moving code, explicit, well-defined placeholders reduce copy/paste mistakes, and they apply those suggestions with little or no argument.Strengthening safety around funds and secrets
AI’s insistence on avoiding inline mnemonics and API keys, on flagging examples that send real funds, and on tightening mnemonic/private key warnings is often adopted even when the exact wording is softened. Several authors accept “Funds at risk” type suggestions, adapt them, and acknowledge that adding or improving such callouts is a “good suggestion” even if they trim the verbosity, which shows that the underlying risk detection is valued.
Fixing broken links, anchors, and navigation paths
Findings that point to broken internal links, nonexistent pages, wrong anchors, mis-cased routes, or navigation entries that would 404 tend to get likes and little to no pushback. Examples include broken
/patterns/reservelinks, wronglanguage/vslanguages/prefixes, incorrect instruction anchors, and casing mismatches in navigation paths. Contributors treat these as straightforward quality wins and occasionally extend them into broader cleanups like reconsidering uppercase filenames.Cleaning up subtle semantic and mathematical mistakes
AI gets positive feedback for spotting semantic errors in explanations: incorrect reasoning in address derivation, wrong bounds for
VarUIntegertypes, mismatched account state descriptions, or misleading comments about what a function does. These are the kind of issues that are easy for humans to miss but can deeply confuse readers, so contributors are grateful when AI surfaces them and generally accept these corrections without argument.Highlighting unfinished content, TODOs, and stubs
AI reliably flags leftover TODO comments, stub pages like
<Stub issue="…"/>, and inline “Needs a note” markers. Even when some stubs are intentional, maintainers still call these “good catch”, and acknowledge that such placeholders should usually not ship to readers. This positions AI as a useful guardrail against accidentally publishing incomplete work.Nudging toward more neutral, data-backed language
Some marketing-style phrasing and unbacked claims are removed explicitly because AI called them out, with maintainers saying “AI is right. Let data speak for itself.” or upvoting findings that strip subjective language like “definitely beautiful” from technical sections. Used sparingly, this style policing is welcomed as a way to keep docs neutral and evidence-based.
Negative
Overly verbose and template-heavy safety callouts
The most consistent complaint is that AI’s safety-callout suggestions are too long and procedural, with repeated comments like “Too verbose” and “Overly verbose callout suggestion, skipping”, plus multiple dislikes on findings that demand explicit Risk/Scope/Mitigation/Environment blocks everywhere. Contributors feel that multi-paragraph danger/caution templates overwhelm pages, especially when the underlying step is small or already explained, and they prefer shorter, context-aware warnings rather than a rigid, repeated checklist.
Rigid rule enforcement that ignores local conventions and exceptions
AI often applies generic style rules where the project deliberately uses exceptions: insisting that images live under
/resources/images/when a special/resources/logo/or landing-page setup is intentional, demanding root-absolute links in frontmatter that is designed to use relative paths, or requiring sentence case headings for what are effectively product names. Maintainers explicitly push back in these cases, saying the AI is enforcing rules “too literally” or on special-case pages that intentionally diverge, which makes those findings feel noisy or wrong rather than helpful.Hallucinated or context-wrong findings and anchors
Several comments directly call out hallucinations or misread context: a “Hallucinated link” where the target anchor never existed, a suggestion labelled “nonsense” when it misinterprets shard address examples, and at least one case where a maintainer says “There was nothing like this in the text at that moment”. This undermines trust in style- and anchor-related findings and makes contributors wary that some AI complaints are against a phantom version of the file or based on incorrect assumptions about headings and anchors.
Suggestions that degrade clarity or conflict with product naming
When AI rewrites intros or marketing-ish sentences strictly for style compliance, maintainers sometimes say the new wording “makes the explanation unclear” or argue that terms like “seamless” are acceptable in context when describing protocol advantages. Similarly, enforcing sentence case on headings that include official names like “Testgiver TON Bot” is seen as fighting against real product naming. In these cases, contributors prioritize clarity, correctness, and brand fidelity over abstract style rules, and view AI’s rephrasing as a net regression.
Unpatchable, scuffed, or too-broad suggestions
Some suggestions bundle multiple unrelated changes into a single patch, contain nested fenced code blocks that break suggestion formatting, or introduce placeholders like
<PINNED_COMMIT_OR_TAG>that cannot be applied as-is. Maintainers describe these as “bad suggestion — it's not narrow enough”, “unappliable as is”, or note that the formatting “died and suggestion got scuffed”. This makes review friction higher, because the human has to manually decompose or rewrite the fix, reducing the utility of the AI review.Overzealous placeholder and link rules in already-clear contexts
AI sometimes insists that obvious placeholders like
<YOUR_APP_URL>be defined inline, or that high-level schemas with detailed examples still need local definitions for abstract tokens like<TON_CONNECT_LINK_BODY>. Contributors respond that such placeholders are “pretty self-evident” or that the code block is “just an overall scheme” with detailed examples elsewhere, so extra definition text is seen as redundant clutter rather than added clarity, especially when it breaks the flow of concise reference material.Disagreement over “all snippets must be runnable or labeled” doctrine
While many developers like runnable examples, others explicitly oppose forcing every tiny snippet to be copy‑pasteable, especially in language reference pages where short fragments are not meant to be complete programs. One maintainer notes that even
var x = 0;is technically invalid at top-level in their language, and that the page intentionally uses small, illustrative non-runnable snippets. In these areas, AI’s insistence on labels and no literal ellipses is perceived as misaligned with the pedagogical style of the page and the language’s realities.Style policing that conflicts with external specs or legacy interfaces
In some places AI tries to “fix” terms like “whitelist” or gendered pronouns inside external API descriptions or reference strings that mirror third-party interfaces, and maintainers explicitly note that these come from upstream references or contracts and cannot be easily changed. Similarly, suggestions to rewrite explorer placeholders, external URLs, or API examples sometimes clash with external guidelines and are deferred “to sort out later”. This creates friction where AI applies internal style rules to strings that must remain faithful to external systems.
Heavy enforcement of “no marketing language” where mild promotion is intended
AI regularly flags words like “seamless”, “simple”, or “easy to audit” even in sections specifically explaining protocol benefits, leading to disagreements where maintainers argue that a bit of promotional framing is acceptable and “aligns with the requirements” in those contexts. Repeated dislikes on these findings show that overly strict anti-marketing enforcement can be perceived as out of place when the documentation intentionally includes value propositions, not just dry reference content.