Problem
The formatter, diagnostics, and parser all enforce implicit rules about what
valid vimdoc looks like — but those rules are nowhere written down
prescriptively. This blocks spec-enforcement diagnostics and makes it
impossible to confidently answer "is this correct vimdoc?"
Empirical grounding
The corpus work on 135 Neovim 0.12 runtime help files provides ground truth:
- Line width:
tw=78 is universal; 96.2% of prose lines already fit
- Tab stops:
ts=8; tab characters are structural (command-reference
column alignment), not prose whitespace
- Sentence spacing: double-space after
./?/! is a live convention
(confirmed in api.txt and others)
- Separator threshold:
= or - repeated ≥10 times on its own line;
threshold avoids false-positives on things like -- comment or option
strings
- Code fences: both
> (trailing on a prose line) and >language (alone
on its own line) start code blocks; blank line or < ends them; an
unindented non-< line also ends the block
- List items:
- , * , • (unordered) and N. (ordered); * tag *
without adjacent space is a tag definition, not a list item
Scope
A prescriptive spec document (docs/spec.md) covering:
- Block structure: separators, headings (text + right-justified
*tag*),
code blocks, list items, blank lines
- Inline spans: tag definitions (
*tag*), tag references (|taglink|),
code spans (`code`), tab-aligned columns
- Whitespace conventions:
tw=78, ts=8, sentence spacing, trailing
whitespace
- Disambiguation rules:
* tag * vs list item, -> not a fence, ordered
list vs TOC heading annotation, valid taglink syntax (no spaces, no pipe
inside backtick span)
- Edge cases documented by corpus work: blank line terminates code block
(and its diagnostic implications), >language code fence, pipe characters
in prose that are not taglinks
Prerequisite for
Problem
The formatter, diagnostics, and parser all enforce implicit rules about what
valid vimdoc looks like — but those rules are nowhere written down
prescriptively. This blocks spec-enforcement diagnostics and makes it
impossible to confidently answer "is this correct vimdoc?"
Empirical grounding
The corpus work on 135 Neovim 0.12 runtime help files provides ground truth:
tw=78is universal; 96.2% of prose lines already fitts=8; tab characters are structural (command-referencecolumn alignment), not prose whitespace
./?/!is a live convention(confirmed in
api.txtand others)=or-repeated ≥10 times on its own line;threshold avoids false-positives on things like
-- commentor optionstrings
>(trailing on a prose line) and>language(aloneon its own line) start code blocks; blank line or
<ends them; anunindented non-
<line also ends the block-,*,•(unordered) andN.(ordered);* tag *without adjacent space is a tag definition, not a list item
Scope
A prescriptive spec document (
docs/spec.md) covering:*tag*),code blocks, list items, blank lines
*tag*), tag references (|taglink|),code spans (
`code`), tab-aligned columnstw=78,ts=8, sentence spacing, trailingwhitespace
* tag *vs list item,->not a fence, orderedlist vs TOC heading annotation, valid taglink syntax (no spaces, no pipe
inside backtick span)
(and its diagnostic implications),
>languagecode fence, pipe charactersin prose that are not taglinks
Prerequisite for
invalid code fence syntax)
definition of what constitutes a valid taglink