Problem Statement
A common pattern when developers are told "add docstrings" is to write:
def get_user(user_id):
"""Get user."""
def process_data(data):
"""Process data."""
class UserManager:
"""User manager."""
These docstrings technically satisfy presence checks (D100–D107, missing-docstring) but have zero information value. They restate what the name already communicates. ruff and pydocstyle don't catch this — they only check presence and formatting, not whether the content is meaningful.
This is a quality-of-documentation issue that fits squarely in docvet's "enrichment" layer.
Current Behavior
Docstrings that trivially restate the symbol name produce no finding. Presence checks are satisfied.
Proposed Solution
Detection strategy
Compare the docstring summary line against the function/class name. If the summary is a trivial restatement, emit a finding.
Algorithm:
- Extract the first sentence of the docstring (summary line)
- Normalize both symbol name and summary: strip punctuation, lowercase, split
snake_case and CamelCase into word sets
- Filter stop words (
a, an, the, of, for, to, in, etc.)
- If the summary word set is a subset of the name word set, it's trivial
Examples that trigger
| Symbol |
Docstring |
Why trivial |
get_user |
"""Get user.""" |
{get, user} == {get, user} |
process_data |
"""Process the data.""" |
{process, data} ⊆ {process, data} (ignoring "the") |
UserManager |
"""User manager.""" |
{user, manager} == {user, manager} |
calculate_total |
"""Calculate total.""" |
{calculate, total} == {calculate, total} |
Examples that do NOT trigger
| Symbol |
Docstring |
Why not trivial |
get_user |
"""Fetch a user from the database by their ID.""" |
Adds: "database", "ID", "fetch" |
process_data |
"""Apply normalization and deduplication to raw input data.""" |
Describes what "process" means |
UserManager |
"""Manages user lifecycle including creation, auth, and deletion.""" |
Adds substantial context |
Word extraction
_STOP_WORDS = frozenset({
"a", "an", "the", "of", "for", "to", "in", "is", "it",
"and", "or", "this", "that", "with", "from", "by", "on",
})
def _name_to_words(name: str) -> set[str]:
"""Split snake_case and CamelCase into lowercase word sets."""
parts = name.split("_")
words: set[str] = set()
for part in parts:
tokens = re.findall(r"[A-Z]?[a-z]+|[A-Z]+(?=[A-Z]|$)", part)
words.update(t.lower() for t in tokens)
return words - _STOP_WORDS
Configuration
[tool.docvet.enrichment]
detect-trivial-docstrings = true # default
Acceptance Criteria
Technical Notes
Files changed: enrichment.py (~50 lines), config.py (~3 lines), tests (~60 lines)
Category: recommended — quality signal, not a hard requirement. Some trivial docstrings are acceptable in simple utility code.
False positive mitigation: The subset check is conservative — adding even one meaningful word beyond the name avoids the finding. If false positive rate is too high, threshold could be adjusted to overlap ratio.
BMAD Workflow
When ready to implement:
/bmad-bmm-quick-spec -> /bmad-bmm-quick-dev
Problem Statement
A common pattern when developers are told "add docstrings" is to write:
These docstrings technically satisfy presence checks (D100–D107,
missing-docstring) but have zero information value. They restate what the name already communicates. ruff and pydocstyle don't catch this — they only check presence and formatting, not whether the content is meaningful.This is a quality-of-documentation issue that fits squarely in docvet's "enrichment" layer.
Current Behavior
Docstrings that trivially restate the symbol name produce no finding. Presence checks are satisfied.
Proposed Solution
Detection strategy
Compare the docstring summary line against the function/class name. If the summary is a trivial restatement, emit a finding.
Algorithm:
snake_caseandCamelCaseinto word setsa,an,the,of,for,to,in, etc.)Examples that trigger
get_user"""Get user."""{get, user}=={get, user}process_data"""Process the data."""{process, data}⊆{process, data}(ignoring "the")UserManager"""User manager."""{user, manager}=={user, manager}calculate_total"""Calculate total."""{calculate, total}=={calculate, total}Examples that do NOT trigger
get_user"""Fetch a user from the database by their ID."""process_data"""Apply normalization and deduplication to raw input data."""UserManager"""Manages user lifecycle including creation, auth, and deletion."""Word extraction
Configuration
Acceptance Criteria
snake_caseandCamelCasenamesdetect-trivial-docstringsaddedTechnical Notes
Files changed: enrichment.py (~50 lines), config.py (~3 lines), tests (~60 lines)
Category:
recommended— quality signal, not a hard requirement. Some trivial docstrings are acceptable in simple utility code.False positive mitigation: The subset check is conservative — adding even one meaningful word beyond the name avoids the finding. If false positive rate is too high, threshold could be adjusted to overlap ratio.
BMAD Workflow
When ready to implement:
/bmad-bmm-quick-spec->/bmad-bmm-quick-dev