Skip to content

Feature: AI Stylistic Tell Detection — Tricolon, Stacked Sentences, and AI Vocabulary Fingerprints #69

@craigtrim

Description

@craigtrim

Inspiration

From a widely-shared LinkedIn post by Cassidy Johnston:

"It's not the emdash, it's the word 'quiet' in some form in your post along with things in threes ('It's not this, it's that, or that.' 'Profound short statement. Other profound short statement. Mic drop.')

And sentences
Structured
Like this

That tell me that you're using AI to write your posts."

This is a practitioner-level observation that maps directly to measurable stylometric features. These are the tells that signal AI-generated text — not because they're impossible for humans to write, but because they appear with statistically anomalous frequency and co-occurrence in LLM output.

Identified AI Stylistic Tells

1. Tricolon / Rule of Three

LLMs have a strong bias toward three-part structures:

  • "It's not this, it's that, or that."
  • "First thing. Second thing. Third thing."

Detectable as: sentence-level syntactic tricolon; comma-separated list-of-three patterns within a sentence.

2. Stacked Single-Sentence Paragraphs

And sentences
Structured
Like this

Each "sentence" is a single word or short phrase on its own line, used for dramatic effect. LLMs overuse this pattern as a rhetorical device.

Detectable as: paragraph length distribution; ratio of 1-word and 2-word paragraphs; stacked ultra-short-sentence sequences.

3. AI Vocabulary Fingerprints

Certain words appear with statistically elevated frequency in AI-generated text:

  • quiet / quietly
  • delve, tapestry, nuanced, robust, pivotal, transformative
  • it's not X, it's Y framing

Detectable as: word-level frequency anomaly against human baseline corpora; presence of known AI-preferred lexical items.

4. Mic Drop Sentence Structure

Pattern: [Profound claim]. [Restatement]. [Short punchy close].

Detectable as: sentence length sequence — long, medium, very short — appearing in final paragraph position.

Relationship to Existing Work

This directly extends:

Proposed Implementation

  • Tricolon detector: identify comma-list-of-three and three-sentence sequences
  • Stacked short paragraph detector: sequences of 1–3 word paragraphs
  • AI vocabulary frequency scorer: flag elevated use of known LLM-preferred words against a baseline
  • Mic drop sentence shape detector: length-descending terminal sentence sequences
  • Aggregate "AI tell score" combining the above signals
  • CLI output and integration into mega-meta pipeline

Why This Matters

These features are not just useful for AI detection — they're useful for style coaching. A human writer who wants to sound less AI-generated can use these scores as a mirror. Conversely, when validating LLM output against a human author's tonality (#68), high AI-tell scores are a direct conformance penalty.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions