Context
Hello! I'm a WordPress plugin developer who has recently implemented llms.txt generation in our SEO Repair Kit plugin. I'd like to share our implementation approach and seek feedback from the community on standards compliance and best practices.
Our Implementation
We've built a comprehensive llms.txt generator that:
- Generates markdown-formatted content with section headers (
##)
- Uses absolute URLs (one per line)
- Includes sitemap references
- Groups content by post type and taxonomy
- Only includes published, publicly accessible content
- Supports custom post types and taxonomies
- Allows manual editing and customization
Sample Output Format
Generated by SEO Repair Kit, this is an llms.txt file designed to help LLMs better understand and index this website.
# Site Name
## Sitemaps
[XML Sitemap](https://example.com/sitemap_index.xml): Includes all crawlable and indexable pages.
## Posts
- [Post Title](https://example.com/post/): Post description
## Pages
- [Page Title](https://example.com/page/): Page description
Questions
-
Markdown Format: We're using markdown link format [Title](URL): Description - is this the preferred format, or should we use plain URLs?
-
Section Organization: We're grouping by post type (Posts, Pages, Products, etc.) and taxonomy (Categories, Tags, etc.). Is this organization's approach aligned with the standard?
-
Sitemap Inclusion: Should sitemap references be included in llms.txt, or is this redundant since they're already in robots.txt?
-
Content Filtering: We only include published content for security. Should we also filter by other criteria (e.g., password-protected posts, private posts)?
-
Descriptions: We only use existing excerpts/descriptions (no auto-generation). Is this the recommended approach?
Implementation Details
- Plugin: SEO Repair Kit (WordPress.org)
- Standards Reference: Following guidelines from llmstxt.org
GitHub Profile: @cswaqaas
Location: TorontoDigits, Pakistan
Context
Hello! I'm a WordPress plugin developer who has recently implemented
llms.txtgeneration in our SEO Repair Kit plugin. I'd like to share our implementation approach and seek feedback from the community on standards compliance and best practices.Our Implementation
We've built a comprehensive
llms.txtgenerator that:##)Sample Output Format
Questions
Markdown Format: We're using markdown link format
[Title](URL): Description- is this the preferred format, or should we use plain URLs?Section Organization: We're grouping by post type (Posts, Pages, Products, etc.) and taxonomy (Categories, Tags, etc.). Is this organization's approach aligned with the standard?
Sitemap Inclusion: Should sitemap references be included in
llms.txt, or is this redundant since they're already inrobots.txt?Content Filtering: We only include published content for security. Should we also filter by other criteria (e.g., password-protected posts, private posts)?
Descriptions: We only use existing excerpts/descriptions (no auto-generation). Is this the recommended approach?
Implementation Details
GitHub Profile: @cswaqaas
Location: TorontoDigits, Pakistan