Skip to content

Noindex generated classifier URLs and SSR classifier landing pages#9

Draft
DmitryMatv wants to merge 2 commits intomainfrom
t3code/ssr-landing-pages-noindex-search-urls
Draft

Noindex generated classifier URLs and SSR classifier landing pages#9
DmitryMatv wants to merge 2 commits intomainfrom
t3code/ssr-landing-pages-noindex-search-urls

Conversation

@DmitryMatv
Copy link
Owner

Summary

  • Server-render classifier base landing pages and keep them indexable (index, follow) with canonical headers/meta.
  • Mark generated search/variant classifier URLs as noindex, follow via X-Robots-Tag and page-level robots meta.
  • Refactor classifier page state/SSR context building in app/web.py (canonical generation, query decoding, initial SSR fallback behavior).
  • Update SEO content to use config-driven version labels (remove hardcoded UNSPSC version/date text).
  • Tighten crawl controls: disallow fragment and noisy query variants in robots.txt, and remove generated search URLs from sitemap.xml.
  • Adjust semantic search candidate sizing when reranking is enabled to improve reranker input quality.
  • Expand test coverage for SEO/indexability headers, SSR/fallback behavior, sitemap/robots assertions, and related route contracts.

Testing

  • Added tests/test_classifier_page_seo.py covering base-page indexability, generated-page noindex behavior, canonical behavior, SSR fallback, and usage-tracking expectations.
  • Updated header/metadata tests (tests/test_homepage_headers.py, tests/test_request_validation_and_metadata.py) to verify robots directives and SSR-compatible page rendering.
  • Added static SEO guardrails in tests/test_static_headers.py for robots.txt disallow rules and sitemap exclusions.
  • Updated route/helper and classifier tests to match new top_k defaults and reranking semantics.
  • Not run: full test suite execution output was not provided in this patch context.

- Add SEO routing logic to index only base classifier pages and noindex search/variant URLs
- Server-render initial results on base landing pages with HTMX fallback and matching robots headers/meta
- Update robots.txt and sitemap.xml to exclude fragment/query/generated search URLs
- Source UNSPSC “current version” text from config and add regression tests for SEO behavior
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 86b03f2e37

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

- Disable server-side initial results rendering when any query params are present
- Keep tracking/checkout-param landing URLs indexable with canonical base URL
- Continue noindex behavior for `version`/`top_k` variant URLs
- Add SEO tests for query-param GET/HEAD behavior and SSR skip checks
@DmitryMatv
Copy link
Owner Author

@codex review

@DmitryMatv DmitryMatv marked this pull request as draft March 17, 2026 15:55
@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. What shall we delve into next?

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant