From b99e18c069d8cc4f7f074390de0448ef1115104f Mon Sep 17 00:00:00 2001 From: Serhii Zhabskyi Date: Tue, 31 Mar 2026 13:42:45 +0200 Subject: [PATCH] Fix website SEO deployment and link handling --- .github/workflows/deploy-website.yml | 9 ++- tasks/todo.md | 63 +++++++++++++++++++ website/astro.config.mjs | 9 +-- website/integrations/seo-robots.mjs | 34 +++++++--- website/integrations/seo-robots.test.mjs | 43 +++++++++++++ website/package.json | 1 + website/site-url.mjs | 63 +++++++++++++++---- website/site-url.test.mjs | 45 +++++++++++++ .../content/docs/canonical-config/agents.mdx | 2 +- .../docs/canonical-config/commands.mdx | 2 +- .../content/docs/canonical-config/hooks.mdx | 2 +- .../docs/canonical-config/ignore-patterns.mdx | 2 +- .../content/docs/canonical-config/index.mdx | 16 ++--- .../docs/canonical-config/mcp-servers.mdx | 2 +- .../docs/canonical-config/permissions.mdx | 2 +- .../content/docs/canonical-config/rules.mdx | 4 +- .../content/docs/canonical-config/skills.mdx | 2 +- website/src/content/docs/cli/index.mdx | 20 +++--- .../docs/getting-started/installation.mdx | 2 +- .../docs/getting-started/quick-start.mdx | 8 +-- .../content/docs/guides/multi-tool-teams.mdx | 2 +- website/src/content/docs/index.mdx | 16 +++-- 22 files changed, 283 insertions(+), 66 deletions(-) create mode 100644 website/integrations/seo-robots.test.mjs create mode 100644 website/site-url.test.mjs diff --git a/.github/workflows/deploy-website.yml b/.github/workflows/deploy-website.yml index b333ef03..5ebbdafe 100644 --- a/.github/workflows/deploy-website.yml +++ b/.github/workflows/deploy-website.yml @@ -7,8 +7,9 @@ # # Live site: https://samplexbro.github.io/agentsmesh/ (see website/astro.config.mjs `site` + `base`). # -# SEO: set repository variable DEPLOY_SITE_URL to the exact origin you want indexed -# (HTTPS, no trailing slash), e.g. https://samplexbro.github.io or your custom domain. +# SEO: set repository variable DEPLOY_SITE_URL to the exact public site URL +# you want indexed, e.g. https://samplexbro.github.io/agentsmesh/ or +# https://docs.agentsmesh.dev/. Custom-domain builds emit CNAME automatically. # Configure DNS or CDN to 301 the non-canonical hostname (www ↔ apex) to that URL. name: Deploy Website @@ -55,6 +56,10 @@ jobs: working-directory: website run: pnpm install --frozen-lockfile + - name: Run website tests + working-directory: website + run: pnpm test + - name: Build website working-directory: website env: diff --git a/tasks/todo.md b/tasks/todo.md index b24f57a6..e05d0218 100644 --- a/tasks/todo.md +++ b/tasks/todo.md @@ -1,3 +1,59 @@ +# Website SEO hardening pass + +- [x] Reproduce the current website SEO artifact/output state and confirm root causes for the reported issues +- [x] Add failing tests first for deploy URL parsing and generated SEO artifacts (`robots.txt`, canonical URL/base handling, custom-domain `CNAME`) +- [x] Implement the minimal website config/build changes to fix the reproducible issues +- [x] Improve crawlable homepage copy to address the low text-to-code warning without changing product meaning +- [x] Run website verification plus post-feature QA and append review notes + +## Review (Website SEO hardening pass) + +- Changes implemented: + - replaced the website’s hardcoded GitHub Pages path assumptions with a single resolved `DEPLOY_SITE_URL` contract that derives origin, base path, public URLs, and optional custom-domain `CNAME` output from one source of truth + - expanded the SEO build integration so it always writes `robots.txt` and writes `CNAME` automatically for custom-domain deployments + - converted hardcoded `/agentsmesh/...` doc links to base-agnostic relative links so the docs build works both on the current GitHub Pages project URL and on a root custom domain + - added more crawlable homepage copy to improve the low text-to-code ratio without changing the product positioning + - added website unit tests and wired them into `website/package.json` plus the deploy workflow so the SEO behavior is enforced in automation +- Tests added: + - `website/site-url.test.mjs` + - `website/integrations/seo-robots.test.mjs` +- Verification: + - `node --test website/site-url.test.mjs website/integrations/seo-robots.test.mjs` + - `node ./node_modules/astro/astro.js build` (in `website/`) + - `DEPLOY_SITE_URL=https://docs.agentsmesh.dev/ node ./node_modules/astro/astro.js build` (in `website/`) + - `rg -n 'href="/agentsmesh/|src="/agentsmesh/|https://docs.agentsmesh.dev/agentsmesh' website/dist -g '*.html'` + - inspected generated `website/dist/robots.txt` and custom-domain `website/dist/CNAME` +- QA Report — Website SEO hardening pass + +### Acceptance Criteria + +| Criterion | Covered by test? | Status | +| --- | --- | --- | +| Deploy URL handling supports both GitHub Pages project paths and root custom domains | `website/site-url.test.mjs`, default/custom Astro builds | OK | +| SEO artifact generation emits the correct `robots.txt` and optional `CNAME` from the same deploy URL | `website/integrations/seo-robots.test.mjs`, custom-domain Astro build | OK | +| Internal docs links remain valid when the website base path changes | default/custom Astro builds plus no `/agentsmesh/` matches in custom-domain HTML output | OK | +| Homepage ships more crawlable explanatory copy to help the text-to-code warning | updated `website/src/content/docs/index.mdx`, verified in Astro build output | OK | +| Website deploy automation exercises the new SEO tests before building | `website/package.json`, `.github/workflows/deploy-website.yml` | OK | + +### Edge Cases + +| Scenario | Covered? | Test location | +| --- | --- | --- | +| Project-site deployment under `/agentsmesh` keeps the base path in canonical URLs and sitemap links | ✓ | `website/site-url.test.mjs` | +| Root custom-domain deployment drops the `/agentsmesh` path entirely | ✓ | `website/site-url.test.mjs`, custom `astro build` | +| GitHub Pages host does not emit an unnecessary `CNAME` file | ✓ | `website/site-url.test.mjs`, default `astro build` | +| Custom-domain host emits `CNAME` and root sitemap URL | ✓ | `website/integrations/seo-robots.test.mjs`, custom `astro build` | +| Built HTML for a root custom domain contains no stale `/agentsmesh` href/src references | ✓ | `rg -n 'href="/agentsmesh/|src="/agentsmesh/|https://docs.agentsmesh.dev/agentsmesh' website/dist -g '*.html'` | + +### Gaps Identified + +- none in the implemented website changes; the live site still needs `DEPLOY_SITE_URL` pointed at a root custom domain plus DNS hostname redirection for the SEO report to stop seeing the GitHub Pages project-path redirect + +### Actions Taken + +- proved the current GitHub Pages project-path build already generated `robots.txt`, then fixed the underlying deploy contract so the site can also ship from a root custom domain where root-level SEO files and non-redirected status codes are possible +- protected the new behavior with unit tests and deploy-workflow coverage instead of leaving it as a one-off config tweak + # TypeScript error repair pass - [x] Run `pnpm typecheck` and capture the full current TypeScript error set @@ -218,6 +274,13 @@ | Criterion | Covered by test? | Status | | --- | --- | --- | + +# Website SEO issue repair pass + +- [ ] Reproduce the current website SEO output and confirm which issues are fixable in-repo vs blocked by hosting/domain constraints +- [ ] Add failing tests first for the website SEO artifacts and URL behavior we can enforce from the repo +- [ ] Implement the minimal website/deploy changes to fix the reproducible SEO issues +- [ ] Run verification plus post-feature QA and append review notes | Recover the global branch threshold without lowering config | `pnpm test:coverage -- --coverage.reporter=json-summary --coverage.reporter=text-summary` | OK | | Add targeted branch tests instead of broad fixture churn | `tests/unit/config/git-remote.test.ts`, `tests/unit/install/install-manifest.test.ts`, `tests/unit/install/git-pin.test.ts`, `tests/unit/install/install-conflicts.branches.test.ts` | OK | | Keep full suite stable under coverage load | `tests/unit/cli/commands/watch.test.ts`, full `pnpm test:coverage` run | OK | diff --git a/website/astro.config.mjs b/website/astro.config.mjs index a865c3ca..8670cbba 100644 --- a/website/astro.config.mjs +++ b/website/astro.config.mjs @@ -2,15 +2,16 @@ import { defineConfig } from 'astro/config'; import starlight from '@astrojs/starlight'; import seoRobotsIntegration from './integrations/seo-robots.mjs'; -import { absoluteFromBase, getSiteOrigin } from './site-url.mjs'; +import { absoluteFromBase, fromBase, getSiteBase, getSiteOrigin, resolveDeploySite } from './site-url.mjs'; +const deploySite = resolveDeploySite(); const site = getSiteOrigin(); const ogImage = absoluteFromBase('/og-image.png'); export default defineConfig({ site, trailingSlash: 'always', - base: '/agentsmesh', + base: getSiteBase(), integrations: [ starlight({ title: 'AgentsMesh', @@ -52,7 +53,7 @@ export default defineConfig({ }, { tag: 'link', - attrs: { rel: 'icon', href: '/agentsmesh/favicon.svg', type: 'image/svg+xml' }, + attrs: { rel: 'icon', href: fromBase('/favicon.svg'), type: 'image/svg+xml' }, }, ], sidebar: [ @@ -123,6 +124,6 @@ export default defineConfig({ }, ], }), - seoRobotsIntegration(() => getSiteOrigin()), + seoRobotsIntegration(() => deploySite.publicUrl), ], }); diff --git a/website/integrations/seo-robots.mjs b/website/integrations/seo-robots.mjs index 33d7cf22..1d5bff7d 100644 --- a/website/integrations/seo-robots.mjs +++ b/website/integrations/seo-robots.mjs @@ -1,20 +1,36 @@ import { writeFileSync } from 'node:fs'; +import { getCnameValue, resolveDeploySite } from '../site-url.mjs'; + +export function buildRobotsTxt(raw = null) { + const { publicUrl } = resolveDeploySite(raw); + return `User-agent: * +Allow: / + +Sitemap: ${publicUrl}/sitemap-index.xml +`; +} + +export function buildSeoArtifacts(raw = null) { + const artifacts = [{ fileName: 'robots.txt', content: buildRobotsTxt(raw) }]; + const cname = getCnameValue(raw); + if (cname) { + artifacts.push({ fileName: 'CNAME', content: cname }); + } + return artifacts; +} + /** - * @param {() => string} getOrigin Host-only HTTPS URL, no trailing slash + * @param {() => string | null | undefined} getDeploySiteUrl Full public site URL */ -export default function seoRobotsIntegration(getOrigin) { +export default function seoRobotsIntegration(getDeploySiteUrl) { return { name: 'seo-robots', hooks: { 'astro:build:done': ({ dir }) => { - const origin = getOrigin().replace(/\/$/, ''); - const body = `User-agent: * -Allow: / - -Sitemap: ${origin}/agentsmesh/sitemap-index.xml -`; - writeFileSync(new URL('robots.txt', dir), body, 'utf8'); + for (const artifact of buildSeoArtifacts(getDeploySiteUrl())) { + writeFileSync(new URL(artifact.fileName, dir), artifact.content, 'utf8'); + } }, }, }; diff --git a/website/integrations/seo-robots.test.mjs b/website/integrations/seo-robots.test.mjs new file mode 100644 index 00000000..3f09e6b3 --- /dev/null +++ b/website/integrations/seo-robots.test.mjs @@ -0,0 +1,43 @@ +import test from 'node:test'; +import assert from 'node:assert/strict'; + +import { buildSeoArtifacts, buildRobotsTxt } from './seo-robots.mjs'; + +test('buildRobotsTxt points crawlers at the public sitemap URL', () => { + assert.equal( + buildRobotsTxt('https://samplexbro.github.io/agentsmesh/'), + `User-agent: * +Allow: / + +Sitemap: https://samplexbro.github.io/agentsmesh/sitemap-index.xml +`, + ); +}); + +test('buildSeoArtifacts adds CNAME for custom domains and skips it for github.io', () => { + assert.deepEqual(buildSeoArtifacts('https://samplexbro.github.io/agentsmesh/'), [ + { + fileName: 'robots.txt', + content: `User-agent: * +Allow: / + +Sitemap: https://samplexbro.github.io/agentsmesh/sitemap-index.xml +`, + }, + ]); + + assert.deepEqual(buildSeoArtifacts('https://docs.agentsmesh.dev/'), [ + { + fileName: 'robots.txt', + content: `User-agent: * +Allow: / + +Sitemap: https://docs.agentsmesh.dev/sitemap-index.xml +`, + }, + { + fileName: 'CNAME', + content: 'docs.agentsmesh.dev\n', + }, + ]); +}); diff --git a/website/package.json b/website/package.json index 693c8813..dcf9cc16 100644 --- a/website/package.json +++ b/website/package.json @@ -5,6 +5,7 @@ "private": true, "scripts": { "dev": "astro dev", + "test": "node --test", "build": "astro build", "preview": "astro preview", "astro": "astro" diff --git a/website/site-url.mjs b/website/site-url.mjs index 15cf0bbd..fb8c9d2d 100644 --- a/website/site-url.mjs +++ b/website/site-url.mjs @@ -1,23 +1,62 @@ +const DEFAULT_DEPLOY_SITE_URL = 'https://samplexbro.github.io/agentsmesh/'; + +function normalizeBasePath(pathname) { + const trimmed = pathname.replace(/\/+$/, ''); + return trimmed === '' ? '/' : trimmed; +} + /** - * Single source of truth for the docs site origin. Set DEPLOY_SITE_URL in CI - * (GitHub repository variable) to your indexed hostname — e.g. apex HTTPS URL - * with no trailing slash. Configure DNS/CDN to 301 the non-canonical host (www vs apex). + * Single source of truth for the docs site's public URL. + * Set DEPLOY_SITE_URL in CI to the exact indexed website URL: + * - GitHub Pages project site: https://samplexbro.github.io/agentsmesh/ + * - Custom domain at root: https://docs.agentsmesh.dev/ */ - -/** @returns {string} e.g. https://samplexbro.github.io */ -export function getSiteOrigin() { - const raw = +export function resolveDeploySite(raw = null) { + const value = + raw?.trim() || process.env.DEPLOY_SITE_URL?.trim() || process.env.SITE_URL?.trim() || - 'https://samplexbro.github.io'; - return raw.replace(/\/$/, ''); + DEFAULT_DEPLOY_SITE_URL; + const url = new URL(value); + const basePath = normalizeBasePath(url.pathname); + const publicUrl = `${url.origin}${basePath === '/' ? '' : basePath}`; + + return { + origin: url.origin, + basePath, + publicUrl, + hostname: url.hostname, + }; +} + +/** @returns {string} e.g. https://samplexbro.github.io */ +export function getSiteOrigin(raw = null) { + return resolveDeploySite(raw).origin; +} + +/** @returns {string} e.g. /agentsmesh or / */ +export function getSiteBase(raw = null) { + return resolveDeploySite(raw).basePath; } /** @param {string} pathWithLeadingSlash path after base, e.g. /og-image.png */ -export function absoluteFromBase(pathWithLeadingSlash) { - const origin = getSiteOrigin(); +export function fromBase(pathWithLeadingSlash, raw = null) { const suffix = pathWithLeadingSlash.startsWith('/') ? pathWithLeadingSlash : `/${pathWithLeadingSlash}`; - return `${origin}/agentsmesh${suffix}`; + const basePath = getSiteBase(raw); + return basePath === '/' ? suffix : `${basePath}${suffix}`; +} + +/** @param {string} pathWithLeadingSlash path after base, e.g. /og-image.png */ +export function absoluteFromBase(pathWithLeadingSlash, raw = null) { + const suffix = pathWithLeadingSlash.startsWith('/') + ? pathWithLeadingSlash + : `/${pathWithLeadingSlash}`; + return `${resolveDeploySite(raw).publicUrl}${suffix}`; +} + +export function getCnameValue(raw = null) { + const { hostname } = resolveDeploySite(raw); + return hostname.endsWith('.github.io') ? null : `${hostname}\n`; } diff --git a/website/site-url.test.mjs b/website/site-url.test.mjs new file mode 100644 index 00000000..a342e269 --- /dev/null +++ b/website/site-url.test.mjs @@ -0,0 +1,45 @@ +import test from 'node:test'; +import assert from 'node:assert/strict'; + +import { + absoluteFromBase, + fromBase, + getCnameValue, + resolveDeploySite, +} from './site-url.mjs'; + +test('resolveDeploySite keeps the GitHub Pages project path as base', () => { + const deploySite = resolveDeploySite('https://samplexbro.github.io/agentsmesh/'); + + assert.deepEqual(deploySite, { + origin: 'https://samplexbro.github.io', + basePath: '/agentsmesh', + publicUrl: 'https://samplexbro.github.io/agentsmesh', + hostname: 'samplexbro.github.io', + }); +}); + +test('resolveDeploySite supports a custom domain at the site root', () => { + const deploySite = resolveDeploySite('https://docs.agentsmesh.dev/'); + + assert.deepEqual(deploySite, { + origin: 'https://docs.agentsmesh.dev', + basePath: '/', + publicUrl: 'https://docs.agentsmesh.dev', + hostname: 'docs.agentsmesh.dev', + }); +}); + +test('fromBase and absoluteFromBase honor the resolved base path', () => { + assert.equal(fromBase('/favicon.svg', 'https://samplexbro.github.io/agentsmesh/'), '/agentsmesh/favicon.svg'); + assert.equal(fromBase('/favicon.svg', 'https://docs.agentsmesh.dev/'), '/favicon.svg'); + assert.equal( + absoluteFromBase('/og-image.png', 'https://docs.agentsmesh.dev/'), + 'https://docs.agentsmesh.dev/og-image.png', + ); +}); + +test('getCnameValue only emits a CNAME record for custom domains', () => { + assert.equal(getCnameValue('https://samplexbro.github.io/agentsmesh/'), null); + assert.equal(getCnameValue('https://docs.agentsmesh.dev/'), 'docs.agentsmesh.dev\n'); +}); diff --git a/website/src/content/docs/canonical-config/agents.mdx b/website/src/content/docs/canonical-config/agents.mdx index b4258e1f..bea18397 100644 --- a/website/src/content/docs/canonical-config/agents.mdx +++ b/website/src/content/docs/canonical-config/agents.mdx @@ -60,7 +60,7 @@ Suggest concrete fixes, not vague recommendations. ## Tool-specific behavior -See the **Agents** row in the [supported tools matrix](/agentsmesh/reference/supported-tools/) for per-target support levels (native, embedded, or unsupported). +See the **Agents** row in the [supported tools matrix](../reference/supported-tools/) for per-target support levels (native, embedded, or unsupported).