A minimal Next.js app that proxies a Webflow sitemap, applies custom rules, and serves a final sitemap or a sitemap index if the number of URLs exceeds a configurable limit.
- Proxies from your Webflow site’s auto-generated
sitemap.xml - Removes entries that match glob patterns (supports
*and**) - Optionally adds extra URLs
- Optionally rewrites the domain of all URLs
- Splits the final set into multiple sub-sitemaps when the total exceeds the limit (default 45,000)
- Exposed under a basePath so you can host it at
/configalongside your Webflow site
- Fetches the source sitemap from your Webflow site (
ORIGIN_DOMAIN), parses it, and materializes entries. - Applies modifications in this order:
- Remove URLs matching configured patterns
- Add custom URLs
- Replace origin domain with a new domain (optional)
- If the final URL count is greater than the configured limit, returns a sitemap index listing chunked sub-sitemaps.
- Sitemap index or single sitemap:
/config/sitemap.xml - Sub-sitemaps (when needed):
/config/sitemap/[n].xml(e.g./config/sitemap/1.xml)
Note: The app’s basePath is
/config(seenext.config.ts). Adjust links accordingly if you change it.
ORIGIN_DOMAIN(required): The fully-qualified origin domain to proxy from. Example:https://www.yourdomain.comSITEMAP_LIMIT(optional): Integer; maximum URLs per sitemap file. Default:45000
Edit app/sitemap.xml/config.ts:
-
getUrlsToRemove()- Returns glob patterns to exclude from the sitemap
- Patterns are combined with
ORIGIN_DOMAIN - Glob syntax:
*matches any single path segment (no/)**matches across segments (including/)
- Examples (assuming
ORIGIN_DOMAIN=https://example.com):"/work/*"→ matcheshttps://example.com/work/anything(one segment)"/**/blog"→ matches any URL ending in/blogat any depth
-
getUrlsToAdd()- Returns absolute or path-based URLs to add
- Path-based URLs will be prefixed with
ORIGIN_DOMAIN
-
getDomainToReplace()- Return a fully-qualified domain (e.g.,
https://www.newdomain.com) to rewrite all URLs that start withORIGIN_DOMAIN - Return
""to disable
- Return a fully-qualified domain (e.g.,
-
getSourceSitemapUrl()/getOriginDomain()/getSitemapLimit()- Internal helpers that read from env and provide defaults
npm install
npm run devVisit:
http://localhost:3000/config/sitemap.xml→ sitemap or sitemap indexhttp://localhost:3000/config/sitemap/1.xml→ first chunk (only if index is returned)
The basePath is set to /config in next.config.ts. If you change it, the routes and index links will change accordingly.
You can deploy wherever you host Next.js apps. To deploy alongside your Webflow site on Webflow Cloud, see the docs:
- Webflow Cloud overview: https://developers.webflow.com/webflow-cloud/intro
- Keep Webflow’s auto-generated sitemap enabled (so the source
sitemap.xmlremains available atORIGIN_DOMAIN). - Disable the setting that auto-inserts the Webflow sitemap into
robots.txt. - Manually add a
robots.txtline that points to this app’s sitemap (usuallyhttps://yourdomain.com/config/sitemap.xml). For example:Sitemap: https://yourdomain.com/config/sitemap.xml
MIT — see LICENSE.md.