Publish a sitemap
Expose an auto-built /sitemap.xml that enumerates every canonical URL, skips drafts and redirects, and uses front-matter dates for lastmod.
/sitemap.xml is registered and served automatically on every AddPennington-based host; a working site already emits one. The options below tune what crawlers see — absolute <loc> values, draft/redirect exclusion, or turning the sitemap off on BlogSite. For a first site, start with Create your first Pennington site.
Before you begin
- A working Pennington site (see Create your first Pennington site if not)
- Pages using an
IFrontMatterimplementation —DocFrontMatter,BlogFrontMatter, or a custom one — soIsDraftand (optionally)Dateflow through to the sitemap builder - A known publishing target: either a fully-qualified URL (set
CanonicalBaseUrl) or a sub-path viadotnet run -- build --base-url /sub/(the sitemap falls back toOutputOptions.BaseUrl)
Options
Set CanonicalBaseUrl so <loc> values resolve
When CanonicalBaseUrl is set on PenningtonOptions, DocSiteOptions, or BlogSiteOptions, the sitemap builder prefixes every URL with it — typically https://your-domain.com/ — producing the absolute <loc> entries crawlers require. Without it, entries fall back to the build's --base-url value or to /.
new BlogSiteOptions
{
CanonicalBaseUrl = "https://example.com",
// ...
}
Exclude drafts and redirects with front matter
The sitemap drops any page whose front matter has isDraft: true or sets redirectUrl:. search: false and llms: false are not honored — those are client-side UX preferences, not SEO directives, so opting a page out of search does not remove it from the sitemap.
Every other discovered HTML route is included, regardless of how it is sourced — markdown, Razor pages, and the routes that custom content services (Source content from outside the markdown pipeline, Source content from a remote API) and AddTaxonomy term pages emit all appear. The only routes left out are those with no canonical HTML page: redirects (which serve a 30x) and llms.txt-only sidecars. Non-HTML outputs such as JSON feeds and generated data files are skipped because their output file is not .html.
(BlogSite only) Suppress the endpoint with EnableSitemap = false
On an AddBlogSite host, set BlogSiteOptions.EnableSitemap = false to skip the /sitemap.xml MapGet entirely — useful when the host environment owns its own sitemap. The flag forwards into PenningtonOptions.MapSitemap; on bare AddPennington or AddDocSite, set that property directly to opt out.
new BlogSiteOptions
{
EnableSitemap = false,
// ...
}
Result
/sitemap.xml returns a <urlset> with one <url> per non-draft, non-redirect page, with absolute <loc> values when CanonicalBaseUrl is set:
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://example.com/</loc>
<lastmod>2024-01-15</lastmod>
</url>
<url>
<loc>https://example.com/how-to/feeds/sitemap/</loc>
<lastmod>2024-02-03</lastmod>
</url>
</urlset>
Verify
- Run
dotnet runand fetch/sitemap.xml. Expect a<urlset>document with one<url><loc>…</loc></url>per non-draft, non-redirect page - Mark a page
isDraft: trueor setredirectUrl:on it and refetch. That URL is absent from the<urlset> - Publish with
CanonicalBaseUrl = "https://example.com"and confirm every<loc>starts withhttps://example.com/. Omit it and rundotnet run -- build /sub/to see<loc>values start with/sub/
Related
- Reference:
SitemapService - How-to: Publish an RSS feed
- How-to: Configure redirects