Make the site discoverable to LLM crawlers
Expose a stripped-markdown /llms.txt index plus per-page sidecars so LLM crawlers and agents can ingest your site without scraping HTML.
Expose a /llms.txt index plus per-page stripped-markdown sidecars so LLM crawlers and agents can read the site without scraping HTML. On AddDocSite hosts the wiring is automatic; on bare AddPennington hosts a single AddLlmsTxt(...) call enables it.
Before you begin
- A working Pennington site (see Serve markdown through Blazor Pages if not).
- An
AddDocSitehost (LLM wiring is automatic) or a bareAddPenningtonhost (needs an explicitAddLlmsTxt(...)call — see the "Bare Pennington" section below). The choice rationale is covered in What the DocSite and BlogSite templates wire for you.
Options
(Bare Pennington) Enable LlmsTxtOptions with AddLlmsTxt
On a bare host nothing is wired until you ask for it: call penn.AddLlmsTxt(...) once to turn on the index and sidecars. AddDocSite hosts skip this — the wiring is automatic. The options surface (OutputDirectory, GenerateFullFile) is documented at Pennington.LlmsTxt.LlmsTxtOptions. The chrome-stripping selector lives one layer up at penn.SiteProjection.ContentSelector — it is shared with the search index so both channels see the same body element.
opts.OutputDirectory = "_llms";
opts.GenerateFullFile = false;
Set GenerateFullFile = true to also emit /llms-full.txt — every sidecar concatenated into one file, useful for one-shot ingest by agents that cannot follow per-page links. Off by default because the file can be large.
(DocSite) Opt a page out with llms: false
Every non-draft page is included in the index by default (Llms = true). Setting llms: false in a page's front matter causes LlmsTxtService to skip it when assembling /llms.txt and its sidecar markdown. The page still renders, appears in the sidebar, and participates in search unless search: false is also set. Content/main/llms-hidden.md in examples/DocSiteKitchenSinkExample is a working opted-out page:
---
title: Not in llms.txt
description: This page is intentionally excluded from llms.txt.
sectionLabel: authoring
order: 230
llms: false
uid: kitchen-sink.main.llms-hidden
---
This page carries `llms: false` in its front matter. It still appears
in the sidebar and in search results, but `/llms.txt` does **not**
list it. The content-stripping llms generator skips pages whose
`Llms` flag is `false` when assembling its index of documents.
public bool Llms { get; init; } = true;
Point the chrome-stripping selector at a custom wrapper
The selector that picks the body element out of the rendered page — for a different article wrapper or a non-DocSite layout — is DocSiteOptions.ContentSelector. It defaults to #main-content and is overridable without leaving DocSite. The same selector drives the search index, llms.txt sidecars, and the build-time link audit, so chrome is stripped once. (On a bare host the equivalent knob is penn.SiteProjection.ContentSelector, shown above.) See What the DocSite and BlogSite templates wire for you for cases that do require bare AddPennington.
Split content per-fragment with humans-only / robots-only
For finer control than page-level opt-out, two paired classes mark a fragment as intended for one audience. Both ship in the MonorailCSS base styles — no registration needed.
humans-only— visible in the browser, stripped from the llms.txt extraction.robots-only— hidden in the browser viadisplay: none, kept in the llms.txt extraction.
<div class="humans-only">
<InteractiveTour />
</div>
<div class="robots-only">
<p>Full method signature: <code>Task<Result> ProcessAsync(Options options, CancellationToken ct = default)</code>.</p>
</div>
The classes work anywhere in the rendered page — markdown bodies, Razor components, auto-generated reference pages.
Result
/llms.txt lists each indexed page as a markdown link grouped by section, and each page gets a stripped-markdown sidecar at /_llms/<page>.md. Links are fully qualified when PenningtonOptions.CanonicalBaseUrl is set (or build --base-url https://… is passed); otherwise they fall back to root-relative /_llms/... so an agent that fetched /llms.txt can still resolve them against the origin.
A typical front door — a metadata block, a ## Map of any subtrees split into their own {prefix}llms.txt, then the nav-grouped links:
# Pennington Docs
> Content engine library for .NET.
site: https://docs.example.com/
canonical: https://docs.example.com/llms.txt
generated: 2026-06-09 14:02 UTC
penningtonVersion: 0.1.0
## Map
- [Reference](https://docs.example.com/reference/llms.txt) (96 entries, ~120k tokens) — API surface, host extensions, front-matter keys, Markdig extensions, UI components, diagnostics codes.
## Tutorials
- [Your first Pennington site](https://docs.example.com/tutorials/getting-started/first-site): Build a static site from a single markdown file.
- [Add a second locale](https://docs.example.com/tutorials/beyond-basics/add-a-locale): Ship the same content in a second language.
## How-to
- [Switch the body and heading typeface](https://docs.example.com/how-to/configuration/fonts): Self-host woff2, declare preloads, point the family options at the new faces.
The ## Map block appears only when a subtree is declared (a folder's _meta.yml carries an llms: block, or a content service registers one) — those leaves move to /reference/llms.txt and the front door keeps just the see-also line above. A site with no subtrees jumps straight from the metadata block to the nav-grouped links.
Verify
- Run
dotnet runand fetch/llms.txt. Expect a metadata block, a## Mapsection listing per-subtree splits with token estimates, then nav-grouped links — and no line for any page markedllms: false - Fetch one per-page sidecar (for example,
/_llms/<page>.md). Expect a YAML header withcanonical_url,content_hash,tokens, and the body stripped to clean markdown - With
GenerateFullFile = true, fetch/llms-full.txt. Expect every sidecar concatenated in one response
Related
- Reference:
LlmsTxtOptions - How-to: Tune what the search box returns
- Background: What the DocSite and BlogSite templates wire for you