HtmlToMarkdownConverter
Pennington.Infrastructure
Tiny, hand-rolled HTML → Markdown converter used for llms.txt output.
Scope is deliberately minimal: the output is consumed by LLMs and doesn't need to round-trip to the original HTML. It covers headings, paragraphs, lists, links, inline/fenced code, blockquotes, images, horizontal rules, and basic inline formatting. Everything else recurses into text content.
If this grows past ~250 lines, switch to a real library (e.g. ReverseMarkdown on NuGet) rather than expanding it further.
Methods
Convert
#public static string Convert(IElement root, Func<string, string> rewriteHref = null)
Converts root to markdown.
Parameters
rootIElement- Root element whose descendants are converted.
rewriteHrefFunc<string, string>- Optional callback invoked for every
<a href>value before it's emitted. Return the href unchanged to keep the link as-is, or a new string to replace it. Anchor-only links (#section) and empty hrefs are never passed to the callback — they're emitted as plain text regardless.
Returns
stringPennington.Infrastructure.HtmlToMarkdownConverter
namespace Pennington.Infrastructure;
/// Tiny, hand-rolled HTML → Markdown converter used for llms.txt output. Scope is deliberately minimal: the output is consumed by LLMs and doesn't need to round-trip to the original HTML. It covers headings, paragraphs, lists, links, inline/fenced code, blockquotes, images, horizontal rules, and basic inline formatting. Everything else recurses into text content.If this grows past ~250 lines, switch to a real library (e.g. ReverseMarkdown on NuGet) rather than expanding it further.
public class HtmlToMarkdownConverter
{
/// Converts root to markdown.
public static string Convert(IElement root, Func<string, string> rewriteHref = null)
;
}