robots.txt tells crawlers where they can and can't go. llms.txt is a different idea: a curated index of what's worth reading.
Marketers keep describing llms.txt as "robots.txt for AI." That comparison is unhelpful and slightly wrong. robots.txt is about exclusion. llms.txt is about curation โ a single Markdown file that tells AI systems, "here are the most important pages on my site, and here's how they fit together."
This piece explains what it is, why it matters now, and how to ship one this week.
What llms.txt actually is
llms.txt is a Markdown file at the root of your domain (yoursite.com/llms.txt) that summarises and links to the most important pages on your site, organised by topic.
The proposal originated with Jeremy Howard and has been adopted by a growing number of AI platforms. OpenAI, Anthropic, and Perplexity have all signalled support. It's still emerging โ but the direction is clear and the cost of adopting it is essentially zero.
Why it's not robots.txt
robots.txt and llms.txt do fundamentally different jobs:
| robots.txt | llms.txt | |
|---|---|---|
| Purpose | Tell crawlers what to skip | Tell AI what's worth reading |
| Default behaviour | Allow everything | No default โ purely advisory |
| Format | Plain directives | Structured Markdown |
| Audience | Search engine crawlers | Language models + AI ingestion pipelines |
| Enforcement | Bots that respect it stop | Models that respect it prioritise it |
Treat them as complementary, not interchangeable. You ship robots.txt to control crawl. You ship llms.txt to influence interpretation.
The format, line by line
A valid llms.txt file has a short, opinionated structure:
- H1 โ your site's name.
- Blockquote โ a one-sentence summary of what you do.
- H2 sections โ topical groupings of content.
- Bulleted links โ under each H2, links to the most important pages with brief one-line descriptions.
That's the whole spec. The point is that an AI model can read the file top to bottom and get a fast, accurate map of your site without crawling everything.
Don't over-curate. List the 10โ30 pages you'd actually want an AI to know about. Listing everything defeats the purpose; listing nothing is worse than not shipping a file.
What to actually include
Prioritise pages that meet at least one of these criteria:
- Pillar pages that cover a topic comprehensively.
- Original research โ survey results, benchmarks, case studies.
- Product or service overviews โ the canonical descriptions of what you sell.
- Pricing pages โ AI is increasingly asked about pricing; make yours easy to find.
- Documentation โ especially if you're a SaaS or technical product.
- Definitive opinion pieces โ your point of view on the industry.
Skip:
- Blog posts that are time-sensitive or tactical.
- Marketing pages with no substantive content ("About us" pages with five sentences).
- Internal-feeling pages users wouldn't intentionally visit.
How to ship one this week
The five-step process:
- Inventory your 30 most important pages.
- Group them into 3โ6 topical sections.
- Write one-line descriptions for each.
- Format the file in Markdown following the spec.
- Deploy it at yoursite.com/llms.txt (no auth, no redirects, content-type text/markdown or text/plain).
Submit the URL to the AI platforms that have crawler dashboards (OpenAI's GPTBot tool, Anthropic's ClaudeBot, Perplexity's PerplexityBot). For platforms that don't yet have submission flows, the file will be picked up on the next crawl.
Treat llms.txt as a living document. Re-audit it quarterly โ add new pillar pages as they ship, remove stale ones, and refine the descriptions. It's one of the highest-leverage 30-minute SEO tasks you can do.