llms.txt: the proposed standard for AI crawlers
llms.txt is a proposed file that gives LLM-driven crawlers a clean, machine-readable summary of your site—analogous to robots.txt for the AI era.
llms.txt
llms.txt is a proposed standard file, placed at the root of a website, that gives LLM-driven crawlers a clean, machine-readable summary of the site—what it is, what pages matter most, and how to consume its content. It is the AI-era cousin of robots.txt.
The proposal (originated by Jeremy Howard / Answer.AI) is simple: a plain Markdown file at /llms.txt that lists your most important pages with short descriptions, plus an optional /llms-full.txt for the full text of every important page in one place.
Why llms.txt exists
- LLM crawlers are noisy and expensive. They re-crawl entire sites to find what’s worth reading
- A clean, curated index helps them find the right pages faster
- It puts the publisher in control of what an LLM is allowed to read
- It can boost your citation rate by ensuring the right pages are easy to retrieve
How to write a good llms.txt
- Start with a one-paragraph summary of your site and who it serves
- List the top 20–50 URLs you most want an LLM to read, grouped by section, with a 1–2 sentence description each
- Skip tags, archives, and low-value pages
- Update it when you publish something important
- Optionally publish
/llms-full.txtwith the full text of your best pages in one document
Should you adopt it now?
Yes—if you publish research, documentation, or long-form reference content. The cost is low (one Markdown file), the upside is real, and you will be early when LLM providers formalize the standard.