How to write an llms.txt file
Think of llms.txt as a sitemap aimed at language models — a curated, annotated list of your most important pages. It is one of the cheapest, highest-leverage GEO investments available right now.
What llms.txt actually is
llms.txt is a plain-text file you place at the root of your domain (https://yoursite.com/llms.txt) containing a curated list of URLs that you want language models to prioritize. Unlike sitemap.xml, llms.txt is opinionated — you choose what belongs.
It is not an access control file. robots.txt controls what crawlers can fetch. llms.txt tells crawlers which fetched content matters most.
Why it matters
LLMs ingest enormous amounts of content. Without a signal, your most authoritative pages compete with low-value tag archives, dated blog posts, and category pagination. llms.txt lets you raise the signal on what should define your brand in AI responses.
Minimum viable llms.txt
# Your Brand Name
> One-paragraph description of what you do, written in clear, model-friendly prose. This becomes the de facto "what is this site" answer.
## Core
- [Product overview](https://yourdomain.com/product): What the product is and who it serves
- [Pricing](https://yourdomain.com/pricing): Current pricing tiers and what each includes
- [Documentation](https://yourdomain.com/docs): Technical reference
## Guides
- [Getting started](https://yourdomain.com/guides/getting-started): Onboarding walkthrough
- [Best practices](https://yourdomain.com/guides/best-practices): Recommended patterns
## Comparisons
- [Versus competitor A](https://yourdomain.com/compare/competitor-a): Differentiation analysis
llms.txt vs llms-full.txt
Many sites publish both:
- llms.txt — curated, short, hand-picked
- llms-full.txt — comprehensive, includes full content of priority pages concatenated together
llms-full.txt is heavier but useful for technical sites where models benefit from full reference material in a single fetch. Start with llms.txt and add llms-full.txt only if you have meaningful technical documentation.
Step-by-step
- 1
Pick your top 20-50 canonical pages
Walk your site and identify the pages that should *define* your brand to a language model: product overview, pricing, key feature pages, top guides, comparison pages, security/trust pages, About. Avoid blog posts unless they are evergreen pillar pages. Avoid tag archives and pagination.
- 2
Write one-sentence annotations per page
After each URL, add a short description (5-15 words) of what the page contains. These annotations are read by the model and act as the page's elevator pitch. Avoid keyword stuffing — write naturally.
- 3
Group pages into clear sections
Use H2 headings (## Section) to group: Core, Product, Pricing, Documentation, Guides, Comparisons, Company, Trust. Sections give the model an outline of how to navigate your domain by purpose.
- 4
Publish at the root with the correct content type
Save the file as /llms.txt and serve it as text/plain. Make sure it returns a 200 from your CDN and is not behind authentication. Confirm with curl that the file is reachable.
- 5
Link to llms.txt from your sitemap and llms-full.txt
If you maintain a sitemap index, add llms.txt to it. If you publish llms-full.txt, include a header comment pointing back to llms.txt for the curated short version. This creates discoverability between the artifacts.
- 6
Keep it current
When you launch a major new page, redesign your pricing, or rebrand, update llms.txt within the same week. Stale llms.txt is worse than no llms.txt — it points models at content that no longer reflects you.
Frequently asked questions
Do major AI engines actually read llms.txt today?
Adoption is partial and growing. Several engines and crawlers explicitly look for it; others ignore it. Even where it is not formally parsed, a curated llms.txt influences search-augmented retrieval because models that follow links from the file land on your canonical pages first. The cost of publishing one is essentially zero — there is no reason not to.
What is the difference between llms.txt and sitemap.xml?
Sitemap.xml is for search crawlers and aims for completeness — every URL on your site. llms.txt is for language models and aims for curation — only the pages you want to define you. Both can coexist; they serve different audiences.
Should I include marketing landing pages in llms.txt?
Only the canonical ones. Include your primary product page, pricing page, and About page. Skip campaign-specific landing pages, A/B variants, and gated content. If you list 100 thin pages, the file loses its curation value.
Does llms.txt help with traditional Google SEO?
No direct effect. Google Search uses Googlebot and the standard sitemap. llms.txt is specifically aimed at language model retrieval and AI search surfaces. Some indirect benefit accrues because the pages you prioritize in llms.txt often deserve internal linking and prominence anyway.
What encoding and format should llms.txt use?
Plain UTF-8 markdown. Standard markdown headings, lists, and inline links. Keep it under a few hundred lines for the curated version. For llms-full.txt you can go much longer — there is no strict size limit, though several MB is unusual.
Track your AI visibility automatically
Geosaur runs your prompt set across ChatGPT, Perplexity, Claude, Gemini, and Google AI Overviews on a recurring schedule — and alerts you the moment something changes.
