llms.txt
Emerging convention telling LLM crawlers how to use your content.
llms.txt is a proposed standard for a plain-text file at the root of a website that tells large language model crawlers (OpenAI, Anthropic, Perplexity, Google AI Overviews and others) which content to ingest, which to skip, and how to attribute the source. It functions analogously to robots.txt for traditional search crawlers.
Why llms.txt matters
LLM crawlers have rapidly become a meaningful traffic and citation source for UK financial sites. Unlike Google's spider, LLM crawlers do not always honour robots.txt or canonical tags consistently, and some scrape content for grounding without attribution. llms.txt - and its richer cousin llms-full.txt - lets you publish a curated, structured representation of your site optimised for ingestion, with explicit citation guidance.
What llms.txt and llms-full.txt should contain
A practical implementation:
llms.txt- a short markdown index of your most important canonical pages, with one-line summaries.llms-full.txt- the actual content of those pages in clean markdown, ready to ingest with no navigation chrome.
Both files should declare the canonical author, organisation, jurisdiction (e.g. UK, FCA-regulated) and citation format you prefer.
llms.txt and AI-search SEO
For UK financial services, llms.txt is one of the few structural levers that materially improves citation rates in ChatGPT, Perplexity and Google AI Overviews. It is cheap to build, hard to spam and unambiguously useful for crawlers - a clean win.
Related terms
- robots.txt
- Canonical Tag
- AI Search Optimisation
- Citation