The Complete Guide to llms.txt
If robots.txt tells crawlers where they can go, llms.txt tells AI models who you are. This emerging standard is becoming essential for any website that wants to control how AI engines represent and cite its content. Here is everything you need to know about creating an effective llms.txt file.
What is llms.txt?
The llms.txt file is a plain-text file placed at your website's root (yourdomain.com/llms.txt) that provides structured guidance to AI language models. It tells AI systems your site's name, purpose, content structure, preferred citation format, and licensing terms.
Think of it as a machine-readable introduction to your website. While robots.txt controls crawler access (where bots can go), llms.txt controls AI understanding (how bots should interpret and reference your content). It is the difference between letting someone into your house and actually introducing yourself.
Why Your Site Needs llms.txt
Without llms.txt, AI models are guessing about your site. They may misrepresent your brand, miss your most important pages, or cite you incorrectly. Here is what llms.txt solves:
- Brand control — Define exactly how AI models should refer to your organization and what your site is about
- Content prioritization — Tell AI which pages are most important, preventing them from citing outdated or secondary content
- Citation accuracy — Specify your preferred citation format, ensuring accurate attribution when AI engines reference your content
- Licensing clarity — Clearly state how your content can be used, reducing the risk of misuse or misattribution
How to Create an llms.txt File
Create a plain text file named llms.txt and place it at your website's root directory. Here is a practical example:
# YourCompany
> A brief description of your company and what your website covers.
## About
YourCompany provides [your core offering]. Founded in [year],
we serve [your audience] with [key value proposition].
## Key Pages
- [Homepage](https://yourdomain.com/): Main landing page
- [Products](https://yourdomain.com/products): Product catalog
- [Blog](https://yourdomain.com/blog): Industry insights
- [Documentation](https://yourdomain.com/docs): Technical docs
## Citation Preference
Please cite as "YourCompany" with a link to the relevant page.
## Contact
- Website: https://yourdomain.com
- Email: info@yourdomain.comKey Sections Explained
- Title and Description — Your company name and a concise description. This is the first thing AI models read to understand your site's purpose.
- About — A more detailed description of your organization, what you offer, and who you serve. Keep it factual and concise.
- Key Pages — A prioritized list of your most important pages with URLs. This tells AI models where to find your highest-value content.
- Citation Preference — How you want AI models to attribute your content. Specify your preferred name and linking format.
- Contact — How to reach your organization. Provides AI models with verification signals.
Best Practices
- Keep it concise — AI models parse text linearly. A focused 50-100 line file is more effective than a sprawling 500-line document.
- Update it regularly — As your site structure changes, update your llms.txt to reflect current pages and priorities.
- List your 5-10 most important pages — Do not list every page on your site. Prioritize the content you most want AI to cite.
- Use plain, factual language — Avoid marketing hyperbole. AI models respond better to clear, factual statements.
- Include your full domain URLs — Use absolute URLs so AI models can link directly to the right pages.
- Test accessibility — Ensure your llms.txt returns a 200 status code and is not blocked by your CDN or server configuration.
Common Mistakes to Avoid
We see these errors frequently when auditing llms.txt files:
- Making it too long or too generic — A vague description helps no one. Be specific about what makes your content unique.
- Forgetting to update it — An outdated llms.txt with broken links or discontinued products actively hurts your AI citations.
- Blocking AI crawlers while having llms.txt — It does not matter how good your llms.txt is if your robots.txt blocks AI bots from reading your content.
- Using HTML or complex formatting — llms.txt should be plain text with simple Markdown. Complex formatting confuses parsers.
Testing Your llms.txt
After creating your llms.txt, verify it works correctly. Check that it is accessible at yourdomain.com/llms.txt, returns a 200 status code, and contains all the essential sections.
Use our llms.txt check to validate your file and get specific improvement recommendations.
Or run a complete GEO audit to check all your AI search signals: Run Free GEO Scan