SEO

The llms.txt File: Why Every SaaS Website Needs One

Most SaaS sites tell Google how to crawl them but give AI engines nothing to work with. Here’s the file that fixes that in 30 minutes.

Author:
Shanal Govender
Contributors
Date:
March 16, 2026

Your website has a robots.txt file. It has had one since the mid-90s. That file tells Google what to crawl, what to ignore, and how to navigate your pages. Every SaaS company has one because every SaaS company understands that if you want search engines to index your site correctly, you need to give them instructions.

Now ask yourself: what instructions does your website give ChatGPT?

For most SaaS companies, the answer is nothing. Your site gives AI engines the digital equivalent of a 500-page company wiki with no table of contents. Thousands of HTML pages, nested navigation, gated content, JavaScript-rendered pricing calculators, and interactive elements that large language models were never designed to parse. Good luck, ChatGPT. Figure it out.

That is the problem llms.txt solves. It is a markdown file that sits at your domain root and gives AI engines a structured summary of what your company does, what your product is, and where to find the content that matters most. The specification was proposed by Jeremy Howard in September 2024. Adoption is still early, which means the SaaS companies implementing it right now are quietly building an advantage that competitors will spend months trying to replicate.

The Problem llms.txt Solves

Google’s crawler is sophisticated. It renders JavaScript, follows internal links, processes structured data, and indexes pages based on three decades of refinement. It understands your website because it was purpose-built to understand websites.

AI engines are not crawlers. When ChatGPT, Perplexity, or Google’s AI Mode needs to understand your product, it does not methodically index every page the way Googlebot does. It works with whatever content it can extract within a limited context window. And complex SaaS websites create real problems for that process.

Shanal Govender
Senior GTM Consultant @ Empact Partners
Most SaaS websites were built to convert humans, not to be understood by AI models. Your homepage has animations, your docs have sidebar navigation, your pricing page has interactive toggles. None of that translates to how a language model processes information. You’re essentially making the AI guess what your product does.

The result is predictable. AI engines either misrepresent your product, recommend a competitor that was easier to parse, or leave you out of the conversation entirely. As AI summaries increasingly answer queries directly without sending users to click through, invisibility in AI search means invisibility to a growing share of your market.

JavaScript-heavy pages that require rendering before content is accessible
Gated content behind login walls, email capture forms, or demo request flows
Nested documentation with sidebar navigation and multi-level page hierarchies
Dynamic pricing with interactive calculators, toggles, and conditional logic
Duplicate content spread across landing pages, feature pages, and campaign URLs

This is not a theoretical problem on a product roadmap somewhere. It is happening right now, across every B2B SaaS vertical. And the fix takes about 30 minutes.

What llms.txt Actually Is

An llms.txt file is a markdown document that lives at yourdomain.com/llms.txt. It follows a simple, open specification designed to give large language models a clean, structured overview of your website’s most important content.

Think of llms.txt as the executive summary your website never had. Instead of making AI models parse thousands of HTML pages, you hand them a two-page brief with links to what actually matters.

The format is intentionally minimal. The llms.txt specification defines four components:

H1 heading: Your company or project name. This is the only required element.
Blockquote summary: A two-to-three sentence description of what you do, written in plain language.
Body sections: Additional context about your product, positioning, or key differentiators.
H2 file lists: Categorized markdown links to your most important pages, each with a one-line description.
Shanal Govender
Senior GTM Consultant @ Empact Partners
The beauty of llms.txt is that it does not require engineering resources or a CMS overhaul. It is a markdown file. If you can write a README, you can write an llms.txt. The barrier to entry is almost embarrassingly low, which is exactly why so few companies have done it.

That is the entire format. No schema markup. No JSON-LD. No complex syntax. Just a clean markdown file that tells AI models what your site is about, what your product does, and where to find the content that defines your brand.

The spec also recommends providing markdown versions of your key HTML pages. If your pricing lives at /pricing, a markdown equivalent at /pricing.md gives AI models a clean alternative to parsing your frontend code. This is optional but compounds the benefit significantly.

How To Build Yours in 30 Minutes

This is not a multi-sprint initiative that requires a product manager, two engineers, and a quarterly planning cycle. You can create and deploy a functional llms.txt file in a single sitting.

Write Your Brand Summary

Start with a single H1 heading (your company name) and a blockquote that summarizes what you do in two to three sentences. Write it the way you would explain your product to someone at a conference who just asked what your company does. Not the elevator pitch. Not the investor deck. The honest, plain-language version of what problem you solve and how you solve it.

Shanal Govender
Senior GTM Consultant @ Empact Partners
The summary is the single most important part of the file. If an AI model reads nothing else, this is what it uses to decide whether your product is relevant to a query. Skip the marketing language. Say what you actually do, who you do it for, and what makes your approach different.

List Your Key URLs

Under H2 sections, list the 10 to 20 most important pages on your site. Group them by category (Docs, Product, Resources) and add a one-line description after each link. Prioritize pages that contain original information an AI model cannot find anywhere else: product documentation, feature comparisons, case studies with real results, and pricing details.

Deploy and Reference

Save the file as llms.txt at your domain root (yourdomain.com/llms.txt). Then add a reference to it in your robots.txt file so crawlers and models know it exists. That is the entire deployment.

The hardest part of creating llms.txt is not the formatting. It is choosing which 15 pages actually define your product.

Here is a template you can adapt for your own site:

# YourCompany

> YourCompany is a [category] platform that helps
> [audience] [achieve specific outcome]. Founded in
> [year], the product [key differentiator].

YourCompany serves [target market] by providing
[core value proposition]. Key use cases include
[use case 1], [use case 2], and [use case 3].

## Docs

- [Product Documentation](https://yourco.com/docs):
  Complete technical docs and API reference
- [Getting Started](https://yourco.com/docs/quickstart):
  Setup guide for new users

## Product

- [Features](https://yourco.com/features):
  Core capabilities and use cases
- [Pricing](https://yourco.com/pricing):
  Plans, pricing tiers, and feature comparison
- [Integrations](https://yourco.com/integrations):
  Supported third-party integrations

## Resources

- [Case Studies](https://yourco.com/customers):
  Results with measurable outcomes
- [YourCo vs Competitor](https://yourco.com/compare):
  Feature-by-feature comparison
- [Blog](https://yourco.com/blog):
  Industry insights and product updates
Shanal Govender
Senior GTM Consultant @ Empact Partners
We have rolled this out across multiple partner accounts over the past few weeks. The implementation genuinely takes 30 minutes. The only hard part is deciding which 15 to 20 pages actually deserve to be there, because that forces you to confront which content on your site is genuinely differentiated and which is filler.

What To Include (And What To Skip)

The goal of llms.txt is signal density. Every link and every description should give the AI model something genuinely useful for understanding and recommending your product. This is not a sitemap. It is a curated brief.

If a page would not help an AI model accurately recommend your product to someone asking for a solution in your category, it does not belong in your llms.txt file.

Pages that belong in your llms.txt:

Product and feature pages that explain what your product actually does
Pricing page with plan details and tier comparisons
Documentation and API references that demonstrate technical depth
Case studies with real numbers, real partners, and real timelines
Comparison pages where you control the narrative against competitors
Blog posts with original data or proprietary research that cannot be found elsewhere

Now for the other side. These pages add noise without adding signal:

Generic blog content that restates widely available information
Duplicate landing pages from ad campaigns or A/B tests
Login and account pages with no public-facing value
Category archive pages that just list other content without adding context
Internal tooling or admin pages that should not be public-facing
Shanal Govender
Senior GTM Consultant @ Empact Partners
The biggest mistake is treating llms.txt like a sitemap. A sitemap is comprehensive. llms.txt is selective. You are not listing every page on your site. You are curating the 10 to 20 URLs that, if an AI model only ever saw those pages, would give it an accurate picture of your product, your positioning, and your value.

If you find yourself including more than 20 URLs, you are probably including pages that dilute the signal rather than strengthen it. Edit ruthlessly.

llms.txt Is One Layer of a Bigger Strategy

An llms.txt file will not single-handedly make your brand the top recommendation in ChatGPT. Anyone who tells you otherwise is selling something (probably a course). But it removes a real layer of friction from the AI’s ability to understand and reference your brand accurately.

The companies seeing measurable results in AI engine optimization are stacking three layers. Each one reinforces the others:

llms.txt tells AI models what content on your site matters most and how your product is positioned
Schema markup tells AI models what your content means at a structural level
A mention and citation strategy tells AI models that trusted third-party sources vouch for your brand

Remove any one layer and the other two have to work harder for less impact. llms.txt handles the on-site comprehension layer. Schema handles the semantic meaning layer. Mentions handle the trust and authority layer. Together, they give AI engines everything they need to understand, validate, and recommend your product with confidence.

Shanal Govender
Senior GTM Consultant @ Empact Partners
No single optimization makes you visible in AI search overnight. But when you combine llms.txt with schema markup and a real mention strategy, you remove friction at every level. The AI can find your content, understand its meaning, and verify it against third-party sources. That compound effect is where the results come from.

This is still early. The vast majority of SaaS companies have not even heard of llms.txt, let alone implemented it. The companies doing it now are building a structural advantage in how AI engines understand and recommend their product. Unlike most technical optimizations, this one takes 30 minutes, costs nothing, and has zero downside.

Add the file. Tell AI engines what your product is. Stop making them guess.

If you want help implementing llms.txt as part of a broader AI engine optimization strategy for your SaaS, book a call with our team. We have been rolling this out for SaaS partners since before most companies knew they needed to.

Ready
To Connect?

Let's Partner