Free SEO Tool

Robots.txt generator

Pick a preset or build your own. Block AI training crawlers, allow search engines, add your sitemap — download a ready-to-deploy robots.txt in seconds.

Adding your sitemap URL helps search engines discover your pages faster.

Save this as robots.txt at the root of your site — e.g. yoursite.com/robots.txt.

Need an SEO strategy, not just config?

pseo pro generates up to 300 ranked page specs from your URL. $19 one-time.

What robots.txt actually controls

robots.txt is the first file most crawlers check when they visit your site. It's a plain text file at yoursite.com/robots.txt that tells bots which URLs they can fetch. It's a hint, not an enforcement — well-behaved crawlers like Googlebot respect it, but anyone can choose to ignore it.

Two things robots.txt is great for: (1) keeping crawlers out of areas that waste crawl budget (admin, cart, checkout), and (2) signaling which bots you allow. It's not a good tool for blocking indexing — use a noindex meta tag for that. Blocking a page in robots.txt can actually cause it to be indexed without content, because Google sees the link but can't read the page.

When to use each preset

Allow all

Your default if you want maximum discoverability. Every crawler (search engines, AI training bots, scrapers) can access every URL. Best for new sites that need indexing as fast as possible.

Standard SEO

Allows all crawlers but blocks common private paths: /admin/, /api/, /cart, /checkout, /account/, /_next/. Safe default for most SaaS and ecommerce sites.

Block AI crawlers

Allows search engines but blocks LLM training bots (GPTBot, ClaudeBot, Google-Extended, CCBot, PerplexityBot, etc.). Use if you don't want your content used for AI training — but note this also reduces your visibility in AI search answers.

Block all

Completely hides your site from every crawler. Useful for staging environments, private beta sites, or admin-only domains. Never use this on production if you want traffic.

Frequently asked questions

What is a robots.txt file?

A plain text file at the root of your domain (yoursite.com/robots.txt) that tells search engine crawlers which pages they can and cannot access. It's the first file most crawlers request when visiting your site.

Where do I put robots.txt on my site?

robots.txt must live at the root of your domain, accessible at yoursite.com/robots.txt. It must be served with Content-Type text/plain. On Next.js, place it in the public folder or generate it dynamically with app/robots.ts.

Does robots.txt block Google from indexing a page?

No — robots.txt blocks crawling, not indexing. If other sites link to a blocked page, Google can still index it without content. To truly prevent indexing, use a noindex meta tag or HTTP header on the page itself (and don't block it in robots.txt, or Google can't see the noindex).

Should I block AI crawlers like GPTBot and ClaudeBot?

It depends on your goal. If you don't want your content used to train LLMs, block them. But note: blocking AI crawlers also means your content is less likely to be cited in AI search results like ChatGPT, Perplexity, or Claude. Many sites choose to allow AI crawlers specifically to be discoverable in AI answers. The 'Block AI crawlers' preset here covers the major training bots (GPTBot, ClaudeBot, Google-Extended, CCBot, etc.).

What does 'User-agent: *' mean?

The asterisk (*) is a wildcard meaning 'all crawlers.' Rules under 'User-agent: *' apply to every bot unless that bot has a more specific set of rules elsewhere in the file. You can target a specific crawler by name (e.g., 'User-agent: Googlebot').

Do I need to include Sitemap: in robots.txt?

It's not required but highly recommended. Adding a Sitemap: line helps search engines discover your sitemap without needing to find it via Google Search Console. You can include multiple Sitemap: lines if you have more than one.

Ready to rank your site at scale?

Generate up to 300 SEO page specs tailored to your product for $19. One-time payment, no subscription.