robots.txt Generator

Generate a robots.txt with allow/disallow rules and sitemap.

robots.txt is a plain-text file placed at your site root (/robots.txt) that tells search-engine crawlers which paths they may crawl. This generator turns a User-agent, Allow/Disallow rules, an optional Crawl-delay and one or more Sitemap URLs into a standard, ready-to-paste robots.txt.

Start from a preset — allow all, block all, or the WordPress default — then add or remove paths to fine-tune it. Everything is processed in your browser and nothing is sent to a server.

Presets

User-agentThe crawler the rules apply to. Use * for all bots. Examples: Googlebot, Bingbot.

Allow / Disallow rules

Paths start with a slash (/). An empty Disallow value means 'allow everything'.

Crawl-delay (optional)Seconds to wait between requests. Google ignores it; only some crawlers (Bing/Yandex) honor it. Leave blank to omit.

Sitemap URL (multiple allowed)

Enter one absolute URL per line. Example: https://example.com/sitemap.xml

Generated robots.txt

User-agent: *
Disallow:

robots.txt controls crawling, not indexing

The most common misconception is that Disallow removes a page from search results. It does not.robots.txt only stops a crawler from fetching a URL; it does not prevent indexing. If many external links point to that URL, Google can still index the URL without reading its content and show it in results as "No information is available for this page."

To reliably keep a page out of search results, use noindex on the page itself — not robots.txt — via the meta tag <meta name="robots" content="noindex"> or the HTTP header X-Robots-Tag: noindex. Because a crawler must be able to fetch the page to see thenoindex, blocking that same URL in robots.txt while also setting noindexis contradictory.

Directives at a glance

User-agent: the crawler the rules apply to. * means all bots, Googlebot only Google.
Disallow: a path to block from crawling. It matches by prefix (e.g. /admin/). An empty value allows everything.
Allow: an exception inside a disallowed area. The more specific rule wins.
Crawl-delay: seconds between requests. Google ignores it; only some crawlers (Bing/Yandex) honor it.
Sitemap: the absolute URL of a sitemap. You can list several on separate lines.

Common mistakes

Accidentally blocking the whole site with Disallow: /, dropping it from the index — always verify after deploy.
Blocking CSS/JS so Google can't render the page properly. Don't disallow your static assets.
Missing leading slashes or wrong casing. Paths are case-sensitive and must always start with /.

After deploying, use the robots.txt checker to confirm the live file behaves as intended, and the sitemap validator to verify the sitemaps you declared.

Wildcard `*` and `$` pattern table

Googlebot and Bingbot support two patterns that aren't in the original standard: * matches any run of characters (zero or more) and $ anchors to the end of the URL. Combine them to target file extensions or query strings precisely. Remember every match is anchored to the start of the path.

Rule	Meaning	Blocked / not blocked
`Disallow: /*.pdf$`	Any URL whose path ends in `.pdf`	Blocks `/docs/a.pdf` · not `/a.pdf?v=2`
`Disallow: /*?`	Any URL containing a query string	Blocks `/search?q=x` · not `/search`
`Disallow: /private`	Prefix match (no trailing `$`)	Blocks both `/private/` and `/privately`
`Disallow: /*/print`	A path with any folder in between	Blocks `/blog/print` · not `/print`

How Allow vs Disallow conflicts resolve (walkthrough)

When both an Allow and a Disallow match the same URL, Google obeys the rule with the longer (more specific) path pattern. If the lengths tie, the least restrictive rule (Allow) wins. Consider this block:

User-agent: *
Disallow: /folder/ (8 chars)
Allow: /folder/public.html (19 chars)

For /folder/public.html, both rules match but the Allow pattern is longer, so the URL is crawlable. Meanwhile /folder/secret.html matches only Disallow and is blocked. The key takeaway: it is not "the first line wins" but pattern length that decides — reordering the lines changes nothing.

Common pitfall

Treating Disallow: /folder (no trailing slash) as equivalent to Disallow: /folder/. The former is a prefix match, so it also blocks /folder-archive and /folderX.html, quietly dropping pages you never meant to hide. To block only what's inside a specific folder, always add the trailing /.

Frequently asked questions

If I block a page in robots.txt, will it disappear from Google?

No. robots.txt only blocks crawling (fetching the page). If external links point to it, the URL can still be indexed without its content. To reliably remove it from search, use a noindex meta tag or X-Robots-Tag header on the page.

What's the difference between Disallow and noindex?

Disallow is a crawl directive meaning 'don't fetch this URL,' while noindex is an indexing directive meaning 'don't index this page.' For noindex to work the crawler must be able to read the page, so applying both to the same URL causes noindex to be ignored.

Where do I put the robots.txt file?

It must live at the domain root (https://example.com/robots.txt). Crawlers won't read it from a subfolder, and each subdomain needs its own robots.txt.

Do I need a Crawl-delay?

It's optional. Googlebot ignores Crawl-delay and uses the crawl-rate setting in Search Console instead. It only affects some crawlers like Bing or Yandex, which is useful when server load is a concern.

Are the paths I enter sent to a server?

No. Everything is generated entirely in your browser and your input is never transmitted.

Related guides

How to Write robots.txt: Syntax, Examples and Common Mistakesrobots.txt rule syntax, ready-made examples by site type, and the common mistakes that break indexing.
Creating and Submitting sitemap.xml to Speed Up IndexingSitemap format, using lastmod correctly, how to submit to each search engine, and common errors.