User Agents

Specify which crawlers these rules apply to:

Disallow Paths

Pages or directories to block from crawling (one per line): Examples: /admin/, /private/, *.pdf, /search?*

Allow Paths

Specific paths to allow (overrides disallow rules): Examples: /public/, /images/, /css/

Crawl Settings

Crawl Delay (seconds): Delay between requests (0-86400 seconds). Leave empty for no delay.

Sitemap

Sitemap URL: Full URL to your XML sitemap

Generating your robots.txt file...

Generated Robots.txt

About Robots.txt

Robots.txt is a file that contains instructions on how to crawl a website. It is also known as robots exclusion protocol, and this standard is used by sites to tell the bots which part of their website needs indexing. Also, you can specify which areas you don't want to get processed by these crawlers; such areas contain duplicate content or are under development.

Search Engine Crawlers

Major search engines like Google, Bing, Yahoo, and others respect robots.txt directives. This helps control how your site appears in search results and manages crawl budget effectively.

Security Considerations

While robots.txt can block well-behaved crawlers, malicious bots may ignore these rules. Don't rely on robots.txt alone for security - use proper authentication and access controls.

Common Directives

User-agent: Specifies which crawler the rules apply to
Disallow: Blocks access to specific paths
Allow: Permits access (overrides disallow)
Crawl-delay: Sets delay between requests
Sitemap: Points to your XML sitemap

Best Practices

Place robots.txt in your website's root directory
Use specific paths rather than broad wildcards when possible
Test your robots.txt with Google Search Console
Include your sitemap URL for better indexing
Regularly review and update your robots.txt file

Robots.txt Generator