User Agents

Disallow Paths

Examples: /admin/, /private/, *.pdf, /search?*

Allow Paths

Examples: /public/, /images/, /css/

Crawl Settings

Delay between requests (0-86400 seconds). Leave empty for no delay.

Sitemap

Full URL to your XML sitemap

Generating your robots.txt file...

Generated Robots.txt

About Robots.txt

Robots.txt is a file that contains instructions on how to crawl a website. It is also known as robots exclusion protocol, and this standard is used by sites to tell the bots which part of their website needs indexing. Also, you can specify which areas you don't want to get processed by these crawlers; such areas contain duplicate content or are under development.

Search Engine Crawlers

Major search engines like Google, Bing, Yahoo, and others respect robots.txt directives. This helps control how your site appears in search results and manages crawl budget effectively.

Security Considerations

While robots.txt can block well-behaved crawlers, malicious bots may ignore these rules. Don't rely on robots.txt alone for security - use proper authentication and access controls.

Common Directives

  • User-agent: Specifies which crawler the rules apply to
  • Disallow: Blocks access to specific paths
  • Allow: Permits access (overrides disallow)
  • Crawl-delay: Sets delay between requests
  • Sitemap: Points to your XML sitemap

Best Practices

  • Place robots.txt in your website's root directory
  • Use specific paths rather than broad wildcards when possible
  • Test your robots.txt with Google Search Console
  • Include your sitemap URL for better indexing
  • Regularly review and update your robots.txt file