Free online robots.txt generator
Twaino’s robots.txt generator allows you to visually create a complete and valid robots.txt file for your website. This file is essential for controlling how search engines crawl and index your site. With our intuitive interface, you can add User-Agent rules, Allow and Disallow directives, and specify your sitemap URL without writing a single line of code manually.
The robots.txt file is placed at the root of your website and is the first thing search engine bots check before exploring your pages. A misconfigured robots.txt can block the indexing of important pages or, conversely, allow bots to access resources you want to keep private.
How to use the robots.txt generator?
The tool starts with a default rule for the User-Agent “*” (all bots). You can add Allow or Disallow directives for each User-Agent by clicking “+ Directive”. To target a specific bot like Googlebot or Bingbot, click “+ Add User-Agent” and enter its name. Enter your sitemap URL in the provided field and the preview updates in real-time on the right. Copy the result with one click and paste it into your robots.txt file.
What is the robots.txt file?
The robots.txt file is a text file that follows the Robots Exclusion Protocol. It tells search engine bots which parts of your site they can or cannot crawl. Each set of rules begins with a User-agent directive that identifies the bot in question, followed by Allow and Disallow directives that specify which paths are accessible or blocked.
The Sitemap directive at the end of the file tells search engines where to find your XML sitemap, which facilitates the discovery and indexing of all your pages.
Best practices for robots.txt
Here are some essential recommendations for an effective robots.txt. Never block your CSS and JavaScript files because Google needs them to properly render your pages. Use robots.txt to block low-value SEO pages such as internal search results pages, sorting pages, or irrelevant tag archives. Always specify your XML sitemap URL to facilitate crawling.
Do not use robots.txt to hide sensitive pages because it is public and readable by everyone. To protect confidential content, use authentication or the meta noindex tag instead.
Common directive examples
For a WordPress site, it is common to block access to the wp-admin folder while allowing wp-admin/admin-ajax.php which is necessary for the site to function. Search pages are also typically blocked with Disallow: /?s= and duplicate tag pages. For an e-commerce site, you can block filter and sorting pages that create duplicate content.
FAQ
Does robots.txt prevent my pages from being indexed?
Robots.txt prevents crawling but not necessarily indexing. If other sites link to a blocked page, Google can still index it without visiting it. To prevent indexing, use the meta noindex tag.
What does User-agent: * mean?
The asterisk means “all bots”. The rules under this User-agent apply to all search engines except those that have specific rules defined earlier in the file.
Can I block only Google without affecting Bing?
Yes, create a specific group with User-agent: Googlebot and add your Disallow directives. Other engines will follow the rules of User-agent: * which do not contain these restrictions.
How do I test my robots.txt?
Use the robots.txt testing tool in Google Search Console. It allows you to verify whether a specific URL is blocked or allowed by your rules.
How often do bots check robots.txt?
Googlebot caches robots.txt and checks it approximately once per day. Changes are therefore not applied instantly.

