Robots.txt Checker

Instantly fetch and analyze any website robots.txt file. See parsed user-agent rules, Allow and Disallow directives, declared sitemaps, and the raw file content with syntax highlighting.

How to Check a robots.txt File

1

Enter the Website URL

Paste any website URL into the input field.

2

Click Check robots.txt

Click Check to fetch and parse the robots.txt file.

3

Review Parsed Rules and Raw Content

Read every Allow and Disallow rule with declared sitemaps and raw content.

Pixellize free online tools illustration showing browser, file, and gear icons

Read Robots.txt the Way Crawlers Do

Robots.txt mistakes can deindex an entire site overnight. The checker fetches the file, parses every Allow and Disallow directive, lists declared sitemaps, and shows the raw content with proper syntax highlighting.

Critical after migrations and redesigns, when robots.txt is the most-touched-and-most-broken file on a fresh deployment.

Why Use Our Robots.txt Checker

Parsed Rules View

Rules are organized by user-agent group so you can instantly see which crawlers are allowed or blocked from which paths.

Syntax-Highlighted Raw View

View the raw robots.txt file with color-coded syntax highlighting for User-agent, Disallow, Allow, Sitemap, and Crawl-delay directives.

Sitemap Detection

Automatically extracts and lists all Sitemap URLs declared in the robots.txt file with direct links to open them.

Works on Any Website

Enter any domain or URL and the tool automatically fetches the robots.txt from the root of that domain, trying HTTPS first with HTTP fallback.

Frequently Asked Questions

Everything you need to know about robots.txt files and how to use this checker.

What is a robots.txt file?
A robots.txt file is a plain text file placed at the root of a website that tells search engine crawlers which pages or sections they are allowed or not allowed to crawl. It follows the Robots Exclusion Protocol and is checked by crawlers before they visit any page on a site.
What does Disallow mean in robots.txt?
A Disallow directive tells the specified user-agent (crawler) not to crawl the listed path. For example, Disallow: /admin/ prevents crawlers from accessing any URLs under /admin/. A blank Disallow means the crawler is allowed to access everything.
Does robots.txt block pages from Google index?
Not directly. Disallowing a URL in robots.txt prevents crawling but does not guarantee the page will be removed from Google index. If other sites link to a disallowed page, Google may still index the URL without crawling it. Use the noindex meta tag to prevent indexing.
Why does the tool say no robots.txt was found?
The site may not have a robots.txt file, the server may return a non-200 status code for that URL, or the server may block external requests. Some servers return a 404 page with a 200 status code, which can also cause unexpected results.
Scroll to Top