Question 1

What is Robots.txt?

Accepted Answer

Robots.txt is a plain-text file placed in the root directory of a website that instructs search engine crawlers which pages or sections they are allowed or disallowed from crawling. It follows the Robots Exclusion Protocol and is the first file crawlers check before accessing a site. Robots.txt does not prevent pages from being indexed if they are linked to from other crawled pages.

Question 2

Why does Robots.txt matter?

Accepted Answer

A misconfigured robots.txt can accidentally block important pages from being crawled, leading to deindexing, or waste crawl budget by allowing bots to access low-value pages.

What is Robots.txt?

Related terms

Turn Robots.txt into action