What is the X-Robots-Tag?
The X-Robots-Tag is an instruction you can give in the server’s response telling search engines “don’t index this page” or “don’t follow links on this page,” etc., without needing to put a <meta name="robots"> tag on the page itself.

It’s called “X-Robots-Tag” because it was introduced as a kind of extension (the “X-” prefix) and has become a de facto standard for crawler directives (though not an official W3C standard, all major search engines honor it).
Why use X-Robots-Tag instead of (or in addition to) meta robots tags?
One big reason is it works on non-HTML content and site-wide rules. For example, you can’t put a <meta> tag inside a PDF or an image file. But you can send an X-Robots-Tag header in the HTTP response of that PDF or image request to tell Google not to index it. So it’s super useful for controlling indexing of PDFs, images, videos, or other file types
Also, you might prefer to set rules at the server level. For instance, via your web server configuration, you could say “any URL containing /private/ should have an X-Robots-Tag: noindex.” This covers all content under that path without editing each file or page.
Essentially, X-Robots-Tag gives more flexibility. You can apply directives in HTTP responses for any content type and even specify them in server or code logic.
Google states that “any directive that can be used in a meta robots tag can also be specified as an X-Robots-Tag.” So the directives available include: noindex, nofollow, nosnippet, noarchive (nor archive), noimageindex, notranslate, unavailable_after (set an expiration date for indexing), and also the positive directives like index or follow(though those are defaults). For example, an HTTP header might look like:
X-Robots-Tag: noindex, nofollow
…which instructs crawlers not to index the page and not to follow any links on it. Or you could do:
X-Robots-Tag: Googlebot: noindex’
…to specifically tell Google’s crawler not to index, while allowing other bots if desired (yes, you can target specific user-agents with it)
When should you use the X-Robots-Tag?
Here are just a few common scenarios...
- Blocking non-HTML files: Suppose you have some PDFs that you don’t want showing up in search (maybe they’re old brochures or confidential docs accidentally accessible). You can configure your server to send X-Robots-Tag: noindex for any PDF response. This will prevent Google from indexing those files.
- Site-wide or bulk rules: Let’s say your site is undergoing maintenance or you put up a staging version that got crawled. You could send an X-Robots-Tag: noindex on all pages via your server config temporarily, instead of adding meta tags to every page (and later remove it).
- Dynamic content where adding meta tags is hard: If you have some content generated by a system where injecting a meta tag is not straightforward, but you can tweak server headers, X-Robots-Tag is a handy alternative.
- Media and resources: if you run a stock photo site, you might allow images to be indexed in Google Images (for traffic) or conversely might want to block indexing if you only want the page indexed, not the image itself – you could use X-Robots-Tag: noimageindex on image responses to prevent Google indexing the image file.
- Combining multiple directives: For example, you might want to noindex a page but still let Google follow its links (perhaps a “sitemap” page you don’t want indexed but want Google to discover linked pages). You could do X-Robots-Tag: noindex, follow. Or use the unavailable_after directive via header to tell Google not to show a page after a certain date (handy for limited-time offers or outdated content)
How to implement the X-Robots-Tag
Typically, you do implement the X-Robots-Tag via server configuration (like in Apache’s .htaccess or in Nginx config) or application code.
For instance, in Apache you could do:
<FilesMatch "\.pdf$">
Header set X-Robots-Tag "noindex, nofollow"
</FilesMatch>
This would append that header to all PDF files served. Some CMS or frameworks also allow adding custom headers easily.
But use it carefully though. If you inadvertently send noindex for all pages (a misconfigured server rule), you could deindex your whole site from Google! Always test.
Frequently Asked Questions
How can I check if a page has an X-Robots-Tag?
You can check using curl -I https://example.com/page, which returns the HTTP headers. Look for a line like X-Robots-Tag: noindex. Alternatively, inspect headers in your browser's DevTools under the Network tab. Google Search Console also reports if a page was excluded using X-Robots-Tag.
What happens if I block a URL in robots.txt and use X-Robots-Tag?
Search engines won’t see the X-Robots-Tag if the page is disallowed in robots.txt. because they can’t crawl it. To apply noindex properly, allow crawling and use the header. Blocking crawling but expecting indexing behavior changes won’t work since the directives are never read.
What if a page has both meta robots and an X-Robots-Tag?
If both are present, Google usually follows the more restrictive directive. So if the meta tag says index but the header says noindex, the page won’t be indexed. Avoid conflicts where possible. In practice, most sites use one or the other (not both) unless needed.
Can I use X-Robots-Tag for specific content types like PDFs or images?
Yes. That’s one of its biggest strengths. You can set X-Robots-Tag: noindex or noimageindex on PDFs, images, or JSON API responses directly in server headers (something meta tags can’t do). It’s ideal for bulk rules or file types outside standard HTML.
Is it safe to use X-Robots-Tag sitewide?
It’s powerful but risky. A misconfiguration like noindex on all pages can deindex your site. Always test with a few URLs first and verify with curl or DevTools. For sensitive pages, Google even suggests using both X-Robots and robots.txt for layered protection.