There can be several reasons you may want to
noindex a web page/URL. There may be password protected pages which you don’t want to be seen in search results or pages where you want that only your web admin should have access or affiliate URLs redirecting to another webpage.
First, what do we mean by noindex?
noindex simply means that you don’t want a web page or a link to show up on search engine results pages.
Most of the times it can be accomplished by setting a meta tag on the web page. WordPress users have this covered when they use plugins like Yoast SEO or All-In-One-SEO which provide capability to do so on a per page/post basis.
Here is the syntax used before the closing
</head> tag in a web page using meta robots tag.
<META NAME="ROBOTS" CONTENT="NOINDEX, FOLLOW">
How to use htaccess file to noindex a URL?
For one of my websites, I run affiliate programs for others to participate. The issue was that the affiliate URLs were getting indexed in Google search results. The affiliate URL in my case had the structure –
As can be seen above, there is no way to implement meta tag directly in the page for this kind of URL. Fortunately, Google supports another not-widely-known method to
noindex a page using
X-Robots-Tag directive in the
This is normally done in Apache config files but most of the regular webmasters host their websites on shared web hosts. Shared web hosts rarely provide access to Apache config files, however you can still create or access
htaccess file in the respective directories.
htaccess file rules are applied recursively which means that in the current example,
htaccess file residing in /a/ folder will supersede
htaccess file residing in /aff/ folder which will in turn supersede
htaccess file in the domain root directory. In short, if similar a rule resides in /a/ folder and /aff/ folder , the rule in /a/ folder will take precedence.
noindex any URL containing the structure shown above, I will go to /a/ folder, download the
htaccess file (or will create a new file in case it doesn’t exist) and add this at the top of the file.
Header set X-Robots-Tag "noindex, follow"
We are asking search engines to not index the URL but follow the links.
We will use HTTP Header checker tool to verify the implementation. This is the response we get.
As can be seen in the rectangle marked above, header shows the response as
X-Robots-Tag => noindex, follow
Over To You?
Have you faced such a situation before? How did you resolve the issue?