NoIndex URLs using htaccess file

There can be several reasons you may want to noindex a web page/URL. There may be password protected pages which you don’t want to be seen in search results or pages where you want that only your web admin should have access or affiliate URLs redirecting to another webpage. noindex-htaccess

First, what do we mean by noindex?

noindex simply means that you don’t want a web page or a link to show up on search engine results pages.

Most of the times it can be accomplished by setting a meta tag on the web page. WordPress users have this covered when they use plugins like Yoast SEO or All-In-One-SEO which provide capability to do so on a per page/post basis.

Here is the syntax used before the closing </head> tag in a web page using meta robots tag.

<META NAME="ROBOTS" CONTENT="NOINDEX, FOLLOW">

How to use htaccess file to noindex a URL?

For one of my websites, I run affiliate programs for others to participate. The issue was that the affiliate URLs were getting indexed in Google search results. The affiliate URL in my case had the structure –

http://www.example.com/aff/a/?a=786&p=mysite.com/product1/

As can be seen above, there is no way to implement meta tag directly in the page for this kind of URL. Fortunately, Google supports another not-widely-known method to noindex a page using X-Robots-Tag directive in the HTTP Header.

This is normally done in Apache config files but most of the regular webmasters host their websites on shared web hosts. Shared web hosts rarely provide access to Apache config files, however you can still create or access htaccess file in the respective directories.

htaccess file rules are applied recursively which means that in the current example, htaccess file residing in /a/ folder will supersede htaccess file residing in /aff/ folder which will in turn supersede htaccess file in the domain root directory. In short, if similar a rule resides in /a/ folder and /aff/ folder , the rule in /a/ folder will take precedence.

To noindex any URL containing the structure shown above, I will go to /a/ folder, download the htaccess file (or will create a new file in case it doesn’t exist) and add this at the top of the file.

Header set X-Robots-Tag "noindex, follow"

We are asking search engines to not index the URL but follow the links.

We will use HTTP Header checker tool to verify the implementation. This is the response we get.

As can be seen in the rectangle marked above, header shows the response as X-Robots-Tag => noindex, follow

Over To You?

Have you faced such a situation before? How did you resolve the issue?

5 Comments

Kim on October 12, 2022 at 3:29 AM

I have been looking for videos/resources with regards to x robot tags with no luck for people have made any videos about that (or seen any good resources a non-techy would understand).

I found this from google where they show parts of what I want to see: https://developers.google.com/search/docs/crawling-indexing/robots-meta-tag
So I want to see how I can noindex/nofollow none html pages, such this case with pdf files.

Header set X-Robots-Tag “noindex, nofollow”

But with the example above, where do I place the directive in the htaccess file ??

But I also want to know how I can use these directives to nofollow/noindex towards the spammy “search urls in my GSC”

EX: domain.com/search/online.vi***a.domain

Which is coming up heavily in my search console.

And I would want to show a directive to not crawl those (or noindex/nofollow them).
Also, in this case with regards to the spam medic search urls such as the one above.

Where does the directives go in the htaccess file ??

Thanks,

johan on December 4, 2019 at 12:26 PM

nice.

Jordi on October 16, 2017 at 9:53 PM

Hi friend,

how can I block an specific URL to not being indexed?

This URL is doing an Ajax petition and is appearing in google search I want to prevent it.

Thanks for your time

Ankur Jain on October 26, 2017 at 2:13 PM

@Jordi: Just add this before the closing head tag.
META NAME="ROBOTS" CONTENT="NOINDEX, FOLLOW"
Reply
- ammar on August 1, 2019 at 5:25 PM
  
  where to add url which I want to stop from indexing
  Reply

NoIndex URLs using htaccess file

First, what do we mean by noindex?

How to use htaccess file to noindex a URL?

Over To You?

Free eBook: Important WordPress Plugins

Please check your email and confirm your email address now!

5 Comments

Submit a Comment Cancel reply

About The Author

Popular Articles

Latest Articles

Pin It on Pinterest