robots.txtnoindex

Robot.txt noindex


I have 2 web sites. First one is "test-www.xxxxxxx.net" and another one is "www.xxxxx.net". I want to make a dynamic no-index meta tag. Google robots can index my live site but I dont want it for my test site. Normaly I may add an attribute and make it dynamic in _Layout.cshtml as below.

@if (!Helper.IsLiveSite())
    {
        <meta name="robots" content="noindex">
    }

Is there a way that I can make it in robot.txt file ? It is written in some articles that "I strongly recommend you use Noindex instead, whenever possible."

So please help! How can I do it in robot.txt. I dont know if the below file will create an error or not.

User-agent: *
Disallow: /styles/
Sitemap: http://xxxxxx/sitemap/sitemap.xml
Noindex: test-www.xxxxxxx.net/*
Noindex: http://test-www.xxxxxxx.net/*
Noindex: https://test-www.xxxxxxx.net/*

Thanks.


Solution

  • You can’t disallow indexing with robots.txt¹, only crawling (with Disallow).

    If you want to disallow crawling of all documents from your test site, you have to upload a robots.txt that is accessible from test-www.xxxxxxx.net/robots.txt:

    User-agent: *
    Disallow: /
    

    (And this robots.txt file should not also be accessible from your live site.)

    Search engines (those that support robots.txt) won’t visit (crawl) URLs on this host anymore. If they find URLs to documents on this host somehow (e.g., if another page links to them), they may list (index) them.

    ¹ Google supported (maybe still supports) this experimentally, though.