robots.txtnoindex

Noindex in a robots.txt


I've always stopped google from indexing my website using a robots.txt file. Recently i've read an article from a google employee where he stated you should do this using meta tags. Does this mean Robots.txt won't work? Since i'm working with a CMS my options are very limited and its a lot easier just using a robots.txt file. My question is whats the worst that could happen if i proceed using a robots.txt file instead of meta tags.


Solution

  • Here's the difference in simple terms:

    Use the robots.txt file when you want control at the directory level or across your site. However, keep in mind that robots are not required to follow these directives. Most will, such as Googlebot, but it is safer to keep any highly sensitive information out of publicly-accessible areas of the site.

    As with robots.txt files, noindex tags will exclude a page from search results. The page will still be crawled, but it won’t be indexed. Use these tags when you want control at the individual page level.

    An aside on the difference between crawling and indexing: Crawling (via spiders) is how a search engine’s spider tracks your website; the results of the crawling go into the search engine’s index. Storing this information in an index speeds up the return of relevant search results—instead of scanning every page related to a search, the index (a smaller database) is searched to optimize speed.

    If there was no index, the search engine would look at every single bit of data or info in existence related to the search term, and we’d all have time to make and eat a couple of sandwiches while waiting for search results to display. The index uses spiders to keep its database up to date.

    Here is an example of the tag:

    <meta name="robots" content="noindex,follow"/>
    

    Now that you read and understand the above information, I think you are able to answer your question on your own ;)