ruby-on-railsherokugitignore

Different robots.txt for staging server on Heroku


I have staging and production apps on Heroku.

For crawler, I set robots.txt file.

After that I got message from Google.

Dear Webmaster, The host name of your site, https://www.myapp.com/, does not match any of the "Subject Names" in your SSL certificate, which were:
*.herokuapp.com
herokuapp.com

The Google bot read the robots.txt on my staging apps and send this message. because I didn't set anything for preventing crawlers to read the file.

So, what I'm thinking about is to change .gitignore file between staging and production, but I can't figure out how to do this.

What are the best practices for implementing this?

EDIT

I googled about this and found this article https://olemortenamundsen.wordpress.com/2011/04/05/ruby-secure-staging-environment-of-your-public-app-from-users-and-bots/

This article says to set basic Rack authentication and you won't need to care about robots.txt.

I didn't know that basic auth can prevent google bot. It seems this solution is better that manipulate .gitignore file.


Solution

  • What about serving /robots.txt dynamically using a controller action instead of having a static file? Depending on the environment you allow or disallow search engines to index your application.