ruby-on-railsruby-on-rails-4corscsrfruby-on-rails-4.1

Googlebot causes an invalid Cross Origin Request (COR) on Rails 4.1


How do I prevent Google from causing this error while crawling the site? I am not interested in turning off "protect_from_forgery" unless it is safe to do so.

[fyi] method=GET path=/users format=*/* controller=users action=show status=200 duration=690.32 view=428.25 db=253.06 time=  host= user= user_agent=Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) session= params={""} ()
[hmm] Security warning: an embedded <script> tag on another site requested protected JavaScript. If you know what you're doing, go ahead and disable forgery protection on this action to permit cross-origin JavaScript embedding. (pid:)
[fyi] method=GET path=/users/123/flag format=*/* controller=users action=flag status=500 error='ActionController::InvalidCrossOriginRequest:Security warning: an embedded <script> tag on another site requested protected JavaScript. If you know what you're doing, go ahead and disable forgery protection on this action to permit cross-origin JavaScript embedding.' duration=26.50 time= host= user= user_agent= session= params= (pid)
[omg] ActionController::InvalidCrossOriginRequest (Security warning: an embedded <script> tag on another site requested protected JavaScript. If you know what you're doing, go ahead and disable forgery protection on this action to permit cross-origin JavaScript embedding.):
actionpack (4.1.4) lib/action_controller/metal/request_forgery_protection.rb:217:in `verify_same_origin_request'

The controller responds with this

respond_to do |format|
    format.js { render template: 'users/flag', layout: "some_layout" }
end

I am unable to recreate the bug and it seems to work fine when I do it through my browser

So far I've looked at the following resources but most seem to suggest just blindly turning of CSRF or are unanswered.


To clarify: The action should be protected from CSRF, But I want to prevent Google from crawling it or generating an error from crawling the page. Ie.) I want the false positive Security Warnings to go away without actually compromising my security features.


Solution

  • Googlebot is using the format "*/*" (http://apidock.com/rails/Mime) and the application renders the js since it's the only thing available. Since it's remote, it correctly causes an Invalid COR.

    This was reproducible using:

    curl -H "Accept: */*" https://www.example.com/users/123/flag
    

    The fix is to have an html fallback resource for the spider to crawl:

    respond_to do |format|
      format.html { render template: 'users/flag' }
      format.js { render template: 'users/flag', layout: "some_layout" }
    end