apachemod-rewritehttp-headershttp-accept-header

In apache's mod_rewrite, how can I write a RewriteCond to match the http accept header for all html types?


I have the following rule in my .htaccess that detects if the user passes a specific host, and is asking for HTML, that I provide a specific HTML file:

RewriteCond %{HTTP_HOST} ^example\.domain\.com$ [NC,OR]
RewriteCond %{HTTP_HOST} ^example\.domain\.eu$ [NC]
RewriteCond %{HTTP_ACCEPT} text/html [NC]
RewriteRule ^ /subdomains/example/index.html [END]

# Otherwise go to php api

This seemed to work, but I was getting a lot of warning logs from the php api for unmatched routes. Those routes should have gone to that index.html. I set up a breakpoint and found that I was getting requests with Accept: */*. I could reproduce this issue reliably when I would open the dev console after visiting the page in chrome (I'm not sure why chrome would send off that request... loading the source panel maybe? It results in a "Failed to load resource" error in the console)

Obviously a simple substring search that RewriteCond is doing for text/html won't be sufficient. I believe need to at least handle */* and text/*, but are there some not-so-obvious accept headers? I see some RewriteCond examples of application/xhtml+xml. How can I write a RewriteCond to match the http accept header for all html types?


Solution

  • Normally the frontend does fetch requests using Accept: application/json to subdomain.example.com/folder/1234 to load the data without a full page refresh (using php).

    If the API accepts specific requests (mime-types and/or URL-paths) then maybe reverse the logic and route the request to your PHP API first. Other requests then default to an HTML response.

    For example:

    RewriteCond %{HTTP_HOST} ^subdomain\.example\.com$
    RewriteCond %{HTTP_ACCEPT} application/json
    RewriteRule ^folder/\d+$ /api.php [END]
    
    # Otherwise default to HTML response
    RewriteRule ^ /subdomains/example/index.html [END]
    

    If the API only accepts URL-paths that follow a specific pattern then this should also be included in the rule (as above).

    Other notes...

    However, if you were go to your browser and navigate to subdomain.example.com/folder/1234, it should return the html page to do HTML5 Routing (using a static page)

    If you simply navigate to this URL in the browser then the browser should be including text/html in the Accept header.

    Have you examined the Referer and User-Agent fields of the logged request that seemingly omitted the text/html mime-type from the request? Are you sure this is a user's direct request from the browser? And not bots/search engine traffic (although Googlebot typically includes text/html when crawling)?

    Unless text/html is sent in the Accept header then you cannot be sure that the UA making the request will be able to handle an HTML response.

    If a UA only accepts application/xhtml+xml (XML) then strictly speaking it's not going to accept a text/html response. (Typically browsers send both of these mime-types in the Accept header when making an arbitrary request.)

    If a UA only accepts anything (ie. */*) then that doesn't necessarily mean a text/html response is appropriate. Typically, I see Accept: */* only, for requests to embedded JS files. The same applies to requests that include Accept: text/* only.