apache.htaccesshttp-redirectmod-rewritemod-dir

htaccess: Prevent 301 redirect appending / to the end of the URL for subfolders


I defined this .htaccess file:

RewriteEngine On

# Disable http
RewriteCond %{HTTPS} off
RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]

# Disable www.
RewriteCond %{HTTP_HOST} ^www\.(.*)$ [NC]
RewriteRule ^(.*)$ https://%1/$1 [R=301,L]

# /random run random.php
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^([^\.]+)$ $1.php [NC,L]

# Redirect /index.html in URL to /
RewriteRule (.*)index\.html$ /$1 [R=301,L]

# Redirect xxx.html in URL to nach xxx
RewriteCond %{THE_REQUEST} /([^.]+)\.html [NC]
RewriteRule ^ /%1 [NC,L,R]

RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^ %{REQUEST_URI}.html [NC,L]

I have this folder structure:

It works fine for:

But this happens also and I want to disallow this:

Do you have an idea what I have to change in my .htaccess file?


Solution

  • x.com/example redirect to x.com/example/

    Since /example is a physical directory, mod_dir "fixes" the URL and appends the trailing slash with a 301 (permanent) redirect. The trailing slash is required in order to server the DirectoryIndex (ie. index.html in this case) from that directory.

    However, we can prevent mod_dir from appending the trailing slash with the DirectorySlash Off directive. BUT, we must then issue an internal rewrite to append the trailing slash otherwise the DirectoryIndex document will not be served (as mentioned above). (Or, we could rewrite directly to the index document.)

    When setting DirectorySlash Off we must also ensure that directory-listings (mod_autoindex) are disabled, since the presence of an index document in that directory will no longer prevent the directory listing.

    To resolve potential canonicalisation issues you now need to "redirect" in the other direction to remove any trailing slash on the requested URL. eg. requests to /example/ now need to be redirected back to /example.

    In addition, the following rule that rewrites to the corresponding .html file is not strictly correct:

    RewriteCond %{REQUEST_FILENAME}.html -f
    RewriteRule ^ %{REQUEST_URI}.html [NC,L]
    

    The condition (that uses REQUEST_FILENAME) is not necessarily testing the same URL that you are ultimately rewriting to in the substitution (that uses REQUEST_URI). So, in certain scenarios (eg. when requesting a non-existent file in a directory that also happens to map to a .html file) you can get a rewrite loop (500 internal server error). See the following question/answer on ServerFault that goes into more detail on this: https://serverfault.com/questions/989333/using-apache-rewrite-rules-in-htaccess-to-remove-html-causing-a-500-error

    The same applies to the earlier rule that appends the .php extension.

    Bringing the above points together, we get the following:

    # [NEW] Directory listing (mod_autoindex) must be disabled
    Options -Indexes
    
    # [NEW] Prevent mod_dir appending the trailing slash
    DirectorySlash Off
    
    RewriteEngine On
    
    # Disable http
    RewriteCond %{HTTPS} off
    RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
    
    # Disable www.
    RewriteCond %{HTTP_HOST} ^www\.(.*)$ [NC]
    RewriteRule ^(.*)$ https://%1/$1 [R=301,L]
    
    # Redirect /index.html in URL to /
    RewriteRule (.*)index\.html$ /$1 [R=301,L]
    
    # Redirect xxx.html in URL to just xxx
    RewriteCond %{THE_REQUEST} /([^.?]+)\.html [NC]
    RewriteRule ^ /%1 [R=301,L]
    
    # [NEW] Redirect to remove trailing slash on direct requests
    RewriteCond %{ENV:REDIRECT_STATUS} ^$
    RewriteRule (.+)/$ /$1 [R=301,L]
    
    # [NEW] Internal rewrite to append trailing slash to directories
    RewriteCond %{DOCUMENT_ROOT}/$1 -d
    RewriteRule (.+) $1/ [L]
    
    # /random run random.php
    RewriteCond %{DOCUMENT_ROOT}/$1.php -f
    RewriteRule ^([^\.]+)$ $1.php [L]
    
    # Append ".html" if corresponding file exists
    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI}.html -f
    RewriteRule ^ %{REQUEST_URI}.html [L]
    

    The new rules related to your immediate issue are indicated with [NEW].

    You will need to clear your browser cache before testing since the 301 (permanent) redirect that appended the trailing slash will have been cached. Test with 302 (temporary) redirects to avoid potential caching issues.

    Additional notes:

    # Redirect xxx.html in URL to nach xxx
    RewriteCond %{THE_REQUEST} /([^.]+)\.html [NC]
    RewriteRule ^ /%1 [NC,L,R]
    

    Note that I also modified the regex in the condition of this rule to avoid matching a potential query string, otherwise a request of the form /foo.html?bar.html would result in a double redirect and the query string would be corrupted.

    NB: You don't currently have a corresponding rule for .php requests. (You could handle both in the same rule.)