On my website, both
https://www.datanumen.com/support-doc/access-repair/ (URL1)
and
https://www.datanumen.com/support-doc/access-repair/index.htm (URL2)
will load the URL2 since Apache will search for index.htm under the folder and load it.
Now, to prevent these both URLs will be indexed by Google, I set the canonical URL to URL2 and then add the following redirection in .htaccess, to make sure URL1 will redirect URL2:
# 2021-12-15: Redirect help document URL to the canonical version
<IfModule mod_rewrite.c>
RewriteRule ^support-doc/([a-z\-]+)/ /support-doc/$1/index.htm [R=301,L]
</IfModule>
But this will cause "Too many redirections" error. Why?
That causes a redirect loop because /support-doc/something/index.htm
is matched by the regular expression ^support-doc/([a-z\-]+)/
causing /support-doc/something/index.htm
to redirect to itself.
Perhaps you mean to limit the rewrite rule to the exact URL ending with the trailing slash? If so the rule should be:
RewriteRule ^support-doc/([a-z\-]+)/$ /support-doc/$1/index.htm [R=301,L]
I added a $
meaning "ends with" to the rule so that anything after that doesn't match.
I would recommend removing index.htm
from your canonical URL. In general your canonical URL should be the simplest URL. index.htm
is not friendly for users and it should never be shown in a URL. The index document is there to give the webserver a document that contains the content that should be shown for the directory URL.