I'm using AWS–s3 for static website hosting, and running it through Cloudflare services (including DNS). It is SEO best practice to truncate the .html
from URL names, while simultaneously avoiding duplicate content. I was achieving the desired result using nginx, and am wondering if it is even possible using either s3 or Cloudflare. My gut tells me no.
The basic requirement is: example.com/about.html
should rewrite (not redirect) to example.com/about
. The file name stored on s3 should remain, obviously, *.html
.
The one hack I've stumbled across is:
about
(without the file extension).content-type
back to text/html
.I view this as a horrible "solution": Visiting *.html
results in a 404. Unless, of course, you create a duplicate file with the .html
extension, and then possibly create a url forwarding rule in Cloudflare. Not only is it very messy, it just plain doesn't scale.
Is there a better way?
My gut tells me no.
Your gut is both correct and incorrect.
You can't quite have it both ways with S3; implied extensions aren't supported... however, there is a way to do it while remaining (arguably) SEO-sane.
Instead of about
→ about.html
you can make about
→ about/
→ about/index.html
.
Enable index documents on the bucket. If the browser requests /about
and that's not an object, it will see a response of 301 Moved Permanently
with Location: /about/
.
When S3 sees a request for /about/
, it will return the contents of /about/index.html
without issuing a redirect.
Of course, your original workaround of changing the Content-Type
in the console can be avoided if you set the content type manually when the document is uploaded in the console. There are many content types the console does not automatically set when uploading, so I am in the habit of setting them manually, anyway.