I'm hosting a website on AWS EC2 and I have been fine tuning my configuration and SEO. All of the pages work, and my domain routing works properly. The issue I keep having is that when I look up my site on Google, I see results for both mydomain.com and my.public.ip.amazonaws.com. I am trying to prevent the latter from showing up and confusing users.
I have looked in to Google's crawling procedures and "robot.txt" rules to prevent crawling certain areas of the site (documentation). But if I create a noindex rule in there, I remove my whole site from Google. How do I prevent search engines from displaying content accessed via my server's IP, and only display content that routes through my domain?
Your web server should be redirecting users to your canonical domain.
So, a request to my.public.ip.amazonaws.com
should be a 301
redirect with:
Location: https://example.com/whatever