I've been working on a new project that leverages an API Gateway mapped to a lambda function. The lambda function contains a Kestrel .NET web server that receives requests via proxy through API Gateway. I have remapped API Gateway to an actual subdomain to ensure some branding consistency. Everything is working fine; however, I recently implemented Elmah.IO in order to get a better understanding of what errors are coming up in this unusual context.
Now, about one to five times per day, the api gateway URL is getting a request for a robots.txt file that it is unable to complete. I wouldn't expect the API to be able to complete this request because the API is not meant to serve static content. My question is; how can I prevent these requests from being made?
What is causing the API Gateway URL to be requested? Is it because it is directly detected through the links from my host site? It's using CORS to access the API, so it's possible the robot could be detecting the API as a completely separate domain and attempting to crawl it. If so, is there some configuration I could add to my Web API to force a text response of my design for the robots.txt request?
After researching my way around for a little bit, I eventually just tried to dynamically generate my text file for robots.txt response. I was reading the article over at: http://rehansaeed.com/dynamically-generating-robots-txt-using-asp-net-mvc/
This gave me the idea to dynamically generate the request. As such, I set up the following:
[Route("/")]
public class ServerController : Controller
{
[HttpGet("robots.txt")]
public ContentResult GetRobotsFile()
{
StringBuilder stringBuilder = new StringBuilder();
stringBuilder.AppendLine("user-agent: *");
stringBuilder.AppendLine("disallow: /"); // this will disallow all routes
return this.Content(stringBuilder.ToString(), "text/plain", Encoding.UTF8);
}
}