node.jsreactjsamazon-ec2amazon-rdshttp-status-code-503

The entire Node.js service crashes with 503 errors when the smallest error occurs


This is my first Node.js application I designed and developed. When I run the app in staging or production it crashes when fatal errors occur. For instance, trying to access error[0] when error is empty. This takes the entire service down and the client receives 503 errors. I am used to PHP & C# where this doesn't happen. I mean, the end-user will get an error, but the server for PHP or C# doesn't go down. With Node.js the entire Web service is no longer available. I am working through the code to catch everything that could be a fatal error, but still, I don't have confidence in this app knowing one mistake and my clients are not able to work. To restart the services, I created a health check system that expects a 200 code.

Here is my environment:

Here is what I want to know:


Solution

  • When you use Node.js, your application is the server. On the other hand, when you use PHP, the server is Apache and your code is just a script being executed by Apache's mod_php module. (at least, these are the typical configurations for Node.js and PHP, though not the only ones)

    So when your Node.js application has an uncaught error, it's equivalent to your HTTP server having an uncaught error. It will crash. While with PHP, Apache mod_php will catch it and handle it in a specific way.

    But that doesn't mean it's acceptable for a run-of-the-mill error to cause a Node.js HTTP server to crash. If that happens, it just means your error handling needs improvement. A well-coded Node.js server will catch an error, log it, respond with an error response, and keep chugging. You have to write that code yourself though, so it's not as forgiving as PHP in that regard.

    As for what to do about it, it depends on what kind of errors you're facing exactly, but the general idea is that there should be a top-level error handler that catches any error during a request, logs the error, does whatever else should be done, and returns a code 500 response. There are at least a couple of gotchas in addition:

    1. That kind of error handler cannot catch an "uncaught promise rejection". This sort of error happens when you are not awaiting or catch()ing your promises. You should always do so, but if you want to stop your server from crashing while you diagnose where those are, you can subscribe to the unhandledRejection event. This will prevent them from crashing the process. Make sure to log them and fix them the right way.
    2. If you are using something that inherits from EventEmitter, and it fires an error event which you have not subscribed to, this will be re-thrown as an unhandled error and crash the application. If you're using anything that fires events (not super common for an HTTP server), make sure you subscribe to its error event.
    3. If you are using callback APIs (there's rarely a good reason to anymore) you need to be careful about throwing an error from within a callback, doing so can crash the application.