ruby-on-railsruby-on-rails-3webserverthinevented-io

Thin server underperforming / How do evented web servers work?


I had a rails 3 app on Nginx/Passenger that I just moved to Nginx/Thin (1.3.1). However, my app is now clearly slower than it was on Passenger. A lot of requests time out too.

Thin is an evented webserver. From what I have read about evented web servers, they don't have a concept of workers. One "worker" handles everything. So if one request is waiting on IO, thin just moves on to the next request and so one. One explanation I read about evented servers said that evented servers should perform as well or better than worker based servers because they are only bound by system resources.

However, my CPU usage is very little. My memory usage is very little too, and there isn't much IO happening either. My app just makes a few MySQL queries.

What is the bottleneck here? Shouldn't my thin server handle requests until CPU is at 100%? Do I have to do anything different in my app for it to perform better with an evented server?


Solution

  • Sergio is correct. Your app, at this point, is probably better of on the traditional Apache/Passenger model. If you take the evented route, especially on single-threaded platforms like Ruby, you can NEVER block on anything, whether it is the DB, Cache servers, other HTTP requests you might make - nothing.

    This is what makes asynchronous (evented) programming harder - it is easy to block on stuff, usually in the form of synchronous disk I/O or DNS resolutions. Non-blocking (evented) frameworks such as nodejs are careful in that they (almost) never provide you with a framework function call that is blocking, rather everything is handled using callbacks (incl DB queries).

    This might be easier to visualize if you look at the heart of a single-threaded non-blocking server:

    while( wait_on_sockets( /* list<socket> */ &$sockets, /* event */ &$what, $timeout ) ) {
        foreach( $socketsThatHaveActivity as $fd in $sockets ) {
            if( $what == READ ) {   // There is data availabe to read from this socket
                $data = readFromSocket($fd);
                processDataQuicklyWithoutBlocking( $data );
            }
            elseif ($what == WRITE && $data = dataToWrite($fd)) { // This socket is ready to be written to (if we have any data)
                writeToSocket( $fd, $data );    
            }
        }
    }
    

    What you see above is called the event loop. wait_on_sockets is usually provided by the OS in the form of a system call, such as select, poll, epoll, or kqueue. If processDataQuicklyWithoutBlocking takes too long, your application's network buffer maintained by the OS (new requests, incoming data etc) will eventually fill up, and will cause it to reject new connections and timeout existing ones, as $socketsThatHaveActivity isn't being handled fast enough. This is different from a threaded server (e.g. a typical Apache install) in that each connection is served using a separate thread/process, so incoming data will be read into the app as soon as it arrives, and outgoing data will be sent without delay.

    What non-blocking frameworks like nodejs do when you make (for example) a DB query is to add the socket connection of the DB server to the list of sockets being monitored ($sockets), so even if your query takes a while, your (only) thread isn't blocked on that one socket. Rather they provide a callback:

    $db.query( "...sql...", function( $result ) { ..handle result ..} );
    

    As you can see above, db.query returns immediately with absolutely no blocking on the db server whatsoever. This also means you frequently have to write code like this, unless the programming language itself supports async functions (like the new C#):

    $db.query( "...sql...", function( $result ) { $httpResponse.write( $result ); $connection.close(); } );
    

    The never-ever-block rule can be somewhat relaxed if you have many processes that are each running an event loop (typically the way to run a node cluster), or use a thread pool to maintain the event loop (java's jetty, netty etc, you can write your own in C/C++). While one thread is blocked on something, other threads can still do the event loop. But under heavy enough load, even these would fail to perform. So NEVER EVER BLOCK in an evented server.

    So as you can see, evented servers generally try to solve a different problem - they can have a great many number of open connections. Where they excel at is in just pushing bytes around with light calculations (e.g comet servers, caches like memcached, varnish, proxies like nginx, squid etc). It is worth nothing that even though they scale better, response times generally tend to increase (nothing is better than reserving an entire thread for a connection). Of course, it might not be economically/computationally feasible to run the same number of threads as # of concurrent connections.

    Now back to your issue - I would recommmend you still keep Nginx around, as it is excellent at connection management (which is event-based) - generally means handling HTTP keep-alives, SSL etc. You should then connect this to your Rails app using FastCGI, where you still need to run workers, but don't have to rewrite your app to be fully evented. You should also let Nginx serve static content - no point in getting your Rails workers tied up with something that Nginx can usually do better. This approach generally scales much better than Apache/Passenger, especially if you run a high-traffic website.

    If you can write your entire app to be evented, then great, but I have no idea on how easy or difficult that is in Ruby.