rubysocketsamazon-ec2net-http

Increase connect(2) timeout in RestClient / Net::HTTP on AWS Linux


I'm using rest-client to POST to a very slow web service. I'm setting timeout to 600 seconds, and I've confirmed that it's being passed down to Net::HTTP's @read_timeout and @open_timeout.

However, after about two minutes, I get a low-level timeout error, Errno::ETIMEDOUT: Connection timed out - connect(2):

The relevant part of the backtrace is

Operation timed out - connect(2) for [myhost] port [myport]
/Users/dmoles/.rvm/rubies/ruby-2.2.5/lib/ruby/2.2.0/net/http.rb:879:in `initialize'
/Users/dmoles/.rvm/rubies/ruby-2.2.5/lib/ruby/2.2.0/net/http.rb:879:in `open'
/Users/dmoles/.rvm/rubies/ruby-2.2.5/lib/ruby/2.2.0/net/http.rb:879:in `block in connect'
/Users/dmoles/.rvm/rubies/ruby-2.2.5/lib/ruby/2.2.0/timeout.rb:88:in `block in timeout'
/Users/dmoles/.rvm/rubies/ruby-2.2.5/lib/ruby/2.2.0/timeout.rb:98:in `call'
/Users/dmoles/.rvm/rubies/ruby-2.2.5/lib/ruby/2.2.0/timeout.rb:98:in `timeout'
/Users/dmoles/.rvm/rubies/ruby-2.2.5/lib/ruby/2.2.0/net/http.rb:878:in `connect'
/Users/dmoles/.rvm/rubies/ruby-2.2.5/lib/ruby/2.2.0/net/http.rb:863:in `do_start'
/Users/dmoles/.rvm/rubies/ruby-2.2.5/lib/ruby/2.2.0/net/http.rb:852:in `start'
/Users/dmoles/.rvm/gems/ruby-2.2.5/gems/rest-client-2.0.0/lib/restclient/request.rb:766:in `transmit'
/Users/dmoles/.rvm/gems/ruby-2.2.5/gems/rest-client-2.0.0/lib/restclient/request.rb:215:in `execute'
/Users/dmoles/.rvm/gems/ruby-2.2.5/gems/rest-client-2.0.0/lib/restclient/request.rb:52:in `execute'

It looks like the line of code throwing the error is

TCPSocket.open(conn_address, conn_port, @local_host, @local_port)

It seems as though the underlying connect(2) system call has a timeout of about two minutes, and the timeout parameters passed to Net::HTTP can only shorten that, not lengthen it. Is there a way to modify the socket parameters to set a longer timeout?

Edited to add: This only appears to be a problem on our AWS Linux servers -- on my MacOS development machine, the ten-minute timeout works. I assume the default connect() timeout is longer on MacOS/BSD, but I don't really know.


Solution

  • First of all, you could just increase the tcp_syn_retries configuration updating the /proc/sys/net/ipv4/tcp_syn_retries file. Reference here.

    If if doesn't work, I think you will need to activate the SO_KEEPALIVE or TCP_USER_TIMEOUT options. But probably there is no interface for that in rest-client.

    So maybe you'll need to make a fork or create the Socket and Socket::Option by yourself.

    Mike Perham wrote about it in his blog.