[SOLVED] Unable to accept connections on socket, when creating sockets on remote node via RPC in Erlang

Unable to accept connections on socket, when creating sockets on remote node via RPC in Erlang

I am struggling to identify the reason for gen_tcp:accept always returning an {error, closed} response.

Essentially, I have a supervisor that creates a listening socket:

gen_tcp:listen(8081, [binary, {packet, 0}, {active, false}, {reuseaddr, true}]),

This socket is then passed to a child, which is an implementation of the gen_server behaviour. The child then accepts connections on the socket.

accept(ListeningSocket, {ok, Socket}) ->                                   
    spawn(fun() -> loop(Socket) end),                                      
    accept(ListeningSocket);
accept(_ListeningSocket, {error, Error}) ->
    io:format("Unable to listen on socket: ~p.~n", [Error]),
    gen_server:call(self(), stop).

accept(ListeningSocket) ->                                                 
    accept(ListeningSocket, gen_tcp:accept(ListeningSocket)).                                                                                             

loop(Socket) ->                                                            
    case gen_tcp:recv(Socket, 0) of                                        
        {ok, Data} ->                                                      
            io:format("~p~n", [Data]),                                     
            process_request(Data),                                         
            gen_tcp:send(Socket, Data),                                    
            loop(Socket);                                                  
        {error, closed} -> ok                                              
   end.

I load the supervisor and gen_server BEAM binaries locally, and load them on a another node (which runs on the same machine) via an RPC call to code:load_binary. Next, I execute the supervisor via an RPC call, which in turn starts the server.{error, closed} is always returned by gen_tcp:accept in this scenario.

Should I run the supervisor and server while logged in to a node shell, then the server can accept connections without issue. This includes 'remsh' to the remote node that would fail to accept connections, had I previously RPCed it to start the server unsuccessfully.

I seem to be able to replicate the issue by using the shell alone:

[Terminal 1]: erl -sname node -setcookie abc -distributed -noshell

[Terminal 2]: erl -sname rpc -setcookie abc:

              net_adm:ping('node@verne').
              {ok, ListeningSocket} = rpc:call('node@verne', gen_tcp, listen, [8081, [binary, {packet, 0}, {active, true}, {reuseaddr, true}]]).
              rpc:call('node@verne', gen_tcp, accept, [ListeningSocket]).

The response to the final RPC is {error, closed}.

Could this be something to do with socket/port ownership?

In case it is of help, there are no clients waiting to connect, and I don't set timeouts anywhere.

Solution

Each rpc:call starts a new process on the target node to handle the request. In your final example, your first call creates a listen socket within such a process, and when that process dies at the end of the rpc call, the socket is closed. Your second rpc call to attempt an accept therefore fails due to the already-closed listen socket.

Your design seems unusual in several ways. For example, it's not normal to have supervisors opening sockets. You also say the child is a gen_server yet you show a manual recv loop, which if run within a gen_server would block it. You might instead explain what you're trying to accomplish and request help on coming up with a design to meet your goals.