Its been a few days since I'm playing around Hole Punching in order to have some kind of reliable behaviour, but I'm now at a dead end.
UDP Hole punching works great: simply first send a packet to the remote, and get the remote to send a packet the otherway as it will land through the source NAT. Its rather reliable from what I tried.
But now comes TCP... I don't get it.
Right now, I can establish a connection through NATs but only with connecting sockets:
A.connect(B) -> Crash agains't B's NAT, but open a hole in A's NAT.
B.connect(A) -> Get in A's NAT hole, reach A's connecting socket.
But now, the two sockets that sended the SYN packets for connection are connected.
You would think that I would have done it, got a connection through 2 NATs, hooray.
But the problem is that this is not a normal behaviour, and given to this paper: http://www.brynosaurus.com/pub/net/p2pnat/, I should be able to have a listening socket in parallel to the connecting socket.
So I did bind a listening socket, which would accept inbound connections.
But inbound connections are always caught by the connecting socket and not the by the listening one...
e.g:
#!/usr/bin/env python3
from socket import *
from threading import Thread
Socket = socket
# The used endpoints:
LOCAL = '0.0.0.0', 7000
REMOTE = 'remote', 7000
# Create the listening socket, bind it and make it listen:
Listening = Socket(AF_INET, SOCK_STREAM)
Listening.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
Listening.bind(LOCAL)
Listening.listen(5)
# Just start in another thread some kind of debug:
# Print the addr of any connecting client:
def handle():
while not Listening._closed:
client, addr = Listening.accept()
print('ACCEPTED', addr)
Thread(target=handle).start()
# Now creating the connecting socket:
Connecting = Socket(AF_INET, SOCK_STREAM)
Connecting.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
Connecting.bind(LOCAL)
# Now we can attempt a connection:
try:
Connecting.connect(REMOTE)
print('CONNECTED', Connecting.getpeername())
except Exception as e:
print('TRIED', type(e), e)
Now with this script, just agree on a port with a friend or whatever, and execute it on one end's, the Connecting.connect(...)
should run for a bit (waiting for timeout, because SYN packet crashed into the distant NAT, but fortunately opened a hole in his own NAT), meanwhile execute the script on the other end, now the Connecting.connect(...)
will return because it will have connected.
The weirdest part being: The Listening
socket was never triggerred.
Why ? How to get the listening socket to catch inbound connections over the connecting socket ?
Note: Closing the connecting socket does send something on the network which immediately close the hole, at least it does on my network.
2nd-Note: I'm on windows.
Edit: The main problem is that in any circumstance, this script outputs CONNECTED [...]
rather than CLIENT [...]
, which given some lecture should not happen.
So, after more tests and readings, here's what I came to:
In fact, it is possible to bind a listen socket and a socket making outbound connections on the same address (ip, port).
But the behaviour of the sockets heavily depends on the system / TCP stack implementation, as mentionned in http://www.brynosaurus.com/pub/net/p2pnat/ at §4.3:
What the client applications observe to happen with their sockets during TCP hole punching depends on the timing and the TCP implementations involved. Suppose that A's first outbound SYN packet to B's public endpoint is dropped by NAT B, but B's first subsequent SYN packet to A's public endpoint gets through to A before A's TCP retransmits its SYN. Depending on the operating system involved, one of two things may happen:
A's TCP implementation notices that the session endpoints for the incoming SYN match those of an outbound session A was attempting to initiate. A's TCP stack therefore associates this new session with the socket that the local application on A was using to connect() to B's public endpoint. The application's asynchronous connect() call succeeds, and nothing happens with the application's listen socket.
Since the received SYN packet did not include an ACK for A's previous outbound SYN, A's TCP replies to B's public endpoint with a SYN-ACK packet, the SYN part being merely a replay of A's original outbound SYN, using the same sequence number. Once B's TCP receives A's SYN-ACK, it responds with its own ACK for A's SYN, and the TCP session enters the connected state on both ends.Alternatively, A's TCP implementation might instead notice that A has an active listen socket on that port waiting for incoming connection attempts. Since B's SYN looks like an incoming connection attempt, A's TCP creates a new stream socket with which to associate the new TCP session, and hands this new socket to the application via the application's next accept() call on its listen socket. A's TCP then responds to B with a SYN-ACK as above, and TCP connection setup proceeds as usual for client/server-style connections.
Since A's prior outbound connect() attempt to B used a combination of source and destination endpoints that is now in use by another socket, namely the one just returned to the application via accept(), A's asynchronous connect() attempt must fail at some point, typically with an “address in use” error. The application nevertheless has the working peer-to-peer stream socket it needs to communicate with B, so it ignores this failure.The first behavior above appears to be usual for BSD-based operating systems, whereas the second behavior appears more common under Linux and Windows.
So I'm actually in the first case. On my Windows 10.
This implies that in order to make a reliable method for TCP Hole Punching, I need to bind a listening socket at the same time as a connecting socket, but I later need to detect which one triggered (listening or connecting) and pass it down the flow of the application.