javapythonsocketsrecv

Python socket recv is splitting received message


I'm doing a socket communication between a Python and a Java process. I'm trying to send an int with java and receive it in Python.

Java side (sender):

        ServerSocket ss = new ServerSocket(6666);
        Socket s = ss.accept();

        DataOutputStream dos = new DataOutputStream(s.getOutputStream());
        
        final int numberToSend = 512;

        dos.writeInt(numberToSend);  //In big endian        
        dos.flush();

Python side (receiver):

        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        s.connect(('127.0.0.1', 6666))

        while True:
            data_received = s.recv(4)
            int_received = int.from_bytes(data_received, byteorder='big')
            print("Received:", data_received, "Number:", int_received)
            
            do_other_stuff(int_received)

I'm pretty sure the Java side is working correctly. dos.size() gives 4 bytes.

However, from the Python side, it seems data_received is being split in two parts. For each cycle, that print gives:

Received: b'\x00' Number 0
Received: b'\x00\x02\x00' Number 512

The expected would be Received: b'\x00\x00\x02\x00' Number 512. With the message split, it's interfering with do_other_stuff. sys.getsizeof(data_received) gives 18 and 20 for the first and second parts, respectively.

I've also tried int_received = struct.unpack("!i", data_received)[0] but it gives struct.error: unpack requires a buffer of 4 bytes, as the messages have 1 and 3 bytes.

What am I doing wrong?


Solution

  • I guess you are not doing anything logically wrong, but this seems to be the way TCP works with it. In documentation about using sockets, although an amount is expected, the reception is in a loop waiting to get the expected amount of data.

    Try doing the same. I.E. in pseudocode:

        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        s.connect(('127.0.0.1', 6666))
    
        data_received = b""
        long_received = 0
        while True:
            while long_received < 4:
                data_received += s.recv(4-long_received)
                long_received = len(data_received)
            int_received = int.from_bytes(data_received, byteorder='big') 
            print("Received:", data_received, "Number:", int_received)
            do_other_stuff(int_received)
            data_received = b""
            long_received = 0
    

    EDIT: Added annotation from Remy Lebeau. Thank you.

    EDIT 2: As user user207421 says, it's TCP's behavior, not Python. Thank you, absolutely new to me. :D