pythonsocketsrecvfrom

Why the bytes stream got by python socket.recvfrom is different from that crawled by WireShark?


I used the python socket to send a DNS query packet socket and listen to the response. Finally, I got a DNS response packet by the socket.recvfrom(2048) function as expected. But strangely, where I compared the response packet with the packet crawled by Wireshark, I found there exists many difference.

The differences would be found as 3f at the second picture.

The DNS response packet (The highlighted part) crawled by the Wireshark

The DNS response packet got by the socket.recvfrom(2048)

The Creating a Socket Part Codes:

    ipv = check_ip(dst)
    udp = socket.getprotobyname(Proto.UDP)
    if ipv == IPV.ERROR:
        return None
    elif ipv == IPV.IPV4:
        return socket.socket(socket.AF_INET, socket.SOCK_DGRAM, udp)
    elif ipv == IPV.IPV6:
        return socket.socket(socket.AF_INET6, socket.SOCK_DGRAM, udp)
    else:
        return None

The Receiving a DNS response packet Part Codes:

    remained_time = 0
    while True:
        remained_time = self.timeout - timeit.default_timer() + sent_time
        readable = select.select([sock], [], [], remained_time)[0]
        if len(readable) == 0:
            return (-1, None)

        packet, addr = sock.recvfrom(4096)

Solution

  • Byte 0x3F is the ASCII '?' character. That commonly means the data is being treated as text and is passing through a charset conversion that doesn't support the bytes being converted.

    Notice that 0x3F is replacing only the bytes that are > 0x7F (the last byte supported by ASCII). Non-ASCII bytes in the range of 0x80-0xFF are subject to charset interpretation.

    That makes sense, as you are using the version of recvfrom() that returns a string, so the received bytes need to be converted to Python's default string encoding.

    Since you need raw bytes instead, use recvfrom_into() to fill a pre-allocated bytearray, eg:

    packet = bytearray(4096)
    remained_time = 0
    while True:
        remained_time = self.timeout - timeit.default_timer() + sent_time
        readable = select.select([sock], [], [], remained_time)[0]
        if len(readable) == 0:
            return (-1, None)
        nbytes, addr = sock.recvfrom_into(packet)
    

    Then you can use packet up to nbytes number of bytes as needed.