pythontcppackets

Creating application level packets in Python


I have a basic multi-threaded client server running using python 3.6

Now once a connection is established I want to create application level packets which will be sent over tcp/ip. The purpose of these will be make a three-way handshake to identify multiple clients then authenticate them. the packet will also be used to send certain payloads to the server.

Since python does not have any data type such as structures so i am having a hard time creating these packets.I cant use tuples because they are immutable, I have tried using recordclass and using Structures from c_type.

Data is not sent properly while using recordclass, because i dont know the exact size each packet will be, i kept the recv() argument to a maximum limit but this moves the client to a blocking state if the packet's length is shorter then the maxium limit. And in using c_type structures i can send the data but it is received in a format like this \xbct\x00\x106\xe0\x02ff\xc8B and I cant convert it back to the original form.

Any sort of help will be highly appreciated.

EDIT: So far i have done this. I am attaching below the code snippets i am using, the structure fields are just arbitrary i will change them later on.

Server Side:

...
   ...
   class app_packet(Structure):
       _fields_ = [('packet_type',c_wchar_p),
                  ('sensor_name',c_wchar_p),
                  ('value',c_float)]

   syn=app_packet('syn','temperature',100.2)
   connectionSocket.sendall(syn)
   ...
   ...

Client side:

    ...
    ...
    class app_packet(Structure):
        _fields_ = [('packet_type',c_wchar_p),
                    ('sensor_name',c_wchar_p),
                    ('value',c_float)]

    data=clientSocket.recv(1024)
    syn=unpack('3s4sf',data)
    a=str(syn)
    print("unpacked="+a)
    ...
    ...

But the problem still remains, even after i unpack the received packet the string data remains in byte format, whereas the float data is converted properly. this is what i get for the output of the print statement

unpacked=  (b'(\xbcY', b'\x00\xa0\xa6\xc2', 100.19999694824219)

I have tried different encoding/decoding schemes but nothing is working so far, and i can not convert it back


Solution

  • The main problem with your code is that you're not actually sending the strings.

    You've defined a structure that includes two c_wchar_p members—or, in C terms, wchar_t * pointers. You send those pointers over, and but never send the data they're pointing to. That can't possibly work.

    You can't just send C structs containing pointers in any language. You have to write some higher-level protocol that includes the actual strings instead of the pointers in some way, and then write the code that serializes and deserializes to the struct. Which is much easier to do with functions like struct.pack instead of around a ctypes.Structure. And even easier to do on top of a higher-level protocol like netstrings, and even easier if you just use a text-based protocol with some human-readable framing, like newline-escaped JSON texts with newlines as delimiters.


    If you really want a binary protocol based on just dumping fixed-sized structures to the network, your structs have to be fixed-sized and self-contained. For example, if your packet_type were always at most 4 characters, and your sensor_name at most 30 characters, and it were acceptable to waste space for shorter names, you could do this:

    class app_packet(Structure):
        _fields_ = [('packet_type',c_wchar*4),
                    ('sensor_name',c_wchar*30),
                    ('value',c_float)]
    

    Now the characters are embedded directly in the structure, so it will work.


    Except that it won't really work, because your data types aren't network-portable. A wchar_t can be either 2 bytes or 4 bytes—not just between different platforms, but even between binaries built with different compilers or flags on the same platform. (Plus, they're of course native-endian.) If you really want embedded 2-byte or 4-byte strings, you have to be explicit about it: use c_uint16 or c_uint32, encode with s.encode('utf-16') or s.encode('utf-32'), then either memcpy or cast and slice-copy. But then of course they're not strings within your code until you pull them out, cast them back, and decode them, at which point you might as well be using a proper protocol in the first place.


    Also, it's still not clear why you'd ever want handshake data to be stored in a structure like this in the first place. Why not just pass it around as a tuple (or namedtuple or normal class) with two strings and a float, and serialize/deserialize right as it goes over/comes in from the network.

    You mentioned in a comment that you need them to be mutable, but that doesn't explain it; it makes even less sense that you'd want mutable handshake data. And besides, you can trivially just make a new tuple with different strings instead of having string-like members that you can mutate in place.