cmallocmemcpy

Malloc and memcpy struct plus array


I am trying to find a better title so I apologize if its confusing.

I have a binary message that I am reading over a serial port. The message contains a constant 8 byte header, and a dynamic message length (depends on the message type which is defined in the 8 byte header).

I am trying to create a function that returns this message with the array allocated using malloc.

I changed some of the variable/member names just to make it clearer Example:

#include <stdlib.h>
#include <stdio.h>

typedef struct MyMessageHeader {
  uint8_t byte1, byte2, byte3, byte4, byte5, byte6, byte7, 
} MyMessageHeader;

// I could have up to 100 different message types, but they all contain 
// the same header, so trying to make a message type with a header and array pointer
typedef struct MyMessage {
  MyMessageHeader header;
  uint8_t* data;
} MyMessage;

// Function to copy a raw byte array that is read from the serial port into
// a MyMessage object
MyMessage* FunctionThatIsNotWorking(uint8_t* rawBytes) {
  // Somehow we have determined in rawBytes that this message contains 12 bytes of data
  // plus the 8 bytes which is the header
  MyMessage* ret_msg = malloc(20);
  // This is where things go wrong and I get segmentation faults. Likely due to
  // my misunderstanding of malloc.
  // Copy the rawBytes into MyMessage. Assuming the first 8 bytes of rawBytes goes
  // into MyMessageHeader, and the last 12 bytes go into data
  memcpy(ret_msg, rawBytes, 20);
}

int main() {
uint8_t raw_bytes[20] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20};
MyMessage* my_msg = FunctionThatIsNotWorking(raw_bytes);
// Expecting now but this doesnt work
my_msg->header.byte1 == 1;
my_msg->header.byte2 == 2;
my_msg->header.byte3 == 3;
// ...
// first byte of data should be the 9th byte in raw_bytes 
my_msg->data[0] == 9;
my_msg->data[1] == 10;
}

What am I missing here, or what is my mis-understanding?


Solution

  • I suspect what's byteing you (pun intended) is structure padding

    SUGGESTION:

    1. Read the data directly from your device into a local buffer. Make the buffer at least as long as the largest anticipated message.

    2. Read the header

    3. Perform your "malloc", and finally

    4. Unpack the data from your buffer into you


    OK: Here's what I'd suggest:

    /*
     * SAMPLE OUTPUT:
     * sizeof(*my_msg)= 16
     * header: 01 02 03 04 05 06 07 08
     * data: 09 0a 0b 0c 0d 0e 0f 10 11 12 13 14
     */
    #include <stdio.h>
    #include <stdint.h>   // unit8_t
    #include <stdlib.h>   // malloc()
    #include <string.h>   // memcpy ()
    
    typedef struct MyMessage {
      uint8_t header[8];  // 8 bytes (you said) for header; message length unknown.
      uint8_t* data;      // Here, you're only allocating space for a pointer (e.g. 4 bytes)
    } MyMessage_t;
    
    MyMessage_t* FunctionThatIsNotWorking(uint8_t* rawBytes) {
      // Let's assume rawBytes contains header + data
      // Parse the header, determine message type, and determine message length
      // ... TBD ...
    
      // Allocate your struct
      MyMessage_t* ret_msg = malloc(sizeof (struct MyMessage));
    
      // Now allocate space for your *data* (let's assume this particular message has 12 bytes)
      ret_msg->data = malloc(12);
    
      // Copy your header (from rawBytes)
      memcpy(&ret_msg->header, rawBytes, 8);
    
      // Copy the data (starting on the ninth byte; assume the data is contiguous)
      memcpy(ret_msg->data, &rawBytes[8], 12);
    
      // Return the completed record
      return ret_msg;
    }
    
    int main() {
      int i;
      // Create some dummy data
      uint8_t raw_bytes[20] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20};
    
      MyMessage_t* my_msg = FunctionThatIsNotWorking(raw_bytes);
    
      printf ("sizeof(*my_msg)= %ld\n", sizeof(*my_msg));
      printf ("header: ");
      for (i=0; i<sizeof(my_msg->header); i++) {
        printf("%02x ", my_msg->header[i]);
      }
      printf ("\ndata: ");
      for (i=0; i<12; i++) {
        printf("%02x ", my_msg->data[i]);
      }
      printf ("\n");
    
      return 0;
    }
    

    The point I was trying to make is that the byte layout the compiler gives your struct is NOT necessarily what you might expect. There's often "extra spacing" in fields, to ensure alignment.

    If you're using structs on the sending end (when constructing your message), the data layout in the receiver won't necessarily match the layout in the sender.

    Consequently, you usually need to explicitly "pack" and "unpack" messages you send back and forth between different hosts/different platforms.

    I think the point was emphasizing is that you never allocated space for your data. This is true :)

    Finally, in most cases, the length of the message will be unknown until you actually read the header. My example handles this case, too.

    In case you only want to pass the data back to the caller (my example copies both header and data), you'll probably want to add a "message length" field in your struct.

    Always try to use the "sizeof" operator, instead of hard-coding "magic numbers". The next best thing is to declare constant (e.g. #define DATA_LENGTH 12).


    One more example.

    Let's say you know your message data is always going to be exactly 12 bytes. and let's still assume you want the header field to be 8 bytes. You can do this:

    ...
    typedef struct MyMessage {
      uint8_t header[8]; // 8 bytes (you said) for header
      uint8_t data[12];  // This allocates 12 bytes
    } MyMessage_t;
    ...
    MyMessage_t* FunctionThatIsNotWorking(uint8_t* rawBytes) {
      // Let's assume rawBytes contains header + data
      // Parse the header, determine message type, and determine message length
      // ... TBD ...
    
      // Allocate your struct
      MyMessage_t* ret_msg = malloc(sizeof (*ret_msg));
    
      // Copy directly from rawBytes
      memcpy(ret_msg, rawBytes, sizeof (*ret_msg));
    
      // Return the completed record
      return ret_msg;
    }