I need a method in C that is able to process a data packet that can be of various formats/types. What would be your recommended approach to do so (note that I want to be MISRA compliant as much as possible). Currently, I am considering the following options (but open to learn additional ones), as per the code oultined below
Option 1: the method explicitly states the format of the data packet (and there is a switch case inside the method) Option 2: the data includes the data type and a union with the content
Which of these options would you prefer?
Many thanks for your insights
int process_type1(const struct type1* data) {
/* do stuff with data */
}
int process_type2(const struct type2* data) {
/* do stuff with data */
}
/* Option 1: explicit data type in method to process data */
int process_1(const enum data_type data_type, const uint8_t* data, const size_t* n_bytes) {
int res = -1;
switch (data_type) {
case TYPE_1: res = process_type1((const struct type1*)data); break;
case TYPE_2: res = process_type2((const struct type2*)data); break;
default:
break;
}
return res;
}
/* Option 2: implicit data type in method to process data and union to include the data */
struct packet {
enum data_type {
TYPE_1,
TYPE_2,
/* Add more types as needed */
} data_type;
union {
struct type1 type1;
struct type2 type2;
/* Add more types as needed */
} data;
};
int process_2(const struct packet* packet) {
int res = -1;
switch (packet->data_type) {
case TYPE_1:
res = process_type1(&packet->data.type1);
break;
case TYPE_2:
res = process_type2(&packet->data.type2);
break;
default:
break;
}
return res;
}
Overall, structs and unions are problematic for portable modelling of raw byte streams, such as data protocols:
union
type punning between different representations of the same data, which wouldn't be the case here.As for (const struct type1*)data
followed by a de-refencing of type1
members, it is a strict aliasing violation and undefined behavior, regardless of MISRA. It doesn't matter if uint8_t
is a character type because the character type exception to the strict aliasing rules is only valid when going from a larger type to a character type, not the other way around. (Unless you use another trick which I will show further down.)
The only truly portable, well-defined, MISRA-compliant way is to write serialization/deserialization routines that copy the data member by member. Which you will have to do anyway in case facing a different network endianness compared to your CPU endianness. The down side to this is that you have to copy the data.
Otherwise, this is a possible work-around:
Ditch the struct type1
style so that we may change type1
to a union type without changing our whole code. Are there technical reasons to pick one struct coding style over the other?
Make type1
a union like this:
typedef union
{
struct
{
// specific members
};
uint8_t raw_data [expected_size];
} type1;
This assuming uint8_t
is a character type (pretty much dead certain in practice). After determining the type of the packet, have the function take a uint8_t*
parameter. We may now cast that parameter into a type1*
because we now have "a union type with a member compatible with the effective type of the object" (uint8_t
). That's another exception to the strict aliasing rule.
Furthermore we no longer type pun between two unrelated structs, but rather between the incoming raw data and one of the possible struct types.
The anonymous struct style above is optional, but it allows us to type type.member
rather than type.some_struct.member
. MISRA C does not object against anonymous struct/union.
We may now cook up a generic API like:
typedef enum {
TYPE_1,
TYPE_2,
/* ... */
TYPE_N
} data_t;
typedef result_t process_func_t (const uint8_t* raw_data);
result_t process_type1 (const uint8_t* raw_data); // type-specific protocol parser
...
process_func_t* const process[] = // branch table
{
[TYPE1] = process_type1,
[TYPE2] = process_type2,
...
};
// ensure integrity between enum and branch table:
_Static_assert(sizeof(process)/sizeof(*process) == TYPE_N,
"process look-up table inconsistent");
Usage:
data_t data_type;
...
if(data_type < TYPE_1 || data_type >= TYPE_N)
{
/* defensive programming, error handling here */
}
// assuming raw_data is aligned as per the requirements of the various struct types, then:
process[data_type](raw_data);
This is pretty standard code for safety-related software. And as far as I can tell, MISRA compliant, apart from the cast from uint8_t*
to type1*
etc that we put inside process_type1
. A MISRA deviation from rule 11.3 might be necessary. We don't want a permanent deviation against that rule, because it's a very sound one.