c pointers data-structures memory-management static-memory-allocation

Contiguous type-agnostic pointer-data-structure implementation in C?

I am implementing search-trees and hash-tables in pure C (C89). Although the time+space asymptotic complexity is good, the constants and cache-efficiency… not so much.

The keys and values are void* and the data-structures are close-to-optimal (in average case).

Is there a better way of:

typing data (void* is bad? - alignment &etc.);
making the data-structure contiguous (arrays are really cache efficient; my data-structures are not. Is this where I should be using arenas? - Often I know how much space I need upfront [when in C++ to call ::reserve; or in C what to put in the [] for VLAs]; so that should simplify this.)

Explicitly I am looking for code to enable the creation of contiguous pointer-based builtin|UDT data-structures without using arrays or losing dynamism.

EDIT: As per request by commentor, here is my current tree https://github.com/SamuelMarks/general-balanced-tree-c usable like so:

#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define GBT_KEY_TYPE
typedef const char *gbt_ky_type;

#define GBT_DATA_TYPE
typedef const char *gbt_data_type;

#include <general_balanced_tree_c.h>

void str_key_print(const gbt_ky_type key) { puts(key); }

void str_assign(gbt_ky_type *dst, const gbt_ky_type src) {
  if (*dst)
    free((void *)*dst);
  *dst = strdup(src);
}

int str_key_less(const gbt_ky_type a, const gbt_ky_type b) {
  return strcmp(a, b) < 0;
}

int str_key_equal(const gbt_ky_type a, const gbt_ky_type b) {
  return strcmp(a, b) == 0;
}

void str_key_destroy(gbt_ky_type key) { free((void *)key); }

With entrypoint:

int main(void) {
  /* const struct gbt_dict * dict = gbt_construct_dict();
     // this default dict is int->int ^
  */

  struct gbt_dict *const dict = gbt_construct_dict_full(
      /*gbt_ky_assign_func*/ str_assign,
      /*gbt_ky_less_func*/ str_key_less,
      /*gbt_ky_equal_func*/ str_key_equal,
      /*gbt_assign_func*/ str_assign,
      /*gbt_key_destroy_func*/ str_key_destroy,
      /*gbt_key_print_func*/ str_key_print);

  /* insert key `"foo"` with value `"bar"` */
  gbt_insert(dict, "foo", "bar");
  {
    struct gbt_node *const needle =
        gbt_lookup(dict, "foo"); /* find value of key "foo" */
    assert(needle != NULL);
    assert(strcmp(gbt_keyval(dict, needle), "bar") == 0);
  }
  gbt_delete(dict, "foo"); /* delete key `"foo"` */

  return EXIT_SUCCESS;
}

It's not obvious to me: how best users should specify key and value types; and how to allocate ahead-of-time in a cache-efficient manner (e.g., if I know how many key/value pairs are going to be inserted at constructor time).

Solution

Basically there are 3 different ways to do type-generic programming in C.

typedef a single data type per ADT and then change that typedef depending on project needs. This means that the code is easily portable - just change the typedef. The disadvantage of this is that only one ADT with that data type can exist in the code base at once. It doesn't scale well.
Use void* to data allocated elsewhere (caller-side or through malloc) and keep a separate enum keeping track of what type the void* is pointing to. This is flexible and scales well, but it is type-unsafe and may also give needlessly slow or complex code.
Use _Generic/typeof (introduced to C with C11 and C23 respectively) to do different actions or call different functions depending on the type used. This is both efficient and type safe, but like the typedef version it requires manual editing of some type. Except with this approach you can support several types at once, as long as all types to be supported are known in advance.

The former two typically rely a lot on callback functions to achieve type-specific behavior. See standard C bsearch/qsort which are examples of 2), using void pointers and callbacks.

Also, please keep in mind the reasons why we can't do something like a unsigned char memory pool and cast chunks of that into a specific type: this might cause issues with alignment and strict aliasing both, two forms of undefined behavior bugs.

We could implement something like a unsigned char memory pool by utilizing memcpy and making hard copies all the time, but it has all the disadvantages of 1) and 2) plus an additional execution overhead. The only advantage of that model is that it is the most portable version. It is for example the only method to make structs truly portable during serialization/deserialization, doing so by performing individual memcpy of each member.

Spontaneously it sounds like you are after a re-write of your code from 1) to 3). One way to do that and maintain a list of supported types is as you mentioned, to use X-macros.

Some code examples of that can be found in my off-site post here: What are X macros and when to use them? The example at the bottom of that post shows how you can make a list of supported types (which may or may not include further type-specific stuff if needed), then use _Generic to generate the code, similar to C++ templates.

All the advantages and disadvantages of doing this with X-macros are mentioned in that post too. You get very effective, very maintainable code that scales well, but it will not be very readable and may be hard to trouble-shoot.

Regarding concerns with caching, there are misc tips and tricks.

You could perhaps consider adding the data type at the end of the struct as a flexible array member then allocate it all in one go? That is (pseudo-code for something assumed to be wrapped in macros to call the appropriate constructor):

  #define MYSTRUCT_TYPEDEF(type) \
  typedef struct                 \
  {                              \
    /* all the stuff needed internally here */\
                                 \
    size_t data_size;            \
    type data[];                 \
  } my_struct_##type;

  ...

  #define MYSTRUCT_CONSTRUCTOR(type)                                    \
  my_struct_##type* mystruct_construct_##type (size_t size,             \
                                        const type init_list[size])     \
  {                                                                     \
    my_struct* ms = malloc(sizeof(*ms) + sizeof(type[size]));           \
    /* ... */                                                           \
    ms->data_size = size;                                               \    
    memcpy(ms->data, init_list, sizeof(type[size]));                    \
                                                                        \
    return ms;                                                          \
  }

This makes the meta data cache-coherent with the actual data.

However if you are allocating a tree ADT as seen in the linked Github project, then that in itself will give subpar performance in case each node is allocated individually. Allocating individual nodes of a tree/linked list with malloc() is simply bad practice and has been so since the 1980s somewhere. The main reason why is indeed the advent of data cache memories. (Other reasons why it is bad are malloc overhead execution time, heap fragmentation etc etc)

For such designs consider pre-allocating an array of nodes. If there is a known maximum size then you can use a statically sized array, otherwise use malloc to pre-allocate a "large enough for now" array. And then upon creating a new node, grab one from the pre-allocated list. In the malloc version when the pre-allocated list is full, only then you realloc it. Usually to 2x the size.

For inspiration you can search the web for implementations of linked lists using static sized arrays instead of allocating individual nodes. Another strength of such solutions is that you don't need to make the variables next/left/right etc pointers but can instead use plain integer indices. Instead of NULL you'd use another sentinel value like ~0u. They are real easy to debug compared to traditional linked-lists/BSTs too.