I need to write a custom malloc for GPU programming. Will this work correctly?
void* malloc(int size, int* bytesUsed, uchar* memory){
int startIdx = (*bytesUsed);
(*bytesUsed) += size;
return (void*)(memory+startIdx);
}
I'm new to C programming, I might have made pointer-arithmetic related errors or something... the idea is bytesUsed
gives you the index into memory
of the first free address, so you increment it by size
and then return the incremented index as a pointer.
[Edit 2023] sizeof(max_align_t)
corrected to alignof(max_align_t)
.
There are some issues:
Largest problem is alignment. The returned pointer needs to be aligned. Since this malloc()
is not given the pointer type needed, use max_align_t
"which is an object type whose alignment is as great as is supported by the implementation in all contexts" C11dr §7.19 2. Note: *bytesUsed
needs this alignment too. So apply similar code should if other code affects it.
if (size%alignof(max_align_t)) {
size += alignof(max_align_t) - size%alignof(max_align_t);
}
// or
size = (size + alignof(max_align_t) - 1)/alignof(max_align_t)*alignof(max_align_t);
No detection for out-of-memory.
Avoid re-using standard library names. Code can define
them in later, if needed.
// void* malloc(int size, int* bytesUsed, uchar* memory);
void* RG_malloc(int size, int* bytesUsed, uchar* memory);
// if needed
#define malloc RF_malloc
malloc()
expects a different type for allocations: size_t
, not int
.
// void* malloc(int size, int* bytesUsed, uchar* memory);
void* malloc(size_t size, size_t* bytesUsed, uchar* memory);
Cast is not needed.
// return (void*)(memory+startIdx);
return memory + startIdx;
More clear to use unsigned char
than uchar
, which hopefully is not something else.
Putting this all together
void* malloc(size_t size, size_t* bytesUsed, unsigned char* memory){
size = (size + alignof(max_align_t) - 1)/alignof(max_align_t)*alignof(max_align_t);
if (RG_ALLOC_SIZE - *bytesUsed > size) {
return NULL;
}
size_t startIdx = *bytesUsed; // See note above concerning alignment.
*bytesUsed += size;
return memory + startIdx;
}
Additionally, RG_free()
is not coded. If that is needed, this simply allocation scheme would need significant additions.