In plain C, pointers to void are useful as arguments to generic functions, such as a generic quicksort, or a generic swap, etc., as seen here:
Implementation of generic quicksort, etc.
However, as the example of the generic partition function at that link shows:
size_t generic_partition(void *v, size_t nelems, size_t size, compare_function comp)
{
// for brevity we'll be using the right-most 'element' for the
// location of the pivot storage. a better choice would use
// a median-of-three or median-of-five for pivot selection,
// then swap it to that location before the partitioning loop.
void *ploc = (unsigned char*)v + (nelems-1) * size;
size_t pvt = 0;
for (size_t i=0; i<nelems; ++i)
{
if (comp((unsigned char*)v + (i*size), ploc) < 0)
generic_swap((unsigned char*)v + (pvt++*size), (unsigned char*)v + (i*size), size);
}
generic_swap((unsigned char*)v + (pvt*size), ploc, size);
return pvt;
}
it can be quite heavy on the pointer arithmetic and type casting, inside a loop. The reason seems to be that we must frequently cast between void* and char*, because char* supports pointer arithmetic, while void* does not (because the element size is unknown).
So why don't we use char* instead of void*, to implement generic arguments to generic functions ? It seems to have the advantages of void* (generality, since doesn't every type occupy some integer multiple of one byte ?) with the additional advantage of supporting pointer arithmetic.
Using pointers to char only, without pointers to void, wouldn't the loop in the above code require fewer additions, no multiplications, and no type casts ? If so, then wouldn't that be more efficient ?
Perhaps I am missing something to do with functions like "comp" automatically converting the arguments passed to them into the expected type ?
Perhaps I am missing the point generally.
EDIT (To include code for generic_swap):
void generic_swap(void *a, void *b, size_t size)
{
if (a == b)
return;
unsigned char *p1 = a, *p2 = b;
while (size-- > 0)
{
unsigned char c = *p1;
*p1++ = *p2;
*p2++ = c;
}
}
EDIT - This is what I meant by avoiding multiplications (for what it's worth). I've written it as a modification of dbush's nice answer:
size_t generic_partition(void *v, size_t nelems, size_t size, compare_function comp)
{
// for brevity we'll be using the right-most 'element' for the
// location of the pivot storage. a better choice would use
// a median-of-three or median-of-five for pivot selection,
// then swap it to that location before the partitioning loop.
unsigned char *pboundary = v;
unsigned char *ploc = pboundary + (nelems-1) * size;
size_t pvt = 0;
unsigned char *pcurrent = pboundary + size;
while (pcurrent < ploc)
{
if (comp(pcurrent, ploc) < 0)
{
generic_swap(pboundary, pcurrent, size);
pboundary = pboundary + size;
pvt++;
}
pcurrent = pcurrent + size;
}
generic_swap(pboundary, ploc, size);
return pvt;
}
EDIT: Yet more clarification re. arithmetic:
In case the comparison pcurrent < ploc in the while condition is not defined, maybe it could be replaced by pcurrent != ploc.
Also, I know that there are edge cases that may fail, for example if the array has only one element, but I assume that I would be careful about those in real life, and here my question is more about void*, char* and pointer arithmetic.
EDIT: (The fourth !) I'd happily revert the "while" loop to a "for" loop as in the original, at the cost of an extra variable (the iterator) in return for avoiding possibly tricky behaviour when comparing the pointers via < or via !=.
EDIT: (The fifth) Hold up, wait a minute - something ain't right ! I've just gone back to the book "The C programming language" by Kernighan and Ritchie, where I've found (a variant of) this generic partitioning code included in a generic implementation of quicksort (which is what I am actually trying to implement). To my surprise, although it seems to be perfectly generic, yet it seems to make no use of pointer arithmetic, and no use of casting to chars or unsigned chars ! Am I missing something, or have the many authors of courses and websites that I've consulted been greatly overcomplicating things ? Here is the code from Kernighan and Ritchie's book:
void qsort(void *v[], int left, int right, int (*comp)(void *, void *))
{
int i, last;
void swap(void *v[], int, int);
if (left >= right)
return;
swap(v, left, (left + right)/2);
last = left;
for (i = left + 1; i <= right; i++)
if ((*comp)(v[i], v[left]) < 0)
swap(v, ++last, i);
swap(v, left, last);
qsort(v, left, last-1, comp);
qsort(v, last+1, right, comp);
}
EDIT (The sixth !!!) My apologies all, I am evidently even more of a noob than I had realised: My source reliably informs me that this Kernighan-Ritchie "generic" quicksort operates on an array of pointers always, and that the generic-ness must therefore be aided by the programmer making sure to feed it an array of pointers into the actual array they are trying to sort. I can see that the first code I posted, with its casting to char* etc., achieves generic-ness in a different way.
Conversions to and from void *
can occur without a cast and are guaranteed to work for any object type.
You can actually eliminate the casting by assigning the void *
argument to a unsigned char *
once, then using that in the rest of the function:
size_t generic_partition(void *v, size_t nelems, size_t size, compare_function comp)
{
// for brevity we'll be using the right-most 'element' for the
// location of the pivot storage. a better choice would use
// a median-of-three or median-of-five for pivot selection,
// then swap it to that location before the partitioning loop.
unsigned char *pstart = v;
unsigned char *ploc = pstart + (nelems-1) * size;
size_t pvt = 0;
for (size_t i=0; i<nelems; ++i)
{
if (comp(pstart + (i*size), ploc) < 0)
generic_swap(pstart + (pvt++*size), pstart + (i*size), size);
}
generic_swap(pstart + (pvt*size), ploc, size);
return pvt;
}
Note that you still have to multiply by the element size when constructing the pointers.