I read this code that seems to use lower 2 bits of addresses to store binary flag, see below (obfuscated due to sensitivity):
void *setFlag4Addr( void *addr, BOOLEAN flag )
{
// long for 32-bit machine.
long long ret;
if ( flag ) {
ret = (long long)addr | 0x2;
} else {
ret = (long long)addr | 0x1;
}
return (void *)ret;
}
I wonder what is the justification behind it? does it mean every address has the 2 least significant bits 0b00
?
It is because of alignment? for instance, IIRC, address obtained by say malloc
has an alignment of 16 bytes on 64-bit machines. But what if alignas
is somewhere used?
This function stores the value 1
or 2
in the least significant bits of the address passed as an argument and returns the updated address.
This method is used to implement tagged pointers. It relies on the following assumptions:
pointers must be aligned on a known power of 2, in this case at least 4
. This assumption is fine if said pointers point to objects with this alignment, which is indeed the case for pointers returned by malloc
and friends. The pointers returned by malloc
are guaranteed to be suitably aligned for the largest standard alignment, which is usually 16 on 64-bit machines and at least 8 on all standard architectures.
modified pointers must not be used directly to access the objects. The tagged must be removed first with an appropriate masking operation.
the above code assumes that pointers are not larger than long long
, which is the case on all modern systems, but if would be safer to use an unsigned
type for this operation, eg: type uintptr_t
if available.
The code posted is not optimal as it performs a test. A simpler and more general solution would take the tag as an argument and use a single expression.
Here is a simple example:
#include <stdint.h>
typedef enum {
ptr_NONE,
ptr_INT,
ptr_STR,
ptr_OBJ,
ptr_MASK = 3,
} ptr_TAG;
void *pointer_set_tag(void *addr, ptr_TAG tag) {
return (void *)((uintptr_t)addr + tag);
}
ptr_TAG pointer_get_tag(void *addr) {
return (ptr_TAG)((uintptr_t)addr & ptr_MASK);
}
void *pointer_remove_tag(void *addr, ptr_TAG tag) {
return (void *)((uintptr_t)addr - tag);
}
void *pointer_untag(void *addr) {
return (void *)((uintptr_t)addr & ~(uintptr_t)ptr_MASK);
}
To help distinguish tagged pointers from actual object pointers, it is easy to add an incomplete type:
#include <stdint.h>
typedef enum {
ptr_NONE,
ptr_INT,
ptr_STR,
ptr_OBJ,
ptr_MASK = 3,
} ptr_TAG;
typedef struct tag_stuff *tagged_ptr;
tagged_ptr pointer_set_tag(void *addr, ptr_TAG tag) {
return (tagged_ptr)((uintptr_t)addr + tag);
}
ptr_TAG pointer_get_tag(tagged_ptr addr) {
return (ptr_TAG)((uintptr_t)addr & ptr_MASK);
}
void *pointer_remove_tag(tagged_ptr addr, ptr_TAG tag) {
return (void *)((uintptr_t)addr - tag);
}
void *pointer_untag(tagged_ptr addr) {
return (void *)((uintptr_t)addr & ~(uintptr_t)ptr_MASK);
}