The Objective-C runtime ISA pointer is defined as such:
union isa_t {
isa_t() { }
isa_t(uintptr_t value) : bits(value) { }
uintptr_t bits;
private:
// Accessing the class requires custom ptrauth operations, so
// force clients to go through setClass/getClass by making this
// private.
Class cls;
public:
#if defined(ISA_BITFIELD)
struct {
ISA_BITFIELD; // defined in isa.h
};
bool isDeallocating() {
return extra_rc == 0 && has_sidetable_rc == 0;
}
void setDeallocating() {
extra_rc = 0;
has_sidetable_rc = 0;
}
#endif
void setClass(Class cls, objc_object *obj);
Class getClass(bool authenticated);
Class getDecodedClass(bool authenticated);
};
The bits fields can be read by the definitions here.
When I read a macho from disk and go to the _objc_classlist
section and follow a objc_class
which is defined as such:
struct objc_class : objc_object {
objc_class(const objc_class&) = delete;
objc_class(objc_class&&) = delete;
void operator=(const objc_class&) = delete;
void operator=(objc_class&&) = delete;
// Class ISA;
Class superclass;
cache_t cache; // formerly cache pointer and vtable
class_data_bits_t bits; // class_rw_t * plus custom rr/alloc flags
...
and objc_object
is defined as such:
struct objc_object {
private:
isa_t isa;
public:
...
meaning that I should be able to interpret the first 8 bytes of objc_class
as the bits
field of an isa
, but when I do this and try to interpret the bits I get random and false information,
on the other hand if I interpret the first 8 bytes as a pointer, it leads me to another objc_class
instance on disk, which is usually the metaclass of the class. I wonder then why is the definition of the isa
union from the Objective-C runtime and its bits
field. Is this only right to interpret this as isa
union with bits
when we instantiate an object of a some kind and when reading from disk it's just a pointer to a meta class definition?
EDIT:
The way I read the objc_class
struct from file is with python:
ISA_MASK = 0x0000000ffffffff8
@dataclass
class Isa():
bits: ctypes.c_size_t
_cls: ctypes.c_size_t
def __init__(self, fp, addr):
fp.seek(addr)
self.bits = struct.unpack("<Q", fp.read(8))[0]
self._cls = self.bits
def nonpointer(self):
return self.bits & 1
def has_assoc(self):
return (self.bits >> 1) & 1
def has_cxx_dtor(self):
return (self.bits >> 2) & 1
def shiftcls(self):
return (self.bits >> 3) & 0x7ffffffff
def magic(self):
return (self.bits >> 36) & 0x3f
def weakly_referenced(self):
return (self.bits >> 42) & 1
def unused(self):
return (self.bits >> 43) & 1
def has_sidetable_rc(self):
return (self.bits >> 44) & 1
def extra_rc(self):
return (self.bits >> 45) & 0x7ffff
def get_class(self):
clsbits = self.bits
clsbits &= ISA_MASK
return clsbits
@dataclass
class ObjcObject:
isa: Isa
_addr: ctypes.c_size_t
def __init__(self, fp, addr, isa_class, external_block_addr):
self.isa = None
self._addr = addr
fp.seek(addr)
isa_addr = struct.unpack("<Q", fp.read(8))[0]
if isa_addr != 0 and isa_addr < external_block_addr:
self.isa = Isa(fp, isa_addr, external_block_addr)
@dataclass
class ObjcClass(ObjcObject):
super_class: ObjcClass
cache: Cache
class_ro: ClassRo
def __init__(self, fp, addr, external_block_addr):
super().__init__(fp, addr, ObjcClass, external_block_addr)
...
...
I have for example a class lets call it A
and after processing the chained fixups on address 0x0025eed0
I have it it's symbol _OBJC_CLASS_$_A
and the objc_class
defined in that addres.
The first 8 bytes of the structure is the ISA as we've established by looking at the sources of the runtime. Following it as a pointer and not treating it as the isa_t
union I get to another objc_class
struct for the symbol _OBJC_METACLASS_$_A
which is the metaclass of this class.
Now if instead of treating the first 8 bytes of the objc_class
struct as a pointer to the metaclass, I try to interpret them as the bits of the isa_t
union like I have in the code I provided, and for example using the has_cxx_dtor
method I get False
which is incorrect because I can clearly find this method on the method_list_t
structure of the class_ro
so it doesn't match up with what I parse and hence the isa_t
union seem unrelated to the actual data of the class on disk.
Note that the method for extracting the data from the bits of isa_t
is by looking at the source of isa.h
and assuming I read an ARM64 macho without ptr auth and not from simulator.
After digging a bit through the runtime, it appears that non-pointer isas are a runtime-only concept, and that all on-disk isas will always be regular pointers.
The loading process of Obj-C classes in an object file:
dyld
calls _objc_map_images
(objc-internal.h
/objc-runtime-new.mm
), passing in the object headers to read and load classes from_objc_map_images
does a bit of setup before calling map_images
(objc-private.h
/objc-runtime-new.mm
)map_images
takes the runtime lock, then calls map_images_nolock
(objc-private.h
/objc-os.mm
)map_images_nolock
iterates over the mach headers, searching for Obj-C info and performing some validation. It passes all of the headers which contain Obj-C classes to _read_images
(objc-private.h
/objc-runtime-new.mm
)_read_images
is where we actually get to the interesting parts. It first sets up support for non-pointer isas as relevant for the runtime target, and sets up some tables for storing class information. After reading and fixing up selectors, it starts reading class info (OBJC_RUNTIME_DISCOVER_CLASSES_START()
)
classlist
stored in the header, receiving direct pointers to each of the classes in the imagereadClass
(objc-runtime-new.mm
), which resolves mangled class names, Swift classes, and more — but at the end of the day, the read classref_t
(raw pointer to dyld class) is either cast to Class
(the class object), or replaced by an allocated Class
instanceSo, where do non-pointer isas come into play? Only when setting objects' class at runtime:
objc_constructInstance
or class_createInstance
(runtime.h
), or set an object's class via object_setClass
, the object has either objc_object::initInstanceIsa
or objc_object::initIsa
(objc-object.h
) called on it (and initInstanceIsa
just calls through to initIsa
anyway)objc_object::initIsa
has two implementations (one for SUPPORT_NONPOINTER_ISA
and the other for non-supported), but both call down to isa_t::setClass
(objc-private.h
/objc-object.h
)isa_t::setClass
also has two implementations — when SUPPORT_NONPOINTER_ISA
is true, the implementation sets the appropriate bits in the isa value itself, setting shiftcls
as necessary; when SUPPORT_NONPOINTER_ISA
is false, it just sets the class directly(Or in reverse, if you prefer: isa_t::setClass
is only called from objc_object::initIsa
/objc_object::changeIsa
, which themselves are only called from objc_constructInstance
/class_createInstance
/object_setClass
.)
So, when you read these object files on disk, you will only ever encounter pointer isas for objects and classes; the bits that are actually set inside of isas is done at runtime exclusively. If there are details you're hoping to read from those bits, you'll need to construct that info yourself from the surrounding mach-o data.