I'm most interested in extracting the architecture version, i.e. v5, v5T, etc. I've been referencing Elf for the ARM Architecture Section 4.3.6 Build Attributes which has been helpful in getting me up to this point. I can find the start of the .ARM.attributes section and can parse the first key parts of the information: Format-version, Section-length, and vendor-name + null byte, no problem. I get a little lost after that. Below is a snapshot I ran using hexdump -vC
on an elf compiled with arm-linux-gnueabi-gcc -march=armv5t -O myprog.c -o myprog
for a ARMv5T architecture. The start of the section is 77f0b.
We can see: Format-version: A
Section-length: 0x29
Vendor-name: "aeabi"
Obviously, 5T is available in ASCII form at 77f1C, but I'm not sure how to interpret the tag I need to parse to get that value.
Note: Yes, I understand there are tools that I can use to do this, but I need to extract the information in the application I am writing. It already parses the necessary information to make it this far.
Bonus question: Does PowerPC have similar tags? I couldn't find any supporting documentation.
These tags are documented in the Addenda to, and Errata in, the ABI for the ARM Architecture. For example, under The target-related-attributes (section 3.3.5.2), we learn that Tag_CPU_arch
has value 6, which immediately follows Tag_CPU_name
(5, preceding the 5T
) in your dump. Its argument is 3, which again corresponds to ARM v5T, according to the table in the document. The next tag is Tag_ARM_ISA_use
(8) with an argument of 1, meaning The user intended that this entity could use ARM instructions (whatever this means), and so on.
Note that the integers are encoded in uleb128 format. This encoding is described in the DWARF standard (in section 7.6 of DWARF 3). Basically, it's base-128, little endian, and you need to keep reading while the MSB is set.