I've created my DLL using -pg
option (compilation and link), and managed to run its code and to call exit
at some point.
A file gmon.out
is created, so far so good.
But when I do
gprof mydll.dll gmon.out
no useful information is printed.
When I do the same thing with an executable it works properly and I get the timing & count information all right.
Why is this happening on my DLL ?
(This question has been asked several times several years/decades ago but remained unanswered)
Actually, gprof can do that. The issue it's encountering is that addresses in the DLL are different from the ones that are recorded in the gmon.out
file.
On an executable, the (virtual) address is fixed, but on DLLs it is not. Don't ask me if it's because of ASLR or something else, but it complexifies post-mortem debugging a lot.
Plus the fact that the gmon.out
file format isn't documented, or that there is a documented format but it doesn't match what we were getting.
But we kind of figured it out...
There's a header, then a lot of zeroes, then data in the end. I don't know about a lot of data but the knowledge I got is enough to convert the gmon.out file into an useable one.
First, you have to print the address of your DLL entrypoint symbol when starting the program, and compare it to the static value given by nm
Let's say your entrypoint is _entry
. In your program (for example C) just do:
printf("entry: %p\n",&entry);
Then use nm
on the DLL (which must have symbols) to get the static value. On Windows:
nm mydll.dll | find "_entry"
Let's say you get 0x1F000000
for the static value and 0x6F000000
for the run-time (printed) value. Then you have a 0x50000000
offset that you must apply (subtract) to your gmon.out
binary file.
So basically the format is pretty simple:
In the following real-life example I have highlighted the 3 first longwords:
0x10121450
)0x1357E590
)0x1A2E8C0
)Now the start of the data, note that the offset matches (before the data offset there are only zeroes, probably some room for some more data)
Now we have to apply the offset computed/printed when running the code so the addresses from gmon.out
match the addresses from the DLL
How to do that? it's pretty easy with some python script. The concept:
The aim is to add the address shift to the header addresses and all the addresses of the chunks, leaving the rest of the data unchanged.
The script, everything is hardcoded but you get the idea
import struct
with open("gmon.out","rb") as f: # the file produced by the run
contents = f.read()
start_address,end_address,data_offset = struct.unpack("<III",contents[:12])
profile_data = contents[data_offset:]
nb_records = len(profile_data)/12
records = []
for i in range(0,nb_records):
offset=i*12
extract = profile_data[offset:offset+12]
s,e,data = struct.unpack("<III",extract)
records.append((s,e,data))
# let's say 650000000 is the address that the program printed
# and 10120000 is the address that "nm" reports
shift = 0x65000000-0x10120000
with open("gmon2.out","wb") as f: # the file that will work with that run
f.write(struct.pack("<II",start_address+shift,end_address+shift))
f.write(contents[8:data_offset])
for s,e,d in records:
f.write(struct.pack("<III",s+shift,e+shift,d))
now:
gprof mydll.dll gmon2.out
Doing that allows to decode the gmon file against the DLL, since the addresses are now corrected to match the static addresses contained in the DLL.