profilingmingwgprof

How to profile an EXE and the DLLs it loaded at the same time?


I'm using the MinGW GCC compiler on Windows. If I add -pg switch to the compiler I can generate both the EXE and the DLL with profile data.

The gmon.out is generated. But the problem is that when I use

gprof myprogram.exe gmon.out

I get no profile output (other than the table headings and other text).

gprof mydll.dll gmon.out

I get the output for that particular DLL only but not for the main exe.

Maybe both the exe and the dll wanted to generate the same file and the dll won.

The goal is getting stats for functions in both the EXE and DLL in one output.

Can gprof do that? If not, is there any tool that can do this on Windows?


Solution

  • It seems gprof cannot do that. sprof is not available in MinGw. So rolled my own profiler.

    I used -finstrument-functions So two functions __cyg_profile_func_enter and __cyg_profile_func_exit will be called before and after each functions respectively.

    The actual profiling functions are exported into a DLL and called in these functions, and the DLL is linked to all EXE and DLL in question. So it can profile both of them

    The code in the library is like this (removed clutter: asserts, error checking, simplified function calls for clarity).

    static void proflib_init()
    {
        atexit(proflib_finalize);
    
        empty(callstack);
        empty(trackingData);
        proflibIntialized = 1;
    }
    
    static void proflib_finalize()
    {
        /* Make a log. */
        FILE *f = fopen("proflib_log.txt", "wt");
        int i;
    
        sortBySelftime(trackingData);
    
        fprintf(f, "%10s%15s%15s%15s\n", "Func name", "Cumulative", "Self time", "Count");
        for (i = 0; i < getlength(trackingData); i++)
        {
            FunctionTimeInfo *fri = trackingData[i];
    
            fprintf(f, "%10p%15"PRIu64"%15"PRIu64"%20d\n", fri->addr, fri->cumulative, fri->selfTime, fri->count);
        }
    
        fclose(f);
    }
    
    void proflib_func_enter(void *func, void *caller)
    {
        FunctionTimeInfo elem;
        long long pc;
    
        pc = rdtsc(); /* Read timestamp counter from CPU. */
    
        if (!is_prolib_initialized())
        {
            proflib_init();
        }
    
        /* Register self time as control moves to the child function. */
        if (!isEmpty(callstack))
        {
            FunctionTimeInfo *top = gettop(callstack);
    
            top->selfTime += pc - top->selfSample;
        }
    
        elem.addr = func; /* Address of function. */
        elem.cumulative = pc; /* Time spent in function and functions called by this. (so far store the reference point only.)*/
        elem.selfSample = pc; /* Reference point for self time counting. */
        elem.count = 1; /* Number of this the function is counted. */
        elem.selfTime = 0; /* Time spent in the function not including called functions. */
    
        push(callstack, elem);
    }
    
    void proflib_func_exit(void *func, void *caller)
    {
        FunctionTimeInfo *fti;
        FunctionTimeInfo *storedStat;
        long long pc;
    
        pc = rdtsc();
    
        fti = gettop(callstack);
    
        fti->cumulative = pc - fti->cumulative; /* Finalize the time. */
        fti->selfTime += pc - fti->selfSample;
        pop(callstack);
    
        {
            FunctionTimeInfo *top = gettop(callstack);
    
            top->selfSample = pc; /* Set new self reference for the parent. */
        }
    
        storedStat = find(trackingData, func);
    
        if (storedStat)
        {
            /* Already have an entry. */
            storedStat->cumulative += fti->cumulative;
            storedStat->selfTime += fti->selfTime;
            storedStat->count++;
        }
        else
        {
            /* Add it as new entry. */
            add(trackingData, fti);
        }
    }
    

    And it produces logs like this:

     Func name     Cumulative      Self time          Count
      691C83B9  1138235861408  1138235861408             1137730
      00416396   539018507364   539018507364            16657216
      0040A0DC   259288775768   199827541522             1914832
      0041067D   876519599063   163253984165            92203200
      691C9E0E   785372027387   150744125859              190020
      004390F9  3608742795672   149177603708                   1
      0042E6A4   141485929006   116938396343               37753
      00428CB8   456357355541   112610168088              193304
      0041C2A4   340078363426    84539535634           114437798
      691CB980   402228058455    82958191728               29675
      00408A0A    79628811602    77769403982              512220
      0040D8CD    93610151071    63396331438            87773597
      0040D91A    60276409516    60276409516           175547194
      00427C36    72489783460    58130405593                   1
      691C7C3D    56702394950    56702394950             3455819
      691C949F   101350487028    47913486509             2977100
      691CBBF3   241451044787    45153581905               29771
      0043702E   920148247934    41990658926               25612
      ...
    

    The function names can be found out from the MAP file. And the function at 0x691C83B9 in the DLL is indeed a suboptimal function with O(n³) complexity and is called lot of times, I must refactor that... I totally forgot that function even exist... The 0x004390F9 is the WinMain.