cdwarfgnu-coreutils

Two seemingly different functions with same name exist in the C program (using DWARF info), why?


I have been going through the assembly code of the sort program in the GNU Coreutils and found something I can't figure out and explain in a technological sense of why this is happening.

It all started with first disassembling the sort program by the command:

~/coreutils/new_build/src (master*) » objdump --dwarf=info ./sort &> sort.objdwarf

Which provides me with useful DWARF information that can be used to learn about the program. However, looking through it, I found out that there are two numcompare functions that are very different (which I will show later). For instance:

 <1><5978>: Abbrev Number: 55 (DW_TAG_subprogram)
    <5979>   DW_AT_name        : (indirect string, offset: 0x718): numcompare
...
    <5986>   DW_AT_low_pc      : 0x6fb0
    <598e>   DW_AT_high_pc     : 0x703a
    <5996>   DW_AT_frame_base  : 0x1464 (location list)
    <599a>   DW_AT_GNU_all_tail_call_sites: 1
    <599b>   DW_AT_sibling     : <0x59bc>
 <2><599f>: Abbrev Number: 56 (DW_TAG_formal_parameter)
    <59a0>   DW_AT_name        : a
...
 <2><59ad>: Abbrev Number: 56 (DW_TAG_formal_parameter)
    <59ae>   DW_AT_name        : b
...
 <2><59bb>: Abbrev Number: 0

and second one is:

<1><f882>: Abbrev Number: 10 (DW_TAG_subprogram)
    <f883>   DW_AT_name        : (indirect string, offset: 0x718): numcompare
...
    <f88f>   DW_AT_low_pc      : 0x19118
    <f897>   DW_AT_high_pc     : 0x1957e
    <f89f>   DW_AT_frame_base  : 0x5d34 (location list)
    <f8a3>   DW_AT_GNU_all_tail_call_sites: 1
    <f8a4>   DW_AT_sibling     : <0xf92e>
 <2><f8a8>: Abbrev Number: 6 (DW_TAG_formal_parameter)
    <f8a9>   DW_AT_name        : a
...
 <2><f8b5>: Abbrev Number: 6 (DW_TAG_formal_parameter)
    <f8b6>   DW_AT_name        : b
...
 <2><f8c2>: Abbrev Number: 7 (DW_TAG_formal_parameter)
    <f8c3>   DW_AT_name        : (indirect string, offset: 0x20af): decimal_point
 ...
 <2><f8d2>: Abbrev Number: 7 (DW_TAG_formal_parameter)
    <f8d3>   DW_AT_name        : (indirect string, offset: 0xa1e): thousands_sep
...

After greping the source code, I was able to verify that there is only one numcompare function:

~/coreutils/src (master*) » grep -Hnriw numcompare                                 
sort.c:1998:numcompare (char const *a, char const *b)
sort.c:2694:            diff = numcompare (ta, tb);

Upon further investigation looking through the source code, I happened to find something interesting in terms of how it is defined:


/* Compare strings A and B as numbers without explicitly converting them to
   machine numbers.  Comparatively slow for short strings, but asymptotically
   hideously fast. */

ATTRIBUTE_PURE
static int
numcompare (char const *a, char const *b)
{
  while (blanks[to_uchar (*a)])
    a++;
  while (blanks[to_uchar (*b)])
    b++;

  return strnumcmp (a, b, decimal_point, thousands_sep);
}

From this, I realized that the second DWARF information shown regarding numcompare is actually the information of the strnumcmp! As shown here:

<1><f810>: Abbrev Number: 5 (DW_TAG_subprogram)
    <f811>   DW_AT_external    : 1
    <f812>   DW_AT_name        : (indirect string, offset: 0x56b8): strnumcmp
...
    <f81e>   DW_AT_low_pc      : 0x1957e
    <f826>   DW_AT_high_pc     : 0x195ac
    <f82e>   DW_AT_frame_base  : 0x5cd4 (location list)
    <f832>   DW_AT_GNU_all_tail_call_sites: 1
    <f833>   DW_AT_sibling     : <0xf870>
 <2><f837>: Abbrev Number: 6 (DW_TAG_formal_parameter)
    <f838>   DW_AT_name        : a
...
 <2><f844>: Abbrev Number: 6 (DW_TAG_formal_parameter)
    <f845>   DW_AT_name        : b
...
 <2><f851>: Abbrev Number: 7 (DW_TAG_formal_parameter)
    <f852>   DW_AT_name        : (indirect string, offset: 0x20af): decimal_point
...
 <2><f860>: Abbrev Number: 7 (DW_TAG_formal_parameter)
    <f861>   DW_AT_name        : (indirect string, offset: 0xa1e): thousands_sep
...

And I did a further investigation of this strnumcmp, and it is an outside library function used by GNU Coreutils that is defined as:

int
strnumcmp (char const *a, char const *b,
           int decimal_point, int thousands_sep)
{
  return numcompare (a, b, decimal_point, thousands_sep);
}

So now I'm very confused about what is going on here. The fact ATTRIBUTE_PURE flag is used with static seems to mean:

"ATTRIBUTE_PURE" is a function attribute in C programming language that can be used to indicate that a function has no side effects and only depends on its arguments, and "static" is a storage class specifier that indicates that the function or variable is only visible within the file it is declared in.

This doesn't quite help me explain what is going on here. The main questions I wanted to ask are why and how are there two separate DWARF information of a function numcompare? Thank you in advance.


Solution

  • why and how are there two separate DWARF information of a function numcompare?

    The functions are static - they have internal linkage. Every object file can have a different static function with the same name. For every of them, DWARF can be added.

    One numcompare is here https://github.com/coreutils/coreutils/blob/master/src/sort.c#L1998 and the other is here https://github.com/coreutils/coreutils/blob/master/gl/lib/strnumcmp-in.h#L114 . One is compiled in sort.o the other in strnumcmp.o, and object files are then linked together.