c++assemblyclang++

Why are there a large number of integer constants in the generated assembly code for std::print?


I was playing around with a "hello world!" program using the new std::print function on clang 18.1.0, and I noticed a couple thousand lines of integer constants, as shown here on godbolt. What purpose do they serve, and why are they not present in a classic std::cout -based "hello world!"?


Solution

  • You mean the blocks like this?

    .LJTI17_0:
        .long   .LBB17_1-.LJTI17_0
        .long   .LBB17_2-.LJTI17_0
        .long   .LBB17_24-.LJTI17_0
    

    Those are tables of 32-bit offsets from the table to other labels, probably for switch in position-independent code. The JTI part of the label name presumably stands for Jump Table, the I maybe for Indirect?

    I see in parser<char> there's code that uses it like GCC Jump Table initialization code generating movsxd and add?:

        lea     rdx, [rip + .LJTI29_0]
        movsxd  rcx, dword ptr [rdx + 4*rcx]   # sign-extending load indexing into the table
        add     rcx, rdx                       # add it to the table address
        jmp     rcx                            # to get a jump target
    

    Godbolt by default filters code for library functions, so that code is hidden, but std::print is defined in the header so it compiles into a significant amount of code for the compilation unit that uses it. Unlike <iostream> where operator<<'s definition is only in the library, not available for inlining, so the asm in the caller is just a constructor call for globals like std::cout, and just a normal function call.

    I also see a few tables of literal integer constants like

    std::__1::__extended_grapheme_custer_property_boundary::__entries:
            .long   145
            .long   20485
            .long   22545
            .long   26624
            .long   28945
            .long   260609
            .long   346115
            .long   354305
            .long   356355
            .long   1574642
         ...
    
    std::__1::__width_estimation_table::__entries:
            .long   71303263
            .long   147226625
            .long   147472385
            .long   150618115
            .long   150732800
    

    Their names seem fairly self-explanatory, or at least long enough to search the source for the code that defines and uses them.

    Also a table of powers of 10 that fit in 32 bits, with a name that indicates the purpose:

    std::__1::__itoa::__pow10_32:
            .long   0
            .long   10
            .long   100
            .long   1000
            .long   10000
            .long   100000
            .long   1000000
            .long   10000000
            .long   100000000
            .long   1000000000
    

    Binary or linear search of a small table can be faster than actually multiplying x *= 10 to get a value to compare against to see the decimal-digit length of a number, especially if used repeatedly. And that's something that you might want to do so you can store digits into the itoa output buffer as you generate them LSD-first for non-power-of-2 bases and still have the first digit at a known position. (Otherwise you could just start at the end of a buffer and copy, but that would have a store-forwarding stall if you do a wide copy, and you might not know it's safe to write past the end of an odd length. See How do I print an integer in Assembly Level Programming without printf from the c library? (itoa, integer to decimal ASCII string))