c++gcclibstdc++c++23stdformat

Unable to use std::format for integer formatting - gcc 15.1.0


I am using gcc 15.1.0 docker image. I was running my C++ code but I encountered a very weird issue that I am unable to resolve. Attaching the minimal reproducer (I can't remove the import inside main.cpp file inside my codebase, the reproducer is the stripped down code from my codebase):

root@b92b45f832fe:~# g++ --version
g++ (GCC) 15.1.0
Copyright (C) 2025 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

root@b92b45f832fe:~# ls -l
total 8
-rw-r--r-- 1 root root 452 May 11 20:53 format.ixx
-rw-r--r-- 1 root root 109 May 11 20:53 main.cpp
root@b92b45f832fe:~# cat format.ixx 
module;
#include <format>
#include <optional>

export module utils.format;


export template <typename T>
    struct std::formatter<std::optional<T>> : std::formatter<std::string> {
    constexpr auto format(const auto& opt, auto& ctx) const {
        if (opt.has_value())
            return std::formatter<std::string>::format(std::format("{0}", opt.value()), ctx);

        return std::formatter<std::string>::format("Optional[null]", ctx);
    }
};
root@b92b45f832fe:~# cat main.cpp 
#include <format>

import utils.format;

int main() {
    const auto x = std::format("{:*>4}", 2);
}

root@b92b45f832fe:~# g++ format.ixx  main.cpp -fmodules -std=c++23
root@b92b45f832fe:~# ./a.out 
terminate called after throwing an instance of 'std::format_error'
  what():  format error: invalid width or precision in format-spec
Aborted
root@b92b45f832fe:~# 

Does someone know how to resolve this? The format string is supposed to work here.

Debugging

I did the debugging with CLion, went into the format header file, and found the problematic line (libstdc++ format line 435):

      // N.B. std::from_chars is not constexpr in C++20.
      if (__detail::__from_chars_alnum<true>(__first, __last, __val, 10) <----- ISSUE
        && __first != __start) [[likely]].
        return {__val, __first};
    }

When I stepped in into library code, I found the following behavioral difference:

The function executing in __detail::__from_chars_alnum<true>(__first, __last, __val, 10) is charconv:521-548, whose execution is exactly same in both runs (irrespective of presence of import utils.format in main.cpp), and the data type of all variables is also same in both runs.

In case of import inside main.cpp, the pointer update in charconv is somehow not reflecting in format file's if statement.


Solution

  • UPDATE:

    1. This issue happens with GCC 15.1 even if std::formatter is fully specialized using a custom type (e.g. struct mystruct{}), so it is not caused by the formatter using std::optional or any other builtin or std type.

    2. This issue does not happen on my dev. machine after upgrading the compiler to g++ (GCC) 15.2.1 20250808 (Red Hat 15.2.1-1), so indeed it was most likely a GCC bug.

    TL;DR;

    When your sample program is compiled with g++ (GCC) 15.1.1 20250521 (Red Hat 15.1.1-2) on x64, the compiled code for

    __parse_integer(const _CharT* __first, const _CharT* __last) 
    

    calls

    __detail::__from_chars_alnum<true>(const char*& __first, const char* __last, _Tp& __val, int __base)
    

    but it creates a temporary variable for the first argument of __from_chars_alnum so any changes are applied to that temporary variable and thus lost.

    I don't really know why that temporary is created, but it seems like a GCC bug, because Clang compiles and runs your sample program just fine and `x` is assigned the correct formatted string.

    LONG ANSWER:

    I am providing the disassembled code of __parse_integerwith my comments prefixed with semicolon (;). It shows clearly the creation of a temporary value for __first at address [rbp-0x58].

    As I already mentioned in the short answer, I don't know why the temporary is created. clang version 20.1.8 (Fedora 20.1.8-3.fc42) compiles and runs your program just fine and `x` is assigned the correctly formatted value. So I would say is really looks like a GCC bug to me.

    Below is the disassembly of your program compiled with g++ (GCC) 15.1.1 20250521 (Red Hat 15.1.1-2). My comments are inline, prefixed with semicolon (;)

    (gdb) disas /s
    Dump of assembler code for function _ZNSt8__format15__parse_integerIcEESt4pairItPKT_ES4_S4_:
    /usr/include/c++/15/format:
    432         __parse_integer(const _CharT* __first, const _CharT* __last)
       0x00000000004039a5 <+0>:     push   rbp
       0x00000000004039a6 <+1>:     mov    rbp,rsp
       0x00000000004039a9 <+4>:     sub    rsp,0x70
    
    ; __first is saved on the stack in [rbp-0x68]
       0x00000000004039ad <+8>:     mov    QWORD PTR [rbp-0x68],rdi
    
    ; __last is saved on the stack in [rbp-0x70]
       0x00000000004039b1 <+12>:    mov    QWORD PTR [rbp-0x70],rsi
    
    433         {
    434           if (__first == __last)
    => 0x00000000004039b5 <+16>:    mov    rax,QWORD PTR [rbp-0x68]
       0x00000000004039b9 <+20>:    cmp    rax,QWORD PTR [rbp-0x70]
    
    435             __builtin_unreachable();
    436
    437           if constexpr (is_same_v<_CharT, char>)
    438             {
    439               const auto __start = __first;
       0x00000000004039bd <+24>:    mov    rax,QWORD PTR [rbp-0x68]
    
    ; __start is saved on the stack in [rbp-0x8] and __first (rax) is assigned to it
       0x00000000004039c1 <+28>:    mov    QWORD PTR [rbp-0x8],rax
    
    440               unsigned short __val = 0;
       0x00000000004039c5 <+32>:    mov    WORD PTR [rbp-0x5a],0x0
    
    441               // N.B. std::from_chars is not constexpr in C++20.
    442               if (__detail::__from_chars_alnum<true>(__first, __last, __val, 10)
    
    ; Get the value of __first in rax and save it in the temporary variable temp_first at [rbp-0x58]
       0x00000000004039cb <+38>:    mov    rax,QWORD PTR [rbp-0x68]
       0x00000000004039cf <+42>:    mov    QWORD PTR [rbp-0x58],rax
    
       0x00000000004039d3 <+46>:    lea    rdx,[rbp-0x5a]
       0x00000000004039d7 <+50>:    mov    rsi,QWORD PTR [rbp-0x70]
    
    ; Get the address of temp_first in rax
       0x00000000004039db <+54>:    lea    rax,[rbp-0x58]
       0x00000000004039df <+58>:    mov    ecx,0xa
    
    ; Put the address of temp_first in rdi, which holds the first argument of __detail::__from_chars_alnum<true>()
    ; In compiled code references are represented as pointers, so effectively a reference to temp_first is passed
    ; as the first argument of __detail::__from_chars_alnum<true>()
       0x00000000004039e4 <+63>:    mov    rdi,rax
       0x00000000004039e7 <+66>:    call   0x403887 <_ZNSt8__detail18__from_chars_alnumILb1EtEEbRPKcS2_RT0_i>
    ; At this point the call to __detail::__from_chars_alnum<true>() has returned. Any changes to the first argument
    ; of __detail::__from_chars_alnum<true>() have been applied to temp_first, not __first
    
    443                     && __first != __start) [[likely]]
    
    ; If __detail::__from_chars_alnum<true>() returned false, go to 0x403a01, which in turn returns {0, nullptr}
       0x00000000004039ec <+71>:    test   al,al
       0x00000000004039ee <+73>:    je     0x403a01 <_ZNSt8__format15__parse_integerIcEESt4pairItPKT_ES4_S4_+92>
    
    ; If __first == __start, go to 0x403a01, which in turn returns {0, nullptr}
    ; However we passed a reference to a temporary variable to __detail::__from_chars_alnum<true>(), so changes
    ; (if any) have been applied to temp_first. Thus the comparison __first == __start always holds true and we
    ; always jump to 0x403a01, which in turn returns {0, nullptr}
       0x00000000004039f0 <+75>:    mov    rax,QWORD PTR [rbp-0x68]
       0x00000000004039f4 <+79>:    cmp    rax,QWORD PTR [rbp-0x8]
       0x00000000004039f8 <+83>:    je     0x403a01 <_ZNSt8__format15__parse_integerIcEESt4pairItPKT_ES4_S4_+92>
    
       0x00000000004039fa <+85>:    mov    eax,0x1
       0x00000000004039ff <+90>:    jmp    0x403a06 <_ZNSt8__format15__parse_integerIcEESt4pairItPKT_ES4_S4_+97>
       0x0000000000403a01 <+92>:    mov    eax,0x0
    
    442               if (__detail::__from_chars_alnum<true>(__first, __last, __val, 10)
       0x0000000000403a06 <+97>:    test   al,al
       0x0000000000403a08 <+99>:    je     0x403a33 <_ZNSt8__format15__parse_integerIcEESt4pairItPKT_ES4_S4_+142>
    
    444                 return {__val, __first};
       0x0000000000403a0a <+101>:   mov    rax,QWORD PTR [rbp-0x68]
       0x0000000000403a0e <+105>:   mov    QWORD PTR [rbp-0x38],rax
       0x0000000000403a12 <+109>:   lea    rdx,[rbp-0x38]
       0x0000000000403a16 <+113>:   lea    rcx,[rbp-0x5a]
       0x0000000000403a1a <+117>:   lea    rax,[rbp-0x50]
       0x0000000000403a1e <+121>:   mov    rsi,rcx
       0x0000000000403a21 <+124>:   mov    rdi,rax
       0x0000000000403a24 <+127>:   call   0x403aa6 <_ZNSt4pairItPKcEC2IRtRS1_EEOT_OT0_>
       0x0000000000403a29 <+132>:   mov    rax,QWORD PTR [rbp-0x50]
       0x0000000000403a2d <+136>:   mov    rdx,QWORD PTR [rbp-0x48]
       0x0000000000403a31 <+140>:   jmp    0x403a61 <_ZNSt8__format15__parse_integerIcEESt4pairItPKT_ES4_S4_+188>
    
    445             }
    446           else
    447             {
    448               constexpr int __n = 32;
    449               char __buf[__n]{};
    450               for (int __i = 0; __i < __n && (__first + __i) != __last; ++__i)
    451                 __buf[__i] = __first[__i];
    452               auto [__v, __ptr] = __format::__parse_integer(__buf, __buf + __n);
    453               if (__ptr) [[likely]]
    454                 return {__v, __first + (__ptr - __buf)};
    455             }
    456           return {0, nullptr};
       0x0000000000403a33 <+142>:   mov    DWORD PTR [rbp-0x14],0x0
       0x0000000000403a3a <+149>:   mov    QWORD PTR [rbp-0x10],0x0
       0x0000000000403a42 <+157>:   lea    rdx,[rbp-0x10]
       0x0000000000403a46 <+161>:   lea    rcx,[rbp-0x14]
       0x0000000000403a4a <+165>:   lea    rax,[rbp-0x30]
       0x0000000000403a4e <+169>:   mov    rsi,rcx
       0x0000000000403a51 <+172>:   mov    rdi,rax
       0x0000000000403a54 <+175>:   call   0x403a68 <_ZNSt4pairItPKcEC2IiDnEEOT_OT0_>
       0x0000000000403a59 <+180>:   mov    rax,QWORD PTR [rbp-0x30]
       0x0000000000403a5d <+184>:   mov    rdx,QWORD PTR [rbp-0x28]
    
    457         }
       0x0000000000403a61 <+188>:   mov    ecx,eax
       0x0000000000403a63 <+190>:   mov    eax,ecx
       0x0000000000403a65 <+192>:   leave
       0x0000000000403a66 <+193>:   ret
    End of assembler dump.