I'm starting to play with {fmt}
and wrote a little program to see how it processes large containers. It would seem that fmt::print()
(which ultimately sends output to stdout
) internally first composes the entire result as a string. The test program below where I format a 10,000,000 sized vector<char>
using a format string that consumes 100 bytes per entry amasses the full 100 * 10,000,000 = 1 GB of RAM before starting to dump the result to stdout
. Although you can't tell from the output of my test program, almost all of the 1.7 seconds it took to format and output the result is spent in the formatting -- not the outputting. (If you don't redirect to /dev/null, there's a long pause before anything starts printing to stdout.) This is not good behavior if you're trying to build pipelining tools.
Q1. I do see some references in the docs to fmt::format_to()
. Can that somehow be used to start streaming and discarding the result before the formatting is complete and thereby avoid the in-core composition of the full result?
Q2. Continuing along this line of exploration, instead of passing a container, is there a way I can pass, say, two iterators (that perhaps point at the beginning and ending of a very large file) and pump that data through {fmt} for processing (and thereby avoid having to first read the entire file into memory)?
#include <iostream>
#include <vector>
#include "fmt/format.h"
#include "fmt/ranges.h"
#include "time.h"
using namespace std;
inline long long
clock_monotonic_raw() {
struct timespec ct;
clock_gettime(CLOCK_MONOTONIC_RAW, &ct);
return ct.tv_sec * 1000000000LL + ct.tv_nsec;
}
inline double
dt() {
static long long t0 = 0;
if (t0 == 0) {
t0 = clock_monotonic_raw();
return 0.0;
}
long long t1 = clock_monotonic_raw();
return (t1 - t0) / 1.0e9;
}
int main(int argc, char** argv) {
fprintf(stderr, "%10.6f: ENTRY\n", dt());
vector<char> v;
for (int i = 0; i < 10'000'000; ++i)
v.push_back('A' + i % 26);
string pad(98, ' ');
fprintf(stderr, "%10.6f: INIT\n", dt());
fmt::print(pad + "{}\n", fmt::join(v, "\n" + pad));
fprintf(stderr, "%10.6f: DONE\n", dt());
return 0;
}
matt@dworkin:fmt_test$ g++ -o mem_fmt -O3 -I ../fmt/include/ mem_fmt.cpp ../fmt/libfmt.a
matt@dworkin:fmt_test$ ./mem_fmt > /dev/null
0.000000: ENTRY
0.034582: INIT
1.769687: DONE
[from another window whilst it's running]
matt@dworkin:fmt_test$ ps -aux | egrep 'COMMAND|mem_fmt' | grep -v grep
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
matt 30292 2.8 6.2 1097864 999208 pts/0 S+ 17:40 0:01 ./mem_fmt
Note VSZ of 1.097864 GB
First, let's address your example. The current version of {fmt} has an optimization that allows writing directly into a stream buffer. Right now it is only enabled for fundamental and string types. Once enabled for join_view
in this commit, no additionally dynamic memory will be allocated in your example, fmt::print
will just use the C stream buffer.
Unlike the ostream_iterator
approach it will also be faster.
Before:
% time ./a.out > /dev/null
...
./a.out > /dev/null 0.23s user 0.38s system 71% cpu 0.857 total
After:
% time ./a.out > /dev/null
...
./a.out > /dev/null 0.12s user 0.01s system 96% cpu 0.135 total
This optimization is also proposed (and accepted) for std::print
in P3107R5
Permit an efficient implementation of std::print.
In older versions of {fmt} you can just replace fmt::join
with writing lines individually, fmt::join
provides no benefit in your case anyway.
Now to the questions:
Q1. I do see some references in the docs to fmt::format_to(). Can that somehow be used to start streaming and discarding the result before the formatting is complete and thereby avoid the in-core composition of the full result?
Yes. In general formatting functions including format_to
write into a fixed-size buffer (print
was an exception but it is being fixed as described above). They might still need to allocate for a single argument (but not the full output) if you use padding.
Q2. Continuing along this line of exploration, instead of passing a container, is there a way I can pass, say, two iterators (that perhaps point at the beginning and ending of a very large file) and pump that data through {fmt} for processing (and thereby avoid having to first read the entire file into memory)?
Yes. {fmt} iterates over a range element by element and supports single-pass input iterators. So you can read lazily and discard parts of the input after they have been consumed to save memory. Iterators can be passed as part of a range or via fmt::join
.