Following up a similar question some months ago <[combination of monotonic buffer and unsynchronized memory pool]https://stackoverflow.com/questions/77271609/c17-combination-of-monotonic-buffer-and-unsynchronized-memory-pool>, I am trying to combine monotonic buffer and unsynchronized pool resource to efficiently allocate memory without got to heap for memory allocation.
#include <iostream>
#include <memory_resource>
#include <vector>
class A{
char array[64];
};
typename std::aligned_storage<32000>::type storage;
thread_local std::pmr::monotonic_buffer_resource bufferPool{&storage,sizeof(storage),std::pmr::null_memory_resource()};
std::pmr::unsynchronized_pool_resource pool{{},&bufferPool};
int main()
{
for(int i =0; i < 10000;++i){
std::pmr::vector<A> vec (&pool);
vec.reserve(1000);
for(int j =0; j < 1000;++j){
vec.emplace_back();
}
}
}
Based on the previous answer the expectation the memory block to return in pool and reuse for next allocation. But this does not happen. In next allocations go again to monotonic buffer which behaves monotonically and after some iterations memory runs out What am I missing? And why the below example works and reuse the same memory in each iteration?
#include <iostream>
#include <memory_resource>
int main() {
char buffer[1024];
std::pmr::monotonic_buffer_resource monotonicResource(buffer, sizeof(buffer));
// Use an unsynchronized_pool_resource on top of the monotonic_buffer_resource
std::pmr::unsynchronized_pool_resource poolResource(&monotonicResource);
// Allocate and deallocate memory using the unsynchronized_pool_resource
for (int i = 0; i < 5; ++i) {
std::cout << "Iteration " << i << ":\n";
// Allocate memory
void* ptr = poolResource.allocate(200); // request 200 bytes from the pool
std::cout << "Allocated at: " << ptr << "\n";
// Deallocate memory (returning it to the pool! Not the monotonic resource)
poolResource.deallocate(ptr, 200);
std::cout << "Deallocated.\n\n";
}
return 0;
}
What is the difference between them?
The pool allocator implementations typically have 2 optimizations that make it optimized for many small allocations but very bad for large allocations.
The allocator works, But in order to allocate 64 KB of memory on MSVC it is requesting close to 720 KB of memory which the small 32 KB buffer cannot satisfy.
The solution is really simple, Don't use it for large vectors, you can re-use the vector memory without using a pmr::allocator
, just put it outside of the for loop, or a part of the class.
std::vector<A> vec;
vec.reserve(1000);
for (int i = 0; i < 10000; ++i) {
vec.clear();
for (int j = 0; j < 1000; ++j) {
vec.emplace_back();
}
}
Those pmr
allocators are more useful for smaller allocations like map
or unordered_map
or list
where each time you allocate small nodes of constant size, it is also useful for small vectors, closer to 10 elements.
If you still want to use it for a large vector, you want to set the max_blocks_per_chunk
in pool_options
to 1 and you want to let the allocator draw memory from the unlimited global heap, because you cannot know in advance how much memory it is going to need.
Also gcc/clang have a 4 MB / 1 MB largest_required_pool_block
by default, so you need to specify that you need larger allocations in pool_options
if you need bigger allocations.
std::pmr::monotonic_buffer_resource bufferPool{};
std::pmr::unsynchronized_pool_resource pool{std::pmr::pool_options{1,1024*1024*1024},&bufferPool};
Even after setting it to only allocate 1 block per chunk, it is allocating 140 KB to satisfy the 64 KB allocation, don't try to estimate the memory you will need in advance, you can allocate a small buffer on the stack to handle small allocations for small vectors (like 10 KB), but you must allow it to fall-back to the global heap for large vectors.