I am trying to write some containers in C++ that are supposed to handle big amounts of data.
However I've encountered a bit of a block.
My vector implementation would fall back to being a stable vector (therefore leveraging virtual memory to avoid copies) once it reaches the 128KB capacity threshold, however before that it would be supported by an allocator.
My problem is that there is a range (from page size, which normally is 4KB, to 64KB) in which copies get exponentially more expensive.
I figured that since the page is already filled with data I could just change the virtual address by which I access it from an address with range x to one with range 2x (they'd be given by the allocator but that's not really the point, just know that the addresses are passed by the user), that way I'd avoid the copy.
Is there any way of achieving what I want to do?
I'd also love it to be a solution that works without admin privileges.
I've looked on the internet and found nothing, theoretically it should be possible considering it'd just involve changing some entries in the MMU but I can't find any way of doing it.
Mind you, I'm talking about Windows, for Linux I've already seen some very useful answers.
As mentioned by Raymond Chen in the comments section, you can allocate memory the following way:
CreateFileMapping
and specifying the value INVALID_HANDLE_VALUE
as the hFile
parameter.MapViewOfFileEx
to map the allocated memory into the calling process' virtual address space, optionally specifying the address to which it is to be mapped.With memory allocated as described above, you can "move" the memory allocation to a different address without having to perform a copy, by performing the following steps:
UnmapViewOfFile
on the old memory address.MapViewOfFileEx
again.Here is an example:
#define WIN32_LEAN_AND_MEAN
#include <Windows.h>
#include <iostream>
#include <stdexcept>
#include <cstring>
int main() try
{
SYSTEM_INFO si;
char *buffer;
// get allocation granularity
GetSystemInfo( &si );
// create file mapping backed by system paging file
HANDLE hFileMapping = CreateFileMapping( INVALID_HANDLE_VALUE, NULL, PAGE_READWRITE | SEC_COMMIT, 0, si.dwAllocationGranularity, NULL );
if ( hFileMapping == NULL )
{
throw std::runtime_error( "CreateFileMapping failed!" );
}
// map first view
buffer = static_cast<char*>( MapViewOfFileEx( hFileMapping, FILE_MAP_ALL_ACCESS, 0, 0, 0, NULL ) );
if ( buffer == NULL )
{
throw std::runtime_error( "MapViewOfFileEx failed!" );
}
// write string into first view and print address of string
std::cout <<
"Writing string to address " << static_cast<void*>( buffer ) << ".\n" <<
'\n';
std::strcpy( buffer, "This is a string." );
// read and print string from first view
std::cout <<
"Reading string from address " << static_cast<void*>( buffer ) << ":\n" <<
buffer << '\n' <<
'\n';
// unmap first view
UnmapViewOfFile( buffer );
// map second view at a different address, which is hard-coded this time
buffer = static_cast<char*>( MapViewOfFileEx( hFileMapping, FILE_MAP_ALL_ACCESS, 0, 0, 0, (LPVOID)0x3000000 ) );
if ( buffer == NULL )
{
throw std::runtime_error( "MapViewOfFileEx failed!" );
}
// read and print string from second view
std::cout <<
"Reading string from address "<< static_cast<void*>( buffer ) << ":\n" <<
buffer << '\n' <<
'\n';
//cleanup
UnmapViewOfFile( buffer );
CloseHandle( hFileMapping );
}
catch ( const std::runtime_error &e )
{
std::cerr << "Runtime error: " << e.what() << '\n';
}
This program has the following output:
Writing string to address 00000000000D0000.
Reading string from address 00000000000D0000:
This is a string.
Reading string from address 0000000003000000:
This is a string.
As can be seen, the memory containing the string was successfully read from the new virtual memory address, after being written to the old virtual memory address. This is because both virtual memory addresses most likely represent the same physical memory address.
However, it is not guaranteed that both virtual memory addresses represent the same physical memory address. It is theoretically possible that the process was suspended due to preemptive multitasking after reading the string from the first virtual memory address, and then resumed again immediately before reading the string from the second virtual memory address. If during the time the process was suspended another process ran, and if this other process was using so much memory that the physical memory address containing the string was swapped to the system paging file, then it is possible that the two virtual memory addresses mentioned above never represented the same physical memory address.
Even if this happens, the behavior of the program will still be correct. The only reason I am mentioning this is because it is not guaranteed that a copy did not take place, because it is always possible for physical memory to be swapped to the system paging file and back (which are two very slow copy operations). But it is highly likely that this did not happen and that the same physical address was used by both virtual memory addresses, so that no copy took place.
This possibility of virtual memory being swapped from physical memory to the system paging file and back to physical memory is inherent to all virtual memory, including memory allocated by std::malloc
and the default C++ memory allocator. This swapping can be prevented with VirtualLock
, but the use of this function is generally not recommended, as it will likely degrade performance, because the operating system is generally best at managing physical memory.
In the previous section, I answered your question on how to move virtual memory without performing a copy. However, I doubt whether this is the best solution to your problem, because you may not need to move memory at all.
A better solution may be to reserve a big chunk of contiguous memory addresses in the virtual address space of the program for future use by your container. This memory reservation does not cost any physical memory, as long as you are not using the reserved memory.
For example, you could reserve several megabytes or even gigabytes of memory for future use by your container, but only actually use a single memory page until your container has grown so much that it need further memory pages.
Unfortunately, if you use this method with memory allocated with CreateFileMapping
(which is the method used in the previous section of this answer), then growing the container is easy, but shrinking it is a bit of a problem. This is because once a memory page is in the committed state, you cannot put it back into the reserved state, when using CreateFileMapping
. But you can call VirtualAlloc
with MEM_RESET
on pages with data that you no longer need, so that the memory does not count as being fully used (but still counts as committed).
For this reason, it may be better not to use CreateFileMapping
, but rather to use VirtualAlloc
for both reserving and committing the memory. That way, you can also decommit the memory using VirtualFree
with the MEM_DECOMMIT
parameter, in order to return it to the reserved state.
In order to for example reserve 100 MiB of memory, but only commit the first 4096 byte memory page for use (assuming that is the size of a memory page), you could use the following function call:
LPVOID buffer = VirtualAlloc( NULL, 100 * 1024 * 1024, MEM_RESERVE, PAGE_NOACCESS );
VirtualAlloc( buffer, 4096, MEM_COMMIT, PAGE_READWRITE );
Alternatively, you may want to use virtual memory placeholders, which is a new feature that was introduced in Windows 10 Version 1803. See this blog article (which was written by Raymond Chen) for further information.