I am aware that multiple non volatile loads can be collapsed into a single load by the compiler. Does this mean that memory mapping functions can create undefined behavior?
for example:
int k = *p;
unmapFile(fm, fileHandle1);//the unmapped memory includes the address pointed to by p
//p is no longer valid, pointing to a virtual address that is unmapped
mapFile(fm, fileHanlde2);//p now points to an address somewhere in a different file
int j = *p;
//use k and j here
can the two dereferences of p be collapsed into one causing the code to behave unpredictably?
as an aside, I know file mapping functions don't allow you to specify the starting address of the map (at least in posix and windows), but this was the simplest hypothetical I could come up with. There are indeed use cases for this paradigm.
Does this mean that memory mapping functions can create undefined behavior?
C itself does not define any such functions or speak to how they could or should work if defined. Therefore, as far as the C language spec is concerned, anything and everything associated with memory mapping functions has undefined behavior. There is no other answer at the level of generality of the question. As @AtsushiYokoyama observes, there are practical issues that may dissuade the compiler from combining the two reads of *p
, but that should not be taken as a guarantee that any given compiler indeed won't combine them.
In a programming and execution environment that makes memory mapping functions available, most details related to their behavior, to the extent that they are defined, are implementation specific. Considering a POSIX environment supporting file mappings and the POSIX mmap()
and munmap()
functions as an example, POSIX says:
The
munmap()
function shall remove any mappings for those entire pages containing any part of the address space of the process starting ataddr
and continuing forlen
bytes. Further references to these pages shall result in the generation of aSIGSEGV
signal to the process.
And it says:
The
mmap()
function shall establish a mapping between the address space of the process at an addresspa
forlen
bytes to the memory object represented by the file descriptorfildes
at offsetoff
forlen
bytes [...]
and
The mapping established by
mmap()
shall replace any previous mappings for those whole pages containing any part of the address space of the process starting atpa
and continuing forlen
bytes.
It is the responsibility of the compiler to ensure that any optimizations it performs preserve program semantics as defined by the combination of all relevant specifications. I don't see any way that a compiler for a POSIX environment where your unmapFile()
corresponds to munmap()
and your mapFile()
corresponds to mmap()
could justify combining the two reads of *p
. That doesn't mean that no compiler would, but it does mean that a compiler that did would risk producing program behavior that did not conform to POSIX. For this case, then, I'd strengthen my remark above to: there are practical issues that should dissuade the compiler from combining the two reads of *p
.