I have code that works with large data blocks having different layouts. The layout will determine which part of the data is fixed, and which data is not fixed. Once data is fixed in a block, it normally doesn't change anymore. So all code reading data will always see the same data.
However, other services may make changes in these blocks as long as they are sure that no code will read that part of the block. To simplify the code, blocks that contain a change will be sent from one service to the other, regardless of the layout of the block. The receiving service will then overwrite the complete block, including the data that was not changed. Let me illustrate this with an example:
Suppose we have the following block of data:
57 | 23 | 98 | 17 | 25 | 00 | 00 | 00 | 00 | 00 |
---|
And imagine that the first 5 values are 'fixed'. Code in our service will only read the first 5 values and will never read the next 5 values. We can guarantee this due to the design of our architecture. The next 5 values don't really make sense so I put zeroes in the table to illustrate this.
Now another service determines the next 5 values, sends the complete block to our service, and we simply overwrite the complete block with the new data. Since the first 5 values were 'fixed', they remain the same, but the code that transfers and overwrites the block, doesn't know about the layout of the block, so the only thing it can do is overwrite the complete block. This is the result:
57 | 23 | 98 | 17 | 25 | 08 | 33 | 42 | 71 | 85 |
---|
As said before, the first 5 values did not change, although they were overwritten by the transfer logic.
Question is: Is this a data race? Is it allowed to overwrite a memory address with exactly the same value if other threads can read the data at the same time?
Is this a data race?
Yep.
Is it allowed to overwrite a memory address with exactly the same value if other threads can read the data at the same time?
Not explicitly - and that's not the only issue, either.
If your compiler actually performs a single 8-byte load, you have a real (ie, not even potentially just theoretical) data race on the last 3 bytes. Say you have a hypothetical machine where the uint64_t
value 57 23 98 17 25 00 00 42
is a trap representation, your update thread used memmove
, and it copies the update backwards.
However, a data race means behaviour is undefined by the standard. It may be well-defined on a particular platform - such as any platform which doesn't have integer trap representations, any platform where you know the compiler will really use byte loads, or any platform with explicit semantics for idempotent stores.
See, for example, [intro.races] note 23:
Transformations that introduce a speculative read of a potentially shared memory location might not preserve the semantics of the C++ program as defined in this document, since they potentially introduce a data race. However, they are typically valid in the context of an optimizing compiler that targets a specific machine with well-defined semantics for data races. They would be invalid for a hypothetical machine that is not tolerant of races or provides hardware race detection
(there's no such note for non-speculative races and no specific exception for your idempotent stores, but it's reasonable to take the same approach there IMO).
Obviously if you can write your code so it doesn't depend on these platform details it will be more portable and less fragile in the face of compiler and/or platform updates. Just atomically swapping between two versions of a block (so you read from a copy guaranteed to be unchanging, and write to a copy guaranteed not to be shared) will always be correct, and may even be faster due to reducing cache/coherency traffic.