x86intelcpu-architecturecpu-cachecache-invalidation

Invalidation of the cache from L1 cache


Suppose that a cache line with variable X is simultaneously uploaded to L1d of CPU0 and L1d of CPU1. After changing the value of X from CPU0, when CPU1's L1d cache line is invalidated, Is it impossible for CPU1 to copy the variable X from CPU0's L1d cache if CPU0 has a cache line with X? And even if this is not the case, I want to know if there are cases where CPU0 brings in CPU1'


Solution

  • The case described is not allowed. When a processor core executes a store to an address, the data is written to a "store buffer" for transfer to the cache at a later time. Before transferring data from the store buffer, the cache requires Exclusive access to the line -- a state that can exist in only one cache at a time.

    Three easy cases:

    1. If the core's cache already has exclusive access (i.e., the line is in the Exclusive or Modifed states), then the store buffer can write the data to the cache at any time.
    2. If the core's cache has a valid copy of the line without exclusive access (such as the "Shared" state), the presence of new data in the store buffer will cause the cache to generate an "upgrade" request for the line. The upgrade to E or M state will not be granted until all other caches (or directories) acknowledge that they have invalidated any copies of that address.
    3. If the core's cache does not have a valid copy of the line (either no address match or an address match in the Invalid state), the cache will issue a "Read With Intent To Modify" request. This will result in the transfer of the current data for the cache line (whether in memory or from a modified copy in another core's cache) to the requesting core's cache, AND the invalidation of the cache line in every other cache in the system.

    If two cores execute store instructions "at the same time", the details of the implementation will result in one of two the cores obtaining exclusive access. The other core will have its request "rejected" (NACK'd), and it will retry the request until the first core+cache has completed its upgrade of the cache line state and update of the data. This mechanism forces all stores to a single address to be processed sequentially, even if they are issued concurrently.

    In general it is not possible for a user to reliably make something happen "at the same time" in two cores (or to detect whether it happened at the same time), but the implementations have to account for it by the serialization process described above.