performancememorylatencycpu-cachelow-latency

Approximate cost to access various caches and main memory?


Can anyone give me the approximate time (in nanoseconds) to access L1, L2 and L3 caches, as well as main memory on Intel i7 processors?

While this isn't specifically a programming question, knowing these kinds of speed details is neccessary for some low-latency programming challenges.


Solution

  • Here is a Performance Analysis Guide for the i7 and Xeon range of processors. I should stress, this has what you need and more (for example, check page 22 for some timings & cycles for example).

    Additionally, this page has some details on clock cycles etc. The second link served the following numbers:

    Core i7 Xeon 5500 Series Data Source Latency (approximate)               [Pg. 22]
    
    local  L1 CACHE hit,                              ~4 cycles (   2.1 -  1.2 ns )
    local  L2 CACHE hit,                             ~10 cycles (   5.3 -  3.0 ns )
    local  L3 CACHE hit, line unshared               ~40 cycles (  21.4 - 12.0 ns )
    local  L3 CACHE hit, shared line in another core ~65 cycles (  34.8 - 19.5 ns )
    local  L3 CACHE hit, modified in another core    ~75 cycles (  40.2 - 22.5 ns )
    
    remote L3 CACHE (Ref: Fig.1 [Pg. 5])        ~100-300 cycles ( 160.7 - 30.0 ns )
    
    local  DRAM                                                   ~60 ns
    remote DRAM                                                  ~100 ns
    

    EDIT2:
    The most important is the notice under the cited table, saying:

    "NOTE: THESE VALUES ARE ROUGH APPROXIMATIONS. THEY DEPEND ON CORE AND UNCORE FREQUENCIES, MEMORY SPEEDS, BIOS SETTINGS, NUMBERS OF DIMMS, ETC,ETC..YOUR MILEAGE MAY VARY."

    EDIT: I should highlight that, as well as timing/cycle information, the above intel document addresses much more (extremely) useful details of the i7 and Xeon range of processors (from a performance point of view).