gpunvidiatesla

Is there a relation between single and double precision in NVIDIA Tesla?


In the model Tesla K20 the peak single-precision floating point performance is about 3.52 TFlops but the double-precision is 1.17 TFlops,so the ratio is 3. The Tesla K20X has 3.95 and 1.31, and Tesla K40 has 4.29 and 1.43 TFlops, the ratio seems to repeat. My question is if there is a reason to the ratio be 3 and not 2, that seems logical to me because the difference between single and double precision. I am learning about GPUS and GPGPUS, so i don't know very much about it.

In the second page of this pdf there is a specs table. NVIDIA-Tesla-Kepler-Family-Datasheet.pdf


Solution

  • The models you listed are all based on Kepler architecture, which has peak double precision rate equal to 1/3 of peak single precision rate. This is the way NVIDIA has built this piece of hardware. For comparison, Fermi, which is the previous hardware generation, had the ratio of 1/2 between peak double and single precision rate.

    You may refer to NVIDIA documentation for instruction throughput, by instruction type and hardware generation:

    http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#maximize-instruction-throughput

    You will notice that consumer-grade products (GeForce GTX) typically have much lower double-to-single precision rate - 1/8, 1/12, 1/24, and even 1/32, depending on hardware version.