Summary of some performance tests with floating-point numbers on my standard desktop computer Thu Feb 11 07:59:16 CET 2016 Executive summary: doubles are faster than floats, except when floats are faster. Hardware: Intel Core i7-4790K ("Haswell Refresh" from the summer of 2014) ASUS Z97-DELUXE ATX 32 GB of memory (4 * 8 GB Corsair Vengeance 1600 MHz) No overclocking I ran one billion operations, alternating multiplication and division, in varying numbers of threads. On my processor, with four cores and hyperthreading, most of the time the best performance was found with eight parallel threads. This is consistent with what you would expect. On this computer a float is 32 bits and a double is 64 bits. With operations on a single variable (on the C source code level), where one might expect that everything is done in processor registers, the best performance measured was 1533.03 Mflops with float, and 6205.27 Mflops with double. Using doubles was four times faster than using floats, in spite of floats being 32 bits and doubles 64. Here I would guess that the main memory is not used at all, and that performance is limited by floating-point calculations in the processor. If I have to guess the reasons, based on what I think I remember that I have read, it is because floating-point calculations are done in fast hardware with double precision or more, and single-precision float calculations are made the same way, but then the values have to be converted from, and back to, float. If, instead, have a billion different floating-point numbers, stored in memory, and each number must be loaded from, and stored back in, main memory, it is the other way around, with floats being (slightly) faster than doubles. In this case, the best measured float performance was 2315.10 Mflops, and double 1592.77 Mflops. Here I think performance is limited by memory bandwidth, and it is not unreasonable that floats (with in this case 4 gigabytes of data) is faster than doubles (with in this case 8 gigabytes of data). But note that it was not twice as fast. But, as always, it must be remembered that benchmarks are dependent on many things: hardware, benchmark software, compiler, compiler settings, the CPU cooler, etcetera. On other hardware, for example with different implementations of floating-point calculations, we might get completely different results. -- Thomas Padron-McCarthy, tel +46(0)707347013, http://www.aass.oru.se/~tpy/