I noticed there was a double precision divide after all log2 calls when using cygwin64 so I wrote a small program to illustrate it. Both loops give the same result for the 20 decimal places that are printed:
#include
#include
#include
int main() {
uint32_t i;
for (i = 1 ; i <= 10 ; i++)
printf("log %u: %.20llf\n", i, log2((double)i));
for (i = 1 ; i <= 10 ; i++)
printf("log %u: %.20llf\n", i, 1.442695040888963387 * log((double)i));
}
Looking at the relevant parts of the assembly code shows this for the log2 calculation for the first loop, which apparently uses natural logs and converts to log2:
.L2:
pxor %xmm0, %xmm0
cvtsi2sdl %ebx, %xmm0
call log
movl %ebx, %edx
movq %rsi, %rcx
addl $1, %ebx
divsd %xmm6, %xmm0
movq %xmm0, %r8
movapd %xmm0, %xmm2
call printf
cmpl $11, %ebx
jne .L2
And this for the log2 calculation for the second loop:
.L3:
pxor %xmm0, %xmm0
cvtsi2sdl %ebx, %xmm0
call log
movl %ebx, %edx
movq %rsi, %rcx
addl $1, %ebx
mulsd %xmm6, %xmm0
movq %xmm0, %r8
movapd %xmm0, %xmm2
call printf
cmpl $11, %ebx
jne .L3
The question is: Why does cygwin64 divide by log2(e) instead of multiply by 1 / log2(e)? Aren't double precision multiplies a lot faster than double precision divides? Would the use of a divide be specific to cygwin64 or would all versions of gcc use divide instead of multiply?