So I decided to dig deeper. Basically the standard crand() is not that good. So instead I searched for the fastest Mersenne Twister there is. I downloaded the latest code and compiled it in the fastest way for my architecture.
And after all that trouble we got the performance down to 18 seconds. Still slower that Julia‘s 16 seconds.
$ time ./eulerfast
Euler : 2.71824
Probably, we could do a bit better with more tweaks, and probably exceed Julia‘s performance with some effort. But at that point, I got tired of pushing this further. The thing I love about Julia is how well it is engineered and hassle free. It is quite phenomenal the performance you get out of it, with so little effort. And for basic technical computing things, like random number generation, you don’t have to dig hard for a better library. The “batteries included” choices in the Julia‘s standard library are pretty good.
it is quite nice though that you can rely on the quality of Base for numerics.
$ time ./a.out
Euler : 2.71829
For the curios, I am using this version of Julia
Julia Version 0.6.3-pre.0
Commit 93168a6 (2017-12-18 07:11 UTC)
OS: Linux (x86_64-linux-gnu)
CPU: Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz
BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
LLVM: libLLVM-3.9.1 (ORCJIT, haswell)
Now one should not put too much emphasis on such micro benchmarks. However, I found this a very curious examples when a high level language like Julia could be twice as fast a c. The Julia language authors must be doing some amazing mojo.