Comment 10 for bug 1928508

Revision history for this message
Heitor Alves de Siqueira (halves) wrote (last edit ):

It's been some time since the original benchmarks, so I'm repeating the test from the description. I haven't used hyperfine for the comparisons below, so they won't have the same statistical reliability but should nevertheless be sufficient for validation.

Binaries have been compiled as below:
$ gcc -mtune=generic -march=x86-64 -g -O3 test_memcpy.c -o test_memcpy64

---- AMD ----
$ grep -m1 "model name" /proc/cpuinfo
model name : AMD Ryzen 7 3700X 8-Core Processor

$ dpkg -l | grep -m1 libc6
ii libc6:amd64 2.31-0ubuntu9.2 amd64 GNU C Library: Shared libraries
$ ./test_memcpy64 32
32 MB = 2.506206 ms
-Compare match (should be zero): 0

$ dpkg -l | grep -m1 libc6
ii libc6:amd64 2.31-0ubuntu9.4~20210524ppa1 amd64 GNU C Library: Shared libraries
$ ./test_memcpy64 32
32 MB = 1.384115 ms
-Compare match (should be zero): 0

So, for AMD it's a very noticeable improvement (1.38ms vs 2.51ms).

---- Intel ----
$ grep -m1 "model name" /proc/cpuinfo
model name : Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz

$ dpkg -l | grep -m1 libc6
ii libc6:amd64 2.31-0ubuntu9.2 amd64 GNU C Library: Shared libraries
$ ./test_memcpy64 32
32 MB = 2.304554 ms
-Compare match (should be zero): 0

$ dpkg -l | grep -m1 libc6
ii libc6:amd64 2.31-0ubuntu9.4~20210524ppa1 amd64 GNU C Library: Shared libraries
$ ./test_memcpy64 32
32 MB = 2.209747 ms
-Compare match (should be zero): 0

For Intel the difference isn't very significant, but there are also no performance regressions (2.30ms vs 2.21ms).