openssl is not LTO-safe

Bug #2058017 reported by Adrien Nader
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
openssl (Ubuntu)
Fix Released
Undecided
Adrien Nader

Bug Description

tl;dr: since it's too much work to make openssl LTO-safe, upstream doesn't see it as a goal and doesn't test it, and there are probably no performance gains to LTO for this package.

Openssl is an old project and the codebase wasn't written with aliasing rules in mind. There are several reports of issues related to LTO. The openssl technical commitee says "currently we're not going to fix all the strict aliasing and other LTO problems" and "Fixes raised in pull requests will be considered."; in other words: if you find a violation, we'll merge your fixes but we're not going to dedicate time to fixing them ourselves.

We don't have specific reports on launchpad at the moment but there has been at least one issue experienced by the FIPS: the compiler decided a 0-filled array could be removed and proceeded to do so. In addition to that, compilers are only pushing this further and further. Issues are impossible to predict and even security updates could trigger issues.

Gentoo prevents usage of LTO for openssl and has some links related to this at https://gitweb.gentoo.org/repo/gentoo.git/tree/dev-libs/openssl/openssl-3.2.1-r1.ebuild#n131 :
- https://github.com/llvm/llvm-project/issues/55255
- https://github.com/openssl/openssl/issues/12247
- https://github.com/openssl/openssl/issues/18225
- https://github.com/openssl/openssl/issues/18663
- https://github.com/openssl/openssl/issues/18663#issuecomment-1181478057

Gentoo also prevents usage of -fstrict-aliasing and always set -fno-strict-aliasing. I don't plan to do the same at least at the moment and for Noble since I don't have time to investigate more changes.

Performance shouldn't be impacted much if at all:
- crypto algorithms are implemented in ASM (funnily, using C implementations can trigger issues because these got miscompiled)
- the rest of the openssl codebase probably doesn't benefit from LTO because source files match codepaths quite well
- at the moment, openssl performance for servers is bad due to algorithmic/architectural issues, not micro-optimizations and these wouldn't be noticed
- if LTO-compliance was doable and thought to be useful by upstream, they would have certainly pushed that forward, especially in the wake of openssl 3.0's performance issues.

Code size increases by a few percents except for libcrypto which gets 17% larger. The corresponding .deb file increases by 2.6% only.

I ran "openssl speed" with a long benchmark time in order to get good results (there is a variation of several percents with the default times). I then scripted a diff which output is shown below; "....." means the difference is within 2% which is the vast majority. Also note that some important ciphers are not present due to how openssl speed works; small aes-*-cbc are negatively impacted, up to -10% but that would -50% if you compared between "software" and "hardware" implementations, the results would be reversed at anything but the smallest data sizes, and the fact that you want to use hardware implementations as much as possible means that you also want to avoid places where LTO could have an effect.

type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
md5 ..... ..... ..... ..... ..... .....
sha1 ..... ..... ..... ..... ..... .....
rmd160 ..... ..... ..... ..... ..... .....
sha256 +2.3% ..... ..... ..... ..... .....
sha512 ..... ..... ..... ..... ..... .....
hmac(md5) ..... ..... ..... ..... ..... .....
des-ede3 ..... ..... ..... ..... ..... .....
aes-128-cbc -10.0% ..... ..... ..... ..... .....
aes-192-cbc -7.6% ..... ..... ..... ..... .....
aes-256-cbc -5.2% ..... ..... ..... ..... .....
camellia-128-cbc ..... ..... ..... ..... ..... .....
camellia-192-cbc ..... ..... ..... ..... ..... .....
camellia-256-cbc ..... ..... ..... ..... ..... .....
ghash ..... ..... +21.2% -27.3% +30.5% +39.3%
rand -2.8% -2.9% -2.9% -2.8% ..... .....
sign verify sign/s verify/s
rsa 512 bits 0.000031s 0.000002s -2.7% .....
rsa 1024 bits ..... 0.000005s ..... .....
rsa 2048 bits +2.4% 0.000015s -2.3% .....
rsa 3072 bits ..... 0.000032s ..... .....
rsa 4096 bits ..... ..... ..... .....
rsa 7680 bits ..... ..... 30.2 .....
rsa 15360 bits ..... ..... 5.9 .....
sign verify sign/s verify/s
dsa 512 bits +4.8% 0.000024s -3.9% .....
dsa 1024 bits +2.5% -3.3% ..... +2.4%
dsa 2048 bits ..... ..... ..... +2.0%
sign verify sign/s verify/s
160 bits ecdsa (secp160r1) +100.0% +100.0% ..... -2.2%
192 bits ecdsa (nistp192) 0.0002s 0.0002s -3.6% -3.3%
224 bits ecdsa (nistp224) 0.0000s 0.0001s ..... .....
256 bits ecdsa (nistp256) 0.0000s 0.0001s ..... .....
384 bits ecdsa (nistp384) +14.3% 0.0006s -3.2% .....
521 bits ecdsa (nistp521) 0.0002s 0.0005s ..... .....
163 bits ecdsa (nistk163) 0.0002s 0.0003s -3.2% -3.0%
233 bits ecdsa (nistk233) 0.0002s +25.0% ..... -2.2%
283 bits ecdsa (nistk283) 0.0004s 0.0008s ..... -3.5%
409 bits ecdsa (nistk409) 0.0007s 0.0013s -2.1% -2.0%
571 bits ecdsa (nistk571) 0.0015s 0.0029s ..... .....
163 bits ecdsa (nistb163) 0.0002s 0.0003s ..... .....
233 bits ecdsa (nistb233) 0.0002s 0.0005s ..... .....
283 bits ecdsa (nistb283) 0.0004s 0.0008s -2.4% -2.7%
409 bits ecdsa (nistb409) 0.0007s +7.7% -2.5% -3.5%
571 bits ecdsa (nistb571) 0.0016s 0.0031s ..... .....
256 bits ecdsa (brainpoolP256r1) 0.0003s 0.0003s -2.5% .....
256 bits ecdsa (brainpoolP256t1) 0.0003s 0.0003s -2.9% -3.2%
384 bits ecdsa (brainpoolP384r1) +14.3% 0.0007s -2.9% .....
384 bits ecdsa (brainpoolP384t1) +14.3% 0.0006s -2.9% -2.0%
512 bits ecdsa (brainpoolP512r1) 0.0011s 0.0009s -2.8% -3.1%
512 bits ecdsa (brainpoolP512t1) +10.0% +12.5% -3.4% -4.5%
op op/s
160 bits ecdh (secp160r1) 0.0001s -5.8%
192 bits ecdh (nistp192) 0.0002s -7.4%
224 bits ecdh (nistp224) 0.0001s .....
256 bits ecdh (nistp256) 0.0000s .....
384 bits ecdh (nistp384) 0.0007s -4.0%
521 bits ecdh (nistp521) 0.0003s -4.1%
163 bits ecdh (nistk163) 0.0002s -4.6%
233 bits ecdh (nistk233) 0.0002s -4.7%
283 bits ecdh (nistk283) 0.0004s -2.9%
409 bits ecdh (nistk409) 0.0006s -3.6%
571 bits ecdh (nistk571) 0.0014s .....
163 bits ecdh (nistb163) 0.0002s .....
233 bits ecdh (nistb233) 0.0002s .....
283 bits ecdh (nistb283) 0.0004s -2.5%
409 bits ecdh (nistb409) +16.7% -3.2%
571 bits ecdh (nistb571) 0.0015s .....
256 bits ecdh (brainpoolP256r1) 0.0003s -3.9%
256 bits ecdh (brainpoolP256t1) 0.0003s -4.9%
384 bits ecdh (brainpoolP384r1) 0.0007s -3.7%
384 bits ecdh (brainpoolP384t1) 0.0007s -3.9%
512 bits ecdh (brainpoolP512r1) 0.0010s .....
512 bits ecdh (brainpoolP512t1) 0.0010s -2.1%
253 bits ecdh (X25519) 0.0000s .....
448 bits ecdh (X448) 0.0002s .....
sign verify sign/s verify/s
253 bits EdDSA (Ed25519) 0.0000s 0.0001s ..... .....
456 bits EdDSA (Ed448) 0.0002s 0.0002s ..... .....
sign verify sign/s verify/s
256 bits SM2 (CurveSM2) 0.0003s 0.0003s -2.9% -3.2%
op op/s
2048 bits ffdh 0.0002s .....
3072 bits ffdh 0.0006s -2.4%
4096 bits ffdh 0.0013s .....
6144 bits ffdh 0.0029s .....
8192 bits ffdh ..... .....

PS: I used a ZSH script for that (because bash cannot do floating point arithmetic operations) which is below, using two files "speed-lto" and "speed-no-lto":

a=speed-lto; b=speed-no-lto; l=$(wc -l speed-lto | cut -f1 -d' '); exec 3<$a; exec 4<$b; for i in $(seq 1 $l); do read -A -u 3 c; read -A -u 4 d; for j in $(seq 1 ${#c}); do x="${c[$j]}"; y="${d[$j]}"; if [[ "$x" == "$y" ]]; then printf '%s ' "$x"; else xm=$(echo "$x" | tr -dc '0-9'); ym=$(echo "$y" | tr -dc '0-9'); p=$(((100. * (ym - xm)) / xm)); if (( p > 2 || p < -2)); then printf '%+0.1f%% ' "$p"; else printf '..... '; fi; fi; done; printf '\n'; done | column -t; exec 3>&-; exec 4>&-

Related branches

Adrien Nader (adrien)
summary: - openssl is not LTO-safe
+ [FFe] openssl is not LTO-safe
Adrien Nader (adrien)
description: updated
Adrien Nader (adrien)
summary: - [FFe] openssl is not LTO-safe
+ openssl is not LTO-safe
Adrien Nader (adrien)
Changed in openssl (Ubuntu):
milestone: none → ubuntu-24.04
assignee: nobody → Adrien Nader (adrien-n)
status: New → In Progress
Adrien Nader (adrien)
description: updated
description: updated
Adrien Nader (adrien)
description: updated
Adrien Nader (adrien)
Changed in openssl (Ubuntu):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package openssl - 3.0.13-0ubuntu2

---------------
openssl (3.0.13-0ubuntu2) noble; urgency=medium

  [ Tobias Heider ]
  * Add fips-mode detection and adjust defaults when running in fips mode
    (LP: #2056593):
    - d/p/fips/crypto-Add-kernel-FIPS-mode-detection.patch:
      Detect if kernel fips mode is enabled
    - d/p/fips/crypto-Automatically-use-the-FIPS-provider-when-the-kerne.patch:
      Load FIPS provider if running in FIPS mode
    - d/p/fips/apps-speed-Omit-unavailable-algorithms-in-FIPS-mode.patch:
      Limit openssl-speed to FIPS compliant algorithms when running in FIPS mode
    - d/p/fips/apps-pass-propquery-arg-to-the-libctx-DRBG-fetches.patch
      Make sure DRBG respects query properties
    - d/p/fips/test-Ensure-encoding-runs-with-the-correct-context-during.patch:
      Make sure encoding runs with correct library context and provider

  [ Adrien Nader ]
  * Re-enable intel/0002-AES-GCM-enabled-with-AVX512-vAES-and-vPCLMULQDQ.patch
    (LP: #2030784)
    Thanks Bun K Tan and Dan Zimmerman
  * Disable LTO with which the codebase is generally incompatible (LP: #2058017)

 -- Adrien Nader <email address hidden> Fri, 15 Mar 2024 09:46:33 +0100

Changed in openssl (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.