Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: David von Oheimb <david.von.oheimb@siemens.com>
(Merged from https://github.com/openssl/openssl/pull/18204)
Fixes an issue disassembling the functions because the symtab contains
an attribute indicating the presence of data within them.
CLA: trivial
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18086)
macOS 10.7 and 10.8 had a bit wired clang which is detected as
`__GNUC__` which has `__ATOMIC_ACQ_REL` but it excepts one option at
`__atomic_is_lock_free` instead of 2.
This prevents OpenSSL to be compiled on such systems.
Fixes: #18055
Signed-off-by: Kirill A. Korinsky <kirill@korins.ky>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18056)
gcc6.3 doesn't seem to support the register aliases fp and lr for x29 and x30,
so use the x names.
Fixes#18114
Change-Id: I077edda42af4c7cdb7b24f28ac82d1603f550108
Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18127)
X509V3_add_value() will return 0 on malloc failure, which could lead to
err logic in X509V3_parse_list().
Fix this by adding return value check of X509V3_add_value().
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18077)
Allow to specify "z16" as machine generation in environment variable
OPENSSL_s390xcap. It is an alias for "z15".
Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18054)
This patch optimizes SM4 for ARM processor using ASIMD instruction
It will improve performance if both of following conditions are met:
1) Input data equal to or more than 4 blocks
2) Cipher mode allows parallelism, including ECB,CTR,GCM or CBC decryption
This patch implements SM4 SBOX lookup in vector registers, with the
benefit of constant processing time over existing C implementation.
It is only enabled for micro-architecture N1/V1. In the ideal scenario,
performance can reach up to 2.7X
When either of above two conditions is not met, e.g. single block input
or CFB/OFB mode, CBC encryption, performance could drop about 50%.
The assembly code has been reviewed internally by ARM engineer
Fangming.Fang@arm.com
Signed-off-by: Daniel Hu <Daniel.Hu@arm.com>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17951)
Check the return value of EVP_KDF_fetch to avoid a potential
null pointer dereference.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18062)
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18031)
Both are the same issue and both as false positives. Annotate the line so
that this is ignored.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
(Merged from https://github.com/openssl/openssl/pull/18012)
CLI changes: New parameter -digest to CLI command openssl cms, to
provide pre-computed digest for use with -sign.
API changes: New function CMS_final_digest(), like CMS_final() but
uses a pre-computed digest instead of computing it from the data.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: David von Oheimb <david.von.oheimb@siemens.com>
Reviewed-by: Todd Short <todd.short@me.com>
(Merged from https://github.com/openssl/openssl/pull/15348)
This refactors OSSL_LIB_CTX to avoid using CRYPTO_EX_DATA. The assorted
objects to be managed by OSSL_LIB_CTX are hardcoded and are initialized
eagerly rather than lazily, which avoids the need for locking on access
in most cases.
Fixes#17116.
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17881)
d6e4287c97 introduced 5x interleaving as an
optimization for ThunderX2, and that leads to some performance degradation on
when encoding short buffers. We found this performance degradation by measuring
the performance of nginx on Ubuntu 20.04 that comes with OpenSSL 1.1.1f and
Ubuntu 22.04 with OpenSSL 3.0.1.
This patch limits the 5x interleave to buffers larger than 512 bytes.
On Graviton2 we see the following performance with this patch:
$ openssl speed -evp aes-128-gcm -bytes 128
AES-128-GCM 64 bytes 79 bytes 80 bytes 128 bytes 256 bytes 511 bytes 512 bytes 1024 bytes
master 1062564.71k 775113.11k 1069959.33k 1411716.28k 1653114.86k 1585981.16k 1973683.03k 2203214.08k
master+patch 1062729.28k 771915.11k 1103883.42k 1458665.43k 1708701.20k 1647060.84k 1975571.80k 2204038.42k
diff 0% 0% 3% 3% 3% 4% 0% 0%
revert d6e428 1055290.03k 773448.92k 1117411.97k 1441478.57k 1695698.52k 1634598.04k 1981851.65k 2196680.36k
CLA: trivial
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17984)
The sweep of the source tree in #17373 missed the BSAES assembly due its
PR #14592 having been temporarily backed out at the time.
This constitutes a partial fix for #17958 - covers cases except when
configured with -DOPENSSL_AES_CONST_TIME.
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17988)
This becomes a performance improvement in the ossl_sa_doall_arg function which
has started appearing on profile output. The other ossl_sa_ functions don't
contribute significantly to profile output.
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17973)
The sizes are rounded via the expression: (cmpl + 7) / 8 which overflows if
cmpl is near to the type's maximum. Instead we use the safe_math function to
computer this without any possibility of error.
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17884)
OPENSSL_sk_num returns an integer which can theoretically be negative.
Assigning this to a size_t and using it as a loop bound isn't ideal.
Rather than adding checked for NULL or negative returns, changing the loop
index and end to int is simpler.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17954)
The symbol OPENSSL_s390xcap_P and the OPENSSL_cpuid_setup function are not
exported by the version script of OpenSSL. However, if someone uses the
static library without the version script, these symbols all of a sudden
become global symbols and their usage in assembler code does not correctly
reflect that for PIC. Since these symbols should never be used outside of
OpenSSL, hide them inside the binary.
Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17946)
The assert added cannot ever fail because (current & 0xFFFF) != 0 from the
while loop and the trailing zero bit count therefore cannot be as large as 32.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com>
(Merged from https://github.com/openssl/openssl/pull/17892)
This refactors decoder functionality to reduce calls to
OSSL_DECODER_is_a / EVP_KEYMGMT_is_a, which are substantial bottlenecks
in the performance of repeated decode operations (see #15199).
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17921)
Fixes#17911.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17912)
The calculation in some cases does not finish for non-prime p.
This fixes CVE-2022-0778.
Based on patch by David Benjamin <davidben@google.com>.
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
Fixes#17869.
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17870)
Fixes#17876
CLA: trivial
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Matthias St. Pierre <Matthias.St.Pierre@ncp-e.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17877)
The IV length cache value was being invalidated excessively, causing IV
length caching to be ineffective.
Related to #17064.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17862)
Decoration prefix for some assembler labels in aes-gcm-avx512.pl was
fixed for mingw64 build.
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17868)
As the potential failure of the BIO_read(),
it should be better to add the check and return
error if fails.
Also, in order to decrease the same code, using
'out_free' will be better.
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17754)
Partial fix for #17064. Avoid excessive writes to the cache line
containing the refcount for an EVP_MD object to avoid extreme
cache contention when using a single EVP_MD at high frequency on
multiple threads. This changes performance in 3.0 from being double
that of 1.1 to only slightly higher than that of 1.1.
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17857)