Commit Graph

11 Commits

Author SHA1 Message Date
Xu Yizhou c007203b94 SM4 AESE optimization for ARMv8
Signed-off-by: Xu Yizhou <xuyizhou1@huawei.com>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/19914)
2023-02-02 10:16:47 +11:00
Matt Caswell fecb3aae22 Update copyright year
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Release: yes
2022-05-03 13:34:51 +01:00
Daniel Hu 4908787f21 SM4 optimization for ARM by ASIMD
This patch optimizes SM4 for ARM processor using ASIMD instruction

It will improve performance if both of following conditions are met:
1) Input data equal to or more than 4 blocks
2) Cipher mode allows parallelism, including ECB,CTR,GCM or CBC decryption

This patch implements SM4 SBOX lookup in vector registers, with the
benefit of constant processing time over existing C implementation.

It is only enabled for micro-architecture N1/V1. In the ideal scenario,
performance can reach up to 2.7X

When either of above two conditions is not met, e.g. single block input
or CFB/OFB mode, CBC encryption, performance could drop about 50%.

The assembly code has been reviewed internally by ARM engineer
Fangming.Fang@arm.com

Signed-off-by: Daniel Hu <Daniel.Hu@arm.com>

Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17951)
2022-04-12 10:37:42 +02:00
Daniel Hu 15b7175f55 SM4 optimization for ARM by HW instruction
This patch implements the SM4 optimization for ARM processor,
using SM4 HW instruction, which is an optional feature of
crypto extension for aarch64 V8.

Tested on some modern ARM micro-architectures with SM4 support, the
performance uplift can be observed around 8X~40X over existing
C implementation in openssl. Algorithms that can be parallelized
(like CTR, ECB, CBC decryption) are on higher end, with algorithm
like CBC encryption on lower end (due to inter-block dependency)

Perf data on Yitian-710 2.75GHz hardware, before and after optimization:

Before:
  type      16 bytes     64 bytes    256 bytes    1024 bytes   8192 bytes  16384 bytes
  SM4-CTR  105787.80k   107837.87k   108380.84k   108462.08k   108549.46k   108554.92k
  SM4-ECB  111924.58k   118173.76k   119776.00k   120093.70k   120264.02k   120274.94k
  SM4-CBC  106428.09k   109190.98k   109674.33k   109774.51k   109827.41k   109827.41k

After (7.4x - 36.6x faster):
  type      16 bytes     64 bytes    256 bytes    1024 bytes   8192 bytes  16384 bytes
  SM4-CTR  781979.02k  2432994.28k  3437753.86k  3834177.88k  3963715.58k  3974556.33k
  SM4-ECB  937590.69k  2941689.02k  3945751.81k  4328655.87k  4459181.40k  4468692.31k
  SM4-CBC  890639.88k  1027746.58k  1050621.78k  1056696.66k  1058613.93k  1058701.31k

Signed-off-by: Daniel Hu <Daniel.Hu@arm.com>

Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17455)
2022-01-18 11:52:14 +01:00
Matt Caswell 3c2bdd7df9 Update copyright year
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/14801)
2021-04-08 13:04:41 +01:00
Shane Lontis 8a6e912520 Add ossl_ symbols for sm3 and sm4
Partial fix for #12964

Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/14473)
2021-03-18 17:52:37 +10:00
Matt Caswell eec0ad10b9 Update copyright year
Reviewed-by: Nicola Tuveri <nic.tuv@gmail.com>
(Merged from https://github.com/openssl/openssl/pull/13144)
2020-10-15 14:10:06 +01:00
Pauli 592dcfd3df prov: prefix all exposed 'cipher' symbols with ossl_
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/13030)
2020-10-01 10:33:57 +10:00
Pauli 7d6766cb53 prov: prefix provider internal functions with ossl_
Also convert the names to lower case.

Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/13014)
2020-09-29 16:33:16 +10:00
Shane Lontis f75abcc0f0 Fix Use after free when copying cipher ctx
Fixes #10438
issue found by clusterfuzz/ossfuzz

The dest was getting a copy of the src structure which contained a pointer that should point to an offset inside itself - because of the copy it was pointing to the original structure.

The setup for a ctx is mainly done by the initkey method in the PROV_CIPHER_HW structure. Because of this it makes sense that the structure should also contain a copyctx method that is use to resolve any pointers that need to be setup.

A dup_ctx has been added to the cipher_enc tests in evp_test. It does a dup after setup and then frees the original ctx. This detects any floating pointers in the duplicated context that were pointing back to the freed ctx.

Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/10443)
2019-11-18 13:13:05 +10:00
Richard Levitte 604e884bb8 Providers: move all ciphers
From providers/{common,default}/ to providers/implementations/

Except for common code, which remains in providers/common/ciphers/.
However, we do move providers/common/include/internal/ciphers/*.h
to providers/common/include/prov/, and adjust all source including
any of those header files.

Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/10088)
2019-10-10 14:12:15 +02:00