openssl/doc/designs/ML-KEM.md

ML-KEM Design
=============

This document covers OpenSSL-specific ML-KEM implementation details.
**ML-KEM** is specified in [FIPS 203], which includes comprehensive pseudo-code
for all its algorithms.

ML-KEM Parameters & Functions
-----------------------------

There are 3 different parameter sets in FIPS 203 (see Section 8).

To support these variants, OpenSSL has 3 associated key managers and 3
corresponding KEM function sets.
The key management and KEM algorithm names are **ML-KEM-512**, **ML-KEM-768**
and **ML-KEM-1024**.
At the TLS layer, the associated key exchange *groups* are, respectively,
**MLKEM512**, **MLKEM768** and **MLKEM1024**.

**ML-KEM** makes extensive use of four **SHA3** primitives: **SHA3-256**,
**SHA3-512**, **SHAKE128** and **SHAKE256**.
To improve **ML-KEM** execution performance, the EVP handles for these are
pre-fetched during **ML-KEM** key initialisation and stored with each key.
These are then used in key generation, encapsulation and decapsulation.
These are also duplicated by reference (**EVP_MD** handles uprefed) when an
**ML-KEM** key is duplicated.

ML-KEM keys
-----------

**ML-KEM** is an asymmetric algorithm, and has both public and private keys.
Since the public key is exchanged between the two parties as part of key
agreement, the encoding (*wire-form*) of the public key is clearly defined and
there are unambiguous choices for its encoding and decoding functions.

It may be noted that the *wire-form* public key is "compressed".
Instead of the bulky "A" ("m" in the code) matrix, which represents the majority
of the storage required for ML-KEM public and private keys, the *wire-form* public
key, holds a 32-byte seed from which the the matrix is regenerated by the recipient
of the public key.
In the OpenSSL implementation, the matrix is *eagerly* evaluated as part of
decoding the public key, and stored in memory in the internal form needed for
subsequent computations (encapsulation).
Since the private key includes the public key as one of its components, the matrix
is also pre-computed and stored with the private key, and then need not be
regenerated during decapsulation.
During encapsulation (typically performed by servers), it is in principle
possible to save space and compute the matrix elements *just-in-time*, as each
matrix element is used exactly once.
This is not currently implemented, and the matrix is pre-computed in full.

However, the same matrix is used both during key generation and decapsulation
and computing it twice would have a noticeable performance impact (typically on
the client).
If we wanted to do *just-in-time* matrix computation for decapsulation, we'd
need to have a different memory layout for public keys when only the public key
is known, and to change the algorithm code to generate matrix elements on
demand during encapsulation.
This can be considered later, if it is determined that the space savings (9 *
512 bytes in memory for ML-KEM-768, for the full matrix, instead of 512 bytes
for a just-in-time element).
Since servers will generally destroy the client public key soon after the
shared secret is computed, these don't stay in memory long, and briefly saving
~2KB may not to be of much benefit).

The private key format is yet to be clearly standardised, though (to be able to
fully describe the algorithms) FIPS 203 documents a format that is commonly
referred to as the "extended" format.
This is the private key format supported by our key management provider
interface.
The IETF voices interest in using the "seed-based" format (the 64-byte (*d*,
*z*) seed pair from which the key is generated and can be recovered).
Recovery of the key from the seed (*d*, *z* pair) is supported by the [FIPS
203] internal deterministic key generation functions, which are used in the
*keygen* portion of the Known Answer Tests (KATs).

The design therefore caters to both options: The default key generation and KEM
encapsulation/decapsulation functions operate on "extended keys" in the
[FIPS 203] format, but it will be possible to use the "seed-based" private key
format by using the (currently test-only) deterministic *keygen* interface.
When keys are generated randomly, we don't presently provide a mechanism
to obtain and store the seed.
This can be added later if required.

Key generation API
------------------

Keys can be generated via the usual **EVP_PKEY_generate()** and
**EVP_PKEY_Q_keygen()** functions.

An explicit seed can be specified by setting the key generation
**OSSL_PKEY_PARAM_ML_KEM_SEED** parameter to a 64-byte octet-string
(concatenation of the **d** and **z** values (32-bytes each) in that order).

KEM API
-------

**ML-KEM** is meant to be a drop-in replacement for existing KEM algorithms.
Accessed in the usual way via:

- **EVP_PKEY_encapsulate_init()**,
- **EVP_PKEY_encapsulate()**,
- **EVP_PKEY_decapsulate_init()**, and
- **EVP_PKEY_decapsulate()**.

For the encapsulation operation, a test-only option exists to bypass the random
number generator (secret random inputs are required for security) and pass in
a pre-determined 32-byte random value, by setting of the
**OSSL_KEM_PARAM_IKME** parameter.

Buffers
-------

The **ML-KEM** key management and KEM providers interface with the underlying
libcrypto implementation via functions that validate the sizes of all provided
input/output buffers (encoded keys, ciphertext, shared secrets and seeds) against
the values expected for the provider's ML-KEM variant (a pointer to the variant
parameters is stored with each key).

The underlying libcrypto **ML-KEM** APIs are not directly exposed to users,
only the abstracted key management and KEM **EVP** APIs are public interfaces.

Constant Time Considerations
----------------------------

The usual constant time methods are used in the implementation.
However, we avoid using a *value-barrier* to set the masks that perform
constant-time *select* between one of two values.
This avoids a 30-50% performance penalty and is expected to be robust even in
the face of plausible future compiler optimisations.
Remainders module the prime are computed via Barret Reduction and the decoding
and decompression of the decrypted *message* has been tested to not be
vulnerable to the "clangover" attack in our implementation.

All the libcrypto functions (other than **ML_KEM_KEY** allocation, which
returns **NULL** on error) return 1 for success or zero on error.
It should be noted that to avoid chosen-ciphertext attacks, the
**decapsulate** implementation **must** return success and a synthetic
shared secret (generated in constant-time whether synthetic or successfully
decrypted) whenever the input is a well-formed ciphertext.

The only exception to the above is when, unexpectedly, one of the **SHA3**
functions fails, in that case all hope of constant-time computation is
lost, but we don't expect such failures to be influenced by the content
of chosen-ciphertexts, so this should not be an issue).

Nevertheless, even then we fall back on returning a shared secret from the RNG
along with an error indication only when the key derivation function
for the synthetic shared secret fails.
In all other conditions we return success and, as appropriate, either
the correct shared secret, or the synthetic alternative generated by the KDF.

<!-- Links  -->

[FIPS 203]:
    <https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.203.pdf>
Initial ML-KEM documentation Reviewed-by: Viktor Dukhovni <viktor@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26037) 2024-12-05 18:27:49 +08:00			`ML-KEM Design`
			`=============`

Initial batch of ML-KEM doc updates. With the soon-to-be-merged ML-KEM #26172 as the merge base. Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Neil Horman <nhorman@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26217) 2024-12-20 00:56:59 +08:00			`This document covers OpenSSL-specific ML-KEM implementation details.`
			`ML-KEM is specified in [FIPS 203], which includes comprehensive pseudo-code`
			`for all its algorithms.`
Initial ML-KEM documentation Reviewed-by: Viktor Dukhovni <viktor@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26037) 2024-12-05 18:27:49 +08:00
			`ML-KEM Parameters & Functions`
			`-----------------------------`

			`There are 3 different parameter sets in FIPS 203 (see Section 8).`

Initial batch of ML-KEM doc updates. With the soon-to-be-merged ML-KEM #26172 as the merge base. Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Neil Horman <nhorman@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26217) 2024-12-20 00:56:59 +08:00			`To support these variants, OpenSSL has 3 associated key managers and 3`
			`corresponding KEM function sets.`
			`The key management and KEM algorithm names are ML-KEM-512, ML-KEM-768`
			`and ML-KEM-1024.`
			`At the TLS layer, the associated key exchange groups are, respectively,`
			`MLKEM512, MLKEM768 and MLKEM1024.`

			`ML-KEM makes extensive use of four SHA3 primitives: SHA3-256,`
			`SHA3-512, SHAKE128 and SHAKE256.`
			`To improve ML-KEM execution performance, the EVP handles for these are`
			`pre-fetched during ML-KEM key initialisation and stored with each key.`
			`These are then used in key generation, encapsulation and decapsulation.`
			`These are also duplicated by reference (EVP_MD handles uprefed) when an`
			`ML-KEM key is duplicated.`
Initial ML-KEM documentation Reviewed-by: Viktor Dukhovni <viktor@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26037) 2024-12-05 18:27:49 +08:00
			`ML-KEM keys`
			`-----------`

Initial batch of ML-KEM doc updates. With the soon-to-be-merged ML-KEM #26172 as the merge base. Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Neil Horman <nhorman@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26217) 2024-12-20 00:56:59 +08:00			`ML-KEM is an asymmetric algorithm, and has both public and private keys.`
Initial ML-KEM documentation Reviewed-by: Viktor Dukhovni <viktor@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26037) 2024-12-05 18:27:49 +08:00			`Since the public key is exchanged between the two parties as part of key`
Initial batch of ML-KEM doc updates. With the soon-to-be-merged ML-KEM #26172 as the merge base. Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Neil Horman <nhorman@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26217) 2024-12-20 00:56:59 +08:00			`agreement, the encoding (wire-form) of the public key is clearly defined and`
Initial ML-KEM documentation Reviewed-by: Viktor Dukhovni <viktor@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26037) 2024-12-05 18:27:49 +08:00			`there are unambiguous choices for its encoding and decoding functions.`

Initial batch of ML-KEM doc updates. With the soon-to-be-merged ML-KEM #26172 as the merge base. Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Neil Horman <nhorman@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26217) 2024-12-20 00:56:59 +08:00			`It may be noted that the wire-form public key is "compressed".`
Initial ML-KEM documentation Reviewed-by: Viktor Dukhovni <viktor@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26037) 2024-12-05 18:27:49 +08:00			`Instead of the bulky "A" ("m" in the code) matrix, which represents the majority`
Initial batch of ML-KEM doc updates. With the soon-to-be-merged ML-KEM #26172 as the merge base. Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Neil Horman <nhorman@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26217) 2024-12-20 00:56:59 +08:00			`of the storage required for ML-KEM public and private keys, the wire-form public`
			`key, holds a 32-byte seed from which the the matrix is regenerated by the recipient`
			`of the public key.`
			`In the OpenSSL implementation, the matrix is eagerly evaluated as part of`
			`decoding the public key, and stored in memory in the internal form needed for`
			`subsequent computations (encapsulation).`
			`Since the private key includes the public key as one of its components, the matrix`
			`is also pre-computed and stored with the private key, and then need not be`
			`regenerated during decapsulation.`
Add util/codespell-check.sh and run it Reviewed-by: Neil Horman <nhorman@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/28639) 2025-09-23 03:51:32 +08:00			`During encapsulation (typically performed by servers), it is in principle`
Initial batch of ML-KEM doc updates. With the soon-to-be-merged ML-KEM #26172 as the merge base. Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Neil Horman <nhorman@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26217) 2024-12-20 00:56:59 +08:00			`possible to save space and compute the matrix elements just-in-time, as each`
			`matrix element is used exactly once.`
			`This is not currently implemented, and the matrix is pre-computed in full.`

			`However, the same matrix is used both during key generation and decapsulation`
			`and computing it twice would have a noticeable performance impact (typically on`
			`the client).`
			`If we wanted to do just-in-time matrix computation for decapsulation, we'd`
			`need to have a different memory layout for public keys when only the public key`
			`is known, and to change the algorithm code to generate matrix elements on`
			`demand during encapsulation.`
			`This can be considered later, if it is determined that the space savings (9 *`
			`512 bytes in memory for ML-KEM-768, for the full matrix, instead of 512 bytes`
			`for a just-in-time element).`
			`Since servers will generally destroy the client public key soon after the`
			`shared secret is computed, these don't stay in memory long, and briefly saving`
			`~2KB may not to be of much benefit).`

			`The private key format is yet to be clearly standardised, though (to be able to`
			`fully describe the algorithms) FIPS 203 documents a format that is commonly`
			`referred to as the "extended" format.`
			`This is the private key format supported by our key management provider`
			`interface.`
			`The IETF voices interest in using the "seed-based" format (the 64-byte (d,`
			`z) seed pair from which the key is generated and can be recovered).`
			`Recovery of the key from the seed (d, z pair) is supported by the [FIPS`
			`203] internal deterministic key generation functions, which are used in the`
			`keygen portion of the Known Answer Tests (KATs).`

			`The design therefore caters to both options: The default key generation and KEM`
			`encapsulation/decapsulation functions operate on "extended keys" in the`
			`[FIPS 203] format, but it will be possible to use the "seed-based" private key`
			`format by using the (currently test-only) deterministic keygen interface.`
			`When keys are generated randomly, we don't presently provide a mechanism`
			`to obtain and store the seed.`
			`This can be added later if required.`
Initial ML-KEM documentation Reviewed-by: Viktor Dukhovni <viktor@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26037) 2024-12-05 18:27:49 +08:00
			`Key generation API`
			`------------------`

Initial batch of ML-KEM doc updates. With the soon-to-be-merged ML-KEM #26172 as the merge base. Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Neil Horman <nhorman@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26217) 2024-12-20 00:56:59 +08:00			`Keys can be generated via the usual EVP_PKEY_generate() and`
			`EVP_PKEY_Q_keygen() functions.`
Initial ML-KEM documentation Reviewed-by: Viktor Dukhovni <viktor@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26037) 2024-12-05 18:27:49 +08:00
Initial batch of ML-KEM doc updates. With the soon-to-be-merged ML-KEM #26172 as the merge base. Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Neil Horman <nhorman@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26217) 2024-12-20 00:56:59 +08:00			`An explicit seed can be specified by setting the key generation`
			`OSSL_PKEY_PARAM_ML_KEM_SEED parameter to a 64-byte octet-string`
Add util/codespell-check.sh and run it Reviewed-by: Neil Horman <nhorman@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/28639) 2025-09-23 03:51:32 +08:00			`(concatenation of the d and z values (32-bytes each) in that order).`
Initial ML-KEM documentation Reviewed-by: Viktor Dukhovni <viktor@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26037) 2024-12-05 18:27:49 +08:00
			`KEM API`
			`-------`

Initial batch of ML-KEM doc updates. With the soon-to-be-merged ML-KEM #26172 as the merge base. Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Neil Horman <nhorman@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26217) 2024-12-20 00:56:59 +08:00			`ML-KEM is meant to be a drop-in replacement for existing KEM algorithms.`
			`Accessed in the usual way via:`
Initial ML-KEM documentation Reviewed-by: Viktor Dukhovni <viktor@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26037) 2024-12-05 18:27:49 +08:00
Initial batch of ML-KEM doc updates. With the soon-to-be-merged ML-KEM #26172 as the merge base. Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Neil Horman <nhorman@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26217) 2024-12-20 00:56:59 +08:00			`- EVP_PKEY_encapsulate_init(),`
			`- EVP_PKEY_encapsulate(),`
			`- EVP_PKEY_decapsulate_init(), and`
			`- EVP_PKEY_decapsulate().`
Initial ML-KEM documentation Reviewed-by: Viktor Dukhovni <viktor@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26037) 2024-12-05 18:27:49 +08:00
Initial batch of ML-KEM doc updates. With the soon-to-be-merged ML-KEM #26172 as the merge base. Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Neil Horman <nhorman@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26217) 2024-12-20 00:56:59 +08:00			`For the encapsulation operation, a test-only option exists to bypass the random`
			`number generator (secret random inputs are required for security) and pass in`
			`a pre-determined 32-byte random value, by setting of the`
			`OSSL_KEM_PARAM_IKME parameter.`
Initial ML-KEM documentation Reviewed-by: Viktor Dukhovni <viktor@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26037) 2024-12-05 18:27:49 +08:00
			`Buffers`
			`-------`

Initial batch of ML-KEM doc updates. With the soon-to-be-merged ML-KEM #26172 as the merge base. Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Neil Horman <nhorman@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26217) 2024-12-20 00:56:59 +08:00			`The ML-KEM key management and KEM providers interface with the underlying`
			`libcrypto implementation via functions that validate the sizes of all provided`
			`input/output buffers (encoded keys, ciphertext, shared secrets and seeds) against`
			`the values expected for the provider's ML-KEM variant (a pointer to the variant`
			`parameters is stored with each key).`

			`The underlying libcrypto ML-KEM APIs are not directly exposed to users,`
			`only the abstracted key management and KEM EVP APIs are public interfaces.`
Initial ML-KEM documentation Reviewed-by: Viktor Dukhovni <viktor@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26037) 2024-12-05 18:27:49 +08:00
			`Constant Time Considerations`
			`----------------------------`

Initial batch of ML-KEM doc updates. With the soon-to-be-merged ML-KEM #26172 as the merge base. Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Neil Horman <nhorman@openssl.org> (Merged from https://github.com/openssl/openssl/pull/26217) 2024-12-20 00:56:59 +08:00			`The usual constant time methods are used in the implementation.`
			`However, we avoid using a value-barrier to set the masks that perform`
			`constant-time select between one of two values.`
			`This avoids a 30-50% performance penalty and is expected to be robust even in`
			`the face of plausible future compiler optimisations.`
			`Remainders module the prime are computed via Barret Reduction and the decoding`
			`and decompression of the decrypted message has been tested to not be`
			`vulnerable to the "clangover" attack in our implementation.`

			`All the libcrypto functions (other than ML_KEM_KEY allocation, which`
			`returns NULL on error) return 1 for success or zero on error.`
			`It should be noted that to avoid chosen-ciphertext attacks, the`
			`decapsulate implementation must return success and a synthetic`
			`shared secret (generated in constant-time whether synthetic or successfully`
			`decrypted) whenever the input is a well-formed ciphertext.`

			`The only exception to the above is when, unexpectedly, one of the SHA3`
			`functions fails, in that case all hope of constant-time computation is`
			`lost, but we don't expect such failures to be influenced by the content`
			`of chosen-ciphertexts, so this should not be an issue).`

			`Nevertheless, even then we fall back on returning a shared secret from the RNG`
			`along with an error indication only when the key derivation function`
			`for the synthetic shared secret fails.`
			`In all other conditions we return success and, as appropriate, either`
			`the correct shared secret, or the synthetic alternative generated by the KDF.`

			`<!-- Links -->`

			`[FIPS 203]:`
			`<https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.203.pdf>`