during memfail testing:
https://github.com/openssl/openssl/actions/runs/16794088536/job/47561223902
We get lots of test failures in ossl_rcu_read_lock. This occurs
because we have a few cases in the read lock path that attempt mallocs,
which, if they fail, trigger an assert or a silent failure, which isn't
really appropriate. We should instead fail gracefully, by informing the
caller that the lock failed, like we do for CRYPTO_THREAD_read_lock.
Fortunately, these are all internal apis, so we can convert
ossl_rcu_read_lock to return an int indicating success/failure, and fail
gracefully during the test, rather than hitting an assert abort.
Fixesopenssl/project#1315
Reviewed-by: Paul Yang <paulyang.inf@gmail.com>
Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com>
Reviewed-by: Paul Dale <ppzgs1@gmail.com>
(Merged from https://github.com/openssl/openssl/pull/28195)
RCU stores a per-thread local structure per context-thread, making it
necessecary to move them to the new api to avoid exhausting our OS level
thread-local storage resources when creating lots of contexts
Reviewed-by: Saša Nedvědický <sashan@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/27794)
Only a minimum of 2 qp's are necessary: one for the readers,
and at least one that writers can wait on for retirement.
There is no need for one additional qp that is always unused.
Also only one ACQUIRE barrier is necessary in get_hold_current_qp,
so the ATOMIC_LOAD of the reader_idx can be changed to RELAXED.
And finally clarify some comments.
Reviewed-by: Paul Dale <ppzgs1@gmail.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/27012)
- Update allocate_new_qp_group to take unsigned int
- Move id_ctr in rcu_lock_st for better stack alignment
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
Reviewed-by: Neil Horman <nhorman@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/26972)
The current retirement code for rcu qp's has a race condition,
which can cause use-after-free errors, but only if more than
3 QPs are allocated, which is not the default configuration.
This fixes an oversight in commit 5949918f9a ("Rework and
simplify RCU code")
Reviewed-by: Neil Horman <nhorman@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/26952)
Make CRYPTO_atomic_add consistent with
CRYPTO_atomic_load_int and set the
reader_idx under write_lock since there
is no CRYPTO_atomic_store_int.
Reviewed-by: Neil Horman <nhorman@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/26963)
Older compilers don't always support __ATOMIC_ACQ_REL, use a lock where
they don't
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
(Merged from https://github.com/openssl/openssl/pull/26747)
Use __ATOMIC_RELAXED where possible.
Dont store additional values in the users field.
Reviewed-by: Neil Horman <nhorman@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/26690)
An intermittent failure was noted on our new ppc64le CI runner, in which
what appeared to be a corrupted or invalid value getting returned from a
shared pointer under rcu protection
Investigation showed that the problem was with our small number of qp's
in a lock, and slightly incorrect accounting of the number of qp's
available we were prematurely recycling qp's, which led in turn to
premature completion of synchronization states, resulting in readers
reading memory that may have already been freed.
Fix it by:
a) Ensuring that we account for the fact that the first qp in an rcu
lock is allocated at the time the lock is created
and
b) Ensuring that we have a minimum number of 3 qp's:
1 that is free for write side allocation
1 that is in use by the write side currently
1 "next" qp that the read side can update while the prior qp is being
retired
With this change, the rcu threadstest runs indefinately in my testing
Fixes#26356
Reviewed-by: Paul Dale <ppzgs1@gmail.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Saša Nedvědický <sashan@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/26384)
InterlockedExchangeAdd expects arguments of type LONG *, LONG
but the int arguments were improperly cast to long *, long
Note:
- LONG is always 32 bit
- long is 32 bit on Win32 VC x86/x64 and MingW-W64
- long is 64 bit on cygwin64
Signed-off-by: Georgi Valkov <gvalkov@gmail.com>
Reviewed-by: Paul Dale <ppzgs1@gmail.com>
Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Neil Horman <nhorman@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/24941)
Adjust long lines and correct padding in preprocessor lines to
match the formatting rules
Signed-off-by: Georgi Valkov <gvalkov@gmail.com>
Reviewed-by: Paul Dale <ppzgs1@gmail.com>
Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Neil Horman <nhorman@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/24941)
Builds using 32 bit MinGW will fail, due to the same reasoning described in commit 2d46a44ff2.
CLA: trivial
Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/25025)
mingw is complaining on builds about the use of InterlockedExchange on a
uint32_t type, as the input parameter here is expected to be LONG
(defined as signed 32 bit on all versions of windows).
the input value (reader_idx) will never grow larger than the group size
of the lock (nominally 2, but always a reasonably small value), so it
should be safe to just cast it to the appropriate type here.
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
(Merged from https://github.com/openssl/openssl/pull/25015)
Improve code consistency between threads_pthread.c and threads_win.c
threads_pthread.c has good comments, let's copy them to threads_win.c
In many places uint64_t or LONG int was used, and assignments were
performed between variables with different sizes.
Unify the code to use uint32_t. In 32 bit architectures it is easier
to perform 32 bit atomic operations. The size is large enough to hold
the list of operations.
Fix result of atomic_or_uint_nv improperly casted to int *
instead of int.
Note:
In general size_t should be preferred for size and index, due to its
descriptive name, however it is more convenient to use uint32_t for
consistency between platforms and atomic calls.
READER_COUNT and ID_VAL return results that fit 32 bit. Cast them to
uint32_t to save a few CPU cycles, since they are used in 32 bit
operations anyway.
TODO:
In struct rcu_lock_st, qp_group can be moved before id_ctr
for better alignment, which would save 8 bytes.
allocate_new_qp_group has a parameter count of type int.
Signed values should be avoided as size or index.
It is better to use unsigned, e.g uint32_t, even though
internally this is assigned to a uint32_t variable.
READER_SIZE is 16 in threads_pthread.c, and 32 in threads_win.c
Using a common size for consistency should be prefered.
Signed-off-by: Georgi Valkov <gvalkov@gmail.com>
Reviewed-by: Neil Horman <nhorman@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/24803)
This fixes a build error regression on mingw64 introduced by me in
16beec98d2
In get_hold_current_qp, uint32_t variables were improperly
used to hold the value of reader_idx, which is defined as long int.
So I used CRYPTO_atomic_load_int, where a comment states
On Windows, LONG is always the same size as int
There is a size confusion, because
Win32 VC x86/x64: LONG, long, long int are 32 bit
MingW-W64: LONG, long, long int are 32 bit
cygwin64: LONG is 32 bit, long, long int are 64 bit
Fix:
- define reader_idx as uint32_t
- edit misleading comment, to clarify:
On Windows, LONG (but not long) is always the same size as int.
Fixes the following build error, reported in [1].
crypto/threads_win.c: In function 'get_hold_current_qp':
crypto/threads_win.c:184:32: error: passing argument 1 of 'CRYPTO_atomic_load_int' from incompatible pointer type [-Wincompatible-pointer-types]
184 | CRYPTO_atomic_load_int(&lock->reader_idx, (int *)&qp_idx,
| ^~~~~~~~~~~~~~~~~
| |
| volatile long int *
[1] https://github.com/openssl/openssl/pull/24405#issuecomment-2211602282
Signed-off-by: Georgi Valkov <gvalkov@gmail.com>
Reviewed-by: Neil Horman <nhorman@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/24803)
InterlockedAnd64 and InterlockedAdd64 are not available on VS2010 x86.
We already have implemented replacements for other functions, such as
InterlockedOr64. Apply the same approach to fix the errors.
A CRYPTO_RWLOCK rw_lock is added to rcu_lock_st.
Replace InterlockedOr64 and InterlockedOr with CRYPTO_atomic_load and
CRYPTO_atomic_load_int, using the existing design pattern.
Add documentation and tests for the new atomic functions
CRYPTO_atomic_add64, CRYPTO_atomic_and
Fixes:
libcrypto.lib(libcrypto-lib-threads_win.obj) : error LNK2019: unresolved external symbol _InterlockedAdd64 referenced in function _get_hold_current_qp
libcrypto.lib(libcrypto-lib-threads_win.obj) : error LNK2019: unresolved external symbol _InterlockedOr referenced in function _get_hold_current_qp
libcrypto.lib(libcrypto-lib-threads_win.obj) : error LNK2019: unresolved external symbol _InterlockedAnd64 referenced in function _update_qp
libcrypto.lib(libcrypto-lib-threads_win.obj) : error LNK2019: unresolved external symbol _InterlockedOr64 referenced in function _ossl_synchronize_rcu
Signed-off-by: Georgi Valkov <gvalkov@gmail.com>
Reviewed-by: Neil Horman <nhorman@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/24405)
Reviewed-by: Paul Dale <ppzgs1@gmail.com>
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/24630)
(cherry picked from commit d38d264228)
VC 2010 or earlier compilers do not support static inline.
To work around this problem, we can use the ossl_inline macro.
Fixes:
crypto\threads_win.c(171) : error C2054: expected '(' to follow 'inline'
crypto\threads_win.c(172) : error C2085: 'get_hold_current_qp' : not in formal parameter list
crypto\threads_win.c(172) : error C2143: syntax error : missing ';' before '{'
crypto\threads_win.c(228) : warning C4013: 'get_hold_current_qp' undefined; assuming extern returning int
crypto\threads_win.c(228) : warning C4047: '=' : 'rcu_qp *' differs in levels of indirection from 'int'
Signed-off-by: Georgi Valkov <gvalkov@gmail.com>
Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/24370)
Generally we can get away with just using CRYPTO_atomic_load to do
stores by reversing the source and target variables, but doing so
creates a problem for the thread sanitizer as CRYPTO_atomic_load hard
codes an __ATOMIC_ACQUIRE constraint, which confuses tsan into thinking
that loads and stores aren't properly ordered, leading to RAW/WAR
hazzards getting reported. Instead create a CRYPTO_atomic_store api
that is identical to the load variant, save for the fact that the value
is a unit64_t rather than a pointer that gets stored using an
__ATOMIC_RELEASE constraint, satisfying tsan.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/23671)
The ossl_rcu_call function for windows creates a linked list loop. fix
it to work like the pthread version properly
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/23671)
Currently, rcu has a global bit of data, the CRYPTO_THREAD_LOCAL object
to store per thread data. This works in some cases, but fails in FIPS,
becuase it contains its own copy of the global key.
So
1) Make the rcu_thr_key a per-context variable, and force
ossl_rcu_lock_new to be context aware
2) Store a pointer to the context in the lock object
3) Use the context to get the global thread key on read/write lock
4) Use ossl_thread_start_init to properly register a cleanup on thread
exit
5) Fix up missed calls to OSSL_thread_stop() in our tests
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/24162)
Creating an rcu lock does a double allocation of the underlying mutex.
Not sure how asan didn't catch this, but we clearly have a duplicate
line here
Fixes#24085
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Kurt Roeckx <kurt@roeckx.be>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/24086)
Reviewed-by: Neil Horman <nhorman@openssl.org>
Release: yes
(cherry picked from commit 0ce7d1f355)
Reviewed-by: Hugo Landau <hlandau@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/24034)
Introduce an RCU lock implementation as an alternative locking mechanism
to openssl. The api is documented in the ossl_rcu.pod
file
Read side implementaiton is comparable to that of RWLOCKS:
ossl_rcu_read_lock(lock);
<
critical section in which data can be accessed via
ossl_derefrence
>
ossl_rcu_read_unlock(lock);
Write side implementation is:
ossl_rcu_write_lock(lock);
<
critical section in which data can be updated via
ossl_assign_pointer
and stale data can optionally be scheduled for removal
via ossl_rcu_call
>
ossl_rcu_write_unlock(lock);
...
ossl_synchronize_rcu(lock);
ossl_rcu_call fixup
Reviewed-by: Hugo Landau <hlandau@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22729)
The changes from the following commit should also apply to
Visual Studio 2010
2d46a44ff2 (r104867505)
Fixes build errors: undefined symbol InterlockedOr64
on Windows 2003, Visual Studio 2010 for x86 target.
CLA: trivial
Signed-off-by: Georgi Valkov <gvalkov@gmail.com>
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20557)
The header is already included by <windows.h> for WinSDK 8 or later.
Actually this causes problem for WinSDK 7.1 (defaults for VS2010) that
it does not have this header while SRW Locks do exist for Windows 7.
CLA: trivial
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/16603)
Some functions that lock things are void, so we just return early.
Also make ossl_namemap_empty return 0 on error. Updated the docs, and added
some code to ossl_namemap_stored() to handle the failure, and updated the
tests to allow for failure.
Fixes: #14230
Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/14238)
Fixes#13914
The "SRWLock" synchronization primitive is available in Windows Vista
and later. CRYPTO_THREAD functions now use SRWLock functions when the
target operating system supports them.
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/14381)
We add an implementation for CRYPTO_atomic_or() and CRYPTO_atomic_load()
Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com>
(Merged from https://github.com/openssl/openssl/pull/13733)
When the new OpenSSL CSPRNG was introduced in version 1.1.1,
it was announced in the release notes that it would be fork-safe,
which the old CSPRNG hadn't been.
The fork-safety was implemented using a fork count, which was
incremented by a pthread_atfork handler. Initially, this handler
was enabled by default. Unfortunately, the default behaviour
had to be changed for other reasons in commit b5319bdbd0, so
the new OpenSSL CSPRNG failed to keep its promise.
This commit restores the fork-safety using a different approach.
It replaces the fork count by a fork id, which coincides with
the process id on UNIX-like operating systems and is zero on other
operating systems. It is used to detect when an automatic reseed
after a fork is necessary.
To prevent a future regression, it also adds a test to verify that
the child reseeds after fork.
CVE-2019-1549
Reviewed-by: Paul Dale <paul.dale@oracle.com>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/9832)
Replace it with InitializeCriticalSection()
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/8596)
CRYPTO_atomic_read was added with intention to read statistics counters,
but readings are effectively indistinguishable from regular load (even
in non-lock-free case). This is because you can get out-dated value in
both cases. CRYPTO_atomic_write was added for symmetry and was never used.
Reviewed-by: Kurt Roeckx <kurt@roeckx.be>
(Merged from https://github.com/openssl/openssl/pull/6883)
TlsGetValue clears the last error even on success, so that callers may
distinguish it successfully returning NULL or failing. This error-mangling
behavior interferes with the caller's use of GetLastError. In particular
SSL_get_error queries the error queue to determine whether the caller should
look at the OS's errors. To avoid destroying state, save and restore the
Windows error.
Fixes#6299.
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6316)