https://github.com/openssl/openssl/actions/runs/15214054228/job/42795224720
the theory I have for the cause of this failure is:
1. qtest_create_quic_connection_ex is called for the client
2. The client is in blocking mode, so we fall into the conditional on line 512
3. We create the server thread on line 519, which is non-blocking
4. The scheduler in the failing case, lets the server run ahead of the client
5. Server thread enters qtest_create_quic_connection_ex and iterates steps
6-9 in the do_while loop starting on line 530
6. Server calls qtest_add_time
7. Server calls ossl_quic_tserver_tick
8. Server calls ossl_quic_tserver_is_term_any, received NULL return
9. Server calls qtest_wait_for_timeout
10. Eventually qtest_wait_for_timeout returns zero, adn the server jumps to
the error label, returning zero to globservret, and the thread exits
11. Client thread regains the cpu, and attempts to call SSL_connect, which
fails, as the server is no longer listening
12. We fall into the error case on line 556, and SSL_get_error returns
SSL_ERROR_SSL, which causes clienterr to get set to 1
13. We exit the do{} while loop on line 581, and do the TEST_true check on
line 593. The server having exited wait_for_thread returns true, but
globserverret is still zero from step 10 above, and so the test fails
I can't prove this is the case, as the test only appears to fail in CI,
and we can't dump verbose logging there, lest we affect the timing of
the tests, so this is just a theory, but it seems to fit the
observations we have.
Attempting to fix this, by creating a thread interlock with a condition
variable that blocks the server from ticking the quic reactor until such
time as the client is about to call SSL_connect to prevent the race
condition
Reviewed-by: Saša Nedvědický <sashan@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/27704)
Run-checker merge / run-checker (enable-weak-ssl-ciphers) (push) Waiting to runDetails
Run-checker merge / run-checker (enable-zlib) (push) Waiting to runDetails
Run-checker merge / run-checker (no-dso) (push) Waiting to runDetails
Run-checker merge / run-checker (no-dynamic-engine) (push) Waiting to runDetails
Run-checker merge / run-checker (no-ec2m enable-fips) (push) Waiting to runDetails
Run-checker merge / run-checker (no-engine no-shared) (push) Waiting to runDetails
Run-checker merge / run-checker (no-err) (push) Waiting to runDetails
Run-checker merge / run-checker (no-filenames) (push) Waiting to runDetails
Run-checker merge / run-checker (no-integrity-only-ciphers) (push) Waiting to runDetails
Run-checker merge / run-checker (no-module) (push) Waiting to runDetails
Run-checker merge / run-checker (no-ocsp) (push) Waiting to runDetails
Run-checker merge / run-checker (no-pinshared) (push) Waiting to runDetails
Run-checker merge / run-checker (no-srp) (push) Waiting to runDetails
Run-checker merge / run-checker (no-srtp) (push) Waiting to runDetails
Run-checker merge / run-checker (no-ts) (push) Waiting to runDetails
Run-checker merge / jitter (push) Waiting to runDetails
Run-checker merge / threads_sanitizer_atomic_fallback (push) Waiting to runDetails
Windows GitHub CI / shared (map[arch:win32 config:--strict-warnings no-fips os:windows-2022]) (push) Waiting to runDetails
Windows GitHub CI / shared (map[arch:win64 config:enable-fips no-thread-pool no-quic os:windows-2022]) (push) Waiting to runDetails
Windows GitHub CI / shared (map[arch:win64 config:enable-fips os:windows-2019]) (push) Waiting to runDetails
Windows GitHub CI / plain (windows-2022) (push) Waiting to runDetails
Windows GitHub CI / minimal (windows-2019) (push) Waiting to runDetails
Windows GitHub CI / cygwin (windows-2019, map[arch:win64 config:-DCMAKE_C_COMPILER=gcc --strict-warnings enable-demos no-fips]) (push) Waiting to runDetails
Windows Compression GitHub CI / zstd (push) Waiting to runDetails
Windows Compression GitHub CI / brotli (push) Waiting to runDetails
Initially tests that were written which make use of the noisy dgram BIO,
were done under the assumption that, despite any packet mangling done by
the noisy dgram bio, the connection would still be established. This
was initiall guaranteed by configuring the BIO to avoid
corrupting/dropping/duplicating/re-injecting the first packet received,
thus ensuring that the client and server hello frames would make it to
the peer successfully.
This implicitly made the assumption that the client and server hellos
were contained within a single datagram, which until recently was true.
However, with the introduction of ML-KEM keyshares, the above assumption
no longer holds. Large ML-KEM keyshares generally expand these TLS
messages accross multiple datagrams, and so it is now possible that
those initial records can become corrupted/lost etc, leading to
unexpected connection failures.
Lets fix it by restoring the guarantee that these tests were written
under by making the backoff time configurable to a number of frames, and
configuring the quic connection objects used in the test to not drop the
first two initial frames, once again guaranteeing that the client and
server hello arrive at the peer uncorrupted, so that we get a good
connection established.
Fixes#27103
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Saša Nedvědický <sashan@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/27169)
Use the new short conn id internal api to record and use the connections
short conn id len when decoding packets in qtestlib
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/26592)
Improving handling of packets in tserver doesn't currently make sense,
as we're planning on eliminating it soon. Move this TODO to QUIC FUTURE
Fixesopenssl/project#1070
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Saša Nedvědický <sashan@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/26593)
BIOs created from a BIO_dgram_pair don't normally have a local BIO_ADDR
associated with them. This allows us to set one.
Fixesopenssl/project#933
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Saša Nedvědický <sashan@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/26066)
In preparation for supporting the handling of version negotiation, we
need to be able to detect why the decoding of quic header failed.
Specifically, ossl_quic_wire_decode_pkt_hdr fails if the version
provided in the header isn't QUIC_VERSION_1. We want to keep that, as
we don't support anything else, but the server code needs to
differentiate when we fail decode because of a version problem, vs some
other more fatal malforming issue.
So add a uint64_t *fail_cause pointer that gets filled out with a
failure cause. We only use VERSION failures right now, but we can
expand this later if needed
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Saša Nedvědický <sashan@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/25968)
Centralise the storage of the override in the QUIC_ENGINE rather than in
the QUIC_CONNECTION. We can now set the override on any type of QUIC SSL
object as needed.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Viktor Dukhovni <viktor@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/25457)
Reviewed-by: Neil Horman <nhorman@openssl.org>
Release: yes
(cherry picked from commit 0ce7d1f355)
Reviewed-by: Hugo Landau <hlandau@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/24034)
The test server cannot really cope with modifications
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22267)
If either endpoint issues a PING frame while we are introducing noise
into the communication then there is a danger that the connection itself
will fail. We detect the PING and then back off on generating noise for a
short while. It should be sufficient to just ensure that the next datagram
does not get dropped for each endpoint.
Fixes#22199
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22243)
Ensure we use OPENSSL_NO_SSL_TRACE guards where appropriate.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22193)
Fixes: #22178
Signed-of-by: Randall S. Becker <randall.becker@nexbridge.ca>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22179)
We now have a public function for BIO_ADDR_copy() which can be used in
preference to the test code's private implementation.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22164)
We are actually passing two references to sbio: one as part of a BIO chain
and one stand alone. Therefore we need two references.
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22157)
So far we've only applied noise to the server to client datagrams. Do the
same thing the other way around.
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22157)
We add a new flag QTEST_FLAG_CLIENT_TRACE to get debug tracing output if
required.
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22157)
Where multiple packets are in a single datagram we split them so that all
packets can be affected by the noise
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22157)
Provide a BIO filter that can split QUIC datagrams containing multiple
packets, such that each packet is in its own datagram.
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22157)
Now that we have a noisy datagram BIO we cannot rely on datagrams always
reliably being delivered in the test framework. We need to start taking
notice of timeouts and handling them appropriately.
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22157)
Create a noisy dgram test that can drop/duplicate/reorder UDP packets and
ensure that the QUIC connection is tolerant of this. At this stage we just
create the outline of the test. Adding in the noise will come in future
commits.
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22157)
Test that errors such as SSL_ERROR_WANT_RETRY_VERIFY are properly
handled by QUIC connections.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21922)
And use QTEST_FLAG_FAKE_TIME with test_ssl_trace().
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21713)
We add an additional loop around test_quic_write_read() to repeat the
test but using a session obtained from the initial iteration to confirm
that we can successfully resume the session.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21591)
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21135)
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21135)
We create an SSL BIO using a QUIC based SSL_CTX and then use that BIO
to create a connection and read/write data from streams.
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21367)
We then use it in test_corrupted_data() to remove an OSSL_sleep() which
may fail in some builds.
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21332)
This QUIC server utility is intended for test purposes only and is expected
to be replaced in a future version of OpenSSL by s_server. At that point
it will be removed.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21204)
Typos in doc/man* will be fixed in a different commit.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20910)
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20879)
We create "real" sockets for blocking mode so that we can block on them.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20514)
Allows tests to check that a given transport error was received by the
server.
Reviewed-by: Hugo Landau <hlandau@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20030)