Commit Graph

31294 Commits

Author SHA1 Message Date
Iliia Khaprov ed793f20af
Preserve order for missing deliveries (get_checked_out) 2025-08-14 15:03:46 +02:00
Jean-Sébastien Pédron 4f1c6f0bde
clustering_recovery_SUITE: Add `recover_after_partition_with_leader` testcase
[Why]
The testcase tries to replicate the steps described in issue #12934.

[How]
It uses intermediate Erlang nodes between the common_test control node
and the RabbitMQ nodes, using `peer` standard_io communication. The goal
is to make sure the common_test control node doesn't interfere with the
nodes the RabbitMQ nodes can see, despite the blocking of the Erlang
distribution connection.

So far, I couldn't reproduce the problem reported in #12934. @mkuratczyk
couldn't either, so it might have been fixed as a side effect of another
change...

References #12934.
2025-08-14 12:10:52 +02:00
Luke Bakken 3e235a6316
Do not "pre convert" `running_nodes` to strings
Follow-up to #14118
2025-08-13 12:01:18 -07:00
Luke Bakken a87445cda6 Use `YRL_ERLC_OPTS` instead of `ERL_COMPILER_OPTIONS`
This is a follow-up to commit 93db480bc4

`erlang.mk` supports the `YRL_ERLC_OPTS` variable to set `erlc`-specific
compiler options when processing `.yrl` and `.xrl` files. By using this
variable, it allows `make RMQ_ERLC_OPTS=` to disable the
`+deterministic` option. This allows using `c()` in the erl shell to
recompile modules on the fly when a cluster is running.

You can see that, when `make RMQ_ERLC_OPTS=` is run, these generated
files were produced with the `+deterministic` option, because their
`-file` directives use only basenames.

* `deps/rabbit/src/rabbit_amqp_sql_lexer.erl`
* `deps/rabbit/src/rabbit_amqp_sql_parser.erl`

```
-file("rabbit_amqp_sql_parser.yrl", 0).
-module(rabbit_amqp_sql_parser).
-file("rabbit_amqp_sql_parser.erl", 3).
-export([parse/1, parse_and_scan/1, format_error/1]).
-file("rabbit_amqp_sql_parser.yrl", 122).
```

This commit also ignores those two files, as they will always be
auto-generated.
2025-08-12 12:00:54 -07:00
Michael Klishin 64e5a6a7e5
Merge pull request #14373 from rabbitmq/mk-reduce-rabbitmq-plugins-log-rate
Do not log every transient (or Erlang/OTP-provided) dependency when listing plugins
2025-08-12 15:00:42 -04:00
Jean-Sébastien Pédron 23588b665a
rabbit_access_control: Check configured auth backends are enabled at boot time
[Why]
If a user configures an auth backend module, but doesn't enabled the
plugin that provides it, it will get a crash and a stacktrace when
authentication is performed. The error is not helpful to understand what
the problem is.

[How]
We add a boot step that go through the configured auth backends and
query the core of RabbitMQ and the plugins. If an auth backend is
provided by a plugin, the plugin must be enabled to consider the auth
backend to be valid.

In the end, at least one auth backend must be valid, otherwise the boot
is aborted.

If only some of the configured auth backends were filtered out, but
there are still some valid auth backends, we store the filtered list in
the application environment variable so that
authentication/authorization doesn't try to use them later.

We also report invalid auth backends in the logs:

* Info message for a single invalid auth backend:

    [info] <0.213.0> The `rabbit_auth_backend_ldap` auth backend module is configured. However, the `rabbitmq_auth_backend_ldap` plugin must be enabled in order to use this auth backend. Until then it will be skipped during authentication/authorization

* Warning message when some auth backends were filtered out:

    [warning] <0.213.0> Some configured backends were dropped because their corresponding plugins are disabled. Please look at the info messages above to learn which plugin(s) should be enabled. Here is the list of auth backends kept after filering:
    [warning] <0.213.0> [rabbit_auth_backend_internal]

* Error message when no auth backends are valid:

    [error] <0.213.0> None of the configured auth backends are usable because their corresponding plugins were not enabled. Please look at the info messages above to learn which plugin(s) should be enabled.

V2: In fact, `rabbit_plugins:is_enabled/1` indicates if a plugin is
    running, not if it is enabled... The new check runs as a boot step
    and thus is executed before plugins are started. Therefore we can't
    use this API. Instead, we use `rabbit_plugins:enabled_plugins/0'
    which lists explicitly enabled plugins. The drawback is that in the
    auth backend is enabled implicitly because it is a dependency of
    another explicitly enabled plugin, the check will still consider it
    is disabled and thus abort the boot.

Fixes #13783.
2025-08-12 18:38:28 +02:00
Jean-Sébastien Pédron a8bef770a5
rabbit_plugins: Add `which_plugin/1` to query which plugin provides a module
[Why]
This will be used in a later commit to find the auth backend plugin that
provides a configured auth backend module.

[How]
We go through the list of available plugins, regardless if they are
enabled or not, then look up the given module in the list of modules
associated with each plugin's application.
2025-08-12 18:38:28 +02:00
Jean-Sébastien Pédron 6d3d297598
rabbit_plugins: Add `list/0` to get the list of plugins
... without having to pass a plugins path.

[Why]
It's painful to have to get the plugins path, then pass it to `list/1`
every time. It's also more difficult to discover how to use
`rabbit_plugins` to get that list of plugins.
2025-08-12 18:38:28 +02:00
Michael Klishin 30d78a490b
Do not log every transient dependency when listing plugins
The code path in question is executed every time
rabbit_plugins:list/2 (e.g. rabbit_plugins:is_enabled/1)
is used, which with some distributed plugins can
happen once or several times a minute.

Given the maturity of the plugins subsystem, we
arguably can drop those messages.
2025-08-12 12:05:48 -04:00
Michal Kuratczyk c34c803754
Remove flake in prometheus_http_SUITE (#14367)
Sometimes the metrics for streams created by `stream_pub_sub_metrics`
would be returned when the next test starts, breaking the assertions.
2025-08-12 15:03:20 +02:00
Michael Klishin 5f69116a64
Merge pull request #14364 from rabbitmq/mk-bump-cuttlefish
Cuttlefish 3.5.0
2025-08-11 14:15:13 -04:00
Michael Klishin 41c25cfa04
Merge pull request #14360 from rabbitmq/fix-amqp10
Shovel: AMQP1.0 use prefetch-count as credit on delete-after
2025-08-11 13:54:52 -04:00
Michael Klishin 7413511195
Cuttlefish 3.5.0
This version forces prefixed binaries
(such as encrypted:TkQbjiVWtUJw3Ed/hkJ5JIsFIyhruKII6uKPXogfvDyMXGH1qQK3hVqshFolLN0S)
to have alphanumeric prefixes ([a-zA-Z0-9_]+).

This allows us to tell a generated password value
with a colon from an tagged binary.

If a value of, say, default_pass or ssl_options.password
cannot be parsed as a tagged value, it will be
parsed as a regular binary, because rabbit.schema
specifies multiple types as supported.

References #14233.
2025-08-11 13:25:36 -04:00
Michael Klishin 2024a4bc77
Merge pull request #14348 from udeeksha30-netizen/local_random_exchange
Add config option for enabling local_random_exchange
2025-08-11 13:23:56 -04:00
David Ansari 2f49f9da08
Permit amqp_filter_set_bug by default (#14361)
This partially reverts
https://github.com/rabbitmq/rabbitmq-server/pull/14245.
This makes 4.2 <-> 3.13 mixed version tests succeed.
We can set this flag to `denied_by_default` in 4.3.
2025-08-11 19:09:56 +02:00
udeeksha30-netizen 781c14035e Addressed requested changes 2025-08-11 09:51:20 -07:00
Diana Parra Corbacho fa66b4eb4c Shovel: AMQP1.0 use prefetch-count as credit on delete-after 2025-08-11 18:32:11 +02:00
udeeksha30-netizen 93cffe72e0 Removed extra comments 2025-08-11 08:12:33 -07:00
udeeksha30-netizen 2afbc5eeb6 Add config option for enabling local_random_exchange 2025-08-11 07:47:08 -07:00
Deeksha 8a323b1888 Add config option for enabling local_random_exchange 2025-08-11 07:41:09 -07:00
Michal Kuratczyk 87099e8eea
[skip ci] Add module name to the FF debug log (#14357)
Without this change, the logs looked confusing:
```
[debug] <0.217.0> Feature flags: application `rabbit` has 1 feature flags (including deprecated features)
[debug] <0.217.0> Feature flags: application `rabbit` has 23 feature flags (including deprecated features)
[debug] <0.217.0> Feature flags: application `rabbit` has 1 feature flags (including deprecated features)
[debug] <0.217.0> Feature flags: application `rabbit` has 1 feature flags (including deprecated features)
[debug] <0.217.0> Feature flags: application `rabbit` has 1 feature flags (including deprecated features)
[debug] <0.217.0> Feature flags: application `rabbit` has 1 feature flags (including deprecated features)
[debug] <0.217.0> Feature flags: application `rabbit` has 1 feature flags (including deprecated features)
[debug] <0.217.0> Feature flags: application `rabbit` has 2 feature flags (including deprecated features)
```

it wasn't clear why the same app was queried multiple times with different results.
2025-08-11 13:36:52 +02:00
Jean-Sébastien Pédron 18b8f9a02d
Move `rabbit_auth*` from rabbit_common to rabbit
[Why]
These modules are not used by amqp_client. Therefore, they shouldn't be
in rabbit_common.
2025-08-11 09:31:40 +02:00
dependabot[bot] 8f0ecb9059
[skip ci] Bump the dev-deps group across 4 directories with 1 update
Bumps the dev-deps group with 1 update in the /deps/rabbit/test/amqp_jms_SUITE_data directory: [org.assertj:assertj-core](https://github.com/assertj/assertj).
Bumps the dev-deps group with 1 update in the /deps/rabbitmq_mqtt/test/java_SUITE_data directory: [org.assertj:assertj-core](https://github.com/assertj/assertj).
Bumps the dev-deps group with 1 update in the /deps/rabbitmq_stream/test/rabbit_stream_SUITE_data directory: [org.assertj:assertj-core](https://github.com/assertj/assertj).
Bumps the dev-deps group with 1 update in the /deps/rabbitmq_stream_management/test/http_SUITE_data directory: [org.assertj:assertj-core](https://github.com/assertj/assertj).


Updates `org.assertj:assertj-core` from 3.27.3 to 3.27.4
- [Release notes](https://github.com/assertj/assertj/releases)
- [Commits](https://github.com/assertj/assertj/compare/assertj-build-3.27.3...assertj-build-3.27.4)

Updates `org.assertj:assertj-core` from 3.27.3 to 3.27.4
- [Release notes](https://github.com/assertj/assertj/releases)
- [Commits](https://github.com/assertj/assertj/compare/assertj-build-3.27.3...assertj-build-3.27.4)

Updates `org.assertj:assertj-core` from 3.27.3 to 3.27.4
- [Release notes](https://github.com/assertj/assertj/releases)
- [Commits](https://github.com/assertj/assertj/compare/assertj-build-3.27.3...assertj-build-3.27.4)

Updates `org.assertj:assertj-core` from 3.27.3 to 3.27.4
- [Release notes](https://github.com/assertj/assertj/releases)
- [Commits](https://github.com/assertj/assertj/compare/assertj-build-3.27.3...assertj-build-3.27.4)

---
updated-dependencies:
- dependency-name: org.assertj:assertj-core
  dependency-version: 3.27.4
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: dev-deps
- dependency-name: org.assertj:assertj-core
  dependency-version: 3.27.4
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: dev-deps
- dependency-name: org.assertj:assertj-core
  dependency-version: 3.27.4
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: dev-deps
- dependency-name: org.assertj:assertj-core
  dependency-version: 3.27.4
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: dev-deps
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-09 18:28:24 +00:00
Deeksha febcdbb1b8 Add config option for enabling local_random_exchange 2025-08-08 10:33:11 -07:00
David Ansari edfd8ffede Assert confirm when responder publishes to reply queue 2025-08-08 16:43:09 +02:00
Arnaud Cogoluègnes 22a959331b
Use advertised TLS host setting in metadata frame
The rabbitmq_stream.advertised_tls_host setting is not used in the
metadata frame of the stream protocol, even if it is set. This commit
makes sure the setting is used if set.

References rabbitmq/rabbitmq-stream-java-client#803
2025-08-08 12:33:52 +00:00
Jean-Sébastien Pédron 0a5024b47e
python_SUITE: Add more debug messages 2025-08-08 10:12:59 +02:00
Jean-Sébastien Pédron 5bfb7bc26f
python_SUITE: Increase unittest verbosity
[Why]
I noticed the following error in a test case:

    error sending frame
    Traceback (most recent call last):
      File "/home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbitmq_stomp/test/python_SUITE_data/src/deps/stomp/transport.py", line 623, in send
        self.socket.sendall(encoded_frame)
    OSError: [Errno 9] Bad file descriptor

When the test suite succeeds, this error is not present. When it failed,
it was present. But I checked only one instance of each, it's not enough
to draw any conclusion about the relationship between this error and the
failing test case later.

I have no idea which test case hits this error, so increase the
verbosity, in the hope we see the name of the test case running at the
time of this error.
2025-08-08 10:12:59 +02:00
Jean-Sébastien Pédron 766ca19ad0
python_SUITE: Wait for the AMQP connection to close in `x_queue_name.py`
[Why]
I still don't know what causes the transient failures in this testsuite.
The AMQP connection is closed asynchronously, therefore the next test
case is running when it finishes to close. I have no idea if it causes
troubles, but it makes the broker logs more difficult to read.
2025-08-08 10:12:58 +02:00
Jean-Sébastien Pédron 350bda1081
python_SUITE: Bump Python dependencies to their latest versions 2025-08-08 10:12:58 +02:00
Jean-Sébastien Pédron 5f520b8820
python_SUITE: Increase a timeout in `test_exchange_dest` and `test_topic_dest`
[Why]
The `test_topic_dest` test case fails from time to time in CI. I don't
know why as there are no errors logged anywhere. Let's assume it's a
timeout a bit too short.

While here, apply the same change to `test_exchange_dest`.
2025-08-08 10:12:58 +02:00
Jean-Sébastien Pédron 0e36184a61
cluster_SUITE: Handle error returned by rabbit_ct_broker_helpers
[Why]
It didn't handle them before and crashed later when it assumed the
return value was a list.
2025-08-08 10:12:58 +02:00
Jean-Sébastien Pédron eb8f631e22
proxy_protocol_SUITE: Wait for connection close
[Why]
`gen_tcp:close/1` simply closes the connection and doesn't wait for the
broker to handle it. This sometimes causes the next test to fail
because, in addition to that test's new connection, there is still the
previous one's process still around waiting for the broker to notice the
close.

[How]
We now wait for the connection to be closed at the end of a test case,
and wait for the connection list to have a single element when we want
to query the connnection name.
2025-08-08 10:12:58 +02:00
Jean-Sébastien Pédron 0601ef4f0f
jwks_SUITE: Wait for connection exit in `test_failed_token_refresh_case2`
[Why]
The connection is about to be killed at the end of the test case. It's
not necessary to close it explicitly.

Moreover, on a slow environment like CI, the connection process might
have already exited when the test case tries to close it. In this case,
it fails with a `noproc` exception.
2025-08-08 10:12:58 +02:00
Jean-Sébastien Pédron 02b1561556
auth_SUITE: Wait for connection tracking to be up-to-date
... when testing user limits

[How]
This is the same fix as the one for the vhost limits test case made in
commit 5aab965db4.

While here, fix a compiler warning about an unused variable.
2025-08-08 10:12:58 +02:00
Jean-Sébastien Pédron 5c1456b2d6
auth_SUITE: Handle error returned by rabbit_ct_broker_helpers
[Why]
It didn't handle them before and crashed later when it assumed the
return value was a list.
2025-08-08 10:12:58 +02:00
Jean-Sébastien Pédron 0a643ef339
feature_flags_v2_SUITE: Catch and log return value of peer:stop/1
[Why]
It failed at least once in CI. It should help us understand what went
on.
2025-08-08 10:12:58 +02:00
Jean-Sébastien Pédron fda663d6d4
amqp10_inter_cluster_SUITE: Log messages and queues length
[Why]
This should also help debug the failures we get in CI.
2025-08-08 10:12:58 +02:00
Jean-Sébastien Pédron 5936b3bb95
amqp10_inter_cluster_SUITE: Use per-test shovel names
[Why]
There is a frequent failure in CI and the fact that all test cases use
the same resource names does not help with debugging.
2025-08-08 10:12:57 +02:00
Jean-Sébastien Pédron efdec84291
amqp10_inter_cluster_SUITE: Wait for queue length to reach expectations
[Why]
Relying on the return value of the queue deletion is fragile because the
policy is cleared asynchronously.

[How]
We now wait for the queues to reach the expected queue length, then we
delete them and ensure the length didn't change.
2025-08-08 10:12:57 +02:00
Jean-Sébastien Pédron bd1978ce9c
amqp_client_SUITE: Load test module on broker before using one of its anonymous functions
[Why]
Before this change, when the `idle_time_out_on_server/1` test case was runned first in the
shuffled test group, the test module was not loaded on the remote broker.
When the anonymous function was passed to meck and was executed, we got
the following crash on the broker:

    crasher:
      initial call: rabbit_heartbeat:'-heartbeater/2-fun-0-'/0
      pid: <0.704.0>
      registered_name: []
      exception error: {undef,
                           [{#Fun<amqp_client_SUITE.14.116163631>,
                             [#Port<0.45>,[recv_oct]],
                             []},
                            {rabbit_heartbeat,get_sock_stats,3,
                                [{file,"rabbit_heartbeat.erl"},{line,175}]},
                            {rabbit_heartbeat,heartbeater,3,
                                [{file,"rabbit_heartbeat.erl"},{line,155}]},
                            {proc_lib,init_p,3,
                                [{file,"proc_lib.erl"},{line,317}]},
                            {rabbit_net,getstat,[#Port<0.45>,[recv_oct]],[]}]}

This led to a failure of the test case later, when it waited for a
message from the connecrtion.

We do the same in two other test cases where this is likely to happen
too.

[How]
Loading the module first fixes the problem.
2025-08-08 10:12:57 +02:00
Jean-Sébastien Pédron 56b59c3d3e
amqp_client_SUITE: Trim "list_connections" output in one more place
[Why]
The reason is the same as for commit
ffaf919846. It should have been part of it
in fact, so an oversight from my end.
2025-08-08 10:12:57 +02:00
Jean-Sébastien Pédron 8bdbb0fc23
mqtt_shared_SUITE: Handle error returned by rabbit_ct_broker_helpers
[Why]
It didn't handle them before and crashed later when it assumed the
return value was a list.
2025-08-08 10:12:57 +02:00
Jean-Sébastien Pédron 19ed2493a4
amqp_jms_SUITE: Increase time trap
[Why]
Maven took ages to fetch dependencies at least once in CI. The testsuite
failed because it reached the time trap limit.

[How]
Increase it from 2 to 5 minutes.
2025-08-08 10:12:57 +02:00
Jean-Sébastien Pédron 73c663eb5d
rabbit_stream_partitions_SUITE: Fix incorrect log message 2025-08-08 10:12:57 +02:00
Jean-Sébastien Pédron ea2689f06a
rabbit_exchange_type_consistent_hash_SUITE: Don't enable a feature flag that never existed
[Why]
The `rabbit_consistent_hash_exchange_raft_based_metadata_store` does not
seem to be a feature flag that ever existed according to the git
history. This causes the test case to always be skipped.

[How]
Simply remove the statement that enables this ghost feature flag.
2025-08-08 10:12:57 +02:00
Jean-Sébastien Pédron 832d701f1f
rabbit_exchange_type_consistent_hash_SUITE: Set timetrap to 5 minutes 2025-08-08 10:12:57 +02:00
Jean-Sébastien Pédron 17feaa158c
rabbit_exchange_type_consistent_hash_SUITE: Open/close connection explicitly
[Why]
In CI, we observe that the channel hangs sometimes.
rabbitmq_ct_client_helpers implicit connection is quite fragile in the
sense that a test case can disturb the next one in some cases.

[How]
Let's use a dedicated connection and see if it fixes the problem.
2025-08-08 10:12:57 +02:00
Jean-Sébastien Pédron 267445680f
rabbit_prometheus_http_SUITE: Run `stream_pub_sub_metrics` first
[Why]
I wonder if a previous test interferes with the metrics verified by this
test case. To be safer, execute it first and let's see what happens.
2025-08-08 10:12:57 +02:00
Jean-Sébastien Pédron 2bc8d117b6
rabbit_prometheus_http_SUITE: Log more details for a future failure in CI
[Why]
The `stream_pub_sub_metrics` test failed at least once in CI because the
`rabbitmq_stream_consumer_max_offset_lag` was 4 instead of the expected
3 on line 815.

I couldn't reproduce the problem so far.

[How]
The test case now logs the initial value of that metric at the beginning
of the test function. Hopefully this will give us some clue for the day
it fails again.
2025-08-08 10:12:56 +02:00