To take more frequent checkpoints for large message workload
Lower the min_checkpoint_interval substantially to allow quorum queues
better control over when checkpoints are taken.
Track bytes enqueued in the aux state and suggest a checkpoint after
every 64MB enqueued (this value is scaled according to backlog just
like the indexes condition).
This should help with more timely checkpointing when very large
messages is used.
Try evaluating byte size independently of time window
also increase max size
This commit fixes a bug in the Erlang AMQP 1.0 client.
Prior to this commit, to repro this bug:
1. Send more than 2^16 messages to a queue.
2. Grant more than a total of 2^16 link credit initially (on a single link
or across multiple links) on a single session without any
auto or manual link credit renewal.
The expectation is that thanks to sufficiently granted initial link-credit,
the client will receive all messages.
However, consumption stops after exactly 2^16-1 messages.
That's because the client lib was never sending a flow frame to the server.
So, after the client received all 2^16-1 messages (the initial
incoming-window set by the client), the server's remote-incoming-window
reached 0 causing the server to stop delivering messages.
The expectation is that the client lib automatically handles session
flow control without any manual involvement of the client app.
This commit implements this fix:
* We keep the server's remote-incoming window always large by default as
explained in https://www.rabbitmq.com/blog/2024/09/02/amqp-flow-control#incoming-window
* Hence, the client lib sets its incoming-window to 100,000 initially.
* The client lib tracks its incoming-window decrementing it by 1 for
every transfer it received. (This wasn't done prior to this commit.)
* Whenever this window shrinks below 50,000, the client sends a flow
frame without any link information widening its incoming-window back to 100,000.
* For test cases (maybe later for apps as well), there is a new function
`amqp10_client_session:flow/3`, which allows for a test case to do manual
session flow control. Its API is designed very similar to
`amqp10_client_session:flow_link/4` in that the test can optionally request
the lib to auto widen the session window whenever it falls below a certain threshold.
This avoids using Mix while compiling which simplifies
a number of things and let us do further build improvements
later on.
Elixir is only enabled from within rabbitmq_cli currently.
Eunit is disabled since there are only Elixir tests.
Dialyzer will force-enable Elixir in order to process
Elixir-compiled beam files.
This commit also includes a few changes that are
related:
* The Erlang distribution will now be started for parallel-ct
* Many unnecessary PROJECT_MOD lines have been removed
* `eunit_formatters` has been removed, it provides little value
* The new `maybe_flock` Erlang.mk function is used where possible
* Build test deps when testing rabbitmq_cli (Mix won't do it anymore)
* rabbitmq_ct_helpers now use the early plugins to have Dialyzer
properly set up
It also happens from time to time that HTTP clients use the wrong port
5672. Like for TLS clients connecting to 5672, RabbitMQ now prints a
more descriptive log message.
For example
```
curl http://localhost:5672
```
will log
```
[info] <0.946.0> accepting AMQP connection [::1]:57736 -> [::1]:5672
[error] <0.946.0> closing AMQP connection <0.946.0> ([::1]:57736 -> [::1]:5672, duration: '1ms'):
[error] <0.946.0> {detected_unexpected_http_header,<<"GET / HT">>}
```
We only check here for GET and not for all other HTTP methods, since
that's the most common case.
## What?
If a TLS client app is misconfigured trying to connect to AMQP port 5672
instead to the AMQPS port 5671, this commit makes RabbitMQ log a more
descriptive error message.
```
openssl s_client -connect localhost:5672 -tls1_3
openssl s_client -connect localhost:5672 -tls1_2
```
RabbitMQ logs prior to this commit:
```
[info] <0.1073.0> accepting AMQP connection [::1]:53535 -> [::1]:5672
[error] <0.1073.0> closing AMQP connection <0.1073.0> ([::1]:53535 -> [::1]:5672, duration: '0ms'):
[error] <0.1073.0> {bad_header,<<22,3,1,0,192,1,0,0>>}
[info] <0.1080.0> accepting AMQP connection [::1]:53577 -> [::1]:5672
[error] <0.1080.0> closing AMQP connection <0.1080.0> ([::1]:53577 -> [::1]:5672, duration: '1ms'):
[error] <0.1080.0> {bad_header,<<22,3,1,0,224,1,0,0>>}
```
RabbitMQ logs after this commit:
```
[info] <0.969.0> accepting AMQP connection [::1]:53632 -> [::1]:5672
[error] <0.969.0> closing AMQP connection <0.969.0> ([::1]:53632 -> [::1]:5672, duration: '0ms'):
[error] <0.969.0> {detected_unexpected_tls_header,<<22,3,1,0,192,1,0,0>>
[info] <0.975.0> accepting AMQP connection [::1]:53638 -> [::1]:5672
[error] <0.975.0> closing AMQP connection <0.975.0> ([::1]:53638 -> [::1]:5672, duration: '1ms'):
[error] <0.975.0> {detected_unexpected_tls_header,<<22,3,1,0,224,1,0,0>>}
```
## Why?
I've seen numerous occurrences in the past few years where misconfigured TLS apps
connected to the wrong port. Therefore, RabbitMQ trying to detect a TLS client
and providing a more descriptive log message seems appropriate to me.
## How?
The first few bytes of any TLS connection are:
Record Type (1 byte):
Always 0x16 (22 in decimal) for a Handshake message.
Version (2 bytes):
This represents the highest version of TLS that the client supports. Common values:
0x0301 → TLS 1.0 (or SSL 3.1)
0x0302 → TLS 1.1
0x0303 → TLS 1.2
0x0304 → TLS 1.3
Record Length (2 bytes):
Specifies the length of the following handshake message.
Handshake Type (1 byte, usually the 6th byte overall):
Always 0x01 for ClientHello.
[Why]
Khepri already managed retries if needed, we can just use a timeout.
Note that the timeout was already bumped to a more appropriate 5
minutes, which also matches what we had with Mnesia. However, with 10
retries by default, it meant that this timeout at the end of `init/1`
would thus be 5 * 10 = 50 minutes.
* RMQ-1263: Check if queue protected from deleted inside rabbit_amqqueue:with_delete
Delayed exchange automatically manages associated Delayed Queue. We don't want users to delete it accidentally.
If queue is indeed protected its removal can be forced by calling with
?INTERNAL_USER as ActingUser.
* RMQ-1263: Correct a type spec of amqqueue:internal_owner/1
* RMQ-1263: Add protected queues test
---------
Co-authored-by: Iliia Khaprov <iliia.khaprov@broadcom.net>
Co-authored-by: Michael Klishin <klishinm@vmware.com>
(cherry picked from commit 97f44adfad6d0d98feb1c3a47de76e72694c19e0)
... and cache it.
[Why]
It happens at least in CI that the computed start time varies by a few
seconds. I think this comes from the Erlang time offset which might be
adjusted over time.
This affects peer discovery's sorting of RabbitMQ nodes which uses that
start time to determine the oldest node. When the start time of a node
changes, it could be considered the seed node to join by some nodes but
ignored by the other nodes, leading to troubles with cluster formation.
[Why]
It happens in CI from time to time and it was crashing the channel
process. There is always a `channel.close` method pending in the
channel mailbox.
[How]
For now, log something and ignore the DOWN message. The channel will
exit after handling the pending `channel.close` method anyway.
* Implement rabbitmq-queues leader_health_check command for quorum queues
(cherry picked from commit c26edbef33)
* Tests for rabbitmq-queues leader_health_check command
(cherry picked from commit 6cc03b0009)
* Ensure calling ParentPID in leader health check execution and
reuse and extend formatting API, with amqqueue:to_printable/2
(cherry picked from commit 76d66a1fd7)
* Extend core leader health check tests and update badrpc error handling in cli tests
(cherry picked from commit 857e2a73ca)
* Refactor leader_health_check command validators and ignore vhost arg
(cherry picked from commit 6cf9339e49)
* Update leader_health_check_command description and banner
(cherry picked from commit 96b8bced2d)
* Improve output formatting for healthy leaders and support
silent mode in rabbitmq-queues leader_health_check command
(cherry picked from commit 239a69b404)
* Support global flag to run leader health check for
all queues in all vhosts on local node
(cherry picked from commit 48ba3e161f)
* Return immediately for leader health checks on empty vhosts
(cherry picked from commit 7873737b35)
* Rename leader health check timeout refs
(cherry picked from commit b7dec89b87)
* Update banner message for global leader health check
(cherry picked from commit c7da4d5b24)
* QQ leader-health-check: check_process_limit_safety before spawning leader checks
(cherry picked from commit 17368454c5)
* Log leader health check result in broker logs (if any leaderless queues)
(cherry picked from commit 1084179a2c)
* Ensure check_passed result for leader health internal calls)
(cherry picked from commit 68739a6bd2)
* Extend CLI format output to process check_passed payload
(cherry picked from commit 5f5e9922bd)
* Format leader healthcheck result log and function exports
(cherry picked from commit ebffd7d8a4)
* Change leader_health_check command scope from queues to diagnostics
(cherry picked from commit 663fc9846e)
* Update (c) line year
(cherry picked from commit df82f12a70)
* Rename command to check_for_quorum_queues_without_an_elected_leader
and use across_all_vhosts option for global checks
(cherry picked from commit b2acbae28e)
* Use rabbit_db_queue for qq leader health check lookups
and introduce rabbit_db_queue:get_all_by_type_and_vhost/2.
Update leader health check timeout to 5s and process limit
threshold to 20% of node's process_limit.
(cherry picked from commit 7a8e166ff6)
* Update tests: quorum_queue_SUITE and rabbit_db_queue_SUITE
(cherry picked from commit 9bdb81fd79)
* Fix typo (cli test module)
(cherry picked from commit 615856853a)
* Small refactor - simpler final leader health check result return on function head match
(cherry picked from commit ea07938f3d)
* Clear dialyzer warning & fix type spec
(cherry picked from commit a45aa81bd2)
* Ignore result without strict match to avoid diayzer warning
(cherry picked from commit bb43c0b929)
* 'rabbitmq-diagnostics check_for_quorum_queues_without_an_elected_leader' documentation edits
(cherry picked from commit 845230b0b380a5f5bad4e571a759c10f5cc93b91)
* 'rabbitmq-diagnostics check_for_quorum_queues_without_an_elected_leader' output copywriting
(cherry picked from commit 235f43bad58d3a286faa0377b8778fcbe6f8705d)
* diagnostics check_for_quorum_queues_without_an_elected_leader: behave like a health check w.r.t. error reporting
(cherry picked from commit db7376797581e4716e659fad85ef484cc6f0ea15)
* check_for_quorum_queues_without_an_elected_leader: handle --quiet and --silent
plus simplify function heads.
References #13433.
(cherry picked from commit 7b392315d5e597e5171a0c8196230d92b8ea8e92)
---------
Co-authored-by: Ayanda Dube <adube14@bloomberg.net>
[Why]
This testsuite is very unstable and it is difficult to debug while it is
part of a `parallel-ct` group. It also forced us to re-run the entire
`parallel-ct` group just to retry that one testsuite.
The `buffer` socket option will be changed dynamically
based on how much data is received.
This is restricted to AMQP protocols (old and 1.0).
The algorithm is a little different than Cowboy 2.13.
The moving average is less reactive (div 8 instead of 2)
and floats are used so that using smaller lower buffer
values is possible (otherwise the rounding prevents
increasing buffer sizes). The lower buffer size was
set to 128 as a result.
Compared to the previous which was to set `buffer` to
`rcvbuf` effectively, often to 131072 on Linux for
example, the performance sees a slight improvement
in various scenarios for all message sizes using
AMQP-0.9.1 and a lower memory usage as well. But
the difference is small in the benchmarks we have
run (5% to 10%), whereas Cowboy saw a huge improvement
because its default was very small (1460).
For AMQP-1.0 this seems to be no worse but we didn't
detect a clear improvement. We saw scenarios where
small message sizes showed improvement, and large
message sizes showed a regression. But we are even
less confident with these results. David (AMQP-1.0
native developer) ran a few tests and didn't see a
regression.
The dynamic buffer code is currently identical for
old and 1.0 AMQP. But we might tweak them differently
in the future so they're left as duplicate for now.
This is because different protocols have different
behaviors and so the algorithm may need to be tweaked
differently for each protocol.
The `msg` record was used in 3.13. This commit makes 4.x understand
this record for backward compatibility, specifically for the rare case where:
1. a 3.13 node internally parsed a message from a stream via
```
Message = mc:init(mc_amqp, amqp10_framing:decode_bin(Bin), #{})
```
2. published this Message to a queue
3. RabbitMQ got upgraded to 4.x
(This commit can be reverted in some future RabbitMQ version once it's
safe to assume that these upgraded messages have been consumed.)
The changes were manually tested as described in Jira RMQ-1525.
We must consider whether the previous current file is empty
(has data written, but was already removed) when writing
large messages and opening a file specifically for the large
message. If we don't, then the file will never get deleted
as we only consider files for deletion when a message gets
removed (and there are none).
This is only an issue for large messages. Small messages
write a message than roll over to a new file, so there is
at least one valid message. Large messages close the current
file first, regardless of there being a valid message.
The `rabbit_registry` boot step starts up the `rabbit_registry` gen
server from `rabbit_common`. This is a registry somewhat similar to
the feature flag registry - it's meant to protect an ETS table used for
looking up implementers of behaviors. The registry and its ETS table
should be available as early as possible: the step should enable
external_infrastructure rather than require it.
The previous behaviour was passing solely the message ID making
queue implementations such as, for example, the priority one hard
to fulfil.
Signed-off-by: Matteo Cafasso <noxdafox@gmail.com>
(cherry picked from commit 1f7a27c51d)
Since https://github.com/rabbitmq/rabbitmq-server/pull/13242 updated
Cowlib to v2.14.0, this commit deletes rabbit_uri as written in the
comments of rabbit_uri.erl:
```
This file is a partial copy of
https://github.com/ninenines/cowlib/blob/optimise-urldecode/src/cow_uri.erl
We use this copy because:
1. uri_string:unquote/1 is lax: It doesn't validate that characters that are
required to be percent encoded are indeed percent encoded. In RabbitMQ,
we want to enforce that proper percent encoding is done by AMQP clients.
2. uri_string:unquote/1 and cow_uri:urldecode/1 in cowlib v2.13.0 are both
slow because they allocate a new binary for the common case where no
character was percent encoded.
When a new cowlib version is released, we should make app rabbit depend on
app cowlib calling cow_uri:urldecode/1 and delete this file (rabbit_uri.erl).
```
Since 4.0.0 (commit d45fbc3d) the shared message store writes large
messages into their own rdq files. This information can be utilised
when scanning rdq files during recovery to avoid reading in the whole
message body into memory unnecessarily.
This commit addresses the same issue that was addressed in 3.13.x by
commit baeefbec (ie. appending a large binary together from 4MB chunks
leaves a lot of garbage and memory fragmentation behind) but even more
efficiently.
Large messages which were written before 4.0.0, which don't fully fill
the rdq file, are still handled as before.
```
make -C deps/rabbit ct-rabbit_stream_queue t=cluster_size_3_parallel_1 RABBITMQ_METADATA_STORE=mnesia
```
flaked prior to this commit locally on Ubuntu with the following error after 11 runs:
```
rabbit_stream_queue_SUITE > cluster_size_3_parallel_1 > consume_from_replica
{error,
{{shutdown,
{server_initiated_close,406,
<<"PRECONDITION_FAILED - stream queue 'consume_from_replica' in vhost '/' does not have a running replica on the local node">>}},
{gen_server,call,
[<0.8365.0>,
{subscribe,
{'basic.consume',0,<<"consume_from_replica">>,
<<"ctag">>,false,false,false,false,
[{<<"x-stream-offset">>,long,0}]},
<0.8151.0>},
infinity]}}}
```
Initialising a message container from data stored in a
stream is a special case where we need to recover exchange
and routing key information from the following message annatations:
* x-exchange
* x-routing-keys
* x-cc
We do not want to do this when initialising a message container
from AMQP data just received from a publisher.
This commit introduces a new function `mc_amqp:init_from_stream/2`
that is to be used when needing a message container from a stream
message.
[Why]
During mixed-version testing, the old node might not be able to join or
rejoin a cluster if the other nodes run a newer Khepri machine version.
[How]
The old node is used as the cluster seed node and is never touched
otherwise. Other nodes are restarted or join the cluster later.
## What?
Support the `dynamic` field of sources and targets.
## Why?
1. This allows AMQP clients to dynamically create exclusive queues, which
can be useful for RPC workloads.
2. Support creation of JMS temporary queues over AMQP using the Qpid JMS
client. Exclusive queues map very nicely to JMS temporary queues
because:
> Although sessions are used to create temporary destinations, this is only
for convenience. Their scope is actually the entire connection. Their
lifetime is that of their connection and any of the connection’s sessions
are allowed to create a consumer for them.
https://jakarta.ee/specifications/messaging/3.1/jakarta-messaging-spec-3.1#creating-temporary-destinations
## How?
If the terminus contains the capability `temporary-queue` as defined in
[amqp-bindmap-jms-v1.0-wd10](https://groups.oasis-open.org/higherlogic/ws/public/document?document_id=67638)
[5.2] and as sent by Qpid JMS client,
RabbitMQ will create an exclusive queue.
(This allows a future commit to take other actions if capability
`temporary-topic` will be used, such as the additional creation of bindings.)
No matter what the desired node properties are, RabbitMQ will set the
lifetime policy delete-on-close deleting the exclusive queue when the
link which caused its creation ceases to exist. This means the exclusive
queue will be deleted if either:
* the link gets detached, or
* the session ends, or
* the connection closes
Although the AMQP JMS Mapping and Qpid JMS create only a **sending** link
with `dynamic=true`, this commit also supports **receiving** links with
`dynamic=true` for non-JMS AMQP clients.
RabbitMQ is free to choose the generated queue name. As suggested by the
AMQP spec, the generated queue name will contain the container-id and link
name unless they are very long.
Co-authored-by: Arnaud Cogoluègnes <acogoluegnes@gmail.com>
Make AMQP 1.0 connection shut down its sessions before sending the
close frame to the client similar to how the AMQP 0.9.1 connection
shuts down its channels before closing the connection.
This commit avoids concurrent deletion of exclusive queues by the session process
and the classic queue process.
This commit should also fix https://github.com/rabbitmq/rabbitmq-server/issues/2596
[Why]
Some testcases used to use node 1 as the clustering seed node. With
mixed-version testing, it could cause issues because node 1 would start
with a new version of Ra compared to node 2 and node 2 could fail to
join.
[How]
By using node 2 as the seed node, node 1 running a newer version of Ra
should be able to join because it supports talking to an older version.
[Why]
The `force_reset` command simply removes local files on disk for the
local node.
In the case of Ra, this can't work because the rest of the cluster does
not know about the forced-reset node. Therefore the leader will continue
to send `append_entry` commands to the reset node.
If that forced-reset node restarts and receives these messages, it will
either join the cluster again (because it's on an older Raft term) or it
will hit an assertion and exit (because it's on the same Raft term).
[How]
Given we can't really support this scenario and it has little value, the
command will now return an error if someone attemps a `force_reset` with
a node running Khepri.
This also deprecates the command: once Mnesia support is removed, the
command will be removed at the same time. This is noted in the
rabbitmqctl.8 manpage.
* Redesigned k8s peer discovery
Rather than querying the Kubernetes API, just check the local node name
and try to connect to the pod with `-0` suffix (or configured
`ordinal_start` value). Only the pod with the lowest ordinal can form
a new cluster - all other pods will wait forever.
This should prevent any race conditions and incorrectly formed clusters.
This commit contains the following changes:
1. Simplify .NET suite
2. Simplify Java package naming
3. Extract JMS tests into separate suite. This way, it's easier to run,
debug, and add new tests compared to the previous suite which mixed
.NET tests with JMS tests.
4. Add tests for different JMS message types
for the backends that support it in the first place.
When forming a cluster, registration of the node
joining the cluster might be left to (container)
orchestration tools like Nomad or Kubernetes.
This PR add a new configuration option,
'cluster_formation.registration.enable',
which defaults to true.
When set to false node registration will be skipped.
There is at least one important advantage using a
tool such as Nomad (plus Consul) over the application
(RabbitMQ) doing the registration.
When the application is not stopped gracefully for
any reason, e.g. its OOM killed,
it cannot deregister the service/node.
This leaves behind an unlinked service entry in the registry.
This problem is fundamentally avoided by allowing
Nomad (or similar tools) to register the
node'service.
See #11233#11045 for prior discussions.
Co-authored-by: Frederik Bosch <f.bosch@genkgo.nl>
As described in section 7.1 of filtex-v1.0-wd09:
> Impose a limit on the complexity of each filter expression.
Here, we hard code the maximum properties within a filter expression to 16.
There should never be a use case requiring to filter on more than 16
different properties.
to match that used with Mnesia.
In the case of Mnesia, there are 10 retries
with a 30 second delay each.
For Khepri, a single timeout is used, so it
must be ten times as long.
from rabbit_fifo version 0.
The same was also implemented for the stream coordinator.
QQ: avoid dead lock in queue federation.
When processing the queue federation startup even the process
may call back into the ra process causing a deadlock. in this
case we spawn a temporary process to avoid this.
This offloads the work of reading messages from on-disk segments
to the interacting process rather than doing this blocking, performance
affecting work in the ra server process.
QQ: ensure opened segments are closed after some time of inactivity
Processes that havea received messages that had to be read from disks
may keep a segment open indefinitely. This introduces a timer which
after some time of inactivity will close all opened segments to ensure
file descriptors are not kept open indefinitely.
[Why]
When running mixed-version tests, nodes 1/3/5/... are using the primary
umbrella, so usually the newest version. Nodes 2/4/6/... are using the
secondary umbrella, thus the old version.
When clustering, we used to use node 1 (running a new version) as the
seed node, meaning other nodes would join it.
This complicates things with feature flags because we have to make sure
that we start node 1 with new stable feature flags disabled to allow old
nodes to join.
This is also a problem with Khepri machine versions because the cluster
would start with the latest version, which old nodes might not have.
[How]
This patch changes the logic to use a node running the secondary
umbrella as the seed node instead. If there is no node running it, we
pick the first node as before.
V2: Revert part of "rabbitmq_ct_helpers: Fix how we set
`$RABBITMQ_FEATURE_FLAGS` in tests" (commit
57ed962ef6). These changes are no
longer needed with the new logic.
V3: The check that verifies that the correct metadata store is used has
a special case for nodes that use the secondary umbrella: if Khepri
is supposed to be used but it's not, the feature flag is enabled.
The reason is that the `v4.0.x` branch doesn't know about the `rel`
configuration of `forced_feature_flags_on_init`. The nodes will
have ignored thies parameter and booted with the stable feature
flags only.
Many testsuites are adapted to the new clustering order. If they
manage which node joins which node, either the order is changed in
the testcases, or nodes are started with only required feature
flags. For testsuites that rely on peer discovery where the order is
unknown, nodes are started with only required feature flags.
[How]
1. Use feature flags correctly: the code shouldn't test if a feature
flag is enabled, assuming something else enabled it. It should enable
it and react to an error.
2. Use `close_connection_sync/1` instead of the asynchronous
`amqp10_client:close_connection/1` to make sure they are really
closed. The wait in `end_per_testcase/2` was not enough apparently.
3. For the two testcases that flake the most for me, enclose the code in
a try/after and make sure to close the connection at the end,
regardless of the result. This should be done for all testcases
because the testgroup use a single set of RabbitMQ nodes for all
testcases, therefore testcases are supposed to clean up after them...
This commit is no change in functionality and mostly deletes dead code.
1. Code targeting Erlang 22 and below is deleted since the mininmum
required Erlang version is higher nowadays.
"In OTP 23 distribution flag DFLAG_BIG_CREATION became mandatory. All
pids are now encoded using NEW_PID_EXT, even external pids received
as PID_EXT from older nodes."
https://www.erlang.org/doc/apps/erts/erl_ext_dist.html#new_pid_ext
2. All v1 encoding and decoding of the Pid is deleted since the lower
version RabbitMQ node supports the v2 encoding nowadays.
When a leader changes all enqueuer and consumer processes are notified
from the `state_enter(leader,` callback. However a new leader may not
yet have applied all commands that the old leader had. If any of those
commands is a checkout or a register_enqueuer command these processes
will not be notified of the new leader and thus may never resend their
pending commands.
The new leader will however send an applied notification when it does
apply these entries and these are always sent from the leader process
so can also be used to trigger pending resends. This commit implements
that.
## What?
This commit fixes#13040.
Prior to this commit, exchange federation crashed if the MQTT topic exchange
(`amq.topic` by default) got federated and MQTT 5.0 clients subscribed on the
downstream. That's because the federation plugin sends bindings from downstream
to upstream via AMQP 0.9.1. However, binding arguments containing Erlang record
`mqtt_subscription_opts` (henceforth binding args v1) cannot be encoded in AMQP 0.9.1.
## Why?
Federating the MQTT topic exchange could be useful for warm standby use cases.
## How?
This commit makes binding arguments a valid AMQP 0.9.1 table (henceforth
binding args v2).
Binding args v2 can only be used if all nodes support it. Hence binding
args v2 comes with feature flag `rabbitmq_4.1.0`. Note that the AMQP
over WebSocket
[PR](https://github.com/rabbitmq/rabbitmq-server/pull/13071) already
introduces this same feature flag. Although the feature flag subsystem
supports plugins to define their own feature flags, and the MQTT plugin
defined its own feature flags in the past, reusing feature flag
`rabbitmq_4.1.0` is simpler.
This commit also avoids database migrations for both Mnesia and Khepri
if feature flag `rabbitmq_4.1.0` gets enabled. Instead, it's simpler to
migrate binding args v1 to binding args v2 at MQTT connection establishment
time if the feature flag is enabled. (If the feature flag is disabled at
connection etablishment time, but gets enabled during the connection
lifetime, the connection keeps using bindings args v1.)
This commit adds two new suites:
1. `federation_SUITE` which tests that federating the MQTT topic
exchange works, and
2. `feature_flag_SUITE` which tests the binding args migration from v1 to v2.
msg_store_io_batch_size is no longer used
msg_store_credit_disc_bound appears to be used in the code, but I don't
see any impact of that value on the performance. It should be properly
investigated and either removed completely or fixed, because there's
hardly any point in warning about the values configured
(plus, this settings is hopefully almost never used anyway)
According to the `rabbit_backing_queue` behavious it must always
return `ok`, but it used to return a list of results one for each
priority. That caused the below crash further up the call chain.
```
> rabbit_classic_queue:delete_crashed(Q)
** exception error: no case clause matching [ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok]
in function rabbit_classic_queue:delete_crashed/2 (rabbit_classic_queue.erl, line 516)
```
Other backing_queue implementations (`rabbit_variable_queue`) just
exit with a badmatch upon error.
This (very minor) issue is present since 3.13.0 when
`rabbit_classic_queue:delete_crashed_in_backing_queue/1` was
instroduced with Khepri in commit 5f0981c5. Before that the result of
`BQ:delete_crashed/1` was simply ignored.
Include monitored session pids in format_status/1 of rabbit_amqp_writer.
They could be useful when debugging.
The maximum number of sessions per connection is limited, hence the
output won't be too large.
[Why]
In order to make `khepri_db` the default in the future, the handling of
`$RABBITMQ_FEATURE_FLAGS` had to be adapted to be able to *disable*
Khepri instead.
Unfortunately I broke the behavior with stable feature flags that are
only available in the primary umbrella. In this case, they were
automatically enabled and thus, clustering with an old umbrella that did
not have these feature flags failed with `incompatible_feature_flags`.
[How]
The solution is to always use an absolute list of feature flags, not the
new relative list.
V2: Allow a testsuite to skip the configuration of the metadata store.
This is needed for the feature_flags_SUITE testsuite because it
tests the default behavior and the configuration of the metadata
store changes that behavior.
While here, fix a ct log message where variables were swapped
compared to the format strieg expectation.
V3: Enable `rabbitmq_4.0.0` feature flag in rabbit_mgmt_http_SUITE. This
testsuite apparently requires it and if it's not enabled, it fails.
Without this change, consumers using protocols other than the stream
protocol would display as inactive in the Management UI/API and CLI
commands, even though they were receiving messages.
Accidental "fat finger" virtual deletion accidents
would be easier to avoid if there was a protection mechanism
that would apply equally even to CLI tools and external
applications that do not use confirmations for deletion
operations.
This introduce the following changes:
* Virtual host metadata now supports a new queue,
'protected_from_deletion', which, when set,
will be considered by key virtual host deletion function(s)
* DELETE /api/vhosts/{name} was adapted to handle
such blocked deletion attempts to respond with
a 412 Precondition Failed status
* 'rabbitmqctl list_vhosts' and 'rabbitmqctl delete_vhost'
were adapted accordingly
* DELETE /api/vhosts/{name}/deletion/protection
is a new endpoint that can be used to remove
the protective seal (the metadata key)
* POST /api/vhosts/{name}/deletion/protection
marks the virtual host as protected
In the case of the HTTP API, all operations on
virtual host metadata require administrative
privileges from the target user.
Other considerations:
* When a virtual host does not exist, the behavior
remains the same: the original, protection-unaware
code path is used to preserve backwards compatibility
References #12772.
The following scenario led to a channel crash:
1. Publish to a non-existing stream: `perf-test -y 0 -p -e amq.default -t direct -k stream`
2. Declare the stream: `rabbitmqadmin declare queue name=stream queue_type=stream`
There is no pid yet, so we got a function_clause with `none`
```
{function_clause,
[{osiris_writer,write,
[none,<0.877.0>,<<"<0.877.0>_-65ZKFz18ll5lau0phi7CsQ">>,1,
[[0,"Sp",[192,6,5,"B@@AC"]],
[0,"Sr",
[193,38,4,
[[[163,10,<<"x-exchange">>],[161,0,<<>>]],
[[163,13,<<"x-routing-key">>],[161,6,<<"stream">>]]]]],
[0,"Su",[160,12,[<<0,19,252,1,0,0,98,171,20,16,108,167>>]]]]],
[{file,"src/osiris_writer.erl"},{line,158}]},
{rabbit_stream_queue,deliver0,4,
[{file,"rabbit_stream_queue.erl"},{line,540}]},
{rabbit_stream_queue,'-deliver/3-fun-0-',4,
[{file,"rabbit_stream_queue.erl"},{line,526}]},
{lists,foldl,3,[{file,"lists.erl"},{line,2146}]},
{rabbit_queue_type,'-deliver0/4-fun-5-',5,
[{file,"rabbit_queue_type.erl"},{line,707}]},
{maps,fold_1,4,[{file,"maps.erl"},{line,860}]},
{rabbit_queue_type,deliver0,4,
[{file,"rabbit_queue_type.erl"},{line,704}]},
{rabbit_queue_type,deliver,4,
[{file,"rabbit_queue_type.erl"},{line,662}]}]}
```
Co-authored-by: Karl Nilsson <kjnilsson@gmail.com>
[Why]
Up-to RabbitMQ 3.13.x, there was a case where if:
1. you enabled a plugin
2. you enabled its feature flags
3. you disabled the plugin
4. you restarted a node (or upgraded it)
... the node could crash on startup because it had a feature flag marked
as enabled that it didn't know about:
error:{badmatch,#{feature_flags => ...
rabbit_ff_controller:-check_one_way_compatibility/2-fun-0-/3, line 514
lists:all_1/2, line 1520
rabbit_ff_controller:are_compatible/2, line 496
rabbit_ff_controller:check_node_compatibility_task1/4, line 437
rabbit_db_cluster:check_compatibility/1, line 376
This was "fixed" by the new way of keeping the registry in memory
(#10988) because it introduces a slight change of behavior. Indeed, the
old way walked through the `FeatureFlags` map and looked up the state in
the `FeatureStates` map to create the `is_enabled/1` function. The new
way just looks up the state in `FeatureStates`.
[How]
The new testcase succeeds on 4.0.x and `main`, but would fail on 3.13.x
with the aforementionne crash.
[Why]
The feature flag controller that is responsible for enabling a feature
flag may be on a node that doesn't know this feature flag. This is
supported by there is a bug when it queries the callback definition for
that feature flag: it uses its own registry which does not have anything
about this feature flag.
This leads to a crash because the `run_callback/5` funtion tries to use
the `undefined` atom returned by the registry as a map:
crasher:
initial call: rabbit_ff_controller:init/1
pid: <0.374.0>
registered_name: rabbit_ff_controller
exception error: bad map: undefined
in function rabbit_ff_controller:run_callback/5
in call from rabbit_ff_controller:do_enable/3 (rabbit_ff_controller.erl, line 1244)
in call from rabbit_ff_controller:update_feature_state_and_enable/2 (rabbit_ff_controller.erl, line 1180)
in call from rabbit_ff_controller:enable_with_registry_locked/2 (rabbit_ff_controller.erl, line 1050)
in call from rabbit_ff_controller:enable_many_locked/2 (rabbit_ff_controller.erl, line 991)
in call from rabbit_ff_controller:enable_many/2 (rabbit_ff_controller.erl, line 979)
in call from rabbit_ff_controller:updating_feature_flag_states/3 (rabbit_ff_controller.erl, line 307)
in call from gen_statem:loop_state_callback/11 (gen_statem.erl, line 3735)
[How]
The callback definition is now queried from the first node in the list
given as argument. For the common use case where all nodes know about a
feature flag, the first node is the local one, so there should be no
latency caused by the RPC.
See #12963.
[Why]
Once `khepr_db` is enabled by default, we need another way to disable it
to select Mnesia instead.
[How]
We use the new relative forced feature flags mechanism to indicate if we
want to explicitly enable or disable `khepri_db`. This way, we don't
touch other stable feature flags and only mess with Khepri.
However, this mechanism is not supported by RabbitMQ 4.0.x and older.
They will ignore the setting. Therefore, to make this work in
mixed-version testing, we set the `$RABBITMQ_FEATURE_FLAGS` variable for
the secondary umbrella. This part will go away once we test against
RabbitMQ 4.1.x as the secondary umbrella in the future.
At the end, we compare the effective metadata store to the expected one.
If they don't match, we skip the test.
While here, change `rjms_topic_selector_SUITE` to only choose Khepri
without specifying any feature flags.
If handle_tick is called before the machine has finished the upgrade
process, it could receive an old overview format (stats tuple vs map).
Let's ignore it and the next handle tick should be fine.
Unlikely to happen in production, detected on CI with a very low tick timeout
Fixes#12933
The assumption that `x-last-death-*` annotations must have been set
whenever the `deaths` annotation is set was wrong.
Reproducation steps, Option 1:
1. In v3.13.7, dead letter a message from Q1 to Q2 (both can be classic queues).
2. Re-publish the message including its x-death header from Q2 back to Q1.
(RabbitMQ 3.13.7 will interpret this x-death header and set the deaths annotation.)
3. Upgrade to v4.0.4
4. Dead letter the message from Q1 to Q2 will cause the following crash:
```
crasher:
initial call: rabbit_amqqueue_process:init/1
pid: <0.577.0>
registered_name: []
exception exit: {{badkey,<<"x-last-death-exchange">>},
[{mc,record_death,4,[{file,"mc.erl"},{line,410}]},
{rabbit_dead_letter,publish,5,
[{file,"rabbit_dead_letter.erl"},{line,38}]},
{rabbit_amqqueue_process,'-dead_letter_msgs/4-fun-0-',
7,
[{file,"rabbit_amqqueue_process.erl"},{line,1060}]},
{rabbit_variable_queue,'-ackfold/4-fun-0-',3,
[{file,"rabbit_variable_queue.erl"},{line,655}]},
{lists,foldl,3,[{file,"lists.erl"},{line,2146}]},
{rabbit_variable_queue,ackfold,4,
[{file,"rabbit_variable_queue.erl"},{line,652}]},
{rabbit_priority_queue,ackfold,4,
[{file,"rabbit_priority_queue.erl"},{line,309}]},
{rabbit_amqqueue_process,
'-dead_letter_rejected_msgs/3-fun-0-',5,
[{file,"rabbit_amqqueue_process.erl"},
{line,1038}]}]}
```
Reproduction steps, Option 2:
1. Run a 4.0.4 / 3.13.7 mixed version cluster where both queues Q1 and Q2
are hosted on the 4.0.4 node.
2. Send a message to Q1 which dead letters to Q2.
3. Re-publish a message with the x-death AMQP 0.9.1 header from Q2 to
Q1. However, this time make sure to publish to the 3.13.7 node which
forwards this message to Q1 on the 4.0.4 node.
4. Subsequently dead lettering this message from Q1 to Q2 (happening on
the 4.0.4 node) will also cause the crash.
The modified test case in this commit was able to repro this crash via
Option 2 in the mixed version cluster tests on the `v4.0.x` branch.
As the de-duplication plugin is the only adopter of the `is_duplicate`
callback, we now use a simpler signature.
When a message is deemed duplicated, we discard it and re-route it to
dead letter exchange.
Signed-off-by: Matteo Cafasso <noxdafox@gmail.com>
(cherry picked from commit f93baa35cb)
`is_duplicate` callback signature was changed in order to support both
the mirroring queues as well as the de-duplication ones.
As the mirroring queues are now deprecated and removed, we can fall
back to a simpler boolean as return value.
Signed-off-by: Matteo Cafasso <noxdafox@gmail.com>
(cherry picked from commit c927446e17)
Prior to this commit, when the sending client overshot RabbitMQ's incoming-window
(which is allowed in the event of a cluster wide memory or disk alarm),
and RabbitMQ sent a FLOW frame to the client, RabbitMQ sent a negative
incoming-window field in the FLOW frame causing the following crash in
the writer proc:
```
crasher:
initial call: rabbit_amqp_writer:init/1
pid: <0.19353.0>
registered_name: []
exception error: bad argument
in function iolist_size/1
called as iolist_size([<<112,0,0,23,120>>,
[82,-15],
<<"pÿÿÿü">>,<<"pÿÿÿÿ">>,67,
<<112,0,0,23,120>>,
"Rª",64,64,64,64])
*** argument 1: not an iodata term
in call from amqp10_binary_generator:generate1/1 (amqp10_binary_generator.erl, line 141)
in call from amqp10_binary_generator:generate1/1 (amqp10_binary_generator.erl, line 88)
in call from amqp10_binary_generator:generate/1 (amqp10_binary_generator.erl, line 79)
in call from rabbit_amqp_writer:assemble_frame/3 (rabbit_amqp_writer.erl, line 206)
in call from rabbit_amqp_writer:internal_send_command_async/3 (rabbit_amqp_writer.erl, line 189)
in call from rabbit_amqp_writer:handle_cast/2 (rabbit_amqp_writer.erl, line 110)
in call from gen_server:try_handle_cast/3 (gen_server.erl, line 1121)
```
This commit fixes this crash by maintaning a floor of zero for
incoming-window in the FLOW frame.
Fixes#12816
The credit_flow between publishing AMQP 0.9.1 channel (or MQTT
connection) and (non-mirrored) classic queue processes was
unintentionally removed in 4.0 together with anything else related to
CQ mirroring.
By default we restore the 3.x behaviour for non-mirored classic
queues. It is possible to disable flow-control (the earlier 4.0.x
behaviour) with the new env `classic_queue_flow_control`. In 3.x this
was possible with the config `mirroring_flow_control`.
(cherry picked from commit d65bd7d07a)
[Why]
The code assumed that the transaction would always succeed. It was kind
of the case with Mnesia because it would throw an exception if it
failed.
Khepri returns an error instead. The code has to handle it. In
particular, we see timeouts in CI and before this patch, they caused a
crash because the list comprehension was asked to work on a tuple.
[How]
We now retry a few times for 10 seconds.
[Why]
We pin a version of Horus even if we don't use it directly (it is a
dependency of Khepri). But currently, we can't update Khepri while still
needing the fix in Horus 0.3.1.
Horus 0.3.1 works around a crash in `cover` that mostly affects CI for
now.
This pinning will have to go away with the next update of Khepri.
[Why]
The `ra:member_add/3` call returns before the change is committed. This
is ok for that addition but any follow-up changes to the cluster might
be rejected with the `cluster_change_not_permitted` error.
[How]
Instead of changing other places to wait or retry their cluster
membership change, this patch waits for the current add to be applied
before proceeding and returning.
This fixes some transient failures in CI where such follow-up changes
are rejected and not retried, leaving the cluster in an unexpected state
for the testcase.
An example is with
`quorum_queue_SUITE:force_shrink_member_to_current_member/1`
This check fails on a virin node, because the metadata store
is not yet ready to handle the query. However, a virin
node by definition can't have any queues, so let's just return
false without asking.
[Why]
In CI, we observe some timeouts in the Erlang distribution connections
between the temporary hidden node and the nodes it queries. This affects
peer discovery obviously.
[How]
We introduce some query retries to reduce the risk of an incomplete
query.
While here, we move the sorting of queried nodes from the
`query_node_props2/3` last clause (executed in the temporary hidden
node) to the function setting the temporary hidden node and asking for
these queries. This way the debug messages from that sorting are logged
by RabbitMQ out of the box.
[Why]
This impacts what is reported by the catch because it caught exceptions
emitted by code supposedly called later. An example is the assert
in `query_node_props2/3` last clause.
[Why]
This was the first solution put in place to prevent that the temporary
hidden node connects to the node that started it to write any printed
messages. Because of this, the nodes that the temporary hidden node
queried found out about the parent node and they opened an Erlang
distribution connection to it. This polluted the known nodes list.
However later, the temporary hidden node was started with the
`standard_io` connection option. This prevented the temporary hidden
node from knowing about the node that started it, solving the problem in
a cleaner way.
[How]
This commit garbage-collects that piece of code that is now useless. It
makes the query code way simpler to understand.
[Why]
That timer was started during boot and continued regardless if `rabbit`
was running or stopped.
This caused the reconsiliation to crash if the `rabbit` app was stopped
before the it ended because it tried to access the database even though
it was stopped or even reset.
[How]
We just check if `rabbit` is running before running one reconciliation
and scheduling a new one.
This fixes erlang_ls's header resolution. Previously it would confuse
the include_lib of the `khepri.hrl` from Khepri with this header in
the rabbit app.
This header is also specific to how rabbit uses Khepri so I think the
new name fits better.
Introduce a single place in the AMQP 1.0 Erlang client that infers the AMQP 1.0 type.
Erlang integers are inferred to be AMQP type `long` to avoid overflow surprises.