Commit Graph

3127 Commits

Author SHA1 Message Date
Karl Nilsson 0548ffa6fa raft info 2025-01-20 17:00:58 +00:00
Arnaud Cogoluègnes 1886caec0a
Merge pull request #13092 from rabbitmq/stream-consumer-cancel-event
Emit cancellation event only when stream consumer is cancelled
2025-01-17 14:13:08 +00:00
Arnaud Cogoluègnes 69d0382dd2
Emit cancellation event only when stream consumer is cancelled
Not when the channel or the connection is closed.

References #13085, #9356
2025-01-17 14:42:40 +01:00
Michal Kuratczyk 3ff90deb7c
Merge pull request #13069 from rabbitmq/update-startup-checks
Remove deprecated/unused/old startup checks
2025-01-17 14:28:50 +01:00
Michal Kuratczyk 14171fb035
Remove msg_store_io_batch_size and msg_store_credit_disc_bound checks
msg_store_io_batch_size is no longer used

msg_store_credit_disc_bound appears to be used in the code, but I don't
see any impact of that value on the performance. It should be properly
investigated and either removed completely or fixed, because there's
hardly any point in warning about the values configured
(plus, this settings is hopefully almost never used anyway)
2025-01-17 13:38:43 +01:00
Michal Kuratczyk 954b861db7
Don't warn about dirty I/O scheduler count 2025-01-16 17:42:13 +01:00
Péter Gömöri efd4e45ed8 Fix return value of `rabbit_priority_queue:delete_crashed/1`
According to the `rabbit_backing_queue` behavious it must always
return `ok`, but it used to return a list of results one for each
priority. That caused the below crash further up the call chain.

```
> rabbit_classic_queue:delete_crashed(Q)
** exception error: no case clause matching [ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok]
     in function  rabbit_classic_queue:delete_crashed/2 (rabbit_classic_queue.erl, line 516)
```

Other backing_queue implementations (`rabbit_variable_queue`) just
exit with a badmatch upon error.

This (very minor) issue is present since 3.13.0 when
`rabbit_classic_queue:delete_crashed_in_backing_queue/1` was
instroduced with Khepri in commit 5f0981c5. Before that the result of
`BQ:delete_crashed/1` was simply ignored.
2025-01-16 17:34:19 +01:00
Arnaud Cogoluègnes 114a5c220f
Delete stream consumer metrics when AMQP 091 connection closes (#13085)
To avoid rogue consumer records.
2025-01-16 15:40:06 +01:00
Karl Nilsson e7c624dd46 QQ: improve fifo client log message on leader change
to capture the number of pending commands that will be resent
2025-01-16 09:36:23 +00:00
David Ansari 290889b936 Include sessions in format_status/1
Include monitored session pids in format_status/1 of rabbit_amqp_writer.
They could be useful when debugging.
The maximum number of sessions per connection is limited, hence the
output won't be too large.
2025-01-16 10:06:24 +01:00
Jean-Sébastien Pédron 57ed962ef6
rabbitmq_ct_helpers: Fix how we set `$RABBITMQ_FEATURE_FLAGS` in tests
[Why]
In order to make `khepri_db` the default in the future, the handling of
`$RABBITMQ_FEATURE_FLAGS` had to be adapted to be able to *disable*
Khepri instead.

Unfortunately I broke the behavior with stable feature flags that are
only available in the primary umbrella. In this case, they were
automatically enabled and thus, clustering with an old umbrella that did
not have these feature flags failed with `incompatible_feature_flags`.

[How]
The solution is to always use an absolute list of feature flags, not the
new relative list.

V2: Allow a testsuite to skip the configuration of the metadata store.
    This is needed for the feature_flags_SUITE testsuite because it
    tests the default behavior and the configuration of the metadata
    store changes that behavior.

    While here, fix a ct log message where variables were swapped
    compared to the format strieg expectation.

V3: Enable `rabbitmq_4.0.0` feature flag in rabbit_mgmt_http_SUITE. This
    testsuite apparently requires it and if it's not enabled, it fails.
2025-01-15 20:43:41 +01:00
Michal Kuratczyk a4634d3f70
Allow InitialCredit/MoreCreditAfter of zero (#13067)
https://github.com/rabbitmq/rabbitmq-server/pull/13046
introduced additional checks which prevent setting
`{credit_flow_default_credit,{0,0}}`.

Setting credits to zero allows disabling the credit flow mechanism
(we use it in our benchmarks and mention for example in
https://www.rabbitmq.com/blog/2023/03/21/native-mqtt)
2025-01-14 16:12:05 +01:00
Michal Kuratczyk 1077a55194
Stream queue: consumers are active by default
Without this change, consumers using protocols other than the stream
protocol would display as inactive in the Management UI/API and CLI
commands, even though they were receiving messages.
2025-01-13 15:51:39 +01:00
Michael Klishin 208d3d6b59
Follow-up to #13046 #13055: accept MoreCreditAfter that's equal to InitialCredit 2025-01-12 16:48:53 -05:00
Michael Klishin 3dd3433722
Finish off #13046 2025-01-11 19:15:00 -05:00
Michael Klishin e5fe7247dc
rabbitmq.conf.example: a typo #8076 2025-01-11 17:36:51 -05:00
Michael Klishin 82c93ceb23
rabbitmq.conf.example: document quorum_queue.property_equivalence.relaxed_checks_on_redeclaration #8076 2025-01-11 17:36:50 -05:00
Michael Klishin 4603d3597e
rabbitmq.conf.example: suggest Discussions and Discord for questions 2025-01-11 17:36:50 -05:00
jimmy 0135f62528 Fix typo 2025-01-11 14:15:35 +08:00
jimmy 2394d427e3 Fix typo 2025-01-11 13:19:31 +08:00
jimmy e6954c8720 Change to throw an error when credit_flow_default_credit invalid 2025-01-11 13:13:12 +08:00
root e589c42476 Change to throw an error when credit_flow_default_credit invalid 2025-01-11 13:10:17 +08:00
JimmyWang6 81648759f8 Fix typo 2025-01-10 15:50:38 +08:00
jimmy wang 5b244e7504 Fix MoreCreditAfter could larger than InitialCredit 2025-01-10 14:42:10 +08:00
Michael Klishin 1d88a9d28f
definition_import_SUITE: a new case
that features a deletion-protected virtual host.
2025-01-03 19:07:57 -05:00
Michael Klishin 315c247231
Make sure protected_from_deletion is included into virtual host definitions exported over the HTTP API 2025-01-03 18:49:11 -05:00
Michael Klishin 3f5b13d47f
Merge branch 'main' into mk-virtual-host-protection-from-accidental-deletion 2025-01-02 17:01:54 -05:00
Michael Klishin f62d46c286
Introduce a way to protect a virtual host from deletion
Accidental "fat finger" virtual deletion accidents
would be easier to avoid if there was a protection mechanism
that would apply equally even to CLI tools and external
applications that do not use confirmations for deletion
operations.

This introduce the following changes:

 * Virtual host metadata now supports a new queue,
   'protected_from_deletion', which, when set,
   will be considered by key virtual host deletion function(s)
 * DELETE /api/vhosts/{name} was adapted to handle
   such blocked deletion attempts to respond with
   a 412 Precondition Failed status
 * 'rabbitmqctl list_vhosts' and 'rabbitmqctl delete_vhost'
   were adapted accordingly
 * DELETE /api/vhosts/{name}/deletion/protection
   is a new endpoint that can be used to remove
   the protective seal (the metadata key)
 * POST /api/vhosts/{name}/deletion/protection
   marks the virtual host as protected

In the case of the HTTP API, all operations on
virtual host metadata require administrative
privileges from the target user.

Other considerations:

 * When a virtual host does not exist, the behavior
  remains the same: the original, protection-unaware
  code path is used to preserve backwards compatibility

References #12772.
2025-01-02 16:50:51 -05:00
Michael Klishin ac7dcc9abe
Merge pull request #13010 from johanrhodin/link-fix-patch-1
Update doc link
2025-01-02 12:19:04 -05:00
Johan Rhodin e35edf789d
Update doc link 2025-01-02 10:53:19 -06:00
Michael Klishin 2aed29709e
mirrored_supervisor_SUITE: don't search logs for exceptions #13008 2025-01-02 11:05:49 -05:00
Michael Klishin 968eefa1bb
Bump (c) line year
There are no functional changes to this massive diff.
2025-01-01 17:54:10 -05:00
David Ansari 42ede4a258 Speed up tests
Multiple test cases were recently slowed down by up to 30 seconds.
This commit reverts these changes.
2024-12-30 16:56:18 +00:00
Michal Kuratczyk 68de3fdb77
Fix channel crash when publishing to a new stream (#12969)
The following scenario led to a channel crash:
1. Publish to a non-existing stream: `perf-test -y 0 -p -e amq.default -t direct -k stream`
2. Declare the stream: `rabbitmqadmin declare queue name=stream queue_type=stream`

There is no pid yet, so we got a function_clause with `none`
```
{function_clause,
   [{osiris_writer,write,
        [none,<0.877.0>,<<"<0.877.0>_-65ZKFz18ll5lau0phi7CsQ">>,1,
         [[0,"Sp",[192,6,5,"B@@AC"]],
          [0,"Sr",
           [193,38,4,
            [[[163,10,<<"x-exchange">>],[161,0,<<>>]],
             [[163,13,<<"x-routing-key">>],[161,6,<<"stream">>]]]]],
          [0,"Su",[160,12,[<<0,19,252,1,0,0,98,171,20,16,108,167>>]]]]],
        [{file,"src/osiris_writer.erl"},{line,158}]},
    {rabbit_stream_queue,deliver0,4,
        [{file,"rabbit_stream_queue.erl"},{line,540}]},
    {rabbit_stream_queue,'-deliver/3-fun-0-',4,
        [{file,"rabbit_stream_queue.erl"},{line,526}]},
    {lists,foldl,3,[{file,"lists.erl"},{line,2146}]},
    {rabbit_queue_type,'-deliver0/4-fun-5-',5,
        [{file,"rabbit_queue_type.erl"},{line,707}]},
    {maps,fold_1,4,[{file,"maps.erl"},{line,860}]},
    {rabbit_queue_type,deliver0,4,
        [{file,"rabbit_queue_type.erl"},{line,704}]},
    {rabbit_queue_type,deliver,4,
        [{file,"rabbit_queue_type.erl"},{line,662}]}]}
```

Co-authored-by: Karl Nilsson <kjnilsson@gmail.com>
2024-12-20 08:56:25 +01:00
Jean-Sébastien Pédron ea2c8db2d1
rabbit_feature_flags: Add testcase after issue #12963
[Why]
Up-to RabbitMQ 3.13.x, there was a case where if:
1. you enabled a plugin
2. you enabled its feature flags
3. you disabled the plugin
4. you restarted a node (or upgraded it)

... the node could crash on startup because it had a feature flag marked
as enabled that it didn't know about:

    error:{badmatch,#{feature_flags => ...

        rabbit_ff_controller:-check_one_way_compatibility/2-fun-0-/3, line 514
        lists:all_1/2, line 1520
        rabbit_ff_controller:are_compatible/2, line 496
        rabbit_ff_controller:check_node_compatibility_task1/4, line 437
        rabbit_db_cluster:check_compatibility/1, line 376

This was "fixed" by the new way of keeping the registry in memory
(#10988) because it introduces a slight change of behavior. Indeed, the
old way walked through the `FeatureFlags` map and looked up the state in
the `FeatureStates` map to create the `is_enabled/1` function. The new
way just looks up the state in `FeatureStates`.

[How]
The new testcase succeeds on 4.0.x and `main`, but would fail on 3.13.x
with the aforementionne crash.
2024-12-19 16:33:43 +01:00
Jean-Sébastien Pédron 3325def8eb
rabbit_feature_flags: Take callback definition from correct node
[Why]
The feature flag controller that is responsible for enabling a feature
flag may be on a node that doesn't know this feature flag. This is
supported by there is a bug when it queries the callback definition for
that feature flag: it uses its own registry which does not have anything
about this feature flag.

This leads to a crash because the `run_callback/5` funtion tries to use
the `undefined` atom returned by the registry as a map:

    crasher:
      initial call: rabbit_ff_controller:init/1
      pid: <0.374.0>
      registered_name: rabbit_ff_controller
      exception error: bad map: undefined
        in function  rabbit_ff_controller:run_callback/5
        in call from rabbit_ff_controller:do_enable/3 (rabbit_ff_controller.erl, line 1244)
        in call from rabbit_ff_controller:update_feature_state_and_enable/2 (rabbit_ff_controller.erl, line 1180)
        in call from rabbit_ff_controller:enable_with_registry_locked/2 (rabbit_ff_controller.erl, line 1050)
        in call from rabbit_ff_controller:enable_many_locked/2 (rabbit_ff_controller.erl, line 991)
        in call from rabbit_ff_controller:enable_many/2 (rabbit_ff_controller.erl, line 979)
        in call from rabbit_ff_controller:updating_feature_flag_states/3 (rabbit_ff_controller.erl, line 307)
        in call from gen_statem:loop_state_callback/11 (gen_statem.erl, line 3735)

[How]
The callback definition is now queried from the first node in the list
given as argument. For the common use case where all nodes know about a
feature flag, the first node is the local one, so there should be no
latency caused by the RPC.

See #12963.
2024-12-19 13:45:27 +01:00
Jean-Sébastien Pédron dbec429fba
rabbit_feature_flags: Fix function name in the controller
[Why]
`state_after_virtual_state()` meant nothing.

`state_after_virtual_reset()` was the name I had in mind.
2024-12-19 11:54:25 +01:00
Jean-Sébastien Pédron debe2a118c
rabbitmq_ct_helpers: Change how Mnesia/Khepri is selected
[Why]
Once `khepr_db` is enabled by default, we need another way to disable it
to select Mnesia instead.

[How]
We use the new relative forced feature flags mechanism to indicate if we
want to explicitly enable or disable `khepri_db`. This way, we don't
touch other stable feature flags and only mess with Khepri.

However, this mechanism is not supported by RabbitMQ 4.0.x and older.
They will ignore the setting. Therefore, to make this work in
mixed-version testing, we set the `$RABBITMQ_FEATURE_FLAGS` variable for
the secondary umbrella. This part will go away once we test against
RabbitMQ 4.1.x as the secondary umbrella in the future.

At the end, we compare the effective metadata store to the expected one.
If they don't match, we skip the test.

While here, change `rjms_topic_selector_SUITE` to only choose Khepri
without specifying any feature flags.
2024-12-17 09:56:54 +01:00
Michael Klishin 0db3d7b014
Merge pull request #12950 from rabbitmq/qq-handle-tick
Quorum queues: ignore handle_tick with an old overview format
2024-12-16 11:31:55 -05:00
Michael Klishin 62ce1c954a
Merge pull request #12948 from rabbitmq/fix-flakes
Test fixes for a few more CI flakes
2024-12-16 11:24:10 -05:00
Diana Parra Corbacho a97ec92785 Quorum queues: ignore handle_tick with an old overview format
If handle_tick is called before the machine has finished the upgrade
process, it could receive an old overview format (stats tuple vs map).
Let's ignore it and the next handle tick should be fine.

Unlikely to happen in production, detected on CI with a very low tick timeout
2024-12-16 15:39:39 +01:00
Diana Parra Corbacho fe7a141331 Test: Increase receive timeout in all rabbit test suites 2024-12-16 11:58:05 +01:00
GitHub 0d750769f9 bazel run gazelle 2024-12-14 04:02:32 +00:00
David Ansari b6027ece28 Fix dead lettering crash
Fixes #12933

The assumption that `x-last-death-*` annotations must have been set
whenever the `deaths` annotation is set was wrong.

Reproducation steps, Option 1:
1. In v3.13.7, dead letter a message from Q1 to Q2 (both can be classic queues).
2. Re-publish the message including its x-death header from Q2 back to Q1.
(RabbitMQ 3.13.7 will interpret this x-death header and set the deaths annotation.)
3. Upgrade to v4.0.4
4. Dead letter the message from Q1 to Q2 will cause the following crash:
```
crasher:
  initial call: rabbit_amqqueue_process:init/1
  pid: <0.577.0>
  registered_name: []
  exception exit: {{badkey,<<"x-last-death-exchange">>},
                   [{mc,record_death,4,[{file,"mc.erl"},{line,410}]},
                    {rabbit_dead_letter,publish,5,
                        [{file,"rabbit_dead_letter.erl"},{line,38}]},
                    {rabbit_amqqueue_process,'-dead_letter_msgs/4-fun-0-',
                        7,
                        [{file,"rabbit_amqqueue_process.erl"},{line,1060}]},
                    {rabbit_variable_queue,'-ackfold/4-fun-0-',3,
                        [{file,"rabbit_variable_queue.erl"},{line,655}]},
                    {lists,foldl,3,[{file,"lists.erl"},{line,2146}]},
                    {rabbit_variable_queue,ackfold,4,
                        [{file,"rabbit_variable_queue.erl"},{line,652}]},
                    {rabbit_priority_queue,ackfold,4,
                        [{file,"rabbit_priority_queue.erl"},{line,309}]},
                    {rabbit_amqqueue_process,
                        '-dead_letter_rejected_msgs/3-fun-0-',5,
                        [{file,"rabbit_amqqueue_process.erl"},
                         {line,1038}]}]}
```

Reproduction steps, Option 2:
1. Run a 4.0.4 / 3.13.7 mixed version cluster where both queues Q1 and Q2
   are hosted on the 4.0.4 node.
2. Send a message to Q1 which dead letters to Q2.
3. Re-publish a message with the x-death AMQP 0.9.1 header from Q2 to
   Q1. However, this time make sure to publish to the 3.13.7 node which
   forwards this message to Q1 on the 4.0.4 node.
4. Subsequently dead lettering this message from Q1 to Q2 (happening on
   the 4.0.4 node) will also cause the crash.

The modified test case in this commit was able to repro this crash via
Option 2 in the mixed version cluster tests on the `v4.0.x` branch.
2024-12-13 19:25:43 +01:00
Matteo Cafasso 8d7535e0b1
amqqueue_process: adopt new `is_duplicate` backing queue callback
As the de-duplication plugin is the only adopter of the `is_duplicate`
callback, we now use a simpler signature.

When a message is deemed duplicated, we discard it and re-route it to
dead letter exchange.

Signed-off-by: Matteo Cafasso <noxdafox@gmail.com>
(cherry picked from commit f93baa35cb)
2024-12-11 19:43:45 -05:00
Matteo Cafasso 6a6e760107
backing_queue: simplify `is_duplicate` callback signature
`is_duplicate` callback signature was changed in order to support both
the mirroring queues as well as the de-duplication ones.

As the mirroring queues are now deprecated and removed, we can fall
back to a simpler boolean as return value.

Signed-off-by: Matteo Cafasso <noxdafox@gmail.com>
(cherry picked from commit c927446e17)
2024-12-11 19:43:38 -05:00
David Ansari 9d8ae14e27 Use correct AMQP filter expression string modifier prefix
Section 4.1.1 of AMQP Filter Expressions Working Draft 09
defines `&` (ampersand) instead of `$` (dollar) as the string modifier prefix.
2024-12-11 16:48:56 +01:00
Michael Klishin b84483ab5c
Merge pull request #12907 from rabbitmq/rabbitmq-server-12906
By @gomoripeti: Restore credit_flow between AMQP 0.9.1 channel/MQTT connection -> CQ processes
2024-12-10 10:03:47 -05:00
David Ansari 0d34ef6047 Set a floor of zero for incoming-window
Prior to this commit, when the sending client overshot RabbitMQ's incoming-window
(which is allowed in the event of a cluster wide memory or disk alarm),
and RabbitMQ sent a FLOW frame to the client, RabbitMQ sent a negative
incoming-window field in the FLOW frame causing the following crash in
the writer proc:
```
crasher:
  initial call: rabbit_amqp_writer:init/1
  pid: <0.19353.0>
  registered_name: []
  exception error: bad argument
    in function  iolist_size/1
       called as iolist_size([<<112,0,0,23,120>>,
                              [82,-15],
                              <<"pÿÿÿü">>,<<"pÿÿÿÿ">>,67,
                              <<112,0,0,23,120>>,
                              "Rª",64,64,64,64])
       *** argument 1: not an iodata term
    in call from amqp10_binary_generator:generate1/1 (amqp10_binary_generator.erl, line 141)
    in call from amqp10_binary_generator:generate1/1 (amqp10_binary_generator.erl, line 88)
    in call from amqp10_binary_generator:generate/1 (amqp10_binary_generator.erl, line 79)
    in call from rabbit_amqp_writer:assemble_frame/3 (rabbit_amqp_writer.erl, line 206)
    in call from rabbit_amqp_writer:internal_send_command_async/3 (rabbit_amqp_writer.erl, line 189)
    in call from rabbit_amqp_writer:handle_cast/2 (rabbit_amqp_writer.erl, line 110)
    in call from gen_server:try_handle_cast/3 (gen_server.erl, line 1121)
```

This commit fixes this crash by maintaning a floor of zero for
incoming-window in the FLOW frame.

Fixes #12816
2024-12-10 09:39:21 +01:00
Péter Gömöri 2c1f1a1387
Restore credit_flow between channel/MQTT connection -> CQ processes
The credit_flow between publishing AMQP 0.9.1 channel (or MQTT
connection) and (non-mirrored) classic queue processes was
unintentionally removed in 4.0 together with anything else related to
CQ mirroring.

By default we restore the 3.x behaviour for non-mirored classic
queues. It is possible to disable flow-control (the earlier 4.0.x
behaviour) with the new env `classic_queue_flow_control`. In 3.x this
was possible with the config `mirroring_flow_control`.

(cherry picked from commit d65bd7d07a)
2024-12-09 22:33:47 -05:00