Commit Graph

3083 Commits

Author SHA1 Message Date
Michal Kuratczyk 16a2694f34
WIP 2024-12-12 18:40:45 +01:00
Michal Kuratczyk 7a227e8f43
debug logs 2024-12-12 15:24:20 +01:00
David Ansari 9d8ae14e27 Use correct AMQP filter expression string modifier prefix
Section 4.1.1 of AMQP Filter Expressions Working Draft 09
defines `&` (ampersand) instead of `$` (dollar) as the string modifier prefix.
2024-12-11 16:48:56 +01:00
Michael Klishin b84483ab5c
Merge pull request #12907 from rabbitmq/rabbitmq-server-12906
By @gomoripeti: Restore credit_flow between AMQP 0.9.1 channel/MQTT connection -> CQ processes
2024-12-10 10:03:47 -05:00
David Ansari 0d34ef6047 Set a floor of zero for incoming-window
Prior to this commit, when the sending client overshot RabbitMQ's incoming-window
(which is allowed in the event of a cluster wide memory or disk alarm),
and RabbitMQ sent a FLOW frame to the client, RabbitMQ sent a negative
incoming-window field in the FLOW frame causing the following crash in
the writer proc:
```
crasher:
  initial call: rabbit_amqp_writer:init/1
  pid: <0.19353.0>
  registered_name: []
  exception error: bad argument
    in function  iolist_size/1
       called as iolist_size([<<112,0,0,23,120>>,
                              [82,-15],
                              <<"pÿÿÿü">>,<<"pÿÿÿÿ">>,67,
                              <<112,0,0,23,120>>,
                              "Rª",64,64,64,64])
       *** argument 1: not an iodata term
    in call from amqp10_binary_generator:generate1/1 (amqp10_binary_generator.erl, line 141)
    in call from amqp10_binary_generator:generate1/1 (amqp10_binary_generator.erl, line 88)
    in call from amqp10_binary_generator:generate/1 (amqp10_binary_generator.erl, line 79)
    in call from rabbit_amqp_writer:assemble_frame/3 (rabbit_amqp_writer.erl, line 206)
    in call from rabbit_amqp_writer:internal_send_command_async/3 (rabbit_amqp_writer.erl, line 189)
    in call from rabbit_amqp_writer:handle_cast/2 (rabbit_amqp_writer.erl, line 110)
    in call from gen_server:try_handle_cast/3 (gen_server.erl, line 1121)
```

This commit fixes this crash by maintaning a floor of zero for
incoming-window in the FLOW frame.

Fixes #12816
2024-12-10 09:39:21 +01:00
Péter Gömöri 2c1f1a1387
Restore credit_flow between channel/MQTT connection -> CQ processes
The credit_flow between publishing AMQP 0.9.1 channel (or MQTT
connection) and (non-mirrored) classic queue processes was
unintentionally removed in 4.0 together with anything else related to
CQ mirroring.

By default we restore the 3.x behaviour for non-mirored classic
queues. It is possible to disable flow-control (the earlier 4.0.x
behaviour) with the new env `classic_queue_flow_control`. In 3.x this
was possible with the config `mirroring_flow_control`.

(cherry picked from commit d65bd7d07a)
2024-12-09 22:33:47 -05:00
Jean-Sébastien Pédron 56f90a51a9
rabbit_db: Return error from `force_boot_command_test/0` with Khepri 2024-12-02 13:33:08 +01:00
Jean-Sébastien Pédron df9882417c
rabbit_khepri: Report no partitions from `cli_cluster_status/0` 2024-11-29 16:53:55 +01:00
Jean-Sébastien Pédron 4621fe7730
mirrored_supervisor: Catch timeout from Khepri in `hanlde_info/2`
[Why]
The code assumed that the transaction would always succeed. It was kind
of the case with Mnesia because it would throw an exception if it
failed.

Khepri returns an error instead. The code has to handle it. In
particular, we see timeouts in CI and before this patch, they caused a
crash because the list comprehension was asked to work on a tuple.

[How]
We now retry a few times for 10 seconds.
2024-11-29 12:03:59 +01:00
Jean-Sébastien Pédron 913bd9fa42
rabbit_db: Fix `rabbit_db_msup:update_all/2` spec
[Why]
It can return an error.
2024-11-29 12:03:35 +01:00
Jean-Sébastien Pédron ae9fbb7bd5
Pin Horus to 0.3.1 temporarily
[Why]
We pin a version of Horus even if we don't use it directly (it is a
dependency of Khepri). But currently, we can't update Khepri while still
needing the fix in Horus 0.3.1.

Horus 0.3.1 works around a crash in `cover` that mostly affects CI for
now.

This pinning will have to go away with the next update of Khepri.
2024-11-29 09:50:08 +01:00
Jean-Sébastien Pédron 99d8e90df3
rabbit_quorum_queue: Wait for member add in `add_member/4`
[Why]
The `ra:member_add/3` call returns before the change is committed. This
is ok for that addition but any follow-up changes to the cluster might
be rejected with the `cluster_change_not_permitted` error.

[How]
Instead of changing other places to wait or retry their cluster
membership change, this patch waits for the current add to be applied
before proceeding and returning.

This fixes some transient failures in CI where such follow-up changes
are rejected and not retried, leaving the cluster in an unexpected state
for the testcase.

An example is with
`quorum_queue_SUITE:force_shrink_member_to_current_member/1`
2024-11-28 11:27:40 +01:00
Michal Kuratczyk 46259b5a48
Fix invalid warning about transient queues being used
This fixes the issue where RabbitMQ would warn about
transient queues being used in a cluster with no transient queues.

Fixes https://github.com/rabbitmq/rabbitmq-server/issues/12802
2024-11-27 22:04:01 +01:00
Michal Kuratczyk 1552f89dd7
Skip are_transient_nonexcl_used check on virin node
This check fails on a virin node, because the metadata store
is not yet ready to handle the query. However, a virin
node by definition can't have any queues, so let's just return
false without asking.
2024-11-27 22:03:53 +01:00
Michael Klishin 1cae417dbf
Merge pull request #12821 from rabbitmq/rabbitmq-server-12776
Definition export: inject default queue type into virtual host metadata
2024-11-27 14:53:25 -05:00
Michael Klishin 8a5ea76fe4
Inject DQT into 'ctl export_definitions' 2024-11-27 12:29:48 -05:00
Diana Parra Corbacho d004d69200 Tests: feature_flags_v2_SUITE ignore peer:stop/1 return value 2024-11-27 15:45:58 +01:00
Michael Klishin 090d11818f
HTTP API tests for injected default queue type 2024-11-26 18:00:37 -05:00
Michael Klishin 51e6004840
Inject DQT into GET /api/definitions and /api/vhosts
References #12776
2024-11-26 02:04:30 -05:00
Jean-Sébastien Pédron f6314d06b3
rabbit_peer_discovery: Retry RPC calls
[Why]
In CI, we observe some timeouts in the Erlang distribution connections
between the temporary hidden node and the nodes it queries. This affects
peer discovery obviously.

[How]
We introduce some query retries to reduce the risk of an incomplete
query.

While here, we move the sorting of queried nodes from the
`query_node_props2/3` last clause (executed in the temporary hidden
node) to the function setting the temporary hidden node and asking for
these queries. This way the debug messages from that sorting are logged
by RabbitMQ out of the box.
2024-11-25 16:16:16 +01:00
Jean-Sébastien Pédron 4d4985f254
rabbit_peer_discovery: Fix non-tail-recursive `query_node_props2()`
[Why]
This impacts what is reported by the catch because it caught exceptions
emitted by code supposedly called later. An example is the assert
in `query_node_props2/3` last clause.
2024-11-25 16:16:15 +01:00
Jean-Sébastien Pédron 62f22a7655
rabbit_peer_discovery: Remove the use of group leader proxy
[Why]
This was the first solution put in place to prevent that the temporary
hidden node connects to the node that started it to write any printed
messages. Because of this, the nodes that the temporary hidden node
queried found out about the parent node and they opened an Erlang
distribution connection to it. This polluted the known nodes list.

However later, the temporary hidden node was started with the
`standard_io` connection option. This prevented the temporary hidden
node from knowing about the node that started it, solving the problem in
a cleaner way.

[How]
This commit garbage-collects that piece of code that is now useless. It
makes the query code way simpler to understand.
2024-11-25 16:16:12 +01:00
D Corbacho 1fa4fe2735
Merge pull request #12775 from rabbitmq/fix-flakes
Fixes for test flakes
2024-11-25 16:12:29 +01:00
Jean-Sébastien Pédron fe2061b13b
quorum_queue_member_reconciliation_SUITE: Improve `reset_nodes/2`
[How]
The function now accepts that the node to reset is already out of the
cluster. This avoids a mismatch exception for a situation that is ok.
2024-11-25 12:55:26 +01:00
Jean-Sébastien Pédron 03f9d36988
rabbit_vhosts: Don't reconcile vhosts if `rabbit` is stopped
[Why]
That timer was started during boot and continued regardless if `rabbit`
was running or stopped.

This caused the reconsiliation to crash if the `rabbit` app was stopped
before the it ended because it tried to access the database even though
it was stopped or even reset.

[How]
We just check if `rabbit` is running before running one reconciliation
and scheduling a new one.
2024-11-25 12:39:13 +01:00
Diana Parra Corbacho 73924ba08e Tests: amqp_client_SUITE delete all queues on end per testcase 2024-11-25 09:06:33 +01:00
Diana Parra Corbacho a35f56fdc2 Tests: amqp_filtex_SUITE wait for link attachment and longer timeouts 2024-11-25 09:06:32 +01:00
GitHub 4e8d0f3ac2 bazel run gazelle 2024-11-23 04:02:38 +00:00
Michael Davis c3c7675bda
rabbit_khepri: Add macros for path patterns 2024-11-22 11:21:11 -05:00
Michael Davis e8fb9b6889
rabbit: Move include/{khepri.hrl => rabbit_khepri.hrl}
This fixes erlang_ls's header resolution. Previously it would confuse
the include_lib of the `khepri.hrl` from Khepri with this header in
the rabbit app.

This header is also specific to how rabbit uses Khepri so I think the
new name fits better.
2024-11-22 11:21:11 -05:00
Michael Klishin ea58fb1b48
crashing_queues_SUITE: squash a compiler warning 2024-11-17 17:23:00 -05:00
Michael Klishin 9f026f7a4b
Merge pull request #12727 from rabbitmq/rabbitmq-server-12709
By @Ayanda-D: Ensure only alive leaders and followers when fetching QQ replica states
2024-11-15 13:53:41 -05:00
Ayanda Dube 53cc8f8f2b
Update unit_quorum_queue_SUITE to use temporary alive & registered
test queue processes (since we now check/return only alive members
when fetching replica states)

(cherry picked from commit ebc0387b81)
2024-11-15 12:49:55 -05:00
David Ansari 6e8b566323 Deduplicate AMQP type inference
Introduce a single place in the AMQP 1.0 Erlang client that infers the AMQP 1.0 type.

Erlang integers are inferred to be AMQP type `long` to avoid overflow surprises.
2024-11-15 17:40:36 +01:00
Jean-Sébastien Pédron 2938338182
rabbit_khepri: Do not hard-code `coordination`, use the constant instead 2024-11-15 16:41:16 +01:00
Jean-Sébastien Pédron 05717ccccf
rabbit_khepri: Remove serial file during reset 2024-11-15 16:40:50 +01:00
Jean-Sébastien Pédron e41d766b29
rabbit_khepri: Ensure RabbitMQ is stopped before resetting with Khepri 2024-11-15 16:40:45 +01:00
Jean-Sébastien Pédron 7e2e7b79f2
rabbit_feature_flags: Support relative setting in `forced_feature_flags_on_init`
[Why]
We already support that from the environment variable, it is easy to add
to the configuration setting.
2024-11-15 14:50:35 +01:00
Michael Klishin 3e509c9f30
Merge pull request #12714 from rabbitmq/amqp-event-exchange
Support publishing AMQP 1.0 to Event Exchange
2024-11-14 18:09:19 -05:00
Ayanda Dube 3ecb3b61d4
Use whereis/1 instead of rabbit_process helper, and lists:filtermap/2 in
rabbit_quorum_queue:all_replica_states/0

(cherry picked from commit 19cc2d0608)
2024-11-14 14:05:17 -05:00
Ayanda Dube 6bb4c89c71
Add test for rabbit_quorum_queue:all_replica_states/0
and ensure non-existent/inactive/noproc QQ members are
not reported.

(cherry picked from commit 4e2c62b6af)
2024-11-14 14:05:12 -05:00
Ayanda Dube 9070e394d3
Ensure only alive QQ replica states are reported
when checking replica states to help avoid missing
inactive replicas e.g. on QQ checks from cli tools

(cherry picked from commit 491485092c)
2024-11-14 14:05:08 -05:00
Michael Klishin 15d3d5a8a1
Merge pull request #12674 from rabbitmq/add-is_feature_used-callback-to-transient_nonexcl_queues-depr-feature
rabbit_amqqueue: Add `is_feature_used` callback to `transient_nonexcl_queues` depr. feature
2024-11-14 13:57:32 -05:00
Michael Klishin c888689cca
Merge pull request #12722 from rabbitmq/fix-flakes
Fix flakes
2024-11-14 13:36:17 -05:00
Loïc Hoguin db50739ad8
CQ: Fix flakes in the store file scan test
We don't expect random bytes to be there in the current
version of the message store as we overwrite empty spaces
with zeroes when moving messages around.

We also don't expect messages to be false flagged when
the broker is running because it checks for message
validity in the index. Therefore make sure message bodies
in the tests don't contain byte 255.
2024-11-14 15:04:49 +01:00
Diana Parra Corbacho 6e7269994d Tests: per_node_limit_SUITE cleanup
Catch exceptions when closing connections during cleanup
2024-11-14 15:02:47 +01:00
Diana Parra Corbacho 5ef4fba851 tests: amqp_client_SUITE longer wait on receive for CI 2024-11-14 15:02:47 +01:00
Diana Parra Corbacho 2d025b579b Tests: amqpl_consumer_ack use unmanaged connection 2024-11-14 15:02:47 +01:00
David Ansari de804d1fa7 Support publishing AMQP 1.0 to Event Exchange
## What?

Prior to this commit, the `rabbitmq_event_exchange` internally published
always AMQP 0.9.1 messages to the `amq.rabbitmq.event` topic exchange.
This commit allows users to configure the plugin to publish AMQP 1.0
messages instead.

 ## Why?

Prior to this commit, when an AMQP 1.0 client consumed events,
event properties that are lists were omitted. For example property
`client_properties` of event `connection.created` or property
`arguments` of event `queue.created` were omitted because of the following sequence:
1. The event exchange plugins listens for all kind of internal events.
2. The event exchange plugin re-publishes all events as AMQP 0.9.1 message to the event exchange.
3. Later, when an AMQP 1.0 client consumes this message, the broker must translate the message from AMQP 0.9.1 to AMQP 1.0.
4. This translation follows the rules outlined in https://www.rabbitmq.com/docs/conversions#amqpl-amqp
5. Specifically, in this table the row before the last one describes the rule we're hitting here. It says that if the AMQP 0.9.1
header value is not an `x-` prefixed header and its value is an array or table, then this header is not converted.
That's because AMQP 1.0 application-properties must be simple types as mandated in https://docs.oasis-open.org/amqp/core/v1.0/os/amqp-core-messaging-v1.0-os.html#type-application-properties

 ## How?

The user can configure the plugin as follows to have the plugin
internally publish AMQP 1.0 messages:
```
event_exchange.protocol = amqp_1_0
```

To support complex types such as lists, the plugin sets all event
properties as AMQP 1.0 message-annotations. The plugin prefixes all message
annotation keys with `x-opt-` to comply with the AMQP 1.0 spec.

 ## Alternative Design

An alternative design would have been to format all event properties
e.g. as JSON within the message body. However, this breaks routing on
specific event property values via a headers exchange.

 ## Documentation
https://github.com/rabbitmq/rabbitmq-website/pull/2129
2024-11-14 12:52:09 +01:00
Karl Nilsson bfa293ab3b QQ: reduce memory use when dropping many messages at once.
As may happen when a max_length configuration change is made
when there are many messages on the queue.
2024-11-13 09:07:40 +00:00