Commit Graph

30515 Commits

Author SHA1 Message Date
Péter Gömöri 4eb5b82f9f
amqp10_common: Don't wrap export for test in test macro
The application is not always recompiled which causes tests to fail
because they cannot call `serial_number:usort/1`.

(cherry picked from commit 05a3733722)
2024-11-19 19:14:19 -05:00
Péter Gömöri bbc902ef23
Add test for stream consumer max offset lag prometheus metric
(cherry picked from commit 0c76054a0c)
2024-11-19 19:14:12 -05:00
markus812498 085ec75253
Expose max offset lag of stream consumers via Prometheus
Supports both per stream (detailed) and aggregated (metrics) values.

(cherry picked from commit e82058e872)
2024-11-19 19:14:06 -05:00
Péter Gömöri 9bb7530d04
Move client-side stream protocol test helpers to a separate module
So that they can be used from multiple test suites.

(cherry picked from commit cf8a00c5db)
2024-11-19 19:13:59 -05:00
dependabot[bot] 149036707b
Bump com.rabbitmq:amqp-client
Bumps [com.rabbitmq:amqp-client](https://github.com/rabbitmq/rabbitmq-java-client) from 5.22.0 to 5.23.0.
- [Release notes](https://github.com/rabbitmq/rabbitmq-java-client/releases)
- [Commits](https://github.com/rabbitmq/rabbitmq-java-client/compare/v5.22.0...v5.23.0)

---
updated-dependencies:
- dependency-name: com.rabbitmq:amqp-client
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-11-19 18:37:28 +00:00
Péter Gömöri 2a3b96d568 Minor update in README of recent history exchange
Future is here: it is possible to disable plugins
2024-11-19 11:44:58 +01:00
Hathoute c44c5150f2
Fix failing test
(cherry picked from commit 6459111f86)
2024-11-18 14:44:56 -05:00
Hathoute Hamza f1ee5b551a
Update rabbit_oauth2_schema.erl
(cherry picked from commit ed5f29cec8)
2024-11-18 12:46:46 -05:00
Hathoute 0d51ee9ec0
rabbitmq-auth-backend-oauth2: correctly map additional_scopes_key
(cherry picked from commit 0d799a50eb)
2024-11-18 12:46:40 -05:00
Marcial Rosales 5b845a6474 Extract table of sessions and links 2024-11-18 11:47:44 +01:00
Marcial Rosales 86bf3e108f Navigate from connections to connection page 2024-11-18 11:47:44 +01:00
Marcial Rosales dbc398b705 WIP Test amqp10 connection information in mangement ui 2024-11-18 11:47:44 +01:00
Michael Klishin ea58fb1b48
crashing_queues_SUITE: squash a compiler warning 2024-11-17 17:23:00 -05:00
Michael Klishin 9f026f7a4b
Merge pull request #12727 from rabbitmq/rabbitmq-server-12709
By @Ayanda-D: Ensure only alive leaders and followers when fetching QQ replica states
2024-11-15 13:53:41 -05:00
Ayanda Dube 53cc8f8f2b
Update unit_quorum_queue_SUITE to use temporary alive & registered
test queue processes (since we now check/return only alive members
when fetching replica states)

(cherry picked from commit ebc0387b81)
2024-11-15 12:49:55 -05:00
David Ansari 6e8b566323 Deduplicate AMQP type inference
Introduce a single place in the AMQP 1.0 Erlang client that infers the AMQP 1.0 type.

Erlang integers are inferred to be AMQP type `long` to avoid overflow surprises.
2024-11-15 17:40:36 +01:00
Jean-Sébastien Pédron 2938338182
rabbit_khepri: Do not hard-code `coordination`, use the constant instead 2024-11-15 16:41:16 +01:00
Jean-Sébastien Pédron 05717ccccf
rabbit_khepri: Remove serial file during reset 2024-11-15 16:40:50 +01:00
Jean-Sébastien Pédron e41d766b29
rabbit_khepri: Ensure RabbitMQ is stopped before resetting with Khepri 2024-11-15 16:40:45 +01:00
Jean-Sébastien Pédron 7e2e7b79f2
rabbit_feature_flags: Support relative setting in `forced_feature_flags_on_init`
[Why]
We already support that from the environment variable, it is easy to add
to the configuration setting.
2024-11-15 14:50:35 +01:00
Michael Klishin 4766d91dfe
bazel run gazelle 2024-11-15 02:29:56 -05:00
Michael Klishin 3e509c9f30
Merge pull request #12714 from rabbitmq/amqp-event-exchange
Support publishing AMQP 1.0 to Event Exchange
2024-11-14 18:09:19 -05:00
Ayanda Dube 3ecb3b61d4
Use whereis/1 instead of rabbit_process helper, and lists:filtermap/2 in
rabbit_quorum_queue:all_replica_states/0

(cherry picked from commit 19cc2d0608)
2024-11-14 14:05:17 -05:00
Ayanda Dube 6bb4c89c71
Add test for rabbit_quorum_queue:all_replica_states/0
and ensure non-existent/inactive/noproc QQ members are
not reported.

(cherry picked from commit 4e2c62b6af)
2024-11-14 14:05:12 -05:00
Ayanda Dube 9070e394d3
Ensure only alive QQ replica states are reported
when checking replica states to help avoid missing
inactive replicas e.g. on QQ checks from cli tools

(cherry picked from commit 491485092c)
2024-11-14 14:05:08 -05:00
Michael Klishin d5063c7076
Merge pull request #12675 from rabbitmq/fix-check_if_any_deprecated_features_are_used-cli-command
CLI: Finish `check_if_any_deprecated_features_are_used` implementation
2024-11-14 13:57:40 -05:00
Michael Klishin 15d3d5a8a1
Merge pull request #12674 from rabbitmq/add-is_feature_used-callback-to-transient_nonexcl_queues-depr-feature
rabbit_amqqueue: Add `is_feature_used` callback to `transient_nonexcl_queues` depr. feature
2024-11-14 13:57:32 -05:00
Michael Klishin c888689cca
Merge pull request #12722 from rabbitmq/fix-flakes
Fix flakes
2024-11-14 13:36:17 -05:00
Michael Klishin 79f712897d
Merge pull request #12720 from anhanhnguyen/update/grafana-erlang-dashboards
Grafana Dashboards: Fix Query Errors and Improve Instance Filtering in Erlang Distribution and BEAM Dashboards
2024-11-14 13:18:02 -05:00
Loïc Hoguin db50739ad8
CQ: Fix flakes in the store file scan test
We don't expect random bytes to be there in the current
version of the message store as we overwrite empty spaces
with zeroes when moving messages around.

We also don't expect messages to be false flagged when
the broker is running because it checks for message
validity in the index. Therefore make sure message bodies
in the tests don't contain byte 255.
2024-11-14 15:04:49 +01:00
Diana Parra Corbacho 6e7269994d Tests: per_node_limit_SUITE cleanup
Catch exceptions when closing connections during cleanup
2024-11-14 15:02:47 +01:00
Diana Parra Corbacho db78f9b812 Tests: mqtt_shared_SUITE match expected connection 2024-11-14 15:02:47 +01:00
Diana Parra Corbacho 5ef4fba851 tests: amqp_client_SUITE longer wait on receive for CI 2024-11-14 15:02:47 +01:00
Diana Parra Corbacho 9054b122fd tests: clustering_SUITE wait for stats 2024-11-14 15:02:47 +01:00
Diana Parra Corbacho 2d025b579b Tests: amqpl_consumer_ack use unmanaged connection 2024-11-14 15:02:47 +01:00
Diana Parra Corbacho e9a365b20e tests: clustering_prop_SUITE force stats on every wait 2024-11-14 15:02:47 +01:00
Diana Parra Corbacho 067a54aa40 tests: clustering_SUITE wait for metrics 2024-11-14 15:02:47 +01:00
David Ansari de804d1fa7 Support publishing AMQP 1.0 to Event Exchange
## What?

Prior to this commit, the `rabbitmq_event_exchange` internally published
always AMQP 0.9.1 messages to the `amq.rabbitmq.event` topic exchange.
This commit allows users to configure the plugin to publish AMQP 1.0
messages instead.

 ## Why?

Prior to this commit, when an AMQP 1.0 client consumed events,
event properties that are lists were omitted. For example property
`client_properties` of event `connection.created` or property
`arguments` of event `queue.created` were omitted because of the following sequence:
1. The event exchange plugins listens for all kind of internal events.
2. The event exchange plugin re-publishes all events as AMQP 0.9.1 message to the event exchange.
3. Later, when an AMQP 1.0 client consumes this message, the broker must translate the message from AMQP 0.9.1 to AMQP 1.0.
4. This translation follows the rules outlined in https://www.rabbitmq.com/docs/conversions#amqpl-amqp
5. Specifically, in this table the row before the last one describes the rule we're hitting here. It says that if the AMQP 0.9.1
header value is not an `x-` prefixed header and its value is an array or table, then this header is not converted.
That's because AMQP 1.0 application-properties must be simple types as mandated in https://docs.oasis-open.org/amqp/core/v1.0/os/amqp-core-messaging-v1.0-os.html#type-application-properties

 ## How?

The user can configure the plugin as follows to have the plugin
internally publish AMQP 1.0 messages:
```
event_exchange.protocol = amqp_1_0
```

To support complex types such as lists, the plugin sets all event
properties as AMQP 1.0 message-annotations. The plugin prefixes all message
annotation keys with `x-opt-` to comply with the AMQP 1.0 spec.

 ## Alternative Design

An alternative design would have been to format all event properties
e.g. as JSON within the message body. However, this breaks routing on
specific event property values via a headers exchange.

 ## Documentation
https://github.com/rabbitmq/rabbitmq-website/pull/2129
2024-11-14 12:52:09 +01:00
Anh Nguyen dc9311a561 Update Erlang Distribution dashboard panel and instance filtering
- Modified metric expression and legend format in State of distribution links
- Changed panel type from 'flant-statusmap-panel' to 'status-history' for Process state
2024-11-14 11:04:07 +07:00
Michael Klishin 27952938f8
Merge pull request #12712 from rabbitmq/gh_12608
QQ: reduce memory use when dropping many messages at once.
2024-11-13 11:42:20 -05:00
Anh Nguyen b9dc0ea3b4 Add instance filtering to Erlang BEAM Grafana dashboard metrics
- Updated metric expressions to include instance filtering with {instance=\"$node\"}
  for the following metrics:
  - erlang_vm_statistics_run_queues_length
  - erlang_vm_statistics_dirty_io_run_queue_length
  - erlang_vm_statistics_dirty_cpu_run_queue_length
- Added 'DS_PROMETHEUS' as a templated data source variable
2024-11-13 20:20:02 +07:00
Karl Nilsson bfa293ab3b QQ: reduce memory use when dropping many messages at once.
As may happen when a max_length configuration change is made
when there are many messages on the queue.
2024-11-13 09:07:40 +00:00
Michael Klishin c78bc8a9c3
4.1: Avoid an exception when an AMQP 0-9-1-originating message with expiration set is converted for an MQTT consumer (#12710)
* MQTT: avoid an exception

when an AMQP 0-9-1 publisher publishes a message
that has expiration set.

Stack trace was contributed in #12707 by @rdsilio.

* mc_mqtt_SUITE test for #12707 #12710

* MQTT protocol_interop_SUITE: new test for #12710 #12707

* Simplify tests

---------

Co-authored-by: David Ansari <david.ansari@gmx.de>
2024-11-13 09:20:43 +01:00
Michael Klishin aba62b9d12
Mention node_tags #12702 in rabbitmq.conf.example 2024-11-11 22:56:47 -05:00
Michael Klishin 94c8f01699
rabbitmq-diagnostics status: handle output of 3.13.x and previously released 4.0.x nodes
In a mixed cluster environment,
'rabbitmq-diagnostics status' can hit a node
that does not return any node tags.

Be more defensive and handle such cases
by simply displaying "(none)" for such
values.
2024-11-11 17:41:33 -05:00
Simon Unge 3d35416635 Node tags local to broker, add to /api/overview output and ctl status command 2024-11-11 20:49:21 +00:00
Jean-Sébastien Pédron 638e3a4b08
rabbit_amqqueue: Add `is_feature_used` callback to `transient_nonexcl_queues` depr. feature
[Why]
Without this callback, the deprecated features subsystem can't report if
the feature is used or not.

This reduces the usefulness of the HTTP API endpoint or the CLI command
that help verify if a cluster is using deprecated features.

[How]
The callback counts transient non-exclusive queues and return `true` if
there are one or more of them.

References #12619.
2024-11-11 15:57:52 +01:00
Jean-Sébastien Pédron ddaea6facb
CLI: Finish `check_if_any_deprecated_features_are_used` implementation
[Why]
The previous implementation bypassed the deprecated features subsystem.
It only cared about classic mirrored queues and called some
queue-related code directly to determine if this specific feature was
used.

[How]
The command code is simplified by calling the deprecated subsystem to
list used deprecated features instead.

References #12619.
2024-11-11 10:13:25 +01:00
Michael Klishin b43a7263f5
List cluster_tags in rabbitmq.conf.example #12552 #12659 #12699 2024-11-10 20:26:24 -05:00
Michael Klishin f5801be6db
Merge pull request #12659 from rabbitmq/su_aws/cluster_tag
Make it possible to set some cluster metadata besides the name using tags
2024-11-10 19:17:53 -05:00
Michael Klishin 7c66fba0c3
Make it possible to clear cluster_tags via rabbitmq.conf 2024-11-10 14:38:34 -05:00
Michael Klishin 9e649aefc0
We no longer use 'maybe' in this module 2024-11-10 14:35:14 -05:00
Michael Klishin e5d805ea6d
Cluster tags: set unconditionally
Otherwise once set, it would not be possible
to change them by updating rabbitmq.conf
2024-11-10 14:30:51 -05:00
Michael Klishin 6b614fc879
rabbitmq.conf.example: add management.http.max_body_size 2024-11-09 18:02:16 -05:00
Michael Klishin 961e5c5a21
Undo the Bazel-related change from #12696
(cherry picked from commit a66c926985)
2024-11-09 17:47:06 -05:00
Michael Klishin 673826425a
Merge pull request #12696 from rabbitmq/mk-http-api-lower-body-length-limit-for-binding-creation
HTTP API: reduce body size limit for the endpoint used to bind queues/streams/exchanges
2024-11-09 17:13:03 -05:00
Michael Klishin 3dc5c463a4
Pass Dialyzer 2024-11-09 16:53:45 -05:00
Michael Klishin b0abf88aa8
rabbit_mgmt_util: minor refactoring 2024-11-09 16:38:48 -05:00
Michael Klishin fb300d2a4b
HTTP API: limit default body size for binding creation
It does not need to use the "worst case scenario"
default HTTP request body size limit that
is primarily necessary because definition imports
can be large (MiBs in size, for example).

Since exchange, queue names and routing key
have limits of 255 bytes and optional arguments
can practically be expected to be short, we
can lower the limit to < 10 KiB.
2024-11-09 16:16:35 -05:00
Simon Unge eeea517da5 Store tags in global parameters 2024-11-08 21:41:49 +00:00
Simon Unge f5ef64ad06 Add cluster tag config that is exposed via HTTP /api/overview and CTL cluster_status 2024-11-08 21:05:02 +00:00
David Ansari ae423721ad Fix flake will_delay_session_takeover
Prior to this commit, the following flake occurred in CI for
```
make -C deps/rabbitmq_mqtt ct-v5 t=cluster_size_1:will_delay_session_takeover
```

```
=== Location: [{v5_SUITE,will_delay_session_takeover,1473},
              {test_server,ts_tc,1793},
              {test_server,run_test_case_eval1,1302},
              {test_server,run_test_case_eval,1234}]
=== === Reason: {test_case_failed,"Received unexpected PUBLISH payload. Expected: <<\"will-3a\">> Got: <<\"will-4a\">>"}
```

The RabbitMQ logs for this single node test show:
```
2024-11-04 14:43:35.039196+00:00 [debug] <0.1334.0> MQTT accepting TCP connection <0.1334.0> (127.0.0.1:42576 -> 127.0.0.1:27005)
2024-11-04 14:43:35.039336+00:00 [debug] <0.1334.0> Received a CONNECT, client ID: c3, username: undefined, clean start: true, protocol version: 5, keepalive: 60, property names: []
2024-11-04 14:43:35.039438+00:00 [debug] <0.1334.0> MQTT connection 127.0.0.1:42576 -> 127.0.0.1:27005 picked vhost using plugin_configuration_or_default_vhost
2024-11-04 14:43:35.039537+00:00 [debug] <0.1334.0> User 'guest' authenticated successfully by backend rabbit_auth_backend_internal
2024-11-04 14:43:35.039729+00:00 [info] <0.1334.0> Accepted MQTT connection 127.0.0.1:42576 -> 127.0.0.1:27005 for client ID c3
2024-11-04 14:43:35.040297+00:00 [debug] <0.1337.0> MQTT accepting TCP connection <0.1337.0> (127.0.0.1:42580 -> 127.0.0.1:27005)
2024-11-04 14:43:35.040442+00:00 [debug] <0.1337.0> Received a CONNECT, client ID: c4, username: undefined, clean start: true, protocol version: 5, keepalive: 60, property names: []
2024-11-04 14:43:35.040534+00:00 [debug] <0.1337.0> MQTT connection 127.0.0.1:42580 -> 127.0.0.1:27005 picked vhost using plugin_configuration_or_default_vhost
2024-11-04 14:43:35.040597+00:00 [debug] <0.1337.0> User 'guest' authenticated successfully by backend rabbit_auth_backend_internal
2024-11-04 14:43:35.040793+00:00 [info] <0.1337.0> Accepted MQTT connection 127.0.0.1:42580 -> 127.0.0.1:27005 for client ID c4
2024-11-04 14:43:35.041463+00:00 [debug] <0.1340.0> MQTT accepting TCP connection <0.1340.0> (127.0.0.1:42596 -> 127.0.0.1:27005)
2024-11-04 14:43:35.041715+00:00 [debug] <0.1340.0> Received a CONNECT, client ID: c1, username: undefined, clean start: false, protocol version: 5, keepalive: 60, property names: ['Session-Expiry-Interval']
2024-11-04 14:43:35.041806+00:00 [debug] <0.1340.0> MQTT connection 127.0.0.1:42596 -> 127.0.0.1:27005 picked vhost using plugin_configuration_or_default_vhost
2024-11-04 14:43:35.041881+00:00 [debug] <0.1340.0> User 'guest' authenticated successfully by backend rabbit_auth_backend_internal
2024-11-04 14:43:35.041982+00:00 [warning] <0.1328.0> MQTT disconnecting client <<"127.0.0.1:42560 -> 127.0.0.1:27005">> with duplicate id 'c1'
2024-11-04 14:43:35.042062+00:00 [info] <0.1340.0> Accepted MQTT connection 127.0.0.1:42596 -> 127.0.0.1:27005 for client ID c1
2024-11-04 14:43:35.045624+00:00 [debug] <0.1345.0> MQTT accepting TCP connection <0.1345.0> (127.0.0.1:42602 -> 127.0.0.1:27005)
2024-11-04 14:43:35.045781+00:00 [debug] <0.1345.0> Received a CONNECT, client ID: c2, username: undefined, clean start: false, protocol version: 5, keepalive: 60, property names: ['Session-Expiry-Interval']
2024-11-04 14:43:35.045874+00:00 [debug] <0.1345.0> MQTT connection 127.0.0.1:42602 -> 127.0.0.1:27005 picked vhost using plugin_configuration_or_default_vhost
2024-11-04 14:43:35.045943+00:00 [debug] <0.1345.0> User 'guest' authenticated successfully by backend rabbit_auth_backend_internal
2024-11-04 14:43:35.046032+00:00 [warning] <0.1331.0> MQTT disconnecting client <<"127.0.0.1:42566 -> 127.0.0.1:27005">> with duplicate id 'c2'
2024-11-04 14:43:35.046281+00:00 [info] <0.1345.0> Accepted MQTT connection 127.0.0.1:42602 -> 127.0.0.1:27005 for client ID c2
2024-11-04 14:43:35.047063+00:00 [debug] <0.1350.0> MQTT accepting TCP connection <0.1350.0> (127.0.0.1:42614 -> 127.0.0.1:27005)
2024-11-04 14:43:35.047702+00:00 [debug] <0.1350.0> Received a CONNECT, client ID: c3, username: undefined, clean start: true, protocol version: 5, keepalive: 60, property names: ['Session-Expiry-Interval']
2024-11-04 14:43:35.047910+00:00 [debug] <0.1350.0> MQTT connection 127.0.0.1:42614 -> 127.0.0.1:27005 picked vhost using plugin_configuration_or_default_vhost
2024-11-04 14:43:35.048467+00:00 [debug] <0.1350.0> User 'guest' authenticated successfully by backend rabbit_auth_backend_internal
2024-11-04 14:43:35.049701+00:00 [info] <0.1350.0> Accepted MQTT connection 127.0.0.1:42614 -> 127.0.0.1:27005 for client ID c3
2024-11-04 14:43:35.050907+00:00 [warning] <0.1334.0> MQTT disconnecting client <<"127.0.0.1:42576 -> 127.0.0.1:27005">> with duplicate id 'c3'
2024-11-04 14:43:35.051248+00:00 [debug] <0.1353.0> MQTT accepting TCP connection <0.1353.0> (127.0.0.1:42626 -> 127.0.0.1:27005)
2024-11-04 14:43:35.051395+00:00 [debug] <0.1353.0> Received a CONNECT, client ID: c4, username: undefined, clean start: false, protocol version: 5, keepalive: 60, property names: ['Session-Expiry-Interval']
2024-11-04 14:43:35.051519+00:00 [debug] <0.1353.0> MQTT connection 127.0.0.1:42626 -> 127.0.0.1:27005 picked vhost using plugin_configuration_or_default_vhost
2024-11-04 14:43:35.051590+00:00 [debug] <0.1353.0> User 'guest' authenticated successfully by backend rabbit_auth_backend_internal
2024-11-04 14:43:35.051871+00:00 [info] <0.1353.0> Accepted MQTT connection 127.0.0.1:42626 -> 127.0.0.1:27005 for client ID c4
2024-11-04 14:43:35.051960+00:00 [warning] <0.1337.0> MQTT disconnecting client <<"127.0.0.1:42580 -> 127.0.0.1:27005">> with duplicate id 'c4'
2024-11-04 14:43:35.052689+00:00 [debug] <0.1337.0> sent Will Message to topic my/topic for MQTT client ID c4
2024-11-04 14:43:35.054119+00:00 [debug] <0.1334.0> sent Will Message to topic my/topic for MQTT client ID c3
```

We see nicely how RabbitMQ sends the will message for both c3 and c4.
However, the order in which RabbitMQ sends is not guaranteed.
Hence, we adapt the test expectation to not depend on the order of Will
messages being received.
2024-11-08 16:12:52 +01:00
Arnaud Cogoluègnes ca70f20bca
Merge pull request #12686 from rabbitmq/stream-coordinator-ra-local-query-infinity-timeout
Use infinity timout for RA local query in stream coordinator
2024-11-08 15:08:16 +01:00
David Ansari c839409599 Fix test flake properties_section
This test flaked in CI with the following error:
```
=== === Reason: no match of right hand side value {error,half_attached}
  in function  amqp_utils:detach_link_sync/1 (amqp_utils.erl, line 100)
  in call from amqp_filtex_SUITE:properties_section/1 (amqp_filtex_SUITE.erl, line 187)
  in call from test_server:ts_tc/3 (test_server.erl, line 1793)
  in call from test_server:run_test_case_eval1/6 (test_server.erl, line 1302)
  in call from test_server:run_test_case_eval/9 (test_server.erl, line 1234)
```
2024-11-08 11:46:19 +01:00
David Ansari fa61266979
Merge pull request #12689 from rabbitmq/mqtt_flake_management_plugin_connection
Fix MQTT test flake management_plugin_connection
2024-11-08 11:45:16 +01:00
David Ansari 9095f7d961 Fix test flake
Increase waiting for credit being applied as described in commit
aeedad7b51 since this test case still flakes rarely with:
```
=== === Reason: {assertEqual,[{module,amqp_client_SUITE},
                               {line,3030},
                               {expression,"amqp10_msg : body ( Msg1 )"},
                               {expected,[<<"1">>]},
                               {value,[<<"2">>]}]}
  in function  amqp_client_SUITE:detach_requeues_two_connections/2 (amqp_client_SUITE.erl, line 3030)
  in call from test_server:ts_tc/3 (test_server.erl, line 1793)
  in call from test_server:run_test_case_eval1/6 (test_server.erl, line 1302)
  in call from test_server:run_test_case_eval/9 (test_server.erl, line 1234)
```
2024-11-08 11:35:04 +01:00
David Ansari 090384fe37 Fix MQTT test flake management_plugin_connection
Prior to this commit this test flaked in CI:
```
=== === Reason: {assertEqual,
                     [{module,mqtt_shared_SUITE},
                      {line,1222},
                      {expression,"http_get ( Config , \"/connections\" )"},
                      {expected,[]},
                      {value,
                          [#{timeout => 99,
                             name => <<"127.0.0.1:58712 -> 127.0.0.1:29005">>,
                             node =>
                                 <<"rmq-ct-mqtt-cluster_size_1-1-29000@localhost">>,
                             port => 29005,user => <<"guest">>,ssl => false,
                             protocol => <<"MQTT 5-0">>,
                             host => <<"127.0.0.1">>,
                             client_properties =>
                                 #{client_id =>
                                       <<"management_plugin_connection">>},
                             vhost => <<"/">>,peer_host => <<"127.0.0.1">>,
                             peer_port => 58712,frame_max => 0,
                             channel_max => 0,auth_mechanism => <<"none">>,
                             connected_at => 1730797370048,
                             ssl_protocol => null,ssl_key_exchange => null,
                             ssl_cipher => null,ssl_hash => null,
                             peer_cert_issuer => null,
                             peer_cert_subject => null,
                             peer_cert_validity => null,
                             user_who_performed_action => <<"guest">>}]}]}
  in function  mqtt_shared_SUITE:management_plugin_connection/1 (mqtt_shared_SUITE.erl, line 1222)
  in call from test_server:ts_tc/3 (test_server.erl, line 1793)
  in call from test_server:run_test_case_eval1/6 (test_server.erl, line 1302)
  in call from test_server:run_test_case_eval/9 (test_server.erl, line 1234)
```
2024-11-08 11:12:04 +01:00
David Ansari 40bf778e89 Fix MQTT test flake
Prior to this commit, test
```
make -C deps/rabbitmq_mqtt ct-mqtt_shared t=[mqtt,cluster_size_1,v4]:non_clean_sess_reconnect_qos0_and_qos1
```

flaked in CI with error:
```
{mqtt_shared_SUITE,non_clean_sess_reconnect_qos0_and_qos1,972}
{badmatch,{publish_not_received,<<"msg-0">>}}
```

The problem was the following race condition:
* The MQTT v4 client sends an async DISCONNECT
* The global MQTT consumer metric got decremented. However, the classic
  queue still has the MQTT connection proc registered as consumer.
* The test case sends a message
* The classic queue checks out the message to the old connection instead
  of checking out the message to the new connection.

The solution in this commit is to check the consumer count of the
classic queue before proceeding to send the message after disconnection.
2024-11-08 10:35:36 +01:00
Arnaud Cogoluègnes 1634adbff3
Use infinity timout for RA local query in stream coordinator
The 5-second default timeout is too short.
2024-11-08 09:26:24 +01:00
Michael Klishin bcfca2be98
Merge pull request #12670 from rabbitmq/amqp-connection-sessions
Show session and link details for AMQP 1.0 connection
2024-11-07 14:39:47 -05:00
Arnaud Cogoluègnes 26f941b815
Squash dialyzer warning 2024-11-07 18:17:47 +01:00
Arnaud Cogoluègnes 1554b74fc7
Use 4-space indent in stream manager 2024-11-07 17:03:50 +01:00
Arnaud Cogoluègnes 5107fd48ba
Remove gen_server behaviour from stream manager
The stream manager does not need to be a gen_server (no cast, no state)
and the gen_server can create contention for large stream deployments
(some functions make cluster-wide calls that can take some time).
2024-11-07 16:50:31 +01:00
David Ansari 124ef694bc Fix crashes
This commit fixes two different bugs/crashes.

To repro, prior to this commit:
1. Create an AMQP 1.0 connection on node-1.
2. Open the Management UI on node-2 and open the connection page of this
   single AMQP 1.0 connection.

The first crash was the following:
```
[error] <0.1297.0>   crasher:
[error] <0.1297.0>     initial call: cowboy_stream_h:request_process/3
[error] <0.1297.0>     pid: <0.1297.0>
[error] <0.1297.0>     registered_name: []
[error] <0.1297.0>     exception error: no case clause matching
[error] <0.1297.0>                      {badrpc,
[error] <0.1297.0>                          {'EXIT',
[error] <0.1297.0>                              {undef,
[error] <0.1297.0>                                  [{rabbit_connection_tracking,lookup,
[error] <0.1297.0>                                       [<<"[::1]:51729 -> [::1]:5672">>,
[error] <0.1297.0>                                        ['rabbit-1@ABCDDDEEAA']],
[error] <0.1297.0>                                       []}]}}}
[error] <0.1297.0>       in function  rabbit_connection_tracking:lookup/2 (rabbit_connection_tracking.erl, line 235)
[error] <0.1297.0>       in call from rabbit_mgmt_wm_connection_sessions:conn/1 (rabbit_mgmt_wm_connection_sessions.erl, line 72)
[error] <0.1297.0>       in call from rabbit_mgmt_wm_connection_sessions:is_authorized/2 (rabbit_mgmt_wm_connection_sessions.erl, line 63)
[error] <0.1297.0>       in call from cowboy_rest:call/3 (src/cowboy_rest.erl, line 1590)
[error] <0.1297.0>       in call from cowboy_rest:is_authorized/2 (src/cowboy_rest.erl, line 368)
[error] <0.1297.0>       in call from cowboy_rest:upgrade/4 (src/cowboy_rest.erl, line 284)
[error] <0.1297.0>       in call from cowboy_stream_h:execute/3 (src/cowboy_stream_h.erl, line 306)
[error] <0.1297.0>       in call from cowboy_stream_h:request_process/3 (src/cowboy_stream_h.erl, line 295)
```

The second crash was the following:
```
[error] <0.1132.0>   crasher:
[error] <0.1132.0>     initial call: cowboy_stream_h:request_process/3
[error] <0.1132.0>     pid: <0.1132.0>
[error] <0.1132.0>     registered_name: []
[error] <0.1132.0>     exception error: no case clause matching
[error] <0.1132.0>                      {tracked_connection,
[error] <0.1132.0>                          {'rabbit-1@ABCDDDEEAA',
[error] <0.1132.0>                              <<"[::1]:65505 -> [::1]:5672">>},
[error] <0.1132.0>                          'rabbit-1@ABCDDDEEAA',<<"/">>,
[error] <0.1132.0>                          <<"[::1]:65505 -> [::1]:5672">>,<13661.1110.0>,
[error] <0.1132.0>                          {1,0},
[error] <0.1132.0>                          network,
[error] <0.1132.0>                          {0,0,0,0,0,0,0,1},
[error] <0.1132.0>                          65505,<<"guest">>,1730908606089}
[error] <0.1132.0>       in function  rabbit_connection_tracking:lookup/2 (rabbit_connection_tracking.erl, line 235)
[error] <0.1132.0>       in call from rabbit_mgmt_wm_connection_sessions:conn/1 (rabbit_mgmt_wm_connection_sessions.erl, line 72)
[error] <0.1132.0>       in call from rabbit_mgmt_wm_connection_sessions:is_authorized/2 (rabbit_mgmt_wm_connection_sessions.erl, line 63)
[error] <0.1132.0>       in call from cowboy_rest:call/3 (src/cowboy_rest.erl, line 1590)
[error] <0.1132.0>       in call from cowboy_rest:is_authorized/2 (src/cowboy_rest.erl, line 368)
[error] <0.1132.0>       in call from cowboy_rest:upgrade/4 (src/cowboy_rest.erl, line 284)
[error] <0.1132.0>       in call from cowboy_stream_h:execute/3 (src/cowboy_stream_h.erl, line 306)
[error] <0.1132.0>       in call from cowboy_stream_h:request_process/3 (src/cowboy_stream_h.erl, line 295)
2024-11-07 15:11:42 +01:00
David Ansari 9d0c851df2 Show session and link details for AMQP 1.0 connection
## What?

On the connection page in the Management UI, display detailed session and
link information including:
* Link names
* Link target and source addresses
* Link flow control state
* Session flow control state
* Number of unconfirmed and unacknowledged messages

 ## How?

A new HTTP API endpoint is added:
```
/connections/:connection_name/sessions
```

The HTTP handler first queries the Erlang connection process to find out about
all session Pids. The handler then queries each Erlang session process
of this connection.

(The table auto-refreshes by default every 5 seconds. The handler querying a single
connection with 60 idle sessions with each 250 links takes ~100 ms.)

For better user experience in the Management UI, this commit also makes the
session process store and expose link names as well as source/target addresses.
2024-11-07 15:11:42 +01:00
Jean-Sébastien Pédron 26a00e7969
rabbitmq_management: Link from the deprecated features panel to docs
... that were added to the website in rabbitmq/rabbitmq-website#2122.
2024-11-06 20:00:41 +01:00
markus812498 a961b5f418
cosmetic: arranged and reorganized vertical bars
(cherry picked from commit 98c2363a79)
2024-11-06 12:04:00 -05:00
Diana Parra Corbacho 3eb2bc4507 Tests: clustering_prop_SUITE set core_metrics_gc_interval to a very low value 2024-11-06 16:59:42 +01:00
Diana Parra Corbacho 8dc712543e Tests: set disk monitor as active in set_disk_free_limit_command_test 2024-11-06 16:59:19 +01:00
Jean-Sébastien Pédron f7a740cd8f
rabbit_feature_flags: Rework the management UI page
[Why]
The "Feature flags" admin section had several issues:
* It was not designed for experimental feature flags. What was done for
  RabbitMQ 4.0.0 was still unclear as to what a user should expect for
  experimental feature flags.
* The UI uses synchronous requests from the browser main thread. It
  means that for a feature flag that has a long running migration
  callback, the browser tab could freeze for a very long time.

[How]
The feature flags table is reworked and now displays:
* a series of icons to highlight the following:
    * a feature flag that has a migration function and thus that can
      take time to be enabled
    * a feature flag that is experimental
    * whether this experimental feature flag is supported or not
* a toggle to quickly show if a feature flag is enabled or not and let
  the user enable it at the same time.

For stable feature flags, when a user click on the toggle, the toggle
goes into an intermediate state while waiting for the response from the
broker. If the response is successful, the toggle is green. Otherwise it
goes back to red and the error is displayed in a popup as before.

For experimental feature flags, when a user click on the toggle, a popup
is displayed to let the user know of the possible constraints and
consequences, with one or two required checkboxes to tick so the user
confirms they understand the message. The feature flag is enabled only
after the user validates the popup. The displayed message and the
checkboxes depend on if the experimental feature flag is supported or
not (it is a new attribute of experimental feature flags).

The request to enable feature flags now uses the modern `fetch()` API.
Therefore it uses Javascript promises and does not block the main
thread: the UI remains responsive while a migration callback runs.

Finally, an "Enable all stable feature flags" button has been added to
the warning that tells the user some stable feature flags are still
disabled.

V2: Pause auto-refresh while a feature flag is being handled. This fixes
    some display inconsistencies.
2024-11-06 11:35:14 +01:00
Jean-Sébastien Pédron d2d608211a
rabbit_feature_flags: Use non-blocking call in `get_state/1`
[Why]
The previous implementation was using the blocking `is_enabled/1` API.
This meant that if a feature flag was being enabled and the enable
callback took time, the CLI's `list_feature_flag` command or any use of
the management UI would block until the feature flag was enabled.

[How]
`get_state/1` now uses the non-blocking API. However it returns a now
possible value: `state_changing`.
2024-11-06 11:35:14 +01:00
Jean-Sébastien Pédron f90cb869cc
rabbit_feature_flags: Expose more feature flag properties to the management API
[Why]
It allows to better communicate each feature flag specificities and make
a better more user-friendly management UI.
2024-11-06 11:35:14 +01:00
Jean-Sébastien Pédron 724705ca3b
rabbit_feature_flags: Declare if an experimental feature flag is supported or not
[Why]
Durint the development of Khepri, it was difficult to communicate that
it was unsupported in RabbitMQ 3.13.x but was then supported in 4.0.x
even though it was still experimental.

[How]
The feature flag definition now exposes that support level in a now
attribute called `experiment_level`. It can be `unsupported` or
`supported`.

We can use this now attribute in the CLI or the web UI to convey the
level of support to the end user.

In the future, we could imagine that an experimental feature flag
becomes abandoned, where upgraded from a node that has it enabled to a
version that marks the feature flag as abandoned is not possible.
2024-11-06 11:35:14 +01:00
Lois Soto Lopez 4819801a33 Exclude policy_repair QQ test on mixed versions 2024-11-05 16:38:18 +01:00
GitHub 71bdd1a78c bazel run gazelle 2024-11-05 04:02:22 +00:00
Michael Klishin 9ff83519f9
Merge pull request #12652 from rabbitmq/dependabot/maven/deps/rabbitmq_mqtt/test/java_SUITE_data/main/org.apache.maven.plugins-maven-surefire-plugin-3.5.2
build(deps): bump org.apache.maven.plugins:maven-surefire-plugin from 3.5.1 to 3.5.2 in /deps/rabbitmq_mqtt/test/java_SUITE_data
2024-11-04 14:07:43 -05:00
dependabot[bot] fc3ef6dda7
build(deps): bump org.apache.maven.plugins:maven-surefire-plugin
Bumps [org.apache.maven.plugins:maven-surefire-plugin](https://github.com/apache/maven-surefire) from 3.5.1 to 3.5.2.
- [Release notes](https://github.com/apache/maven-surefire/releases)
- [Commits](https://github.com/apache/maven-surefire/compare/surefire-3.5.1...surefire-3.5.2)

---
updated-dependencies:
- dependency-name: org.apache.maven.plugins:maven-surefire-plugin
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-11-04 18:48:45 +00:00
dependabot[bot] 969186f6fd
build(deps): bump org.apache.maven.plugins:maven-surefire-plugin
Bumps [org.apache.maven.plugins:maven-surefire-plugin](https://github.com/apache/maven-surefire) from 3.5.1 to 3.5.2.
- [Release notes](https://github.com/apache/maven-surefire/releases)
- [Commits](https://github.com/apache/maven-surefire/compare/surefire-3.5.1...surefire-3.5.2)

---
updated-dependencies:
- dependency-name: org.apache.maven.plugins:maven-surefire-plugin
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-11-04 18:34:32 +00:00
dependabot[bot] 4860585c50
build(deps): bump org.apache.maven.plugins:maven-surefire-plugin
Bumps [org.apache.maven.plugins:maven-surefire-plugin](https://github.com/apache/maven-surefire) from 3.5.1 to 3.5.2.
- [Release notes](https://github.com/apache/maven-surefire/releases)
- [Commits](https://github.com/apache/maven-surefire/compare/surefire-3.5.1...surefire-3.5.2)

---
updated-dependencies:
- dependency-name: org.apache.maven.plugins:maven-surefire-plugin
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-11-04 18:17:42 +00:00
Michael Klishin 508ec97c88
Merge pull request #12646 from rabbitmq/metrics-flakes
metrics_SUITE: wait for tables in proper test
2024-11-04 12:08:26 -05:00
Michael Klishin 6e54084db0
Merge pull request #12642 from rabbitmq/prometheus-federation-flake
test: wait for links and metrics in prometheus_rabbitmq_federation_collector_SUITE
2024-11-04 12:00:48 -05:00
David Ansari 6034f3c411
Merge pull request #12638 from rabbitmq/amqp-connection-metrics
Expose AMQP connection metrics
2024-11-04 17:57:29 +01:00
Diana Parra Corbacho 054fcd676c metrics_SUITE: wait for tables in proper test 2024-11-04 16:46:23 +01:00
Diana Parra Corbacho 2fba2419d3 test: wait for links and metrics in prometheus_rabbitmq_federation_collector_SUITE 2024-11-04 08:42:42 +01:00
Michael Klishin 734f6853bc
Merge branch 'main' into rabbitmq-server-12412 2024-11-04 00:39:51 -05:00
Michael Klishin 8ea7e65e34
QQ: handle case where a stale read request results in member crash.
It is possible for a slow running follower with local consumers
to crash after a snapshot installation as it tries to read an entry
from its log that is no longer there (as it has been consumed and
completed by another node but still refers to prior consumers on the
current node).

This commit makes the log effect callback function more defensive
to check that the number of commands returned by the log effect
isn't different from what was requested. if it is different we
consider this a stale read request and return no further effects.

Conflicts:
	deps/rabbit/test/quorum_queue_SUITE.erl
2024-11-04 00:36:48 -05:00
GitHub 654bd047f3
bazel run gazelle 2024-11-04 00:34:52 -05:00
David Ansari af876ed6d1
Use log macros for AMQP
Using a log macro has the benefit that location data is added as
explained in https://www.erlang.org/doc/apps/kernel/logger.html#t:metadata/0
2024-11-04 00:34:51 -05:00
David Ansari 6fde076707
Support AMQP 1.0 token renewal
Closes #9259.

 ## What?
Allow an AMQP 1.0 client to renew an OAuth 2.0 token before it expires.

 ## Why?
This allows clients to keep the AMQP connection open instead of having
to create a new connection whenever the token expires.

 ## How?
As explained in https://github.com/rabbitmq/rabbitmq-server/issues/9259#issuecomment-2437602040
the client can `PUT` a new token on HTTP API v2 path `/auth/tokens`.
RabbitMQ will then:
1. Store the new token on the given connection.
2. Recheck access to the connection's vhost.
3. Clear all permission caches in the AMQP sessions.
4. Recheck write permissions to exchanges for links publishing to
   RabbitMQ, and recheck read permissions from queues for links
   consuming from RabbitMQ. The latter complies with the user
   expectation in #11364.
2024-11-04 00:34:51 -05:00
Diana Parra Corbacho ff44f4d355
Test: metrics_SUITE queue_idemp wait for queue metrics 2024-11-04 00:34:51 -05:00
Diana Parra Corbacho 7ac5b17787
Test: wait for metrics 2024-11-04 00:34:51 -05:00
Diana Parra Corbacho ab9d225502
Tests: wait for connection closed in metrics_SUITE 2024-11-04 00:34:50 -05:00
Michal Kuratczyk df8f6d19aa
Abort restart-cluster if something goes wrong
For example, if the first restarted node doesn't start,
don't try to restart the other nodes. This mimics what
orchestrators such as Kubernetes or BOSH would do
(although they perform this check differently)
2024-11-04 00:34:50 -05:00
Marcial Rosales c8e1593679
Verify non-zero DNS and email SAN 2024-11-04 00:34:50 -05:00
Marcial Rosales c0ef442d6d
Use the correct variable name 2024-11-04 00:34:50 -05:00
Jean-Sébastien Pédron fe7beea4b8
rabbit_feature_flags: Log controller task on a single line 2024-11-04 00:34:50 -05:00
Jean-Sébastien Pédron 9802348683
rabbit_feature_flags: Report feature flags init error reason
[Why]
`failed_to_initialize_feature_flags_registry` was a little too vague.
2024-11-04 00:34:50 -05:00
Jean-Sébastien Pédron 937ca915c9
rabbit_feature_flags: Introduce hard vs. soft required feature flags
[Why]
Before this patch, required feature flags were basically checked during
boot: they must have been enabled when they were mere stable feature
flags. If they were not, the node refused to boot.

This was easy for the developer because making a feature flag required
allowed to remove the entire compatibility code. Very satisfying.

Unfortunately, this was a pain point to end users, especially those who
did not pay attention to RabbitMQ and the release notes and were just
asking their package manager to update everything. They could end up
with a node that refuse to boot. The only solution was to downgrade,
enable the disabled stable feature flags, upgrade again.

[How]
This patch introduces two levels of requirement to required feature
flags:
* `hard`: this corresponds to the existing behavior where a node will
  refuse to boot if a hard required feature flag is not enabled before
  the upgrade.
* `soft`: such a required feature flag will be automatically enabled
  during the upgrade to a version where it is marked as required.

The level of requirement is set in the feature flag definition:

    -rabbit_feature_flag(
       {my_feature_flag,
        #{stability     => required,
	  require_level => hard
         }}).

The default requirement level is `soft`. All existing required feature
flags have now a requirement level of `hard`.

The handling of soft required feature flag is done when the cluster
feature flags states are verified and synchronized. If a required
feature flag is not enabled yet, it is enabled at that time.

This means that as developers, we will have to keep compatibility code
forever for every soft required feature flag, like the feature flag
definition itself.
2024-11-04 00:34:49 -05:00
Jean-Sébastien Pédron b5b598ce25
rabbit_prometheus_http_SUITE: Start broker once in `special_chars` group
`init_per_group/3`, which starts the broker, was already called earlier
in the function.

This fixes a bug where the node can't be stopped in `end_per_group/2`,
attecting the next group ability to start one.
2024-11-04 00:34:49 -05:00
Diana Parra Corbacho 624b72bedb
queue_SUITE: use a different upstream for each queue on multi-federation tests 2024-11-04 00:34:49 -05:00
David Ansari ea7bc819fd
Add AMQP 1.0 event exchange test 2024-11-04 00:34:49 -05:00
Jean-Sébastien Pédron 2d61fac09c
rabbitmq-run.mk: Restart nodes in a cluster sequentially
... not in parallel.
2024-11-04 00:34:49 -05:00
Jean-Sébastien Pédron 7f1d1615f9
rabbitmq-run.mk: Use a 60 seconds timeout for `rabbitmqctl wait`
... not 60 milliseconds.
2024-11-04 00:34:49 -05:00
Michael Klishin 0a557f7d5e
Use fmt_string in this error message 2024-11-04 00:34:48 -05:00
Diana Parra Corbacho ef06f80bb8
Fix metrics_SUITE connection_metrics flake 2024-11-04 00:34:48 -05:00
Loïc Hoguin 2235492d28
Make CI: Add mixed version testing
This is enabled on main and for pull requests. Bazel remains
used in previous branches.
2024-11-04 00:34:47 -05:00
dependabot[bot] 7e05aac424
build(deps): bump org.springframework.boot:spring-boot-starter-parent
Bumps [org.springframework.boot:spring-boot-starter-parent](https://github.com/spring-projects/spring-boot) from 3.3.4 to 3.3.5.
- [Release notes](https://github.com/spring-projects/spring-boot/releases)
- [Commits](https://github.com/spring-projects/spring-boot/compare/v3.3.4...v3.3.5)

---
updated-dependencies:
- dependency-name: org.springframework.boot:spring-boot-starter-parent
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-11-04 00:34:47 -05:00
dependabot[bot] 6ade708dab
build(deps): bump org.springframework.boot:spring-boot-starter-parent
Bumps [org.springframework.boot:spring-boot-starter-parent](https://github.com/spring-projects/spring-boot) from 3.3.4 to 3.3.5.
- [Release notes](https://github.com/spring-projects/spring-boot/releases)
- [Commits](https://github.com/spring-projects/spring-boot/compare/v3.3.4...v3.3.5)

---
updated-dependencies:
- dependency-name: org.springframework.boot:spring-boot-starter-parent
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-11-04 00:34:47 -05:00
David Ansari 238ce77585
Delete test access_failure
This test flakes in CI as described in
https://github.com/rabbitmq/rabbitmq-server/issues/12413#issuecomment-2419293869

The test case fails with
```
Node: rabbit_shard2@localhost
Case: amqp_system_SUITE:access_failure
Reason: {error,{{badmatch,{error,134,
                                 "Unhandled exception. System.Exception: expected exception not received
                                 at Program.Test.accessFailure(String uri) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 477
                                 at Program.main(String[] argv) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 509\n"}},
                [{amqp_system_SUITE,run_dotnet_test,2,
                                    [{file,"amqp_system_SUITE.erl"},
                                     {line,257}]},
```

However, RabbitMQ closes the session as expected due to the missing read
permissions to the queue as shown in the RabbitMQ logs:
```
[debug] <0.1321.0> Asked to create a new user 'access_failure', password length in bytes: 24
[info] <0.1321.0> Created user 'access_failure'
[debug] <0.1324.0> Asked to set permissions for user 'access_failure' in virtual host '/' to '.*', '^banana.*', '^banana.*'
[info] <0.1324.0> Successfully set permissions for user 'access_failure' in virtual host '/' to '.*', '^banana.*', '^banana.*'
[info] <0.1333.0> accepting AMQP connection 127.0.0.1:36248 -> 127.0.0.1:25000
[debug] <0.1333.0> User 'access_failure' authenticated successfully by backend rabbit_auth_backend_internal
[info] <0.1333.0> Connection from AMQP 1.0 container 'AMQPNetLite-101d7d51': user 'access_failure' authenticated using SASL mechanism PLAIN and granted access to vhost '/'
[debug] <0.1333.0> AMQP 1.0 connection.open frame: hostname = 127.0.0.1, extracted vhost = /, idle-time-out = undefined
[debug] <0.1333.0> AMQP 1.0 created session process <0.1338.0> for channel number 0
[warning] <0.1338.0> Closing session for connection <0.1333.0>: {'v1_0.error',
[warning] <0.1338.0>                                             {symbol,
[warning] <0.1338.0>                                              <<"amqp:unauthorized-access">>},
[warning] <0.1338.0>                                             {utf8,
[warning] <0.1338.0>                                              <<"read access to queue 'test' in vhost '/' refused for user 'access_failure'">>},
[warning] <0.1338.0>                                             undefined}
[debug] <0.1333.0> AMQP 1.0 closed session process <0.1338.0> with channel number 0
[warning] <0.1333.0> closing AMQP connection <0.1333.0> (127.0.0.1:36248 -> 127.0.0.1:25000, duration: '269ms'):
[warning] <0.1333.0> client unexpectedly closed TCP connection
```

```
let receiver = ReceiverLink(ac.Session, "test-receiver", src)
```
uses a null constructur for the onAttached callback.
ReceiverLink doesn't seem to block.

Given that the exact same authorization error is already tested in test
case attach_source_queue of amqp_auth_SUITE, it's safe to delete this F#
test.
2024-11-04 00:34:47 -05:00
David Ansari 52b6419876
Remove test flake
Prior to this commit tests
* leader_transfer_quorum_queue_credit_single
* leader_transfer_quorum_queue_credit_batches
flaked in CI during 4.1 (main) and 4.0 mixed version testing.

The follwing error occurred on node 0:
```
[error] <0.1950.0> Timed out waiting for credit reply from quorum queue 'leader_transfer_quorum_queue_credit_batches' in vhost '/'. Hint: Enable feature flag rabbitmq_4.0.0
[warning] <0.1950.0> Closing session for connection <0.1945.0>: {'v1_0.error',
[warning] <0.1950.0>                                             {symbol,<<"amqp:internal-error">>},
[warning] <0.1950.0>                                             {utf8,
[warning] <0.1950.0>                                              <<"Timed out waiting for credit reply from quorum queue 'leader_transfer_quorum_queue_credit_batches' in vhost '/'. Hint: Enable feature flag rabbitmq_4.0.0">>},
[warning] <0.1950.0>                                             undefined}
```

Therefore we enable this feature flag for both tests.

This commit also simplifies some test setups that were necessary for
4.0/3.13 mixed version testing, but isn't necessary anymore for 4.1/4.0
mixed version testing.
2024-11-04 00:34:47 -05:00
David Ansari 70597737e4
Support x-cc message annotation (#12559)
Support x-cc message annotation

Support an `x-cc` message annotation in AMQP 1.0
similar to the [CC](https://www.rabbitmq.com/docs/sender-selected) header in AMQP 0.9.1.

The value of the `x-cc` message annotation must by a list of strings.
A message annotation is used since application properties allow only simple types.
2024-11-04 00:34:47 -05:00
David Ansari 9c2ee91a3c
Validate setting permissions works
in order to troubleshoot the flake described in
https://github.com/rabbitmq/rabbitmq-server/issues/12413#issuecomment-2419293869
```
Node: rabbit_shard2@localhost
Case: amqp_system_SUITE:access_failure
Reason: {error,{{badmatch,{error,134,
                                 "Unhandled exception. System.Exception: expected exception not received\n
                                 at Program.Test.accessFailure(String uri) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 477\n
                                 at Program.main(String[] argv) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 509\n"}},
                [{amqp_system_SUITE,run_dotnet_test,2,
                                    [{file,"amqp_system_SUITE.erl"},
                                     {line,257}]},
```
2024-11-04 00:34:46 -05:00
David Ansari 7a5277e1c4
Fix test flake
As described in https://github.com/rabbitmq/rabbitmq-server/issues/12413#issuecomment-2385379386
test case queue_topology flaked in CI with the following error:
```
rabbitmq_amqp_client > management_SUITE > cluster_size_3 > queue_topology
    #1. {error,{test_case_failed,{824,
                                  <<"rmq-ct-cluster_size_3-1-21000@localhost">>}}}
```

This flake could not be reproduced locally (neither with Mnesia nor with Khepri).
2024-11-04 00:34:45 -05:00
David Ansari 3db4a97cfb Expose AMQP connection metrics
Expose the same metrics for AMQP 1.0 connections as for AMQP 0.9.1 connections.

Display the following AMQP 1.0 metrics on the Management UI:
* Network bytes per second from/to client on connections page
* Number of sessions/channels on connections page
* Network bytes per second from/to client graph on connection page
* Reductions graph on connection page
* Garbage colletion info on connection page

Expose the following AMQP 1.0 per-object Prometheus metrics:
* rabbitmq_connection_incoming_bytes_total
* rabbitmq_connection_outgoing_bytes_total
* rabbitmq_connection_process_reductions_total
* rabbitmq_connection_incoming_packets_total
* rabbitmq_connection_outgoing_packets_total
* rabbitmq_connection_pending_packets
* rabbitmq_connection_channels

The rabbit_amqp_writer proc:
* notifies the rabbit_amqp_reader proc if it sent frames
* hibernates eventually if it doesn't send any frames

The rabbit_amqp_reader proc:
* does not emit stats (update ETS tables) if no frames are received
or sent to save resources when there are many idle connections.
2024-11-02 19:08:24 +01:00
Karl Nilsson 94e677987f QQ: handle case where a stale read request results in member crash.
It is possible for a slow running follower with local consumers
to crash after a snapshot installation as it tries to read an entry
from its log that is no longer there (as it has been consumed and
completed by another node but still refers to prior consumers on the
current node).

This commit makes the log effect callback function more defensive
to check that the number of commands returned by the log effect
isn't different from what was requested. if it is different we
consider this a stale read request and return no further effects.
2024-11-01 11:37:20 +00:00
GitHub fa0067c22d bazel run gazelle 2024-11-01 04:02:26 +00:00
Michael Klishin 67bc9504ed
Merge pull request #12617 from rabbitmq/amqp-log-macro
Use log macros for AMQP
2024-10-31 14:25:42 -04:00
Diana Parra Corbacho 0df71d54cb Test: metrics_SUITE queue_idemp wait for queue metrics 2024-10-31 09:39:44 +01:00
Diana Parra Corbacho e1c22a0d2a Test: wait for metrics 2024-10-31 09:05:40 +01:00
Diana Parra Corbacho 34c1fd13d9 Tests: wait for connection closed in metrics_SUITE 2024-10-31 09:05:30 +01:00
Michael Klishin 7b04b4f885
Merge pull request #12616 from rabbitmq/abort-restart-cluster-on-failure
Make (rabbitmq-run.mk): abort restart-cluster if something goes wrong
2024-10-30 13:11:01 -04:00
Michael Klishin 5f6b1ccfb5
Merge pull request #12604 from rabbitmq/mqtt-fix-ssl-variable-name
MQTT, x.509 certificate-based authentication: use the correct key name for the TLS SAN type configuration parameter
2024-10-30 11:16:15 -04:00
David Ansari dbd9ede67b Use log macros for AMQP
Using a log macro has the benefit that location data is added as
explained in https://www.erlang.org/doc/apps/kernel/logger.html#t:metadata/0
2024-10-30 14:50:05 +01:00
Michal Kuratczyk 2c0fc70135
Abort restart-cluster if something goes wrong
For example, if the first restarted node doesn't start,
don't try to restart the other nodes. This mimics what
orchestrators such as Kubernetes or BOSH would do
(although they perform this check differently)
2024-10-30 12:58:35 +01:00
Jean-Sébastien Pédron 3c15d7e3e6
rabbit_feature_flags: Log controller task on a single line 2024-10-30 11:12:40 +01:00
Jean-Sébastien Pédron 2abec68708
rabbit_feature_flags: Report feature flags init error reason
[Why]
`failed_to_initialize_feature_flags_registry` was a little too vague.
2024-10-30 11:12:40 +01:00
Jean-Sébastien Pédron ea899602b0
rabbit_feature_flags: Introduce hard vs. soft required feature flags
[Why]
Before this patch, required feature flags were basically checked during
boot: they must have been enabled when they were mere stable feature
flags. If they were not, the node refused to boot.

This was easy for the developer because making a feature flag required
allowed to remove the entire compatibility code. Very satisfying.

Unfortunately, this was a pain point to end users, especially those who
did not pay attention to RabbitMQ and the release notes and were just
asking their package manager to update everything. They could end up
with a node that refuse to boot. The only solution was to downgrade,
enable the disabled stable feature flags, upgrade again.

[How]
This patch introduces two levels of requirement to required feature
flags:
* `hard`: this corresponds to the existing behavior where a node will
  refuse to boot if a hard required feature flag is not enabled before
  the upgrade.
* `soft`: such a required feature flag will be automatically enabled
  during the upgrade to a version where it is marked as required.

The level of requirement is set in the feature flag definition:

    -rabbit_feature_flag(
       {my_feature_flag,
        #{stability     => required,
	  require_level => hard
         }}).

The default requirement level is `soft`. All existing required feature
flags have now a requirement level of `hard`.

The handling of soft required feature flag is done when the cluster
feature flags states are verified and synchronized. If a required
feature flag is not enabled yet, it is enabled at that time.

This means that as developers, we will have to keep compatibility code
forever for every soft required feature flag, like the feature flag
definition itself.
2024-10-30 11:12:18 +01:00
David Ansari 1778bc22aa Support AMQP 1.0 token renewal
Closes #9259.

 ## What?
Allow an AMQP 1.0 client to renew an OAuth 2.0 token before it expires.

 ## Why?
This allows clients to keep the AMQP connection open instead of having
to create a new connection whenever the token expires.

 ## How?
As explained in https://github.com/rabbitmq/rabbitmq-server/issues/9259#issuecomment-2437602040
the client can `PUT` a new token on HTTP API v2 path `/auth/tokens`.
RabbitMQ will then:
1. Store the new token on the given connection.
2. Recheck access to the connection's vhost.
3. Clear all permission caches in the AMQP sessions.
4. Recheck write permissions to exchanges for links publishing to
   RabbitMQ, and recheck read permissions from queues for links
   consuming from RabbitMQ. The latter complies with the user
   expectation in #11364.
2024-10-30 10:42:40 +01:00
Jean-Sébastien Pédron d6024e30f4
rabbit_prometheus_http_SUITE: Start broker once in `special_chars` group
`init_per_group/3`, which starts the broker, was already called earlier
in the function.

This fixes a bug where the node can't be stopped in `end_per_group/2`,
attecting the next group ability to start one.
2024-10-30 10:08:56 +01:00
Marcial Rosales e7cb2420a7 Verify non-zero DNS and email SAN 2024-10-29 16:41:20 +01:00
Marcial Rosales 4c1099950d Use the correct variable name 2024-10-29 16:41:20 +01:00
Diana Parra Corbacho 02bca637e1 queue_SUITE: use a different upstream for each queue on multi-federation tests 2024-10-29 16:01:07 +01:00
David Ansari 444df0029b
Merge pull request #12606 from rabbitmq/amqp-event-exchange
Add AMQP 1.0 event exchange test
2024-10-29 14:19:28 +01:00
David Ansari f55cd21e52 Add AMQP 1.0 event exchange test 2024-10-29 12:04:09 +01:00
Jean-Sébastien Pédron c0be3c0648
rabbitmq-run.mk: Restart nodes in a cluster sequentially
... not in parallel.
2024-10-29 11:41:20 +01:00
Jean-Sébastien Pédron 624d9bae0c
rabbitmq-run.mk: Use a 60 seconds timeout for `rabbitmqctl wait`
... not 60 milliseconds.
2024-10-29 11:37:50 +01:00
Lois Soto Lopez 2577b7e284 Remove extra keys from `gather_policy_config` out 2024-10-28 12:54:00 +01:00
Michael Klishin 69aed84c52
Merge pull request #12592 from rabbitmq/mk-management-ui-escpn
Use fmt_string in this error message
2024-10-25 22:49:15 -04:00
Michael Klishin 8ad8d3197e Use fmt_string in this error message 2024-10-25 22:14:41 -04:00
Diana Parra Corbacho 4e92841a9f Fix metrics_SUITE connection_metrics flake 2024-10-25 18:07:41 +02:00
Loïc Hoguin f68fc8bb94
Make CI: Add mixed version testing
This is enabled on main and for pull requests. Bazel remains
used in previous branches.
2024-10-25 13:50:05 +02:00
Michael Klishin 8b79ac7754
Merge pull request #12583 from rabbitmq/dependabot/maven/deps/rabbitmq_auth_backend_http/examples/rabbitmq_auth_backend_spring_boot_kotlin/main/org.springframework.boot-spring-boot-starter-parent-3.3.5
build(deps): bump org.springframework.boot:spring-boot-starter-parent from 3.3.4 to 3.3.5 in /deps/rabbitmq_auth_backend_http/examples/rabbitmq_auth_backend_spring_boot_kotlin
2024-10-24 15:29:14 -04:00
dependabot[bot] cca22ca577
build(deps): bump org.springframework.boot:spring-boot-starter-parent
Bumps [org.springframework.boot:spring-boot-starter-parent](https://github.com/spring-projects/spring-boot) from 3.3.4 to 3.3.5.
- [Release notes](https://github.com/spring-projects/spring-boot/releases)
- [Commits](https://github.com/spring-projects/spring-boot/compare/v3.3.4...v3.3.5)

---
updated-dependencies:
- dependency-name: org.springframework.boot:spring-boot-starter-parent
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-10-24 18:07:36 +00:00
dependabot[bot] 55a0555508
build(deps): bump org.springframework.boot:spring-boot-starter-parent
Bumps [org.springframework.boot:spring-boot-starter-parent](https://github.com/spring-projects/spring-boot) from 3.3.4 to 3.3.5.
- [Release notes](https://github.com/spring-projects/spring-boot/releases)
- [Commits](https://github.com/spring-projects/spring-boot/compare/v3.3.4...v3.3.5)

---
updated-dependencies:
- dependency-name: org.springframework.boot:spring-boot-starter-parent
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-10-24 18:07:04 +00:00
David Ansari b1169d06ba Delete test access_failure
This test flakes in CI as described in
https://github.com/rabbitmq/rabbitmq-server/issues/12413#issuecomment-2419293869

The test case fails with
```
Node: rabbit_shard2@localhost
Case: amqp_system_SUITE:access_failure
Reason: {error,{{badmatch,{error,134,
                                 "Unhandled exception. System.Exception: expected exception not received
                                 at Program.Test.accessFailure(String uri) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 477
                                 at Program.main(String[] argv) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 509\n"}},
                [{amqp_system_SUITE,run_dotnet_test,2,
                                    [{file,"amqp_system_SUITE.erl"},
                                     {line,257}]},
```

However, RabbitMQ closes the session as expected due to the missing read
permissions to the queue as shown in the RabbitMQ logs:
```
[debug] <0.1321.0> Asked to create a new user 'access_failure', password length in bytes: 24
[info] <0.1321.0> Created user 'access_failure'
[debug] <0.1324.0> Asked to set permissions for user 'access_failure' in virtual host '/' to '.*', '^banana.*', '^banana.*'
[info] <0.1324.0> Successfully set permissions for user 'access_failure' in virtual host '/' to '.*', '^banana.*', '^banana.*'
[info] <0.1333.0> accepting AMQP connection 127.0.0.1:36248 -> 127.0.0.1:25000
[debug] <0.1333.0> User 'access_failure' authenticated successfully by backend rabbit_auth_backend_internal
[info] <0.1333.0> Connection from AMQP 1.0 container 'AMQPNetLite-101d7d51': user 'access_failure' authenticated using SASL mechanism PLAIN and granted access to vhost '/'
[debug] <0.1333.0> AMQP 1.0 connection.open frame: hostname = 127.0.0.1, extracted vhost = /, idle-time-out = undefined
[debug] <0.1333.0> AMQP 1.0 created session process <0.1338.0> for channel number 0
[warning] <0.1338.0> Closing session for connection <0.1333.0>: {'v1_0.error',
[warning] <0.1338.0>                                             {symbol,
[warning] <0.1338.0>                                              <<"amqp:unauthorized-access">>},
[warning] <0.1338.0>                                             {utf8,
[warning] <0.1338.0>                                              <<"read access to queue 'test' in vhost '/' refused for user 'access_failure'">>},
[warning] <0.1338.0>                                             undefined}
[debug] <0.1333.0> AMQP 1.0 closed session process <0.1338.0> with channel number 0
[warning] <0.1333.0> closing AMQP connection <0.1333.0> (127.0.0.1:36248 -> 127.0.0.1:25000, duration: '269ms'):
[warning] <0.1333.0> client unexpectedly closed TCP connection
```

```
let receiver = ReceiverLink(ac.Session, "test-receiver", src)
```
uses a null constructur for the onAttached callback.
ReceiverLink doesn't seem to block.

Given that the exact same authorization error is already tested in test
case attach_source_queue of amqp_auth_SUITE, it's safe to delete this F#
test.
2024-10-24 18:34:25 +02:00
David Ansari c476540bbc Remove test flake
Prior to this commit tests
* leader_transfer_quorum_queue_credit_single
* leader_transfer_quorum_queue_credit_batches
flaked in CI during 4.1 (main) and 4.0 mixed version testing.

The follwing error occurred on node 0:
```
[error] <0.1950.0> Timed out waiting for credit reply from quorum queue 'leader_transfer_quorum_queue_credit_batches' in vhost '/'. Hint: Enable feature flag rabbitmq_4.0.0
[warning] <0.1950.0> Closing session for connection <0.1945.0>: {'v1_0.error',
[warning] <0.1950.0>                                             {symbol,<<"amqp:internal-error">>},
[warning] <0.1950.0>                                             {utf8,
[warning] <0.1950.0>                                              <<"Timed out waiting for credit reply from quorum queue 'leader_transfer_quorum_queue_credit_batches' in vhost '/'. Hint: Enable feature flag rabbitmq_4.0.0">>},
[warning] <0.1950.0>                                             undefined}
```

Therefore we enable this feature flag for both tests.

This commit also simplifies some test setups that were necessary for
4.0/3.13 mixed version testing, but isn't necessary anymore for 4.1/4.0
mixed version testing.
2024-10-24 18:16:11 +02:00
David Ansari 2c0cdee7d2
Support x-cc message annotation (#12559)
Support x-cc message annotation

Support an `x-cc` message annotation in AMQP 1.0
similar to the [CC](https://www.rabbitmq.com/docs/sender-selected) header in AMQP 0.9.1.

The value of the `x-cc` message annotation must by a list of strings.
A message annotation is used since application properties allow only simple types.
2024-10-24 13:03:05 +02:00
David Ansari 0c905f9b17 Validate setting permissions works
in order to troubleshoot the flake described in
https://github.com/rabbitmq/rabbitmq-server/issues/12413#issuecomment-2419293869
```
Node: rabbit_shard2@localhost
Case: amqp_system_SUITE:access_failure
Reason: {error,{{badmatch,{error,134,
                                 "Unhandled exception. System.Exception: expected exception not received\n
                                 at Program.Test.accessFailure(String uri) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 477\n
                                 at Program.main(String[] argv) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 509\n"}},
                [{amqp_system_SUITE,run_dotnet_test,2,
                                    [{file,"amqp_system_SUITE.erl"},
                                     {line,257}]},
```
2024-10-24 12:34:03 +02:00
David Ansari 8c046c71c8 Fix test flake
As described in https://github.com/rabbitmq/rabbitmq-server/issues/12413#issuecomment-2385379386
test case queue_topology flaked in CI with the following error:
```
rabbitmq_amqp_client > management_SUITE > cluster_size_3 > queue_topology
    #1. {error,{test_case_failed,{824,
                                  <<"rmq-ct-cluster_size_3-1-21000@localhost">>}}}
```

This flake could not be reproduced locally (neither with Mnesia nor with Khepri).
2024-10-24 09:25:49 +00:00
Lois Soto Lopez 9dc9f974b5 Remove ShouldLog & limit deliv. limit not set logg
Removes the usage of a ShouldLog parameter on several functions
and limits the logging of the message warning about the delivery_limit
not being set to the moment of queueDeclaration
2024-10-24 07:28:03 +02:00
Lois Soto Lopez 3b5069fdc5 Simplify publish_confirm_many 2024-10-24 07:28:03 +02:00
Lois Soto Lopez 42b58c7c01 Use wait_for_messages_ready 2024-10-24 07:28:03 +02:00
Lois Soto Lopez df14b4a9ac Use local function for ensuring qq proc dead 2024-10-24 07:28:03 +02:00
Lois Soto Lopez 51abb5c73f Consider QQs may let pass 1st overflowing msg 2024-10-24 07:28:02 +02:00
Lois Soto Lopez dc9ab1d8cf Move tests to main qq SUITE & refactor a bit 2024-10-24 07:27:47 +02:00
Péter Gömöri ccd854878b Refactoring suggestion
(some of this is just reverting to the original format to reduce the
diff against main)
2024-10-24 07:23:02 +02:00
Lois Soto Lopez ec87ef1ceb Use ra_machine_config but limit keys to check 2024-10-24 07:23:02 +02:00
Lois Soto Lopez fabe54db94 Use `ra_machine_config` to gen a comparable config
Instead of checking the values for current configuration, represented in
`rabbit_quorum_queue:handle_tick` by the `Overview` variable, against
the effective policy, just regenerate the configuration and compare with
the current configuration.
2024-10-24 07:23:02 +02:00
Lois Soto Lopez b408351d9e Add test for QQ policy repair feature 2024-10-24 07:23:02 +02:00
Lois Soto Lopez f9179d1090 Add QQ periodic policy repair 2024-10-24 07:23:02 +02:00
Michael Klishin 1150d51df1
Merge pull request #12577 from rabbitmq/su_aws/actually_handle_timeout_reconciliation
QQ periodic membership reconciliation: correctly return a `ra:members/2` error in case of a timeout
2024-10-23 22:21:27 -04:00
Simon Unge dacdeb024d Fix so that the code handles a timeout return 2024-10-23 23:12:36 +00:00
David Ansari 17df1b9343 Attempt to eliminate test flake
This commit attempts to eliminate the test flake described in
https://github.com/rabbitmq/rabbitmq-server/issues/12413#issuecomment-2385449940

```
rabbitmq_mqtt > parallel-ct-set-1 > mqtt_shared_SUITE > cluster_size_3 > v4 rabbit_mqtt_qos0_queue_kill_node

=== Ended at 2024-10-01 09:59:52
=== Location: [{mqtt_shared_SUITE,rabbit_mqtt_qos0_queue_kill_node,[1165](https://github.com/rabbitmq/rabbitmq-server/issues/mqtt_shared_suite.src.html#1165)},
              {test_server,ts_tc,1793},
              {test_server,run_test_case_eval1,1302},
              {test_server,run_test_case_eval,1234}]
=== === Reason: no match of right hand side value {publish_not_received,
                                                    <<"m1">>}
  in function  mqtt_shared_SUITE:rabbit_mqtt_qos0_queue_kill_node/1 (mqtt_shared_SUITE.erl, line 1165)
  in call from test_server:ts_tc/3 (test_server.erl, line 1793)
  in call from test_server:run_test_case_eval1/6 (test_server.erl, line 1302)
  in call from test_server:run_test_case_eval/9 (test_server.erl, line 1234)
```

This flake could not be reproduced locally.
This commit also assumes that this flake occurred under Khepri but not
under Mnesia.

The hypothesis is the following:
* Node 0 is down
* MQTT client creates binding on node 1
* Khepri commits since the binding is replicated and persisted on node 1
  and node 2. However the binding isn't reflected yet in node 2's
  routing projecting table.
* Publishing a message to node 2 routes to nowhere.
2024-10-23 17:36:58 +00:00
David Ansari 814d44dd82 Convert array from AMQP 1.0 to AMQP 0.9.1
Fix the following crash when an AMQP 0.9.1 client consumes an AMQP 1.0
encoded message that contains an array value in message annotations:
```
crasher:
  initial call: rabbit_channel:init/1
  pid: <0.685.0>
  registered_name: []
  exception exit: {function_clause,
                      [{mc_amqpl,to_091,
                           [<<"x-array">>,
                            {array,utf8,[{utf8,<<"e1">>},{utf8,<<"e2">>}]}],
                           [{file,"mc_amqpl.erl"},{line,737}]},
                       {mc_amqpl,'-convert_from/3-fun-3-',1,
                           [{file,"mc_amqpl.erl"},{line,168}]},
                       {lists,filtermap_1,2,
                           [{file,"lists.erl"},{line,2279}]},
                       {mc_amqpl,convert_from,3,
                           [{file,"mc_amqpl.erl"},{line,158}]},
                       {mc,convert,3,[{file,"mc.erl"},{line,332}]},
                       {rabbit_channel,handle_deliver0,4,
                           [{file,"rabbit_channel.erl"},{line,2619}]},
                       {lists,foldl_1,3,[{file,"lists.erl"},{line,2151}]},
                       {lists,foldl,3,[{file,"lists.erl"},{line,2146}]}]}
```
2024-10-22 12:16:19 +02:00
Michael Klishin d7c4e94331
Merge pull request #12550 from rabbitmq/lukebakken/shellcheck-init-slapd
Ensure init-slapd.sh passes `shellcheck`
2024-10-22 00:56:30 -04:00
GitHub 7d8d338bf0 bazel run gazelle 2024-10-22 04:02:23 +00:00
Lois Soto Lopez 3ff7e82c5c Provide specific f. to fix client ssl options
Provides a specific function to fix client ssl options, i.e.: apply all
fixes that are applied for TLS listeneres and clients on previous
versions but also sets `cacerts` option to CA certificates obtained by
`public_key:cacerts_get`, only when no `cacertfile` or `cacerts` are
provided.
2024-10-21 18:00:06 -04:00
Michael Klishin f4e689310f
Merge pull request #12563 from rabbitmq/dependabot/maven/deps/rabbitmq_stream_management/test/http_SUITE_data/main/junit.jupiter.version-5.11.3
build(deps-dev): bump junit.jupiter.version from 5.11.2 to 5.11.3 in /deps/rabbitmq_stream_management/test/http_SUITE_data
2024-10-21 15:04:12 -04:00
Michael Klishin 770c807b56
Merge pull request #12560 from rabbitmq/dependabot/maven/deps/rabbitmq_auth_backend_http/examples/rabbitmq_auth_backend_spring_boot/main/org.junit.jupiter-junit-jupiter-params-5.11.3
build(deps-dev): bump org.junit.jupiter:junit-jupiter-params from 5.11.2 to 5.11.3 in /deps/rabbitmq_auth_backend_http/examples/rabbitmq_auth_backend_spring_boot
2024-10-21 15:04:04 -04:00
Michael Klishin b72adebf0c
Merge pull request #12561 from rabbitmq/dependabot/maven/deps/rabbitmq_stream/test/rabbit_stream_SUITE_data/main/junit.jupiter.version-5.11.3
build(deps-dev): bump junit.jupiter.version from 5.11.2 to 5.11.3 in /deps/rabbitmq_stream/test/rabbit_stream_SUITE_data
2024-10-21 15:03:57 -04:00
dependabot[bot] 87ebc27c1c
build(deps-dev): bump junit.jupiter.version
Bumps `junit.jupiter.version` from 5.11.2 to 5.11.3.

Updates `org.junit.jupiter:junit-jupiter-engine` from 5.11.2 to 5.11.3
- [Release notes](https://github.com/junit-team/junit5/releases)
- [Commits](https://github.com/junit-team/junit5/compare/r5.11.2...r5.11.3)

Updates `org.junit.jupiter:junit-jupiter-params` from 5.11.2 to 5.11.3
- [Release notes](https://github.com/junit-team/junit5/releases)
- [Commits](https://github.com/junit-team/junit5/compare/r5.11.2...r5.11.3)

---
updated-dependencies:
- dependency-name: org.junit.jupiter:junit-jupiter-engine
  dependency-type: direct:development
  update-type: version-update:semver-patch
- dependency-name: org.junit.jupiter:junit-jupiter-params
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-10-21 19:00:47 +00:00
dependabot[bot] 45d285b06d
build(deps-dev): bump org.junit.jupiter:junit-jupiter
Bumps [org.junit.jupiter:junit-jupiter](https://github.com/junit-team/junit5) from 5.11.2 to 5.11.3.
- [Release notes](https://github.com/junit-team/junit5/releases)
- [Commits](https://github.com/junit-team/junit5/compare/r5.11.2...r5.11.3)

---
updated-dependencies:
- dependency-name: org.junit.jupiter:junit-jupiter
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-10-21 19:00:19 +00:00
dependabot[bot] c9206ca2cd
build(deps-dev): bump junit.jupiter.version
Bumps `junit.jupiter.version` from 5.11.2 to 5.11.3.

Updates `org.junit.jupiter:junit-jupiter-engine` from 5.11.2 to 5.11.3
- [Release notes](https://github.com/junit-team/junit5/releases)
- [Commits](https://github.com/junit-team/junit5/compare/r5.11.2...r5.11.3)

Updates `org.junit.jupiter:junit-jupiter-params` from 5.11.2 to 5.11.3
- [Release notes](https://github.com/junit-team/junit5/releases)
- [Commits](https://github.com/junit-team/junit5/compare/r5.11.2...r5.11.3)

---
updated-dependencies:
- dependency-name: org.junit.jupiter:junit-jupiter-engine
  dependency-type: direct:development
  update-type: version-update:semver-patch
- dependency-name: org.junit.jupiter:junit-jupiter-params
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-10-21 18:57:44 +00:00
dependabot[bot] 7016af6c53
build(deps-dev): bump org.junit.jupiter:junit-jupiter-params
Bumps [org.junit.jupiter:junit-jupiter-params](https://github.com/junit-team/junit5) from 5.11.2 to 5.11.3.
- [Release notes](https://github.com/junit-team/junit5/releases)
- [Commits](https://github.com/junit-team/junit5/compare/r5.11.2...r5.11.3)

---
updated-dependencies:
- dependency-name: org.junit.jupiter:junit-jupiter-params
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-10-21 18:44:27 +00:00
David Ansari cbe5551cf1 bazel run gazelle 2024-10-20 12:41:23 +02:00
David Ansari dc9ebc5b81 Check topic permissions of CC and BCC headers 2024-10-20 11:41:29 +02:00
Luke Bakken c69aa911c4
Ensure init-slapd.sh passes `shellcheck` 2024-10-18 16:46:08 -07:00
David Ansari 97512b0a9e
Merge pull request #12549 from rabbitmq/amqp-filtex-errors
Prevent crash for invalid application-properties filter
2024-10-18 17:46:03 +02:00
David Ansari c5b6e7f297 bazel run gazelle 2024-10-18 16:09:00 +02:00
David Ansari 1827df811a Prevent crash for invalid application-properties filter
application-properties keys are restricted to be strings.

Prior to this commit, a function_clause error occurred if the client
requested an invalid filter:
```
  │ *Error{Condition: amqp:internal-error, Description: Session error: function_clause
  │ [{rabbit_amqp_filtex,'-validate0/2-fun-0-',
  │                      [{{symbol,<<"subject">>},{utf8,<<"var">>}}],
  │                      [{file,"rabbit_amqp_filtex.erl"},{line,119}]},
  │  {lists,map,2,[{file,"lists.erl"},{line,2077}]},
  │  {rabbit_amqp_filtex,validate0,2,[{file,"rabbit_amqp_filtex.erl"},{line,119}]},
  │  {rabbit_amqp_filtex,validate,1,[{file,"rabbit_amqp_filtex.erl"},{line,28}]},
  │  {rabbit_amqp_session,parse_filters,2,
  │                       [{file,"rabbit_amqp_session.erl"},{line,3068}]},
  │  {rabbit_amqp_session,parse_filter,1,
  │                       [{file,"rabbit_amqp_session.erl"},{line,3014}]},
  │  {rabbit_amqp_session,'-handle_attach/2-fun-0-',21,
  │                       [{file,"rabbit_amqp_session.erl"},{line,1371}]},
  │
  {rabbit_misc,with_exit_handler,2,[{file,"rabbit_misc.erl"},{line,465}]}],
  Info: map[]}
```

After this commit, the filter won't actually take effect without a crash occurring.

Supersedes #12520
2024-10-18 15:37:28 +02:00
Michael Klishin 17dd95a801
Merge pull request #12543 from rabbitmq/su_aws/fix_desctiption_qq_target_group_size_policy
Fix module mentioned in target group size description
2024-10-18 09:01:19 -04:00
David Ansari d1d7d7bad4 Optionally notify client app with AMQP 1.0 performative
This commit notifies the client app with the AMQP performative if
connection config `notify_with_performative` is set to `true`.

This allows the client app to learn about all fields including
properties and capabilities returned by the AMQP server.
2024-10-18 13:51:35 +02:00
Simon Unge 691a0368ba Fix module mentioned in target group size description 2024-10-17 22:52:00 +00:00
Luke Bakken 3d668fda46
Grafana: add a runtime/Erlang/BEAM dashboard (#12456)
* Add BEAM dashboard

Also update the other dashboards by opening in Grafana v11.2.2 and ensuring they work as expected.

* Update the Erlang-Distributions-Compare dashboard

* Update the RabbitMQ-Overview dashboard

* Update the RabbitMQ-Quorum-Queues-Raft dashboard

* Update the RabbitMQ-Stream dashboard

* Update distribution link status panel

---------

Co-authored-by: Michal Kuratczyk <mkuratczyk@vmware.com>
2024-10-17 07:10:54 -07:00
Loïc Hoguin 469c3a0791
Make CI: Check that CI knows about all CT_SUITES in CI
Instead of every time we run Make for these applications.

This means that during development we are free to modify
these values or create new test suites without having to
worry about the check. If we forget to then add the test
suites in PARALLEL_CT the workflow will tell us.
2024-10-17 10:52:28 +02:00
David Ansari ab8814ad7d Fix error message
Prior to this commit if dotnet or mvnw failed to fetch test
dependencies, for example because dotnet isn't installed, the test setup
crashed in an unexpected way:
```
amqp_system_SUITE > dotnet
    {'EXIT',
        {badarg,
            [{lists,keysearch,
                 [rmq_nodes,1,
                  {skip,
                      "Failed to fetch .NET Core test project dependencies"}],
                 [{error_info,#{module => erl_stdlib_errors}}]},
             {test_server,lookup_config,2,
                 [{file,"test_server.erl"},{line,1779}]},
             {rabbit_ct_broker_helpers,get_node_configs,2,
                 [{file,"rabbit_ct_broker_helpers.erl"},{line,1411}]},
             {rabbit_ct_broker_helpers,enable_feature_flag,2,
                 [{file,"rabbit_ct_broker_helpers.erl"},{line,1999}]},
             {amqp_system_SUITE,init_per_group,2,
                 [{file,"amqp_system_SUITE.erl"},{line,77}]},
             {test_server,ts_tc,3,[{file,"test_server.erl"},{line,1794}]},
             {test_server,run_test_case_eval1,6,
                 [{file,"test_server.erl"},{line,1391}]},
             {test_server,run_test_case_eval,9,
                 [{file,"test_server.erl"},{line,1235}]}]}}
```

This commit improves the error message instead of failing with `badarg`.

This commit also decides to fail the test setup instead of skipping the
suite because we always want CI to execute this test and be notified
instead of silently skipping if the test can't be run.
2024-10-16 17:53:17 +02:00
David Ansari 8c0cd1b78c Bump dotnet
This commit fixes the CI error on `main` branch where
amqp_system_SUITE failed with the following error:
```
Process terminated. Couldn't find a valid ICU package installed on the system. Set the configuration flag System.Globalization.Invariant to true if you want to run with no globalization support.
   at System.Environment.FailFast(System.String)
   at System.Globalization.GlobalizationMode.GetGlobalizationInvariantMode()
   at System.Globalization.GlobalizationMode..cctor()
   at System.Globalization.CultureData.CreateCultureWithInvariantData()
   at System.Globalization.CultureData.get_Invariant()
   at System.Globalization.CultureInfo..cctor()
   at System.String.ToLowerInvariant()
   at Microsoft.DotNet.PlatformAbstractions.RuntimeEnvironment.GetArch()
   at Microsoft.DotNet.PlatformAbstractions.RuntimeEnvironment..cctor()
   at Microsoft.DotNet.PlatformAbstractions.RuntimeEnvironment.GetRuntimeIdentifier()
   at Microsoft.DotNet.Cli.MulticoreJitProfilePathCalculator.CalculateProfileRootPath()
   at Microsoft.DotNet.Cli.MulticoreJitActivator.StartCliProfileOptimization()
   at Microsoft.DotNet.Cli.MulticoreJitActivator.TryActivateMulticoreJit()
   at Microsoft.DotNet.Cli.Program.Main(System.String[])

Exit code: 134 (pid <0.1533.0>)
```
2024-10-16 16:35:37 +02:00
David Ansari 358ff79611 Provide clear error message for reserved annotation keys
As described in https://docs.oasis-open.org/amqp/core/v1.0/os/amqp-core-messaging-v1.0-os.html#type-annotations
> The annotations type is a map where the keys are restricted to be of type symbol or of type ulong.
> All ulong keys, and all symbolic keys except those beginning with "x-" are reserved.

Prior to this commit, if an AMQP client used a reserved annotation key,
the entire AMQP connection terminated with a function_clause error
message that might be difficult to understand for client libs:
```
<<"Session error: function_clause\n[{amqp10_framing,'-decode_annotations/1-fun-0-',\n                 [{{symbol,<<\"aa\">>},{utf8,<<\"bbb\">>}}],\n                 [{file,\"amqp10_framing.erl\"},{line,158}]},\n {lists,map,2,[{file,\"lists.erl\"},{line,1559}]},\n {amqp10_framing,decode,1,[{file,\"amqp10_framing.erl\"},{line,127}]},\n {lists,map_1,2,[{file,\"lists.erl\"},{line,1564}]},\n {lists,map,2,[{file,\"lists.erl\"},{line,1559}]},\n {mc_amqp,init,1,[{file,\"mc_amqp.erl\"},{line,102}]},\n {mc,init,4,[{file,\"mc.erl\"},{line,150}]},\n {rabbit_amqp_session,incoming_link_transfer,4,\n                      [{file,\"rabbit_amqp_session.erl\"},{line,2341}]}]">>
```

This commit ends only the session and provides a clearer error message.
2024-10-16 14:14:04 +02:00
Loïc Hoguin e4d20bba51
Merge pull request #12502 from rabbitmq/loic-ct-master-patching
Make CI: Fix and enhance ct_master
2024-10-16 12:40:06 +02:00
Karl Nilsson 3b1ef8f529 QQ: fix the key_metrics_rpc function.
Currently this function always falls back to the compatibility code
and never gets the benefit of using ra:key_metrics/1 due to incorrect
use of the map update operatior ":=" instead of the insert operator
"=>".
2024-10-15 16:30:39 +01:00
Loïc Hoguin 4127f15676
Make CI: Bazel updates following ct_master work 2024-10-15 14:57:42 +02:00
Loïc Hoguin 8d411c7cda
Make CI: Print auto-skipped and failed test cases at the end
Of a ct_master run. This uses the builtin CT Master event
handler to gather the results.
2024-10-15 14:57:42 +02:00
Loïc Hoguin 655caf6d1a
Make CI: Have ct_master return the test results
Instead of having a CT hook just to know whether our tests failed.
2024-10-15 14:57:42 +02:00
Loïc Hoguin dddf917378
Make CI: Sort the results printout from ct_master
It makes more sense to sort by node name, than to have
the results in the order they finished.
2024-10-15 14:57:42 +02:00
Loïc Hoguin 6cdc32f558
Make CI: Make ct_master handle all testspec instructions 2024-10-15 14:57:42 +02:00
Loïc Hoguin 77ab5eddcb
Reduce the amount of printing to the terminal during tests 2024-10-15 14:57:42 +02:00
Loïc Hoguin 1897e02764
Make CI: Fix a small issue in master_runs.html 2024-10-15 14:57:42 +02:00
Loïc Hoguin ce7184598c
Make CI: Fix the master_runs.html css file paths
Needed to file:set_cwd like in normal CT.
2024-10-15 14:57:42 +02:00
Loïc Hoguin 37c2f9f675
Make CI: Don't refresh logs at the end of ct_master run
The ct_run:run_test function already takes care of the
node's logs. The ct_master_logs module takes care of
ct_master itself.
2024-10-15 14:57:41 +02:00
Loïc Hoguin 807c8f8a0b
Make CI: Add forks of ct_master_event and ct_master_logs 2024-10-15 14:57:41 +02:00
Arnaud Cogoluègnes 966e06f2f7
Use inner module function to increment stream protocol counter
Reduce duplication.
2024-10-14 09:53:48 +02:00
Arnaud Cogoluègnes affdeb3125
Use macro for stream publisher/consumer reference check guard
References #12499
2024-10-14 09:24:57 +02:00
Michael Klishin 2acc3298b3
Merge pull request #12503 from rabbitmq/rabbitmq-server-12499-check-stream-publisher-reference-length
Return error if stream publisher/consumer reference is longer than 255 characters
2024-10-11 13:33:07 -04:00
Michael Klishin 1726064579
Merge pull request #12208 from rabbitmq/qq-otp27-conf
Adjust vheap sizes for message handling processes in OTP 27
2024-10-11 09:59:40 -04:00
Arnaud Cogoluègnes 622dec011d
Return error if store offset reference is longer than 255 characters 2024-10-11 14:55:44 +02:00
David Ansari b1064fddba Support negative integers in modified annotations 2024-10-11 14:43:31 +02:00
David Ansari 2e90619a62 Add custom dead letter history test
Test the use case described in https://github.com/rabbitmq/rabbitmq-website/pull/2095:

> Rather than relying solely on RabbitMQ's built-in dead lettering tracking via x-opt-deaths,
consumers can customise dead lettering event tracking.
2024-10-11 13:00:25 +02:00
David Ansari 855a32ab28 Add alternate exchange test assertion
Test the use case described in
https://github.com/rabbitmq/rabbitmq-website/pull/2095
2024-10-11 12:23:00 +02:00
David Ansari e6818f0040 Track requeue history
Support tracking the requeue history as described in
https://github.com/rabbitmq/rabbitmq-website/pull/2095

This commit:
1. adds a test case tracing the requeue history via AMQP 1.0
   using the modified outcome and
2. fixes bugs in the broker which crashed if a modified message
   annotation value is an AMQP 1.0 list, map, or array.

Complex modified annotation values (list, map, array) are stored as tagged values from now on.
This means AMQP 0.9.1 consumers will not receive modified annotations of
type list, map, or array (which is okay).
2024-10-11 12:21:28 +02:00
Arnaud Cogoluègnes 0260862a27
Return error if stream consumer reference is longer than 255 characters 2024-10-11 11:29:09 +02:00
Arnaud Cogoluègnes 4e8fb46bbf
Return error if stream publisher reference is longer than 255 characters
Fixes #12499
2024-10-11 10:34:45 +02:00
Michael Klishin d9ff6a00d8
Merge pull request #12501 from rabbitmq/flake-clustering-queue-on-other-node
Tests: wait until stats are published, not just collected on the agent
2024-10-10 23:07:49 -04:00
dependabot[bot] c2c6748847
build(deps): bump kotlin.version
Bumps `kotlin.version` from 2.0.20 to 2.0.21.

Updates `org.jetbrains.kotlin:kotlin-test` from 2.0.20 to 2.0.21
- [Release notes](https://github.com/JetBrains/kotlin/releases)
- [Changelog](https://github.com/JetBrains/kotlin/blob/v2.0.21/ChangeLog.md)
- [Commits](https://github.com/JetBrains/kotlin/compare/v2.0.20...v2.0.21)

Updates `org.jetbrains.kotlin:kotlin-maven-allopen` from 2.0.20 to 2.0.21

---
updated-dependencies:
- dependency-name: org.jetbrains.kotlin:kotlin-test
  dependency-type: direct:development
  update-type: version-update:semver-patch
- dependency-name: org.jetbrains.kotlin:kotlin-maven-allopen
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-10-10 19:00:23 +00:00
Diana Parra Corbacho 7d45609b1a Tests: wait until stats are published, not just collected on the agent 2024-10-10 12:57:14 +02:00
Diana Parra Corbacho 45718fbcf6 Tests: wait until stats are published, not just collected on the agent 2024-10-10 12:34:59 +02:00
Michal Kuratczyk 6a7f8d0d1e Remove redundant copy of adjust_for_message_handling_proc/0 2024-10-09 20:08:34 -04:00
Karl Nilsson 465b19e8e8 Adjust vheap sizes for message handling processes in OTP 27
OTP 27 reset all assumptions on how the vm reacts to processes that
buffer and process a lot of large binaries.

Substantially increasing the vheap sizes for such process restores
most of the same performance by allowing processes to hold more binary
data before major garbage collections are triggered.

This introduces a new module to capture process flag configurations.

The new vheap sizes are only applied when running on OTP 27 or
above.
2024-10-09 20:08:34 -04:00
Michael Klishin 9893a2bd48
Merge pull request #12399 from rabbitmq/deprecate-oauth2-settings
Deprecate two OAuth2 settings: auth_oauth2.jwks_url and management.metadata_url
2024-10-09 11:46:58 -04:00
Marcial Rosales 0f1b8760a4 Fix issue 2024-10-09 11:01:09 -04:00
Marcial Rosales 0c8dadd662 Fix failing test cases 2024-10-09 11:01:09 -04:00
Marcial Rosales 0835c7ecf4 Resolve merge conflicts 2024-10-09 11:01:09 -04:00
Marcial Rosales c9d5ddf89f Deprecate oauth_metadata_url
If oauth_metadata_url is configured, RabbitMQ uses it.
Else it uses the discovery_endpoint url calculated from
issuer and discovery_endpoint_path
2024-10-09 11:01:09 -04:00
Marcial Rosales ee8d5f7fb0 Deprecate jwks_url but it is still supported
jwks_uri takes precedence when both are set
2024-10-09 11:01:09 -04:00
Marcial Rosales 322a9a9f9f Rename jkws_url to jwks_uri 2024-10-09 11:01:09 -04:00
Marcial Rosales b21a222abd Remove management.oauth_metadata_url 2024-10-09 11:01:09 -04:00
Marcial Rosales 423b591310 Fix failing test cases 2024-10-09 10:57:38 -04:00
Marcial Rosales ebc3dea971 Minor formatting improvement 2024-10-09 10:57:38 -04:00
Marcial Rosales b966ab7b72 Configure scope_aliases also per resource_server 2024-10-09 10:57:38 -04:00
Marcial Rosales 3e81cfa89d Handle wrong scope_aliases configuration 2024-10-09 10:57:38 -04:00
Marcial Rosales 48670a0ecf Support two modes of configuring
scope_aliases using cuttlefish
2024-10-09 10:57:38 -04:00
Marcial Rosales a30c829ec5 Test translation function of scope_aliases 2024-10-09 10:57:38 -04:00
Marcial Rosales dcb52638ab Minor refactoring 2024-10-09 10:57:38 -04:00
Marcial Rosales 5841e37804 Fix schema translation for
scope_aliases
2024-10-09 10:57:38 -04:00
Marcial Rosales cd46b406df Modify schema to include scope_aliases
WIP Add translation function
2024-10-09 10:57:38 -04:00
GitHub 5ae16631e9 bazel run gazelle 2024-10-09 04:02:38 +00:00
Michael Klishin c15f19fe83 OAuth 2: CLI is a build time dependency, not a runtime one 2024-10-08 07:11:43 -04:00
Michael Klishin e7f82a53ba OAuth 2: add a missing dependency on rabbitmq_cli 2024-10-08 07:09:08 -04:00
GitHub d63d70c36c bazel run gazelle 2024-10-08 07:09:08 -04:00
Loïc Hoguin 545abce10f CQ: Fix shared store scanner missing messages
It was still possible, although rare, to have message store
files lose message data, when the following conditions were
met:

 * the message data contains byte values 255
   (255 is used as an OK marker after a message)
 * the message is located after a 0-filled hole in the file
 * the length of the data is at least 4096 bytes and
   if we misread it (as detailed below) we encounter
   a 255 byte where we expect the OK marker

The trick for the code to previously misread the length can
be explained as follow:

A message is stored in the following format:

  <<Len:64, MsgIdAndMsg:Len/unit:8, 255>>

With MsgId always being 16 bytes in length. So Len is always
at least 16, if the message data Msg is empty. But technically
it never is.

Now if we have a zero filled hole just before this message,
we may end up with this:

  <<0, Len:64, MsgIdAndMsg:Len/unit:8, 255>>

When we are scanning we are testing bytes to see if there is
a message there or not. We look for a Len that gives us byte
255 after MsgIdAndMsg.

Len of value 4096 looks like this in binary:

  <<0:48, 16, 0>>

Problem is if we have leading zeroes, Len may look like this:

  <<0, 0:48, 16, 0>>

If we take the first 64 bits we get a potential length of 16.
We look at the byte after the next 16 bytes. If it is 255, we
think this is a message and skip by this amount of bytes, and
mistakenly miss the real message.

Solving this by changing the file format would be simple enough,
but we don't have the luxury to afford that. A different solution
was found, which is to combine file scanning with checking that
the message exists in the message store index (populated from
queues at startup, and kept up to date over the life time of
the store). Then we know for sure that the message above
doesn't exist, because the MsgId won't be found in the index.
If it is, then the file number and offset will not match,
and the check will fail.

There remains a small chance that we get it wrong during dirty
recovery. Only a better file format would improve that.
2024-10-08 07:09:08 -04:00
Marcial Rosales 743f663520 Fix bazel configuration 2024-10-08 08:17:48 +02:00