Commit Graph

1361 Commits

Author SHA1 Message Date
Péter Gömöri 2c1f1a1387
Restore credit_flow between channel/MQTT connection -> CQ processes
The credit_flow between publishing AMQP 0.9.1 channel (or MQTT
connection) and (non-mirrored) classic queue processes was
unintentionally removed in 4.0 together with anything else related to
CQ mirroring.

By default we restore the 3.x behaviour for non-mirored classic
queues. It is possible to disable flow-control (the earlier 4.0.x
behaviour) with the new env `classic_queue_flow_control`. In 3.x this
was possible with the config `mirroring_flow_control`.

(cherry picked from commit d65bd7d07a)
2024-12-09 22:33:47 -05:00
Diana Parra Corbacho 7fe2b0414c Tests: auth_SUITE close all connections in end_per_testcase 2024-11-25 09:06:32 +01:00
Diana Parra Corbacho aa28be0f27 tests: v5_SUITE wait longer to receive messages 2024-11-25 09:06:32 +01:00
Diana Parra Corbacho 964247afcc Tests: mqtt_shared_SUITE skip check for previous connections
The test checks later based on clientId
2024-11-25 09:06:32 +01:00
dependabot[bot] 149036707b
Bump com.rabbitmq:amqp-client
Bumps [com.rabbitmq:amqp-client](https://github.com/rabbitmq/rabbitmq-java-client) from 5.22.0 to 5.23.0.
- [Release notes](https://github.com/rabbitmq/rabbitmq-java-client/releases)
- [Commits](https://github.com/rabbitmq/rabbitmq-java-client/compare/v5.22.0...v5.23.0)

---
updated-dependencies:
- dependency-name: com.rabbitmq:amqp-client
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-11-19 18:37:28 +00:00
Michael Klishin c888689cca
Merge pull request #12722 from rabbitmq/fix-flakes
Fix flakes
2024-11-14 13:36:17 -05:00
Diana Parra Corbacho db78f9b812 Tests: mqtt_shared_SUITE match expected connection 2024-11-14 15:02:47 +01:00
Michael Klishin c78bc8a9c3
4.1: Avoid an exception when an AMQP 0-9-1-originating message with expiration set is converted for an MQTT consumer (#12710)
* MQTT: avoid an exception

when an AMQP 0-9-1 publisher publishes a message
that has expiration set.

Stack trace was contributed in #12707 by @rdsilio.

* mc_mqtt_SUITE test for #12707 #12710

* MQTT protocol_interop_SUITE: new test for #12710 #12707

* Simplify tests

---------

Co-authored-by: David Ansari <david.ansari@gmx.de>
2024-11-13 09:20:43 +01:00
David Ansari ae423721ad Fix flake will_delay_session_takeover
Prior to this commit, the following flake occurred in CI for
```
make -C deps/rabbitmq_mqtt ct-v5 t=cluster_size_1:will_delay_session_takeover
```

```
=== Location: [{v5_SUITE,will_delay_session_takeover,1473},
              {test_server,ts_tc,1793},
              {test_server,run_test_case_eval1,1302},
              {test_server,run_test_case_eval,1234}]
=== === Reason: {test_case_failed,"Received unexpected PUBLISH payload. Expected: <<\"will-3a\">> Got: <<\"will-4a\">>"}
```

The RabbitMQ logs for this single node test show:
```
2024-11-04 14:43:35.039196+00:00 [debug] <0.1334.0> MQTT accepting TCP connection <0.1334.0> (127.0.0.1:42576 -> 127.0.0.1:27005)
2024-11-04 14:43:35.039336+00:00 [debug] <0.1334.0> Received a CONNECT, client ID: c3, username: undefined, clean start: true, protocol version: 5, keepalive: 60, property names: []
2024-11-04 14:43:35.039438+00:00 [debug] <0.1334.0> MQTT connection 127.0.0.1:42576 -> 127.0.0.1:27005 picked vhost using plugin_configuration_or_default_vhost
2024-11-04 14:43:35.039537+00:00 [debug] <0.1334.0> User 'guest' authenticated successfully by backend rabbit_auth_backend_internal
2024-11-04 14:43:35.039729+00:00 [info] <0.1334.0> Accepted MQTT connection 127.0.0.1:42576 -> 127.0.0.1:27005 for client ID c3
2024-11-04 14:43:35.040297+00:00 [debug] <0.1337.0> MQTT accepting TCP connection <0.1337.0> (127.0.0.1:42580 -> 127.0.0.1:27005)
2024-11-04 14:43:35.040442+00:00 [debug] <0.1337.0> Received a CONNECT, client ID: c4, username: undefined, clean start: true, protocol version: 5, keepalive: 60, property names: []
2024-11-04 14:43:35.040534+00:00 [debug] <0.1337.0> MQTT connection 127.0.0.1:42580 -> 127.0.0.1:27005 picked vhost using plugin_configuration_or_default_vhost
2024-11-04 14:43:35.040597+00:00 [debug] <0.1337.0> User 'guest' authenticated successfully by backend rabbit_auth_backend_internal
2024-11-04 14:43:35.040793+00:00 [info] <0.1337.0> Accepted MQTT connection 127.0.0.1:42580 -> 127.0.0.1:27005 for client ID c4
2024-11-04 14:43:35.041463+00:00 [debug] <0.1340.0> MQTT accepting TCP connection <0.1340.0> (127.0.0.1:42596 -> 127.0.0.1:27005)
2024-11-04 14:43:35.041715+00:00 [debug] <0.1340.0> Received a CONNECT, client ID: c1, username: undefined, clean start: false, protocol version: 5, keepalive: 60, property names: ['Session-Expiry-Interval']
2024-11-04 14:43:35.041806+00:00 [debug] <0.1340.0> MQTT connection 127.0.0.1:42596 -> 127.0.0.1:27005 picked vhost using plugin_configuration_or_default_vhost
2024-11-04 14:43:35.041881+00:00 [debug] <0.1340.0> User 'guest' authenticated successfully by backend rabbit_auth_backend_internal
2024-11-04 14:43:35.041982+00:00 [warning] <0.1328.0> MQTT disconnecting client <<"127.0.0.1:42560 -> 127.0.0.1:27005">> with duplicate id 'c1'
2024-11-04 14:43:35.042062+00:00 [info] <0.1340.0> Accepted MQTT connection 127.0.0.1:42596 -> 127.0.0.1:27005 for client ID c1
2024-11-04 14:43:35.045624+00:00 [debug] <0.1345.0> MQTT accepting TCP connection <0.1345.0> (127.0.0.1:42602 -> 127.0.0.1:27005)
2024-11-04 14:43:35.045781+00:00 [debug] <0.1345.0> Received a CONNECT, client ID: c2, username: undefined, clean start: false, protocol version: 5, keepalive: 60, property names: ['Session-Expiry-Interval']
2024-11-04 14:43:35.045874+00:00 [debug] <0.1345.0> MQTT connection 127.0.0.1:42602 -> 127.0.0.1:27005 picked vhost using plugin_configuration_or_default_vhost
2024-11-04 14:43:35.045943+00:00 [debug] <0.1345.0> User 'guest' authenticated successfully by backend rabbit_auth_backend_internal
2024-11-04 14:43:35.046032+00:00 [warning] <0.1331.0> MQTT disconnecting client <<"127.0.0.1:42566 -> 127.0.0.1:27005">> with duplicate id 'c2'
2024-11-04 14:43:35.046281+00:00 [info] <0.1345.0> Accepted MQTT connection 127.0.0.1:42602 -> 127.0.0.1:27005 for client ID c2
2024-11-04 14:43:35.047063+00:00 [debug] <0.1350.0> MQTT accepting TCP connection <0.1350.0> (127.0.0.1:42614 -> 127.0.0.1:27005)
2024-11-04 14:43:35.047702+00:00 [debug] <0.1350.0> Received a CONNECT, client ID: c3, username: undefined, clean start: true, protocol version: 5, keepalive: 60, property names: ['Session-Expiry-Interval']
2024-11-04 14:43:35.047910+00:00 [debug] <0.1350.0> MQTT connection 127.0.0.1:42614 -> 127.0.0.1:27005 picked vhost using plugin_configuration_or_default_vhost
2024-11-04 14:43:35.048467+00:00 [debug] <0.1350.0> User 'guest' authenticated successfully by backend rabbit_auth_backend_internal
2024-11-04 14:43:35.049701+00:00 [info] <0.1350.0> Accepted MQTT connection 127.0.0.1:42614 -> 127.0.0.1:27005 for client ID c3
2024-11-04 14:43:35.050907+00:00 [warning] <0.1334.0> MQTT disconnecting client <<"127.0.0.1:42576 -> 127.0.0.1:27005">> with duplicate id 'c3'
2024-11-04 14:43:35.051248+00:00 [debug] <0.1353.0> MQTT accepting TCP connection <0.1353.0> (127.0.0.1:42626 -> 127.0.0.1:27005)
2024-11-04 14:43:35.051395+00:00 [debug] <0.1353.0> Received a CONNECT, client ID: c4, username: undefined, clean start: false, protocol version: 5, keepalive: 60, property names: ['Session-Expiry-Interval']
2024-11-04 14:43:35.051519+00:00 [debug] <0.1353.0> MQTT connection 127.0.0.1:42626 -> 127.0.0.1:27005 picked vhost using plugin_configuration_or_default_vhost
2024-11-04 14:43:35.051590+00:00 [debug] <0.1353.0> User 'guest' authenticated successfully by backend rabbit_auth_backend_internal
2024-11-04 14:43:35.051871+00:00 [info] <0.1353.0> Accepted MQTT connection 127.0.0.1:42626 -> 127.0.0.1:27005 for client ID c4
2024-11-04 14:43:35.051960+00:00 [warning] <0.1337.0> MQTT disconnecting client <<"127.0.0.1:42580 -> 127.0.0.1:27005">> with duplicate id 'c4'
2024-11-04 14:43:35.052689+00:00 [debug] <0.1337.0> sent Will Message to topic my/topic for MQTT client ID c4
2024-11-04 14:43:35.054119+00:00 [debug] <0.1334.0> sent Will Message to topic my/topic for MQTT client ID c3
```

We see nicely how RabbitMQ sends the will message for both c3 and c4.
However, the order in which RabbitMQ sends is not guaranteed.
Hence, we adapt the test expectation to not depend on the order of Will
messages being received.
2024-11-08 16:12:52 +01:00
David Ansari 090384fe37 Fix MQTT test flake management_plugin_connection
Prior to this commit this test flaked in CI:
```
=== === Reason: {assertEqual,
                     [{module,mqtt_shared_SUITE},
                      {line,1222},
                      {expression,"http_get ( Config , \"/connections\" )"},
                      {expected,[]},
                      {value,
                          [#{timeout => 99,
                             name => <<"127.0.0.1:58712 -> 127.0.0.1:29005">>,
                             node =>
                                 <<"rmq-ct-mqtt-cluster_size_1-1-29000@localhost">>,
                             port => 29005,user => <<"guest">>,ssl => false,
                             protocol => <<"MQTT 5-0">>,
                             host => <<"127.0.0.1">>,
                             client_properties =>
                                 #{client_id =>
                                       <<"management_plugin_connection">>},
                             vhost => <<"/">>,peer_host => <<"127.0.0.1">>,
                             peer_port => 58712,frame_max => 0,
                             channel_max => 0,auth_mechanism => <<"none">>,
                             connected_at => 1730797370048,
                             ssl_protocol => null,ssl_key_exchange => null,
                             ssl_cipher => null,ssl_hash => null,
                             peer_cert_issuer => null,
                             peer_cert_subject => null,
                             peer_cert_validity => null,
                             user_who_performed_action => <<"guest">>}]}]}
  in function  mqtt_shared_SUITE:management_plugin_connection/1 (mqtt_shared_SUITE.erl, line 1222)
  in call from test_server:ts_tc/3 (test_server.erl, line 1793)
  in call from test_server:run_test_case_eval1/6 (test_server.erl, line 1302)
  in call from test_server:run_test_case_eval/9 (test_server.erl, line 1234)
```
2024-11-08 11:12:04 +01:00
David Ansari 40bf778e89 Fix MQTT test flake
Prior to this commit, test
```
make -C deps/rabbitmq_mqtt ct-mqtt_shared t=[mqtt,cluster_size_1,v4]:non_clean_sess_reconnect_qos0_and_qos1
```

flaked in CI with error:
```
{mqtt_shared_SUITE,non_clean_sess_reconnect_qos0_and_qos1,972}
{badmatch,{publish_not_received,<<"msg-0">>}}
```

The problem was the following race condition:
* The MQTT v4 client sends an async DISCONNECT
* The global MQTT consumer metric got decremented. However, the classic
  queue still has the MQTT connection proc registered as consumer.
* The test case sends a message
* The classic queue checks out the message to the old connection instead
  of checking out the message to the new connection.

The solution in this commit is to check the consumer count of the
classic queue before proceeding to send the message after disconnection.
2024-11-08 10:35:36 +01:00
dependabot[bot] 969186f6fd
build(deps): bump org.apache.maven.plugins:maven-surefire-plugin
Bumps [org.apache.maven.plugins:maven-surefire-plugin](https://github.com/apache/maven-surefire) from 3.5.1 to 3.5.2.
- [Release notes](https://github.com/apache/maven-surefire/releases)
- [Commits](https://github.com/apache/maven-surefire/compare/surefire-3.5.1...surefire-3.5.2)

---
updated-dependencies:
- dependency-name: org.apache.maven.plugins:maven-surefire-plugin
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-11-04 18:34:32 +00:00
David Ansari 3db4a97cfb Expose AMQP connection metrics
Expose the same metrics for AMQP 1.0 connections as for AMQP 0.9.1 connections.

Display the following AMQP 1.0 metrics on the Management UI:
* Network bytes per second from/to client on connections page
* Number of sessions/channels on connections page
* Network bytes per second from/to client graph on connection page
* Reductions graph on connection page
* Garbage colletion info on connection page

Expose the following AMQP 1.0 per-object Prometheus metrics:
* rabbitmq_connection_incoming_bytes_total
* rabbitmq_connection_outgoing_bytes_total
* rabbitmq_connection_process_reductions_total
* rabbitmq_connection_incoming_packets_total
* rabbitmq_connection_outgoing_packets_total
* rabbitmq_connection_pending_packets
* rabbitmq_connection_channels

The rabbit_amqp_writer proc:
* notifies the rabbit_amqp_reader proc if it sent frames
* hibernates eventually if it doesn't send any frames

The rabbit_amqp_reader proc:
* does not emit stats (update ETS tables) if no frames are received
or sent to save resources when there are many idle connections.
2024-11-02 19:08:24 +01:00
Michael Klishin 5f6b1ccfb5
Merge pull request #12604 from rabbitmq/mqtt-fix-ssl-variable-name
MQTT, x.509 certificate-based authentication: use the correct key name for the TLS SAN type configuration parameter
2024-10-30 11:16:15 -04:00
Jean-Sébastien Pédron ea899602b0
rabbit_feature_flags: Introduce hard vs. soft required feature flags
[Why]
Before this patch, required feature flags were basically checked during
boot: they must have been enabled when they were mere stable feature
flags. If they were not, the node refused to boot.

This was easy for the developer because making a feature flag required
allowed to remove the entire compatibility code. Very satisfying.

Unfortunately, this was a pain point to end users, especially those who
did not pay attention to RabbitMQ and the release notes and were just
asking their package manager to update everything. They could end up
with a node that refuse to boot. The only solution was to downgrade,
enable the disabled stable feature flags, upgrade again.

[How]
This patch introduces two levels of requirement to required feature
flags:
* `hard`: this corresponds to the existing behavior where a node will
  refuse to boot if a hard required feature flag is not enabled before
  the upgrade.
* `soft`: such a required feature flag will be automatically enabled
  during the upgrade to a version where it is marked as required.

The level of requirement is set in the feature flag definition:

    -rabbit_feature_flag(
       {my_feature_flag,
        #{stability     => required,
	  require_level => hard
         }}).

The default requirement level is `soft`. All existing required feature
flags have now a requirement level of `hard`.

The handling of soft required feature flag is done when the cluster
feature flags states are verified and synchronized. If a required
feature flag is not enabled yet, it is enabled at that time.

This means that as developers, we will have to keep compatibility code
forever for every soft required feature flag, like the feature flag
definition itself.
2024-10-30 11:12:18 +01:00
Marcial Rosales e7cb2420a7 Verify non-zero DNS and email SAN 2024-10-29 16:41:20 +01:00
Marcial Rosales 4c1099950d Use the correct variable name 2024-10-29 16:41:20 +01:00
David Ansari 2c0cdee7d2
Support x-cc message annotation (#12559)
Support x-cc message annotation

Support an `x-cc` message annotation in AMQP 1.0
similar to the [CC](https://www.rabbitmq.com/docs/sender-selected) header in AMQP 0.9.1.

The value of the `x-cc` message annotation must by a list of strings.
A message annotation is used since application properties allow only simple types.
2024-10-24 13:03:05 +02:00
David Ansari 17df1b9343 Attempt to eliminate test flake
This commit attempts to eliminate the test flake described in
https://github.com/rabbitmq/rabbitmq-server/issues/12413#issuecomment-2385449940

```
rabbitmq_mqtt > parallel-ct-set-1 > mqtt_shared_SUITE > cluster_size_3 > v4 rabbit_mqtt_qos0_queue_kill_node

=== Ended at 2024-10-01 09:59:52
=== Location: [{mqtt_shared_SUITE,rabbit_mqtt_qos0_queue_kill_node,[1165](https://github.com/rabbitmq/rabbitmq-server/issues/mqtt_shared_suite.src.html#1165)},
              {test_server,ts_tc,1793},
              {test_server,run_test_case_eval1,1302},
              {test_server,run_test_case_eval,1234}]
=== === Reason: no match of right hand side value {publish_not_received,
                                                    <<"m1">>}
  in function  mqtt_shared_SUITE:rabbit_mqtt_qos0_queue_kill_node/1 (mqtt_shared_SUITE.erl, line 1165)
  in call from test_server:ts_tc/3 (test_server.erl, line 1793)
  in call from test_server:run_test_case_eval1/6 (test_server.erl, line 1302)
  in call from test_server:run_test_case_eval/9 (test_server.erl, line 1234)
```

This flake could not be reproduced locally.
This commit also assumes that this flake occurred under Khepri but not
under Mnesia.

The hypothesis is the following:
* Node 0 is down
* MQTT client creates binding on node 1
* Khepri commits since the binding is replicated and persisted on node 1
  and node 2. However the binding isn't reflected yet in node 2's
  routing projecting table.
* Publishing a message to node 2 routes to nowhere.
2024-10-23 17:36:58 +00:00
dependabot[bot] 45d285b06d
build(deps-dev): bump org.junit.jupiter:junit-jupiter
Bumps [org.junit.jupiter:junit-jupiter](https://github.com/junit-team/junit5) from 5.11.2 to 5.11.3.
- [Release notes](https://github.com/junit-team/junit5/releases)
- [Commits](https://github.com/junit-team/junit5/compare/r5.11.2...r5.11.3)

---
updated-dependencies:
- dependency-name: org.junit.jupiter:junit-jupiter
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-10-21 19:00:19 +00:00
Loïc Hoguin 469c3a0791
Make CI: Check that CI knows about all CT_SUITES in CI
Instead of every time we run Make for these applications.

This means that during development we are free to modify
these values or create new test suites without having to
worry about the check. If we forget to then add the test
suites in PARALLEL_CT the workflow will tell us.
2024-10-17 10:52:28 +02:00
Loïc Hoguin 655caf6d1a
Make CI: Have ct_master return the test results
Instead of having a CT hook just to know whether our tests failed.
2024-10-15 14:57:42 +02:00
Loïc Hoguin 6cdc32f558
Make CI: Make ct_master handle all testspec instructions 2024-10-15 14:57:42 +02:00
dependabot[bot] 7fc2fcd7a5
build(deps): bump org.apache.maven.plugins:maven-surefire-plugin
Bumps [org.apache.maven.plugins:maven-surefire-plugin](https://github.com/apache/maven-surefire) from 3.5.0 to 3.5.1.
- [Release notes](https://github.com/apache/maven-surefire/releases)
- [Commits](https://github.com/apache/maven-surefire/compare/surefire-3.5.0...surefire-3.5.1)

---
updated-dependencies:
- dependency-name: org.apache.maven.plugins:maven-surefire-plugin
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-10-07 18:38:10 +00:00
Loïc Hoguin 9645fb1275
Make parallel-ct properly detect test failures
The problem comes from `ct_master` which doesn't tell us
in the return value whether the tests succeeded. In order
to get that information a CT hook was created. But then
we run into another problem: despite its documentation
claiming otherwise, `ct_master` does not handle `ct_hooks`
instructions in the test spec.

So for the time being we fork `ct_master` into a new
`ct_master_fork` module and insert our hook directly
in the code. Later on we will submit patches to OTP.
2024-10-07 13:30:32 +02:00
dependabot[bot] ea435ecdbd
build(deps-dev): bump org.junit.jupiter:junit-jupiter
Bumps [org.junit.jupiter:junit-jupiter](https://github.com/junit-team/junit5) from 5.11.1 to 5.11.2.
- [Release notes](https://github.com/junit-team/junit5/releases)
- [Commits](https://github.com/junit-team/junit5/compare/r5.11.1...r5.11.2)

---
updated-dependencies:
- dependency-name: org.junit.jupiter:junit-jupiter
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-10-04 18:32:14 +00:00
David Ansari fa5b738cbc Fix shard counts
Half of these groups moved to the rabbitmq_web_mqtt plugin.
2024-10-01 18:02:00 +02:00
David Ansari db92f48dae bazel run gazelle 2024-10-01 18:02:00 +02:00
Loïc Hoguin a0ee6ddb69
Bazel fixes following renaming of test suites 2024-09-30 12:35:44 +02:00
Loïc Hoguin 4530fb5d97
make: Add new CT suites and clarify check on CT_SUITES 2024-09-30 12:35:43 +02:00
Loïc Hoguin aee0cd0079
make & make CI: Small cleanups 2024-09-30 12:35:43 +02:00
Loïc Hoguin ae984cc364
make: Set CT_LOGS_DIR to top-level logs/ directory
All CT logs will now be under <toplevel>/logs. An improved
test workflow would be to always keep the logs/all_runs.html
page open in the browser and refresh it whenever tests are
run in any of the rabbit applications.
2024-09-30 12:35:43 +02:00
Loïc Hoguin f4a24c7f16
make: Run rabbitmq_mqtt tests via parallel-ct 2024-09-30 12:35:41 +02:00
David Ansari f002029ebd
Do not open WebMQTT connection in MQTT plugin 2024-09-30 12:35:41 +02:00
Loïc Hoguin ddab3d523f
mqtt tests: Move v5 web_mqtt tests out of rabbitmq_mqtt 2024-09-30 12:35:41 +02:00
Loïc Hoguin 690b830e43
mqtt tests: Move web_mqtt tests out of rabbitmq_mqtt
The shared test suite was renamed only for clarity, but the
Web-MQTT test suites were renamed out of necessity: since
we are now adding the MQTT test directory to the code path
we need test suites to have different names to avoid
conflicts. We can't (easily) addpath only for this test suite
either since CT hooks don't call functions in a predictable
enough manner; it would always be hacky.
2024-09-30 12:35:41 +02:00
Loïc Hoguin f4f375c6a9
Use Make in CI
This is a proof of concept that mostly works but is missing
some tests, such as rabbitmq_mqtt or rabbitmq_cli. It also
doesn't apply to mixed version testing yet.
2024-09-30 12:35:41 +02:00
dependabot[bot] 1db0d1f758
build(deps-dev): bump org.junit.jupiter:junit-jupiter
Bumps [org.junit.jupiter:junit-jupiter](https://github.com/junit-team/junit5) from 5.11.0 to 5.11.1.
- [Release notes](https://github.com/junit-team/junit5/releases)
- [Commits](https://github.com/junit-team/junit5/compare/r5.11.0...r5.11.1)

---
updated-dependencies:
- dependency-name: org.junit.jupiter:junit-jupiter
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-25 18:34:34 +00:00
David Ansari 960808e6b2
Emit histogram metric for received message sizes per protocol (#12342)
* Add global histogram metrics for received message sizes per-protocol

fixup: add new files to bazel

fixup: expose message_size_bytes as prometheus classic histogram type

`rabbit_msg_size_metrics` does not use `seshat` any more, but
`counters` directly.

fixup: add msg_size_metrics unit test

* Improve message size histogram

1.
Avoid unnecessary time series emitted for stream protocol
The stream protocol cannot observe message sizes.
This commit ensures that the following time series are omitted:
```
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="64"} 0
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="256"} 0
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="1024"} 0
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="4096"} 0
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="16384"} 0
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="65536"} 0
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="262144"} 0
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="1048576"} 0
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="4194304"} 0
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="16777216"} 0
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="67108864"} 0
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="268435456"} 0
rabbitmq_global_message_size_bytes_bucket{protocol="stream",le="+Inf"} 0
rabbitmq_global_message_size_bytes_count{protocol="stream"} 0
rabbitmq_global_message_size_bytes_sum{protocol="stream"} 0
```

This reduces the number of time series by 15.

2.
Further reduce the number of time series by reducing the number of
buckets. Instead of 13 bucktes, emit only 9 buckets. Buckets are not
free, each is an extra time series stored.

Prior to this commit:
```
curl -s -u guest:guest localhost:15692/metrics | ag message_size | wc -l
      92
```

After this commit:
```
curl -s -u guest:guest localhost:15692/metrics | ag message_size | wc -l
      57
```

3.
The emitted metric should be called
`rabbitmq_message_size_bytes_bucket` instead of `rabbitmq_global_message_size_bytes_bucket`.
The latter is poor naming. There is no need to use `global` in
the metric name given that this metric doesn't exist in the old flawed
aggregated metrics.

4.
This commit simplies module `rabbit_global_counters`.

5.
Avoid garbage collecting the 10-elements list of buckets per message
being received.

---------

Co-authored-by: Péter Gömöri <peter@84codes.com>
2024-09-24 18:08:24 +02:00
David Ansari bddc54613f Decrease default MQTT Maximum Packet Size
Given that the default max_message_size got decreased from 128 MiB to 16
MiB in RabbitMQ 4.0 in https://github.com/rabbitmq/rabbitmq-server/pull/11455,
it makes sense to also decrease the default MQTT Maximum Packet Size from 256 MiB to 16 MiB.
Since this change was missed in RabbitMQ 4.0, it is scheduled for RabbitMQ 4.1.
2024-09-20 16:11:21 +02:00
David Ansari b1eb354385 Strictly validate annotations 2024-09-18 12:42:27 +02:00
Michael Klishin 4805e31e37
Merge pull request #12317 from rabbitmq/md/khepri/mqtt-fixes
Handle database timeouts in MQTT queue deletion
2024-09-16 19:00:09 -04:00
Michael Davis a9c48ef951
rabbit_mqtt_processor: Handle failures to delete a queue 2024-09-16 14:43:27 -04:00
Michael Davis 9627903716
rabbit_queue_type: Add `{error,timeout}` to delete/4 callback spec
This return value was already possible since a classic queue will return
it during termination if `rabbit_amqqueue:internal_delete/2` fails with
that value.

`rabbit_amqqueue:delete/4` already handles this value and converts it
into a protocol error and channel exit. The other caller (MQTT
processor) will be updated in a child commit.

This commit also replaces eager conversions to protocol errors in
rabbit_classic_queue, rabbit_quorum_queue and rabbit_stream_coordinator:
we should return `{error, timeout}` consistently and not hide it in
protocol errors.
2024-09-16 14:43:24 -04:00
dependabot[bot] 0d6916e3c5
build(deps-dev): bump com.rabbitmq:amqp-client
Bumps [com.rabbitmq:amqp-client](https://github.com/rabbitmq/rabbitmq-java-client) from 5.21.0 to 5.22.0.
- [Release notes](https://github.com/rabbitmq/rabbitmq-java-client/releases)
- [Commits](https://github.com/rabbitmq/rabbitmq-java-client/compare/v5.21.0...v5.22.0)

---
updated-dependencies:
- dependency-name: com.rabbitmq:amqp-client
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-16 18:42:53 +00:00
Michal Kuratczyk f0f7500f6a
Revert "Log errors from `ranch:handshake`" (#12304)
This reverts commit 620fff22f1.

It intoduced a regression in another area - a TCP health check,
such as the default (with cluster-operator) readinessProbe,
on a TLS-enabled instance would log a `rabbit_reader` crash
every few seconds:
```
tls-server-0 rabbitmq 2024-09-13 09:03:13.010115+00:00 [error] <0.999.0>   crasher:
tls-server-0 rabbitmq 2024-09-13 09:03:13.010115+00:00 [error] <0.999.0>     initial call: rabbit_reader:init/3
tls-server-0 rabbitmq 2024-09-13 09:03:13.010115+00:00 [error] <0.999.0>     pid: <0.999.0>
tls-server-0 rabbitmq 2024-09-13 09:03:13.010115+00:00 [error] <0.999.0>     registered_name: []
tls-server-0 rabbitmq 2024-09-13 09:03:13.010115+00:00 [error] <0.999.0>     exception error: no match of right hand side value {error, handshake_failed}
tls-server-0 rabbitmq 2024-09-13 09:03:13.010115+00:00 [error] <0.999.0>       in function  rabbit_reader:init/3 (rabbit_reader.erl, line 171)
```
2024-09-13 17:07:57 +02:00
Michael Davis c37b192beb
Handle Khepri timeouts when deleting MQTT QOS0 queues 2024-09-11 12:18:13 -04:00
Michael Klishin 606a65169f Style, wording 2024-09-03 01:31:44 -04:00
Marcial Rosales 1abc4ed02f Extract client_id from client cert 2024-08-30 11:39:48 +01:00
Michal Kuratczyk 8a03975ba7
Set the default vm_memory_high_watermark to 0.6 (#12161)
The default of 0.4 was very conservative even when it was
set years ago. Since then:
- we moved to CQv2, which have much more predictable memory usage than (non-lazy) CQv1 used to
- we removed CQ mirroring which caused large sudden memory spikes in some situations
- we removed the option to store message payload in memory in quorum queues

For the past two years or so, we've been running all our internal tests and benchmarks
using the value of 0.8 with no OOMkills at all (note: we do this on
Kubernetes where the Cluster Operators overrides the available memory
levaing some additional headroom, but effectively we are still using  more than
0.6 of memory).
2024-08-29 12:10:49 +02:00