Dependency horus broke coverage on `main` branch.
After this commit, on `main` branch in rabbitmq-server root
directory, both show coverage:
1.
```
make -C deps/rabbitmq_mqtt ct-auth t=[v5,limit]:vhost_queue_limit FULL=1 COVER=1
open deps/rabbitmq_mqtt/logs/index.html
```
2.
```
bazel coverage //deps/rabbitmq_mqtt:auth_SUITE -t- --test_sharding_strategy=disabled --test_env FOCUS="-group [v5,limit] -case vhost_queue_limit"
genhtml --output genhtml "$(bazel info output_path)/_coverage/_coverage_report.dat"
open genhtml/index.html
```
where `genhtml` is
https://github.com/linux-test-project/lcov/blob/master/bin/genhtml
Prior to this commit, coverage was broken with both Bazel and Erlang.mk:
On main - below logs are printed in different outputs:
First:
```
*** CT 2023-11-07 16:40:04.959 *** COVER INFO🔗
Adding nodes to cover test: ['rmq-ct-reader_SUITE-1-21000@localhost']
```
followed by
```
Could not start cover on 'rmq-ct-reader_SUITE-1-21000@localhost': {error,
{already_started,
<20798.286.0>}}
```
followed by
```
*** CT 2023-11-07 16:40:04.960 *** COVER INFO🔗
Successfully added nodes to cover test: []
```
followed by
```
Error in process <0.202.0> on node ct_rabbitmq_mqtt@nuc with exit value:
{{badmatch,{ok,[]}},
[{rabbit_ct_broker_helpers,'-cover_add_node/1-fun-0-',1,
[{file,"rabbit_ct_broker_helpers.erl"},
{line,2211}]},
{rabbit_ct_broker_helpers,query_node,2,
[{file,"rabbit_ct_broker_helpers.erl"},
{line,824}]},
{rabbit_ct_broker_helpers,run_node_steps,4,
[{file,"rabbit_ct_broker_helpers.erl"},
{line,447}]},
{rabbit_ct_broker_helpers,start_rabbitmq_node,4,
[{file,"rabbit_ct_broker_helpers.erl"},
```
It's also worth mentioning that
`make run-broker`
on v3.12.x:
```
Starting broker... completed with 36 plugins.
1> whereis(cover_server).
undefined
```
but on main:
```
Starting broker... completed with 36 plugins.
1> whereis(cover_server).
<0.295.0>
```
So, process `cover_server` runs on main in non test code.
Prior to this commit:
1. Start RabbitMQ with MQTT plugin enabled.
2.
```
rabbitmq-diagnostics consume_event_stream
^C
```
3. The logs will print the following warning:
```
[warning] <0.570.0> ** Undefined handle_info in rabbit_mqtt_internal_event_handler
[warning] <0.570.0> ** Unhandled message: {'DOWN',#Ref<0.2410135134.1846280193.145044>,process,
[warning] <0.570.0> <52723.100.0>,noconnection}
[warning] <0.570.0>
```
This is because rabbit_event_consumer:init/1 monitors the CLI process.
Any rabbit_event handler should therefore implement handle_info/2.
It's similar to what's described in the gen_event docs about
add_sup_handler/3:
> Any event handler attached to an event manager which in turn has a
> supervised handler should expect callbacks of the shape
> Module:handle_info({'EXIT', Pid, Reason}, State).
Listing queues with the HTTP API when there are many (1000s) of
quorum queues could be excessively slow compared to the same scenario
with classic queues.
This optimises various aspects of HTTP API queue listings.
For QQs it removes the expensive cluster wide rpcs used to get the
"online" status of each quorum queue. This was previously done _before_
paging and thus would perform a cluster-wide query for _each_ quorum queue in
the vhost/system. This accounted for most of the slowness compared to
classic queues.
Secondly the query to separate the running from the down queues
consisted of two separate queries that later were combined when a single
query would have sufficed.
This commit also includes a variety of other improvements and minor
fixes discovered during testing and optimisation.
MINOR BREAKING CHANGE: quorum queues would previously only display one
of two states: running or down. Now there is a new state called minority
which is emitted when the queue has at least one member running but
cannot commit entries due to lack of quorum.
Also the quorum queue may transiently enter the down state when a node
goes down and before its elected a new leader.
With this adjustment, more actions will use the sandbox, which may
help with an error relating to a missing `rabbit.hrl` that could occur
when building the cli or running its tests
Since this adjusts the user-template.bazelrc, everyone should likely
update their own user.bazelrc accordingly
WHY:
Shovelling from RabbitMQ to Azure Service Bus and Azure Event Hub fails.
Reported in
https://discord.com/channels/1092487794984755311/1092487794984755314/1169894510743011430
Reproduction steps:
1. Follow https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-integrate-with-rabbitmq
2. Publish messages to RabbitMQ:
```
java -jar target/perf-test.jar -x 1 -y 0 -u azure -p -C 100000 -s 1 -c 100000
```
Prior to this commit, after a few seconds and after around 20k messages
arrived in Azure, RabbitMQ errored and logged:
```
{function_clause,
[{amqp10_client_connection,close_sent,
[info,
{'EXIT',<0.949.0>,
{{badmatch,{error,insufficient_credit}},
[{rabbit_amqp10_shovel,forward,4,
[{file,"rabbit_amqp10_shovel.erl"},
{line,334}]},
{rabbit_shovel_worker,handle_info,2,
[{file,"rabbit_shovel_worker.erl"},
{line,101}]},
{gen_server2,handle_msg,2,
[{file,"gen_server2.erl"},{line,1056}]},
{proc_lib,init_p_do_apply,3,
[{file,"proc_lib.erl"},{line,241}]}]}},
```
After this commit, all 100k messages get shovelled to Azure Service Bus.
HOW:
1. Fix link credit accounting in Erlang AMQP 1.0 client library. For each
message being published, link credit must be decreased by 1 instead of
being increased by 1.
2. If the shovel plugin runs out of credits, it must wait until the
receiver (Azure Service Bus) grants more credits to RabbitMQ.
Note that the solution in this commit is rather a naive quick fix for one
obvious bug. AMQP 1.0 integration between RabbitMQ and Azure Service Bus is
not tested and not guaranteed at this point in time.
More work will be needed in the future, some work is done as part of
https://github.com/rabbitmq/rabbitmq-server/pull/9022
Previously, test pubsub was flaky
```
{shared_SUITE,pubsub,766}
{test_case_failed,missing m1}
```
because the binding wasn't present yet on node 0 when publishing to node
0.
This version detects major version mismatches in transient
dependencies
In this case, it will notice if, for instance, ra and osiris ask for
different major versions of seshat
[Why]
`rabbit_khepri` relied on undocumented internals of Ra. This made this
code very fragile and not future-proof at all.
[How]
Ra exposes a new `ra:key_metrics/1` API which fullfills the need. This
patch uses it.
At a consequence, we can get rid of the Dialyzer directives to turn off
some warnings.