The `rabbit_mgmt_gc` gen_server performs garbage collections
periodically. When doing so it can create potentially fairly large
terms, for example by creating a set out of
`rabbit_exchange:list_names/0`. With many exchanges, for example, the
process memory usage can climb steadily especially when the management
agent is mostly idle since `rabbit_mgmt_gc` won't hit enough reductions
to cause a full-sweep GC on itself. Since the process is only active
periodically (once every 2min by default) we can hibernate it to GC the
terms it created.
This can save a medium amount of memory in situations where there are
very many pieces of metadata (exchanges, vhosts, queues, etc.). For
example on an idle single-node broker with 50k exchanges,
`rabbit_mgmt_gc` can hover around 50MB before being naturally GC'd. With
this patch the process memory usage stays consistent between `start_gc`
timer messages at around 1KB.
(cherry picked from commit ce5d42a9d6)
Building on push to any branch is wasteful and unnecessary, because most
of built images are never used. The workflow dispatch trigger covers the
use case to build an image from the latest commit in a branch.
The use case to validate/QA a PR is now covered by on pull request
trigger. This trigger has a caveat: PRs from forks won't produce a
docker image.
Why?
Because PRs from forks do not inject rabbitmq-server secrets. This is a
security mechanism from GitHub, to protect repository secrets.
With this trigger is possible to QA/validate PRs from other Core team
members. Technically, anyone with 'write' access to our repo to push
branches.
(cherry picked from commit 4efb3df39e)
# Conflicts:
# .github/workflows/oci-make.yaml
Test (make) / Build and Xref (1.17, 26) (push) Has been cancelledDetails
Test (make) / Build and Xref (1.17, 27) (push) Has been cancelledDetails
Test (make) / Test (1.17, 27, khepri) (push) Has been cancelledDetails
Test (make) / Test (1.17, 27, mnesia) (push) Has been cancelledDetails
Test (make) / Test mixed clusters (1.17, 27, khepri) (push) Has been cancelledDetails
Test (make) / Test mixed clusters (1.17, 27, mnesia) (push) Has been cancelledDetails
Test (make) / Type check (1.17, 27) (push) Has been cancelledDetails
The `below_node_connection_limit_test` and `ready_to_serve_clients_test`
cases could possibly flake because `is_quorum_critical_single_node_test`
uses the channel manager in `rabbit_ct_client_helpers` to open a
connection. This can cause the line
true = lists:all(fun(E) -> is_pid(E) end, Connections),
to fail to match. The last connection could have been rejected if the
channel manager kept its connection open, so instead of being a pid the
element would have been `{error, not_allowed}`.
With `rabbit_ct_client_helpers:close_channels_and_connection/2` we can
reset the connection manager and force it to close its connection.
This commit is backported from 314e4261fc
on main.
Returning the connection limit and active count are not really necessary
for these checks. Instead of returning them in the response to the
health check we log a warning when the connection limit is exceeded.
(cherry picked from commit 3f53e0172d)
Test (make) / Build and Xref (1.17, 26) (push) Has been cancelledDetails
Test (make) / Build and Xref (1.17, 27) (push) Has been cancelledDetails
Test (make) / Test (1.17, 27, khepri) (push) Has been cancelledDetails
Test (make) / Test (1.17, 27, mnesia) (push) Has been cancelledDetails
Test (make) / Test mixed clusters (1.17, 27, khepri) (push) Has been cancelledDetails
Test (make) / Test mixed clusters (1.17, 27, mnesia) (push) Has been cancelledDetails
Test (make) / Type check (1.17, 27) (push) Has been cancelledDetails
[skip ci] Bump com.google.googlejavaformat:google-java-format from 1.26.0 to 1.27.0 in /deps/rabbit/test/amqp_jms_SUITE_data in the dev-deps group across 1 directory
This is useful for a load balancer, for example, to be able to avoid
sending new connections to a node which is running and has listeners
bound to TCP ports but is being drained for maintenance.
(cherry picked from commit 07fe6307c6)
This updates the health check for protocol listeners to accept a set of
protocols, comma-separated. The check only returns 200 OK when all
requested protocols have active listeners.
(cherry picked from commit 5d319be3f9)
This is a minor change that avoids a cluster-wide query for active
listeners. The old code called `rabbit_networking:active_listeners/0`
and then filtered the results by ones available on the local node. This
caused an RPC and concatenation of all other cluster members' listeners
and then in the next line filtered down to local nodes. Equivalently we
can use `rabbit_networking:node_listeners(node())` which dumps a local
ETS table.
This is not a very impactful change but it's nice to keep the latency of
the health-check handlers low and reduce some unnecessary cluster noise.
(cherry picked from commit 0d692fa161)
At CQ startup variable_queue went through each seqid from 0 to
next_seq_id looking for the first message even if there were no
messages in the queue (no segment files).
In case of a clean shutdown the value next_seq_id is stored in
recovery terms. This value can be utilized by the queue index to
provide better seqid bounds in absence of segment files.
Before this patch starting an empty classic queue with next_seq_id =
100_000_000 used to take about 26 seconds. With this patch it takes
less than 1ms.
(cherry picked from commit 150172f008)
[Why]
They were moved from `rabbit` to `rabbit_common` several years ago to
solve an dependency issue because `amqp_client` depended on the file
handle cache. This is not the case anymore.
[How]
The modules are moved back to `rabbit`.
`rabbit_common` doesn't need to depend on `os_mon` anymore. `rabbit`
already depends on it, so no changes needed here.
`include/rabbit_memory.hrl` and some test cases are moved as well to
follow the `vm_memory_monitor` module.
(cherry picked from commit e58eb1807a)
Consumers with a same name, consuming from the same stream should have
the same partition index. This commit adds a check to enforce this rule
and make the subscription fail if it does not comply.
Fixes#13835
(cherry picked from commit cad8b70ee8)
Vhosts that currently don't have their own default queue type, now
inherit it from the node configuration and store it in their metadata
going forward.
(cherry picked from commit 9d0f01b45b)
Rather than injecting node-level DQT when exporting definitions,
inject it into vhost's metadata when a vhost is created.
(cherry picked from commit 3c95bf32e7)
The correct place for the `default_queue_type` property
is inside the `metadata` block. However, right now we'd
always export the value outside of `metadata` AND only
export it inside `metadata`, if it was not `undefined`.
This value outside of `metadata` was just misleading:
if a user exported the definitins from a fresh node,
changed `classic` to `quorum` and imported such modified
values, the DQT would still be `classic`, because RMQ looks
for the value inside `metadata`. Just to make it more confusing,
if the DQT was changed successfully one way or another, the
value outside of `metadata` would reflect that
(it always shows the correct value, but is ignored on import).
(cherry picked from commit 73da2a3fbb)
it was necessary to add a queue first before checking which
columns are available
(cherry picked from commit 7653b6522a)
# Conflicts:
# selenium/test/queuesAndStreams/list.js