Per discussion in #10415, this introduces a new module,
rabbit_mgmt_nodes, which provides a couple of helpers
that can be used to implement Cowboy REST's
resource_exists/2 in the modules that return
information about cluster members.
(cherry picked from commit 0c0e2ca932)
The behaviour of this module is to fragile to potentially allow a regression
here so we explicitly ping_all/0 before filtering running nodes.
(cherry picked from commit d74821581b)
Ram nodes are a deprecated feature and the actual assertion is
quite a complicated once that isn't easy to reason about as it
asserts on the cluster view of nodes that that have their
rabbit app stopped.
(cherry picked from commit 87664e9fcb)
# Conflicts:
# deps/rabbit/test/clustering_management_SUITE.erl
As this will force erlang to attempt to set up a distribution connection
to the down node. This can take some time, especially in cloud environments.
(cherry picked from commit 8aa217613c)
Must be a leftover from the refactoring in commit 0a87aea
This should prevent the below crash that was seen with an exchange
federation link
```
{undef,
[{supervisor2,try_again_restart,
[<0.105395.0>,
{upstream,
[{encrypted,
<<"...">>}],
<some mirrored supervisor data>},
{upstream,
[{encrypted,
<<"...">>}],
<some mirrored supervisor data>}],
[]}]}
```
(cherry picked from commit b6f782fd0d)
[Why]
We need to do this for the `terminate/3` to be called. Without this, the
process exits without calling it.
(cherry picked from commit a472982d26)
In stream topology function call. This would
trigger an exception in the frame creation
when the stream was not available because the atom
was unexpected.
(cherry picked from commit 6a330563ce)
The solution in #10203 has the following issues:
1. Bindings can be left ofter in Mnesia table rabbit_durable_queue.
One solution to 1. would be to first delete the old queue via
`rabbit_amqqueue:internal_delete(Q, User, missing_owner)`
and subsequently declare the new queue via
`rabbit_amqqueue:internal_declare(Q, false)`
However, even then, it suffers from:
2. Race conditions between `rabbit_amqqueue:on_node_down/1`
and `rabbit_mqtt_qos0_queue:declare/2`:
`rabbit_amqqueue:on_node_down/1` could first read the queue records that
need to be deleted, thereafter `rabbit_mqtt_qos0_queue:declare/2` could
re-create the queue owned by the new connection PID, and `rabbit_amqqueue:on_node_down/1`
could subsequently delete the re-created queue.
Unfortunately, `rabbit_amqqueue:on_node_down/1` does not delete
transient queues in one isolated transaction. Instead it first reads
queues and subsequenlty deletes queues in batches making it prone to
race conditions.
Ideally, this commit deletes all rabbit_mqtt_qos0_queue queues of the
node that has crashed including their bindings.
However, doing so in one transaction is risky as there may be millions
of such queues and the current code path applies the same logic on all
live nodes resulting in conflicting transactions and therefore a long
database operation.
Hence, this commit uses the simplest approach which should still be
safe:
Do not remove rabbit_mqtt_qos0_queue queues if a node crashes.
Other live nodes will continue to route to these dead queues.
That should be okay, given that the rabbit_mqtt_qos0_queue clients auto
confirm.
Continuing routing however has the effect of counting as routing result
for AMQP 0.9.1 `mandatory` property.
If an MQTT client re-connects to a live node with the same client ID,
the new node will delete and then re-create the queue.
Once the crashed node comes back online, it will clean up its leftover
queues and bindings.
(cherry picked from commit 78b4fcc899)
# Conflicts:
# deps/rabbitmq_mqtt/src/rabbit_mqtt_qos0_queue.erl