[Why]
The downstream process was already handling a `{shutdown, Term}`
termination reason from upstream gracefully: it would log a message an
close the connection.
However it didn't handle the more common `shutdown` reason, which
happens with a regular stop of the upstream node. It led to the log of a
giant scary crash message.
[How]
We handle `shutdown` the same as `{shutdown, Term}`.
* Remove unused aliases/imports
* Remove or underscore unused bindings
* Fix variables that should be atoms (`unavailable` -> `:unavailable`)
Also, `Logger.warn/1` has been replaced by `Logger.warning/1`. It should
be safe to just replace the call with `Logger.warning/1` since it's
been in the standard library since Elixir 1.11.
`rabbit_nodes_common:ensure_epmd/0` unconditionally starts EPMD by
spawning `erl` with nodename options. Spawning `erl` can be quite slow
though (around 250ms for me locally), so we should try to avoid it when
we detect that EPMD is already running.
We can relatively cheaply check whether EPMD is already running with
`net_adm:names/0`, a function that asks the daemon on localhost to list
any registered names, the same as `epmd -names` on the comand line. If
we can successfully get the list of names from the daemon then we don't
need to start EPMD. `net_adm:names/0` is relatively cheap compared to
running `erl`, costing around 8ms when EPMD is running and less than a
millisecond when it is not.
This improves the CLI's total run time for commands that read the
`enabled_plugins_file` and `plugins_dir` config options. Those options
try to consult a running node and so they start distribution and ensure
that EPMD is running. `rabbitmqctl --help` saves nearly 500ms when EPMD
is already running as it reads both options, since previously it
spawned and blocked on `erl` when reading each option.
After an upgrade (multi-node cluster with rolling restart) from pre
3.13.0 with already existing federation links, old child ids are
preserved in the mirrored supervisor.
(cherry picked from commit 311cc925e3)
[Why]
Ra consistent queries are currently fragile in the sense that the query
function may run on a remote node and the function reference or MFA may
not be valid on that node. See previous commit for more details.
[How]
We perform local queries in `rabbit_db_maintenance:get_consistent/1`
when Khepri is enabled. This violates what the expectation from this
API, that's why it is a temporary measure, until a proper solution is
found.
[Why]
Ra consistent queries are currently fragile in the sense that the query
function may run on a remote node and the function reference or MFA may
not be valid on that node:
* A different Erlang compiler may produce difference function references
for the same module source code. We observed a difference between
Erlang/OTP 25.x and Erlang/OTP 26.x compilers for instance.
* There is no way to be sure that the remote function copy, whether it
is described by a function reference of an MFA tuple, is the same as
the copy local to the caller. Indeed, the remote node may run a
different version after an upgrade to one of the local or remote
nodes.
[How]
That's why we force local queries for now. This is fine for now,
especially that we use Khepri projections in many places and they are
local by design.
[Why]
Sometimes, `ra_leaderboard:lookup_leader/1` will return `undefined`
because it doesn't know the leader yet. This leads to a failure of the
testcase with a `badmatch` exception.
[How]
We wait for the function to return a valid leader ID, then try again and
return the result.
Prior to this commit test block_connack_timeout
flaked when 2 new ports got created instead of only 1
in line
```
[NewPort] = Ports -- Ports0,
```
This commit filters for tcp_inet ports.
This will always return the port of the new MQTT connection.
The 'Err' variable was accidentally bound twice resulting in the below
error when `erpc_call` and `get_sys_status` returned two different
errors.
An example when 1 follower of a quorum queue is not alive
```
% rabbitmq-queues quorum_status qq1
Status of quorum queue qq1 on node rabbit-1@localhost ...
Error:
{{:case_clause, {:error, :noproc}}, [{:rabbit_quorum_queue, :"-status/2-lc$^0/1-0-", 2, [file: 'rabbit_quorum_queue.erl', line: 1094]}, {:rabbit_quorum_queue, :status, 2, []}]}
```