This ensures that quorum_queues shuts down _before_
coordination where khepri run inside.
Quorum queues depend on khepri so need to be shut down first.
[Why]
The feature flag enable function is called during the initial migration
or when a node is later added to a cluster.
In this latter situation, the cluster is already formed and the Mnesia
tables were already migrated. Syncing the cluster in this specific
situation might kick another node that is currently unreachable.
[How]
If the node running the enable function is already clustered, we skip
the cluster sync.
[Why]
All callers of `khepri_adv` and `khepri_tx_adv` need updates to handle
the now uniform return type of `khepri:node_props_map()` in Khepri
0.17.0.
[How]
We don't need any compatibility code to handle "either the old return
type or the new return type" from the khepri_adv API because the
translation is done entirely in the "client side" code in Khepri -
meaning that the return value from the Ra server is the same but it is
translated differently by the functions in `khepri_adv`.
However, we need to adapt transaction functions because they may be
executed on different versions of Khepri and the behaviour of
`khepri_tx_adv` can be different. To take the possible change of return
value format, we use the new `khepri_tx:does_api_comply_with/1` to know
what to expect.
[Why]
In Khepri 0.17.0, `khepri_cluster:locally_known_members/1` and
`khepri_cluster:locally_known_node/1` were replaced with
`khepri_cluster:members/2` and `khepri_cluster:nodes/2` with `favor` set
to `low_latency` - this matches the interface for queries in Khepri.
Move leader repair earlier in tick function to ensure more
timely update of meta data store record after leader change.
Also use RPC_TIMEOUT macro for metric/stats multicalls to improve
liveness when a node is connected but partitioned / frozen.
This should address crashes like this in (found in user's logs):
```
exception error: no case clause matching
[[{connection_details,[]},
{name,<<"10.0.13.41:50497 -> 10.2.230.128:5671 (1)">>},
{node,rabbit@foobar},
{number,1},
{user,<<"...">>},
{user_who_performed_action,<<"...">>},
{vhost,<<"/">>}],
[{connection_details,[]},
{name,<<"10.0.13.41:50142 -> 10.2.230.128:5671 (1)">>},
{node,rabbit@foobar},
{number,1},
{user,<<"...">>},
{user_who_performed_action,<<"...">>},
{vhost,<<"/">>}]]
in function rabbit_federation_mgmt:format/3 (rabbit_federation_mgmt.erl, line 100)
in call from rabbit_federation_mgmt:'-status/3-lc$^0/1-0-'/4 (rabbit_federation_mgmt.erl, line 89)
in call from rabbit_federation_mgmt:'-status/4-lc$^0/1-0-'/3 (rabbit_federation_mgmt.erl, line 82)
in call from rabbit_federation_mgmt:'-status/4-lc$^0/1-0-'/3 (rabbit_federation_mgmt.erl, line 82)
in call from rabbit_federation_mgmt:status/4 (rabbit_federation_mgmt.erl, line 82)
in call from rabbit_federation_mgmt:to_json/2 (rabbit_federation_mgmt.erl, line 57)
in call from cowboy_rest:call/3 (src/cowboy_rest.erl, line 1590)
in call from cowboy_rest:set_resp_body/2 (src/cowboy_rest.erl, line 1473)
```
## What?
This commit determines the queue topology without checking the queue type.
## Why?
This way, checking leader and replicas works the same across all queue
types without the need to introduce other rabbit_queue_type behaviour as
suggested in other PRs.
## How?
pid is the leader, nodes in queue_type_states are the members/replicas.
This commit results in an unknown stream leader during queue
declaration. However the correct leader will be returned eventually when
calling GET on the stream.
This allows restricting access to the /api/index.html and
the /cli/index.html page to authenticated users should the
user really want to. This can be enabled via advanced.config.
A connection which terminated before it was fully established
would lead to a function_clause, since metadata is not available
to really call notify_connection_closed. We can just ignore such
connections and not notify about them.
Resolves https://github.com/rabbitmq/rabbitmq-server/discussions/13670