Commit Graph

43 Commits

Author SHA1 Message Date
Jean-Sébastien Pédron 1f1a13521b
Skip peer discovery clustering tests if multiple Khepri machine versions
... are being used at the same time.

[Why]
Depending on which node clusters with which, a node running an older
version of the Khepri Ra machine may not be able to apply Ra commands
and could be stuck.

There is no real solution and this clearly an unsupported scenario. An
old node won't always be able to join a newer cluster.

[How]
In the testsuites, we skip clustering tests if we detect that multiple
Khepri Ra machine versions are being used.
2025-02-12 17:13:24 +01:00
Jean-Sébastien Pédron f549425615
rabbitmq_ct_broker_helpers: Use node 2 as the cluster seed node
[Why]
When running mixed-version tests, nodes 1/3/5/... are using the primary
umbrella, so usually the newest version. Nodes 2/4/6/... are using the
secondary umbrella, thus the old version.

When clustering, we used to use node 1 (running a new version) as the
seed node, meaning other nodes would join it.

This complicates things with feature flags because we have to make sure
that we start node 1 with new stable feature flags disabled to allow old
nodes to join.

This is also a problem with Khepri machine versions because the cluster
would start with the latest version, which old nodes might not have.

[How]
This patch changes the logic to use a node running the secondary
umbrella as the seed node instead. If there is no node running it, we
pick the first node as before.

V2: Revert part of "rabbitmq_ct_helpers: Fix how we set
    `$RABBITMQ_FEATURE_FLAGS` in tests" (commit
    57ed962ef6). These changes are no
    longer needed with the new logic.

V3: The check that verifies that the correct metadata store is used has
    a special case for nodes that use the secondary umbrella: if Khepri
    is supposed to be used but it's not, the feature flag is enabled.
    The reason is that the `v4.0.x` branch doesn't know about the `rel`
    configuration of `forced_feature_flags_on_init`. The nodes will
    have ignored thies parameter and booted with the stable feature
    flags only.

    Many testsuites are adapted to the new clustering order. If they
    manage which node joins which node, either the order is changed in
    the testcases, or nodes are started with only required feature
    flags. For testsuites that rely on peer discovery where the order is
    unknown, nodes are started with only required feature flags.
2025-01-27 12:08:12 +01:00
Michael Klishin 968eefa1bb
Bump (c) line year
There are no functional changes to this massive diff.
2025-01-01 17:54:10 -05:00
Jean-Sébastien Pédron d54b2bff3c
rabbitmq_peer_discovery_etcd: Wait for etcd start in system_SUITE
[Why]
It was possible that testcases were executed before the etcd daemon was
ready, leading to test failures.

[How]
There was already a santy check to verify that the etcd daemon was
working correctly, but it was itself a testcase.

This patch moves this code to the etcd start code to wait for it to be
ready.

This replaces the previous workaround of waiting for 2 seconds.

While here, log anything printed to stdout/stderr by etcd after it
exited.

Fixes #12981.
2024-12-19 19:45:55 +01:00
David Ansari c23c632753 Fix etcd test failures
Running
```
make -C deps/rabbitmq_peer_discovery_etcd ct-system
```
on some macOS system causes test failures because the client cannot
connect to etcd:
```
test failed to connect [localhost:2379] by <Gun Down> {down,
                                                       {shutdown,
                                                        econnrefused}}
```

The etcd log file didn't show any error message.
However, the etcd log file showed that the etcd listener got started
after the test case tried to connect.

This commit fixes the test failure.

A better solution would be to use the HTTP API or the etcdctl CLI to
poll the listener status. However, simply waiting for 2 seconds is good
enough for this test suite.
2024-12-19 17:49:47 +01:00
Jean-Sébastien Pédron e890b9d37f
rabbitmq_peer_discovery_{etcd,consul}: Fix error handling if Khepri is unsupported
[How]
We must check the return value of `rabbit_ct_broker_helpers:run_steps/2`
because it could ask that the testsuite/testgroup/testcase should be
skipped.
2024-07-10 14:24:19 +02:00
Karl Nilsson 3390fc97fb etcd peer discovery fixes
Instead of relying on the complex and non-determinstic default node
selection mechanism inside peer discovery this change makes the
etcd backend implemention make the leader selection itself based on
the etcd create_revision of each entry. Although not spelled out anywhere
explicitly is likely that a property called "Create Revision" is going
to remain consistent throughout the lifetime of the etcd key.

Either way this is likely to be an improvement on the current approach.
2024-06-12 15:31:27 +01:00
Jean-Sébastien Pédron 50b490100d
rabbitmq_peer_discovery_etcd: Add clustering testcases
[Why]
The existing testsuite tried if the communication with an etcd node
would work, but didn't test an actual cluster formation.

[How]
The new testcases try to create a cluster using the local etcd node
started by the testsuite. The first one starts one RabbitMQ node at a
time. the second one starts all of them concurrently.

While here, use the etcd source code added as a Git submodule in a
previous commit to compile etcd locally just for the testsuite.
2024-05-14 09:40:44 +02:00
Michael Klishin 9c79ad8d55 More missed license header updates #9969 2024-02-05 12:26:25 -05:00
Michael Klishin f414c2d512
More missed license header updates #9969 2024-02-05 11:53:50 -05:00
Michael Klishin 01092ff31f
(c) year bumps 2024-01-01 22:02:20 -05:00
Michael Klishin 1b642353ca
Update (c) according to [1]
1. https://investors.broadcom.com/news-releases/news-release-details/broadcom-and-vmware-intend-close-transaction-november-22-2023
2023-11-21 23:18:22 -05:00
Luke Bakken 7ac6fea9f3
Update etcd testing version
Also ensure init script passes `shellcheck`

Updated while looking into https://github.com/rabbitmq/rabbitmq-server/issues/5792
2023-01-22 10:04:01 -08:00
Michael Klishin ec4f1dba7d
(c) year bump: 2022 => 2023 2023-01-01 23:17:36 -05:00
Luke Bakken 7fe159edef
Yolo-replace format strings
Replaces `~s` and `~p` with their unicode-friendly counterparts.

```
git ls-files *.erl | xargs sed -i.ORIG -e s/~s>/~ts/g -e s/~p>/~tp/g
```
2022-10-10 10:32:03 +04:00
Michael Klishin c38a3d697d
Bump (c) year 2022-03-21 01:21:56 +04:00
Michael Klishin f7d32d69f8 Introduce a new CLI tool (scope), rabbitmq-tanzu
For Tanzu (commercial) plugins to attach their commands to instead of
polluting rabbitmqctl.

Pair: @pjk25
(cherry picked from commit 6e0f2436fa)
2021-11-30 14:54:09 +00:00
Michael Klishin 52479099ec
Bump (c) year 2021-01-22 09:00:14 +03:00
Michael Klishin 013c30370f Switch to MPL2 2020-07-14 15:47:29 +03:00
Michael Klishin 7c9b9f2f81 Test cases for short peer discovery mechanism alias 2020-06-03 01:39:31 +03:00
Michael Klishin b2d6625a2b init-etcd.sh: add /usr/sbin to PATH
for Debian
2020-04-10 10:44:33 +03:00
Luke Bakken a04ff31d6d Fix syntax errors 2020-04-08 14:44:41 -07:00
Michael Klishin 709f6598af More idempotent tests 2020-04-08 21:55:06 +03:00
Michael Klishin c89e8d4ff8 Output init-etcd.sh exit code and output on failures 2020-04-03 21:00:17 +03:00
Michael Klishin 46ff052218 Adapt init-etcd.sh for Linux 2020-04-03 20:30:48 +03:00
Michael Klishin e3da01186a Initial support for authentication with etcd
References #6.
2020-04-03 14:01:06 +03:00
Michael Klishin 5ca74074e3 New integration tests
they manage an etcd node transparently similarly
to how slapd(8) is managed in rabbitmq-auth-backend-ldap.
2020-04-03 11:34:51 +03:00
Michael Klishin bea151a5f8 Introduce a script that starts an etcd node
to be used by an integration test suite(s)
2020-04-03 11:33:52 +03:00
Michael Klishin e452478c9a Initial tests for v3 API-based implementation 2020-04-01 16:25:18 +03:00
Jean-Sébastien Pédron ab8fc13127 Update copyright (year 2020) 2020-03-10 16:41:31 +01:00
Michael Klishin ce69b14630 Adapt to rabbitmq/rabbitmq-peer-discovery-common API changes
for warning-free OTP 23 compatibility.
2019-12-29 16:58:23 +03:00
Michael Klishin a76c4a8287 Alias cluster_formation.etcd.lock_timeout to cluster_formation.etcd.lock_wait_time
For consistency with the name rabbitmq-peer-discovery-consul now uses. That backend
was updated to support both keys as well.

References rabbitmq/rabbitmq-peer-discovery-consul#20.
2018-10-10 00:09:16 +03:00
Luke Bakken 6c28ce0a48 Whitespace, add a couple tests for cluster name with slash 2018-09-10 11:32:18 -07:00
Michael Klishin c681bdbb10 Handle keys for which node name extraction returned an error 2018-09-10 18:02:41 +02:00
Michael Klishin b96bcf78fa Extract node name from a /nodes/{name} sequence
Relying on slash-separated segments is problematic because
some user-provided segments might contain slashes, e.g. it makes
sense for key prefix values.

Closes #14.

[#159956331]
2018-09-10 17:42:15 +02:00
Michael Klishin 3c9683a365 Failing tests for #14
[#159956331]
2018-09-10 15:58:30 +02:00
Luke Bakken b0f00be3dc Add test for etcd_prefix and cluster_name
Fixes #10
2018-05-28 09:27:33 -07:00
Michael Klishin a568279850 Make cluster name and lock acquisition time configurable
via the new style config format.
2017-08-24 09:55:24 -06:00
Michael Klishin a8b4641aad New style config schema: add more etcd backend settings 2017-08-24 07:04:29 -04:00
Michael Klishin c067edd205 etcd_region does not exist
Looks like a copy-paste mistake from the AWS backend.
2017-08-24 06:38:55 -04:00
Michael Klishin a7b8c925e9 More tests 2017-06-13 03:42:22 +03:00
Michael Klishin ca1c0c431f Add a test for extract_nodes/1 2017-06-13 02:35:17 +03:00
Luke Bakken 74fb686403 Add cuttlefish schema tests and cuttlefish schema.
Fixes #2
2017-06-12 11:12:48 -07:00