Commit Graph

58467 Commits

Author SHA1 Message Date
dependabot[bot] 4860585c50
build(deps): bump org.apache.maven.plugins:maven-surefire-plugin
Bumps [org.apache.maven.plugins:maven-surefire-plugin](https://github.com/apache/maven-surefire) from 3.5.1 to 3.5.2.
- [Release notes](https://github.com/apache/maven-surefire/releases)
- [Commits](https://github.com/apache/maven-surefire/compare/surefire-3.5.1...surefire-3.5.2)

---
updated-dependencies:
- dependency-name: org.apache.maven.plugins:maven-surefire-plugin
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-11-04 18:17:42 +00:00
Michael Klishin 508ec97c88
Merge pull request #12646 from rabbitmq/metrics-flakes
metrics_SUITE: wait for tables in proper test
2024-11-04 12:08:26 -05:00
Michael Klishin 6e54084db0
Merge pull request #12642 from rabbitmq/prometheus-federation-flake
test: wait for links and metrics in prometheus_rabbitmq_federation_collector_SUITE
2024-11-04 12:00:48 -05:00
David Ansari 6034f3c411
Merge pull request #12638 from rabbitmq/amqp-connection-metrics
Expose AMQP connection metrics
2024-11-04 17:57:29 +01:00
Diana Parra Corbacho 054fcd676c metrics_SUITE: wait for tables in proper test 2024-11-04 16:46:23 +01:00
Diana Parra Corbacho 2fba2419d3 test: wait for links and metrics in prometheus_rabbitmq_federation_collector_SUITE 2024-11-04 08:42:42 +01:00
Michael Klishin fe587ae6d3
Merge pull request #12640 from rabbitmq/rabbitmq-server-12412
QQs: periodically apply policies if there's a discrepancy between the current and desired policy-driven state
2024-11-04 01:20:50 -05:00
Michael Klishin 734f6853bc
Merge branch 'main' into rabbitmq-server-12412 2024-11-04 00:39:51 -05:00
Michael Klishin d6a9db0c5a
Actions/alpha build: cosmetics 2024-11-04 00:36:59 -05:00
Michael Klishin 49ad8eadbd
Revert "Actions, alpha build: try passing in a different prerelease_identifier"
This reverts commit de0d8cf70b.
2024-11-04 00:36:59 -05:00
Michael Klishin a1a555e2ed
Actions, alpha build: try passing in a different prerelease_identifier 2024-11-04 00:36:58 -05:00
Michael Klishin aaebcc1048
Actions: trigger alpha build workflow run when workflow itself changes 2024-11-04 00:36:58 -05:00
Michael Klishin 3cf326e64e
Actions: try a using short commit SHA for alpha identifier 2024-11-04 00:36:58 -05:00
Michael Klishin 7ddd9d825a
Use a known repository_dispatch event type 2024-11-04 00:36:58 -05:00
Michael Klishin da615adce7
New workflow for triggering alpha releases
in rabbitmq/server-packages, an Actions-only
repo dedicated to open source RabbitMQ release
automation.
2024-11-04 00:36:58 -05:00
Michael Klishin 84e65cc075
Update SECURITY.md 2024-11-04 00:36:58 -05:00
Michael Klishin 8ea7e65e34
QQ: handle case where a stale read request results in member crash.
It is possible for a slow running follower with local consumers
to crash after a snapshot installation as it tries to read an entry
from its log that is no longer there (as it has been consumed and
completed by another node but still refers to prior consumers on the
current node).

This commit makes the log effect callback function more defensive
to check that the number of commands returned by the log effect
isn't different from what was requested. if it is different we
consider this a stale read request and return no further effects.

Conflicts:
	deps/rabbit/test/quorum_queue_SUITE.erl
2024-11-04 00:36:48 -05:00
GitHub 654bd047f3
bazel run gazelle 2024-11-04 00:34:52 -05:00
Michael Klishin 30c0b36772
Reduce AWS peer discovery workflow run rate
By running it

 * On push, when relevant code paths change
 * Every Monday morning

The peer discovery subsystem does not change
particularly often, and this plugin in particular
does not. Nonetheless, we currently run it for
every push unconditionally.
2024-11-04 00:34:51 -05:00
David Ansari af876ed6d1
Use log macros for AMQP
Using a log macro has the benefit that location data is added as
explained in https://www.erlang.org/doc/apps/kernel/logger.html#t:metadata/0
2024-11-04 00:34:51 -05:00
David Ansari 6fde076707
Support AMQP 1.0 token renewal
Closes #9259.

 ## What?
Allow an AMQP 1.0 client to renew an OAuth 2.0 token before it expires.

 ## Why?
This allows clients to keep the AMQP connection open instead of having
to create a new connection whenever the token expires.

 ## How?
As explained in https://github.com/rabbitmq/rabbitmq-server/issues/9259#issuecomment-2437602040
the client can `PUT` a new token on HTTP API v2 path `/auth/tokens`.
RabbitMQ will then:
1. Store the new token on the given connection.
2. Recheck access to the connection's vhost.
3. Clear all permission caches in the AMQP sessions.
4. Recheck write permissions to exchanges for links publishing to
   RabbitMQ, and recheck read permissions from queues for links
   consuming from RabbitMQ. The latter complies with the user
   expectation in #11364.
2024-11-04 00:34:51 -05:00
Michael Klishin a6adf74620
Actions deps: manually apply #12630 #12631 2024-11-04 00:34:51 -05:00
Diana Parra Corbacho ff44f4d355
Test: metrics_SUITE queue_idemp wait for queue metrics 2024-11-04 00:34:51 -05:00
Diana Parra Corbacho 7ac5b17787
Test: wait for metrics 2024-11-04 00:34:51 -05:00
Diana Parra Corbacho ab9d225502
Tests: wait for connection closed in metrics_SUITE 2024-11-04 00:34:50 -05:00
Michal Kuratczyk df8f6d19aa
Abort restart-cluster if something goes wrong
For example, if the first restarted node doesn't start,
don't try to restart the other nodes. This mimics what
orchestrators such as Kubernetes or BOSH would do
(although they perform this check differently)
2024-11-04 00:34:50 -05:00
Marcial Rosales c8e1593679
Verify non-zero DNS and email SAN 2024-11-04 00:34:50 -05:00
Marcial Rosales c0ef442d6d
Use the correct variable name 2024-11-04 00:34:50 -05:00
Jean-Sébastien Pédron fe7beea4b8
rabbit_feature_flags: Log controller task on a single line 2024-11-04 00:34:50 -05:00
Jean-Sébastien Pédron 9802348683
rabbit_feature_flags: Report feature flags init error reason
[Why]
`failed_to_initialize_feature_flags_registry` was a little too vague.
2024-11-04 00:34:50 -05:00
Jean-Sébastien Pédron 937ca915c9
rabbit_feature_flags: Introduce hard vs. soft required feature flags
[Why]
Before this patch, required feature flags were basically checked during
boot: they must have been enabled when they were mere stable feature
flags. If they were not, the node refused to boot.

This was easy for the developer because making a feature flag required
allowed to remove the entire compatibility code. Very satisfying.

Unfortunately, this was a pain point to end users, especially those who
did not pay attention to RabbitMQ and the release notes and were just
asking their package manager to update everything. They could end up
with a node that refuse to boot. The only solution was to downgrade,
enable the disabled stable feature flags, upgrade again.

[How]
This patch introduces two levels of requirement to required feature
flags:
* `hard`: this corresponds to the existing behavior where a node will
  refuse to boot if a hard required feature flag is not enabled before
  the upgrade.
* `soft`: such a required feature flag will be automatically enabled
  during the upgrade to a version where it is marked as required.

The level of requirement is set in the feature flag definition:

    -rabbit_feature_flag(
       {my_feature_flag,
        #{stability     => required,
	  require_level => hard
         }}).

The default requirement level is `soft`. All existing required feature
flags have now a requirement level of `hard`.

The handling of soft required feature flag is done when the cluster
feature flags states are verified and synchronized. If a required
feature flag is not enabled yet, it is enabled at that time.

This means that as developers, we will have to keep compatibility code
forever for every soft required feature flag, like the feature flag
definition itself.
2024-11-04 00:34:49 -05:00
Jean-Sébastien Pédron b5b598ce25
rabbit_prometheus_http_SUITE: Start broker once in `special_chars` group
`init_per_group/3`, which starts the broker, was already called earlier
in the function.

This fixes a bug where the node can't be stopped in `end_per_group/2`,
attecting the next group ability to start one.
2024-11-04 00:34:49 -05:00
Diana Parra Corbacho 624b72bedb
queue_SUITE: use a different upstream for each queue on multi-federation tests 2024-11-04 00:34:49 -05:00
David Ansari ea7bc819fd
Add AMQP 1.0 event exchange test 2024-11-04 00:34:49 -05:00
Jean-Sébastien Pédron 2d61fac09c
rabbitmq-run.mk: Restart nodes in a cluster sequentially
... not in parallel.
2024-11-04 00:34:49 -05:00
Jean-Sébastien Pédron 7f1d1615f9
rabbitmq-run.mk: Use a 60 seconds timeout for `rabbitmqctl wait`
... not 60 milliseconds.
2024-11-04 00:34:49 -05:00
Michael Klishin 0a5974688d
Fix a typo in 4.0.3 release notes 2024-11-04 00:34:48 -05:00
Michael Klishin 88df855266
4.0.3 release notes 2024-11-04 00:34:48 -05:00
Michael Klishin 0a557f7d5e
Use fmt_string in this error message 2024-11-04 00:34:48 -05:00
Diana Parra Corbacho ef06f80bb8
Fix metrics_SUITE connection_metrics flake 2024-11-04 00:34:48 -05:00
Loïc Hoguin 3dbfcaa3a0
Make CI: Enable khepri mixed clusters testing 2024-11-04 00:34:48 -05:00
Loïc Hoguin 2235492d28
Make CI: Add mixed version testing
This is enabled on main and for pull requests. Bazel remains
used in previous branches.
2024-11-04 00:34:47 -05:00
dependabot[bot] 7e05aac424
build(deps): bump org.springframework.boot:spring-boot-starter-parent
Bumps [org.springframework.boot:spring-boot-starter-parent](https://github.com/spring-projects/spring-boot) from 3.3.4 to 3.3.5.
- [Release notes](https://github.com/spring-projects/spring-boot/releases)
- [Commits](https://github.com/spring-projects/spring-boot/compare/v3.3.4...v3.3.5)

---
updated-dependencies:
- dependency-name: org.springframework.boot:spring-boot-starter-parent
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-11-04 00:34:47 -05:00
dependabot[bot] 6ade708dab
build(deps): bump org.springframework.boot:spring-boot-starter-parent
Bumps [org.springframework.boot:spring-boot-starter-parent](https://github.com/spring-projects/spring-boot) from 3.3.4 to 3.3.5.
- [Release notes](https://github.com/spring-projects/spring-boot/releases)
- [Commits](https://github.com/spring-projects/spring-boot/compare/v3.3.4...v3.3.5)

---
updated-dependencies:
- dependency-name: org.springframework.boot:spring-boot-starter-parent
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-11-04 00:34:47 -05:00
David Ansari 238ce77585
Delete test access_failure
This test flakes in CI as described in
https://github.com/rabbitmq/rabbitmq-server/issues/12413#issuecomment-2419293869

The test case fails with
```
Node: rabbit_shard2@localhost
Case: amqp_system_SUITE:access_failure
Reason: {error,{{badmatch,{error,134,
                                 "Unhandled exception. System.Exception: expected exception not received
                                 at Program.Test.accessFailure(String uri) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 477
                                 at Program.main(String[] argv) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 509\n"}},
                [{amqp_system_SUITE,run_dotnet_test,2,
                                    [{file,"amqp_system_SUITE.erl"},
                                     {line,257}]},
```

However, RabbitMQ closes the session as expected due to the missing read
permissions to the queue as shown in the RabbitMQ logs:
```
[debug] <0.1321.0> Asked to create a new user 'access_failure', password length in bytes: 24
[info] <0.1321.0> Created user 'access_failure'
[debug] <0.1324.0> Asked to set permissions for user 'access_failure' in virtual host '/' to '.*', '^banana.*', '^banana.*'
[info] <0.1324.0> Successfully set permissions for user 'access_failure' in virtual host '/' to '.*', '^banana.*', '^banana.*'
[info] <0.1333.0> accepting AMQP connection 127.0.0.1:36248 -> 127.0.0.1:25000
[debug] <0.1333.0> User 'access_failure' authenticated successfully by backend rabbit_auth_backend_internal
[info] <0.1333.0> Connection from AMQP 1.0 container 'AMQPNetLite-101d7d51': user 'access_failure' authenticated using SASL mechanism PLAIN and granted access to vhost '/'
[debug] <0.1333.0> AMQP 1.0 connection.open frame: hostname = 127.0.0.1, extracted vhost = /, idle-time-out = undefined
[debug] <0.1333.0> AMQP 1.0 created session process <0.1338.0> for channel number 0
[warning] <0.1338.0> Closing session for connection <0.1333.0>: {'v1_0.error',
[warning] <0.1338.0>                                             {symbol,
[warning] <0.1338.0>                                              <<"amqp:unauthorized-access">>},
[warning] <0.1338.0>                                             {utf8,
[warning] <0.1338.0>                                              <<"read access to queue 'test' in vhost '/' refused for user 'access_failure'">>},
[warning] <0.1338.0>                                             undefined}
[debug] <0.1333.0> AMQP 1.0 closed session process <0.1338.0> with channel number 0
[warning] <0.1333.0> closing AMQP connection <0.1333.0> (127.0.0.1:36248 -> 127.0.0.1:25000, duration: '269ms'):
[warning] <0.1333.0> client unexpectedly closed TCP connection
```

```
let receiver = ReceiverLink(ac.Session, "test-receiver", src)
```
uses a null constructur for the onAttached callback.
ReceiverLink doesn't seem to block.

Given that the exact same authorization error is already tested in test
case attach_source_queue of amqp_auth_SUITE, it's safe to delete this F#
test.
2024-11-04 00:34:47 -05:00
David Ansari 52b6419876
Remove test flake
Prior to this commit tests
* leader_transfer_quorum_queue_credit_single
* leader_transfer_quorum_queue_credit_batches
flaked in CI during 4.1 (main) and 4.0 mixed version testing.

The follwing error occurred on node 0:
```
[error] <0.1950.0> Timed out waiting for credit reply from quorum queue 'leader_transfer_quorum_queue_credit_batches' in vhost '/'. Hint: Enable feature flag rabbitmq_4.0.0
[warning] <0.1950.0> Closing session for connection <0.1945.0>: {'v1_0.error',
[warning] <0.1950.0>                                             {symbol,<<"amqp:internal-error">>},
[warning] <0.1950.0>                                             {utf8,
[warning] <0.1950.0>                                              <<"Timed out waiting for credit reply from quorum queue 'leader_transfer_quorum_queue_credit_batches' in vhost '/'. Hint: Enable feature flag rabbitmq_4.0.0">>},
[warning] <0.1950.0>                                             undefined}
```

Therefore we enable this feature flag for both tests.

This commit also simplifies some test setups that were necessary for
4.0/3.13 mixed version testing, but isn't necessary anymore for 4.1/4.0
mixed version testing.
2024-11-04 00:34:47 -05:00
David Ansari 70597737e4
Support x-cc message annotation (#12559)
Support x-cc message annotation

Support an `x-cc` message annotation in AMQP 1.0
similar to the [CC](https://www.rabbitmq.com/docs/sender-selected) header in AMQP 0.9.1.

The value of the `x-cc` message annotation must by a list of strings.
A message annotation is used since application properties allow only simple types.
2024-11-04 00:34:47 -05:00
David Ansari 9c2ee91a3c
Validate setting permissions works
in order to troubleshoot the flake described in
https://github.com/rabbitmq/rabbitmq-server/issues/12413#issuecomment-2419293869
```
Node: rabbit_shard2@localhost
Case: amqp_system_SUITE:access_failure
Reason: {error,{{badmatch,{error,134,
                                 "Unhandled exception. System.Exception: expected exception not received\n
                                 at Program.Test.accessFailure(String uri) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 477\n
                                 at Program.main(String[] argv) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 509\n"}},
                [{amqp_system_SUITE,run_dotnet_test,2,
                                    [{file,"amqp_system_SUITE.erl"},
                                     {line,257}]},
```
2024-11-04 00:34:46 -05:00
David Ansari 7a5277e1c4
Fix test flake
As described in https://github.com/rabbitmq/rabbitmq-server/issues/12413#issuecomment-2385379386
test case queue_topology flaked in CI with the following error:
```
rabbitmq_amqp_client > management_SUITE > cluster_size_3 > queue_topology
    #1. {error,{test_case_failed,{824,
                                  <<"rmq-ct-cluster_size_3-1-21000@localhost">>}}}
```

This flake could not be reproduced locally (neither with Mnesia nor with Khepri).
2024-11-04 00:34:45 -05:00
Michael Klishin af0d8206c8 Actions/alpha build: cosmetics 2024-11-03 23:03:33 -05:00