Commit Graph

167 Commits

Author SHA1 Message Date
Karl Nilsson 9a5d0f9d85 Make stream coodinator machine versioned
In order to retain deterministic results of state machine applications
during upgrades we need to make the stream coordinator versioned such
that we only use the new logic once the stream coordinator switches to
machine version 1.
2022-01-07 12:11:11 +00:00
dcorbacho 0bd8d41b72 Skip new import testcase on mixed environments 2022-01-03 17:37:06 +01:00
Michael Klishin 19ae35aa14
#3925 follow-up: don't include Erlang client headers 2021-12-28 01:24:32 +03:00
Michael Klishin b569ab5d74
Rename two newly introduced test modules 2021-12-28 00:35:55 +03:00
dcorbacho c88605aab4
Import definitions: support user limits 2021-12-26 04:32:00 +03:00
Luke Bakken d1496a2c7c
Fix tests 2021-12-26 04:32:00 +03:00
Luke Bakken 043641c99f
Use protected ets so that data can be read quickly 2021-12-26 04:31:59 +03:00
Thuan Duong Ba dc6fb24761 minor fix on condition to stop batching when total batch size is large 2021-12-20 17:39:06 -08:00
Thuan Duong Ba 1ab485b44c minor update for batching messages when syncthroughput is 0 2021-12-20 17:39:06 -08:00
Thuan Duong Ba 157bffa332 Support configure max sync throughput in CMQs 2021-12-20 17:39:06 -08:00
polaris-alioth 6431584a10 Prevent creating unnamed policy when loading definition 2021-12-19 12:52:26 +08:00
Philip Kuryloski 249e8c853c Adjust the way rabbit_fifo.hrl is referenced in rabbit_fifo_SUITE
For erlang_ls convenience
2021-12-16 16:41:15 +01:00
Michael Klishin ebd79836c1 Revisit operator policy merging rules for boolean fields
For booleans, we can prefer the operator policy value
unconditionally, without any safety implications.

Per discussion with @binarin @pjk25

(cherry picked from commit 6edb7396fd)
2021-12-10 19:48:16 +00:00
Loïc Hoguin 1b0eb9a4a3
Fix case where confirms may not be sent
A channel that first sends a mandatory publish before enabling
confirms mode may not receive confirms for messages published
after that. This is because the publish_seqno was increased
also for mandatory publishes even if confirms were disabled.
But the mandatory feature has nothing to do with publish_seqno.

The issue exists since at least
38e5b687de

The test case introduced focuses for multiple=false. The issue
also exists for multiple=true but it has a different impact:
sending multiple=true,delivery_tag=2 results in both messages
1 and 2 being acked, even if message 2 doesn't exist as far
as the client is concerned. If the message does exist
it might get confirmed earlier than it should have been. The
issue is a bigger problem the more mandatory messages were
sent before enabling confirms mode.
2021-12-08 15:53:47 +01:00
Luke Bakken 9ff201c3ab
Remove flaky assertion
Thanks @kjnilsson
2021-12-01 06:57:25 -08:00
dcorbacho 5e9664f9e7 Query total number of messages on stream leader on queue.declare 2021-11-30 15:09:30 +01:00
David Ansari 45f69f8829 Add missing Ra commands to the log
Before this commit, the tests were not including any settle, return, or
discard Ra commands.

Do not pattern match against 'ra_event' because nowadays:
_Opts = [local, ra_event]
2021-11-26 16:16:45 +01:00
Michael Klishin 4f09fd109c
quorum_queue_SUITE: bump some timeouts 2021-11-24 18:04:35 +03:00
Michael Klishin 6a08e143e9
quorum_queue_SUITE: drop a debug line 2021-11-24 16:47:20 +03:00
Luke Bakken 6d545447b9
Fix quorum queue crash during consumer cancel with return
Fixes #3729
2021-11-23 08:59:47 -08:00
Michael Klishin e22e667a10
Do not count unroutable message in global totals 2021-11-23 16:37:46 +03:00
Luke Bakken 6aaf7ec597
Merge pull request #3740 from rabbitmq/rabbitmq-server-3739
Distribution listener settings support in rabbitmq.conf
2021-11-16 06:36:48 -08:00
Michael Klishin 8a30cf1c86
Distribution listener settings support in rabbitmq.conf
* distribution.listener.interface
 * distribution.listener.port_range.min
 * distribution.listener.port_range.max

Closes #3739
2021-11-16 16:37:28 +03:00
Karl Nilsson bc7b339e7a Stream coordinator: only update amqqueue record if stream id matches
From the coordinator's POV each stream has a unique id consisting of the
vhost, queuename and a high resolution timestamp even if several stream ids
relate to the same queue record.

When performing the mnesia update the coordinator now checks that the current stream id
matches that of the update_mnesia action and does not change the queue record if
the stream id is not the same.

This should avoid "old" incarnations of a stream queue updating newer ones
with incorrect information.
2021-11-16 12:32:33 +00:00
Karl Nilsson 1c6e45257d QQ: set better timeouts for commands
Refactor how the single active consumer check is performed when consuming.

Improve timeouts in rabbit_fifo_client.
2021-11-08 11:07:41 +00:00
Michael Klishin 686dccf410 Introduce a target cluster size hint setting
This is meant to be used by deployment tools,
core features and plugins
that expect a certain minimum
number of cluster nodes
to be present.

For example, certain setup steps
in distributed plugins might require
at least three nodes to be available.

This is just a hint, not an enforced
requirement. The default value is 1
so that for single node clusters,
there would be no behavior changes.
2021-11-03 08:42:58 +00:00
Karl Nilsson 691de2bea4 Take all clustered nodes into account when declaring stream.
Deriving a max-cluster-size only from running nodes would create situations where
in a three-node with only two nodes running cluster it would select an non-running
node as follower.
2021-10-18 15:44:53 +01:00
Karl Nilsson 5520c6cafe Stream queue: handle unsupported header value types
As AMQP 0.9.1 headers are translated into AMQP 1.0 application properties
they are not able to contain complex values such as arrays or tables.

RabbitMQ federation does use array and table values so to avoid crashing when
delivering a federated message to a stream queue we drop them. These header values
should be considered internal however so dropping them before a final queue deliver should not be a huge problem.
2021-10-13 10:27:00 +01:00
Philip Kuryloski 9c9fb7ffb0 Shard cluster_management_SUITE by testcase to better manage timeouts
The suite level timeout the .erl I've learned is actually per
case. By sharding bu testcase, we can better match the common test
level and bazel level timeouts, such that we can get logs from remote
test run failures.
2021-09-30 10:38:39 +02:00
Philip Kuryloski 860653c97a Adjust the clustering_management_SUITE timeout at the ct level
Previously the bazel timeout and common test timeout were equal, which
meant that in practice the bazel timeout was often reached first, in
which case we don't receive the test logs
2021-09-23 13:55:18 +02:00
Philip Kuryloski 7dc0c29227 Use only 3 nodes for feature_flags_with_unpriveleged_user_SUITE
The test does not appear reliable when it runs in Github actions. This
is currently the only test that does so. Other tests run of BuildBuddy workers.
2021-09-22 17:22:49 +02:00
Philip Kuryloski 6e6279eb2b Reduce a test timeout
The original value of 15 minutes was inherited from a larger suite. 5
should be sufficient, as a passing run is typically around 2 minutes.
2021-09-21 10:16:38 +02:00
Karl Nilsson eaa216da82 QQ: emit release cursors after consumer cancel
If this is not done apps that consume/cancel from empty queues in a loop
will grow the raft log in an unbounded manner. This could also be the
case for the garbage_collect command.
2021-09-17 17:09:30 +01:00
Karl Nilsson 5779059bd5 QQ: fix memory leak when cancelling consumer
If the queue is empty when a consumer is cancelled it would leave the
consumer id inside the service queue. If an application subscribes/unsubscibes
in a loop from an empty queue this would cause the service queue to never be
cleared up.

NB: whenever we make a change to how the quorum queue state machien is
calculated we need to consider how this effects determinism as during an
upgrade different members may calculate a different service queue state.
In this case it should be ok as they will eventually converge on the same
state once all "dead" consumer ids have been removed from the queue.

In any case it should not affect how messages are assigned to consumers.
2021-09-17 14:53:33 +01:00
Philip Kuryloski eea99e1cd5 Split the feature_flags_SUITE into two parts for CI/Bazel
Two testcases in the original suite fail if the test is run as the
root user. Currently under remote execution with bazel this is the
only working option. There is a workaround in place, but the entire
suite when run that way takes around 12 minutes. This splits the suite
so that the minimal set of cases is executed using the slower workaround.
2021-09-17 11:08:48 +02:00
Michal Kuratczyk 624767281f Enable metrics collection in run_tests
Proposed `min-masters` implementation relies on metrics so they need to
be collected during queue_master_location tests.
2021-09-10 14:51:11 +02:00
Gerhard Lazu 6a1faa6fd6
Keep checking that replica recovered in rabbit_stream_queue
Rather than sleeping for 6 seconds, we want to check that replica
recovered multiple times within 30 seconds, and either eventually
succeed, or fail if this does not recover within 30 seconds, the default
await_condition time interval.

Pair: @kjnilsson

Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
2021-08-31 17:02:21 +01:00
Philip Kuryloski 09fb5c5321 Skip additional tests in mixed versions
The tests in question won't pass consistently as they are at the mercy
of how the quorum queue is placed across the mixed version nodes
2021-08-30 17:17:25 +02:00
Michael Klishin 83f007be54
Merge pull request #3341 from rabbitmq/local-exclusive-queues
Always place exclusive queues on the local node
2021-08-28 09:54:49 +03:00
Michael Klishin 3b4b4dc222
Exclude roundtrip definition import cases from mixed version runs
References #3333
2021-08-26 19:10:11 +03:00
Michael Klishin 54f7b6d77c
Re-format two definition import input files 2021-08-26 19:03:14 +03:00
Michael Klishin 42a3dfa81b
Exclude the #3333 test case from mixed version runs 2021-08-26 17:25:07 +03:00
Michal Kuratczyk d3dcd48ea5 Always place exclusive queues on the local node
Prior to this change, exclusive queues have been subject to the queue
location process, just like other queues. Therefore, if
queue_master_locator was not client-local and x-queue-master-locator was
not set to client-local, an exclusive queue was likely to be located on
a different node than the connection it is exclusive to.  This is
suboptimal and may lead to inconsistencies when the queue's node goes
down while the connection's node is still up.
2021-08-26 13:05:55 +02:00
Michael Klishin 2e61f51773
Commit definition import case16 file 2021-08-24 04:41:51 +03:00
Michael Klishin 6f97707dac
Definition import: correctly import vhost metadata 2021-08-24 04:41:04 +03:00
Michael Klishin 6a0058fe7c
Introduce TLS-related rabbitmq.conf settings for definition import
currently only used by the HTTPS mechanism but can be used by
any other.
2021-08-17 20:42:53 +03:00
Michael Klishin f3a5235408
Refactor definition import to allow for arbitrary sources
The classic local filesystem source is still supported
using the same traditional configuration key, load_definitions.

Configuration schema follows peer discovery in spirit:

 * definitions.import_backend configures the mechanism to use,
   which can be a module provided by a plugin
 * definitions.* keys can be defined by plugins and contain any
   keys a specific mechanism needs

For example, the classic local filesystem source can now be
configured like this:

``` ini
definitions.import_backend = local_filesystem
definitions.local.path = /path/to/definitions.d/definition.json
```

``` ini
definitions.import_backend = https
definitions.https.url = https://hostname/path/to/definitions.json
```

HTTPS may require additional configuration keys related to TLS/x.509
peer verification. Such extra keys will be added as the need for them
becomes evident.

References #3249
2021-08-14 14:53:45 +03:00
Loïc Hoguin 24c25ab3cc
Add tests for the regression introduced in #3041 2021-08-11 12:50:04 +02:00
Jean-Sébastien Pédron 6c8cf4c510
Logging: Fix crash when Epoch-based timestamps are used with JSON
The code was passing a number (the timestamp) to
unicode:characters_to_binary/1 which expects an iolist to convert to
UTF-8.

We now verify if we have a number before calling that function. If this
is a number (integer or float), we keep it as is because JSON supports
that type.
2021-08-10 12:34:11 +02:00
Michael Klishin 2efc3d22fa
Merge pull request #3176 from rabbitmq/stream-error-handling
Better error handling for streams
2021-07-27 22:25:06 +03:00