[Why]
The previous implementation assumed that ERTS was in the Erlang root
dir. This is true for a standard Erlang install. However, it's not the
case of an Erlang release that does not include the runtime.
[How]
The bin directory (`bindir`) is already available as an argument to
init. We can simply query it and use the value as is.
... instead of the `dist` target.
[Why]
We already do that when building tests. Thus it is more consistent to do
the same.
Also, it makes sense to ensure everything is ready before the `dist`
step. For instance, an Erlang release would not depend on the `dist`
target, just the build and it would still need the CLI to be ready.
as selective receives are efficient in OTP 26:
```
OTP-18431
Application(s):
compiler, stdlib
Related Id(s):
PR-6739
Improved the selective receive optimization, which can now be enabled for references returned from other functions.
This greatly improves the performance of gen_server:send_request/3, gen_server:wait_response/2, and similar functions.
```
By default, hibernate the SSL connection after 6 seconds, reducing its memory footprint.
This reduces memory usage of RabbitMQ by multiple GBs with thousands of
idle SSL connections.
This commit chooses a default value of 6 seconds because that hibernate_after
value is currently hard coded for rabbit_writer.
rabbit_mqtt_reader uses 1 second, rabbit_channel uses 1 - 10 seconds.
This value can be overriden by advanced.config, similar to:
```
[{rabbit, [
{ssl_options, [
{hibernate_after, 30000},
{keyfile, "/etc/.../server_key.pem"},
{certfile, "/etc/.../server_certificate.pem"},
{cacertfile, "/etc/.../ca_certificate.pem"},
{verify,verify_none}
]}
]}].
```
See https://www.erlang.org/doc/man/ssl.html#type-hibernate_after
```
When an integer-value is specified, TLS/DTLS-connection goes into
hibernation after the specified number of milliseconds of inactivity,
thus reducing its memory footprint. When undefined is specified
(this is the default), the process never goes into hibernation.
```
Relates
https://github.com/rabbitmq/rabbitmq-server/discussions/5346https://groups.google.com/g/rabbitmq-users/c/be8qtkkpg5s/m/dHUa-Lh2DwAJ
Fixes#9371
Since each AMQP 1.0 connection opens several direct AMQP connections, we
must assign each direct connection a unique name to prevent multiple
entries in the `connection_created_stats` table.
Also, use `pg_local` to track AMQP 1.0 connections instead of walking
the supervisor trees.
Nuke authz_backends from connection created event 💥
Fix regex for connection name because UniqueId is part of it now (channel number)
messaging components which use flow control when calling credit_flow:block/1,
credit_flow:unblock/1 and/or via credit_flow:peer_down/1 and handle_bump_msg/1.
This can crash all messaging components (connections, channels, queues, msg-store)
when trace is enabled and flow is set or cleared (amqp-1.0 included).
Erlang 27 erlc requires us to match on both but earlier versions
complain that one head of the function is unreachable because the other
one always matches.
So we avoid pattern matching entirely.
Without this change, our tests against OTP master started failing since
the negative zero change was introduced, for example:
https://github.com/rabbitmq/rabbitmq-server/actions/runs/6141080013/job/16660857916
The warning (treated as error) is:
```
rabbit_numerical.erl:33:8: matching on the float 0.0 will no longer also match -0.0 in OTP 27. If you specifically intend to match 0.0 alone, write +0.0 instead.
```
Why:
The default rabbit.collect_statistics_interval is 5,000 ms which
collides with the hibernate timeout of 5,000 ms.
It happens that the writer process hibernates and is woken up just a few
microseconds thereafter to emit stats. This is wasteful as hibernation
occurs costly garbage collections.
Inserting some logs before the writer emitting stats end before the writer
hibernating shows the following (when consuming from a queue and sending a
single message to that queue via the Management UI):
```
2023-09-12 09:46:22.820514+02:00 [info] <0.784.0> rabbit_writer:216 hibernating...
2023-09-12 09:46:22.820779+02:00 [info] <0.784.0> rabbit_writer:274 emitting stats...
2023-09-12 09:46:27.821451+02:00 [info] <0.784.0> rabbit_writer:216 hibernating...
```
How:
Change hibernate timeout from 5s to 6s.
[Why]
For Mnesia, we don't really care. However, for the upcoming use of
Khepri, we need that because the cluster will require at least a quorum
number of nodes to start to become available.
[How]
We simply rely on the shell's job management and wait for them to
complete.
We do the same for `stop-{brokers,cluster}` mostly for consistency's
sake.
It also makes starting nodes slightly faster.
This PR implements an approach for a "protocol (data format) agnostic core" where the format of the message isn't converted at point of reception.
Currently all non AMQP 0.9.1 originating messages are converted into a AMQP 0.9.1 flavoured basic_message record before sent to a queue. If the messages are then consumed by the originating protocol they are converted back from AMQP 0.9.1. For some protocols such as MQTT 3.1 this isn't too expensive as MQTT is mostly a fairly easily mapped subset of AMQP 0.9.1 but for others such as AMQP 1.0 the conversions are awkward and in some cases lossy even if consuming from the originating protocol.
This PR instead wraps all incoming messages in their originating form into a generic, extensible message container type (mc). The container module exposes an API to get common message details such as size and various properties (ttl, priority etc) directly from the source data type. Each protocol needs to implement the mc behaviour such that when a message originating form one protocol is consumed by another protocol we convert it to the target protocol at that point.
The message container also contains annotations, dead letter records and other meta data we need to record during the lifetime of a message. The original protocol message is never modified unless it is consumed.
This includes conversion modules to and from amqp, amqpl (AMQP 0.9.1) and mqtt.
COMMIT HISTORY:
* Refactor away from using the delivery{} record
In many places including exchange types. This should make it
easier to move towards using a message container type instead of
basic_message.
Add mc module and move direct replies outside of exchange
Lots of changes incl classic queues
Implement stream support incl amqp conversions
simplify mc state record
move mc.erl
mc dlx stuff
recent history exchange
Make tracking work
But doesn't take a protocol agnostic approach as we just convert
everything into AMQP legacy and back. Might be good enough for now.
Tracing as a whole may want a bit of a re-vamp at some point.
tidy
make quorum queue peek work by legacy conversion
dead lettering fixes
dead lettering fixes
CMQ fixes
rabbit_trace type fixes
fixes
fix
Fix classic queue props
test assertion fix
feature flag and backwards compat
Enable message_container feature flag in some SUITEs
Dialyzer fixes
fixes
fix
test fixes
Various
Manually update a gazelle generated file
until a gazelle enhancement can be made
https://github.com/rabbitmq/rules_erlang/issues/185
Add message_containers_SUITE to bazel
and regen bazel files with gazelle from rules_erlang@main
Simplify essential proprty access
Such as durable, ttl and priority by extracting them into annotations
at message container init time.
Move type
to remove dependenc on amqp10 stuff in mc.erl
mostly because I don't know how to make bazel do the right thing
add more stuff
Refine routing header stuff
wip
Cosmetics
Do not use "maybe" as type name as "maybe" is a keyword since OTP 25
which makes Erlang LS complain.
* Dedup death queue names
* Fix function clause crashes
Fix failing tests in the MQTT shared_SUITE:
A classic queue message ID can be undefined as set in
fbe79ff47b/deps/rabbit/src/rabbit_classic_queue_index_v2.erl (L1048)
Fix failing tests in the MQTT shared_SUITE-mixed:
When feature flag message_containers is disabled, the
message is not an #mc{} record, but a #basic_message{} record.
* Fix is_utf8_no_null crash
Prior to this commit, the function crashed if invalid UTF-8 was
provided, e.g.:
```
1> rabbit_misc:is_valid_shortstr(<<"😇"/utf16>>).
** exception error: no function clause matching rabbit_misc:is_utf8_no_null(<<216,61,222,7>>) (rabbit_misc.erl, line 1481)
```
* Implement mqtt mc behaviour
For now via amqp translation.
This is still work in progress, but the following SUITEs pass:
```
make -C deps/rabbitmq_mqtt ct-shared t=[mqtt,v5,cluster_size_1] FULL=1
make -C deps/rabbitmq_mqtt ct-v5 t=[mqtt,cluster_size_1] FULL=1
```
* Shorten mc file names
Module name length matters because for each persistent message the #mc{}
record is persisted to disk.
```
1> iolist_size(term_to_iovec({mc, rabbit_mc_amqp_legacy})).
30
2> iolist_size(term_to_iovec({mc, mc_amqpl})).
17
```
This commit renames the mc modules:
```
ag -l rabbit_mc_amqp_legacy | xargs sed -i 's/rabbit_mc_amqp_legacy/mc_amqpl/g'
ag -l rabbit_mc_amqp | xargs sed -i 's/rabbit_mc_amqp/mc_amqp/g'
ag -l rabbit_mqtt_mc | xargs sed -i 's/rabbit_mqtt_mc/mc_mqtt/g'
```
* mc: make deaths an annotation + fixes
* Fix mc_mqtt protocol_state callback
* Fix test will_delay_node_restart
```
make -C deps/rabbitmq_mqtt ct-v5 t=[mqtt,cluster_size_3]:will_delay_node_restart FULL=1
```
* Bazel run gazelle
* mix format rabbitmqctl.ex
* Ensure ttl annotation is refelected in amqp legacy protocol state
* Fix id access in message store
* Fix rabbit_message_interceptor_SUITE
* dializer fixes
* Fix rabbit:rabbit_message_interceptor_SUITE-mixed
set_annotation/3 should not result in duplicate keys
* Fix MQTT shared_SUITE-mixed
Up to 3.12 non-MQTT publishes were always QoS 1 regardless of delivery_mode.
75a953ce28/deps/rabbitmq_mqtt/src/rabbit_mqtt_processor.erl (L2075-L2076)
From now on, non-MQTT publishes are QoS 1 if durable.
This makes more sense.
The MQTT plugin must send a #basic_message{} to an old node that does
not understand message containers.
* Field content of 'v1_0.data' can be binary
Fix
```
bazel test //deps/rabbitmq_mqtt:shared_SUITE-mixed \
--test_env FOCUS="-group [mqtt,v4,cluster_size_1] -case trace" \
-t- --test_sharding_strategy=disabled
```
* Remove route/2 and implement route/3 for all exchange types.
This removes the route/2 callback from rabbit_exchange_type and
makes route/3 mandatory instead. This is a breaking change and
will require all implementations of exchange types to update their
code, however this is necessary anyway for them to correctly handle
the mc type.
stream filtering fixes
* Translate directly from MQTT to AMQP 0.9.1
* handle undecoded properties in mc_compat
amqpl: put clause in right order
recover death deatails from amqp data
* Replace callback init_amqp with convert_from
* Fix return value of lists:keyfind/3
* Translate directly from AMQP 0.9.1 to MQTT
* Fix MQTT payload size
MQTT payload can be a list when converted from AMQP 0.9.1 for example
First conversions tests
Plus some other conversion related fixes.
bazel
bazel
translate amqp 1.0 null to undefined
mc: property/2 and correlation_id/message_id return type tagged values.
To ensure we can support a variety of types better.
The type type tags are AMQP 1.0 flavoured.
fix death recovery
mc_mqtt: impl new api
Add callbacks to allow protocols to compact data before storage
And make readable if needing to query things repeatedly.
bazel fix
* more decoding
* tracking mixed versions compat
* mc: flip default of `durable` annotation to save some data.
Assuming most messages are durable and that in memory messages suffer less
from persistence overhead it makes sense for a non existent `durable`
annotation to mean durable=true.
* mc conversion tests and tidy up
* mc make x_header unstrict again
* amqpl: death record fixes
* bazel
* amqp -> amqpl conversion test
* Fix crash in mc_amqp:size/1
Body can be a single amqp-value section (instead of
being a list) as shown by test
```
make -C deps/rabbitmq_amqp1_0/ ct-system t=java
```
on branch native-amqp.
* Fix crash in lists:flatten/1
Data can be a single amqp-value section (instead of
being a list) as shown by test
```
make -C deps/rabbitmq_amqp1_0 ct-system t=dotnet:roundtrip_to_amqp_091
```
on branch native-amqp.
* Fix crash in rabbit_writer
Running test
```
make -C deps/rabbitmq_amqp1_0 ct-system t=dotnet:roundtrip_to_amqp_091
```
on branch native-amqp resulted in the following crash:
```
crasher:
initial call: rabbit_writer:enter_mainloop/2
pid: <0.711.0>
registered_name: []
exception error: bad argument
in function size/1
called as size([<<0>>,<<"Sw">>,[<<160,2>>,<<"hi">>]])
*** argument 1: not tuple or binary
in call from rabbit_binary_generator:build_content_frames/7 (rabbit_binary_generator.erl, line 89)
in call from rabbit_binary_generator:build_simple_content_frames/4 (rabbit_binary_generator.erl, line 61)
in call from rabbit_writer:assemble_frames/5 (rabbit_writer.erl, line 334)
in call from rabbit_writer:internal_send_command_async/3 (rabbit_writer.erl, line 365)
in call from rabbit_writer:handle_message/2 (rabbit_writer.erl, line 265)
in call from rabbit_writer:handle_message/3 (rabbit_writer.erl, line 232)
in call from rabbit_writer:mainloop1/2 (rabbit_writer.erl, line 223)
```
because #content.payload_fragments_rev is currently supposed to
be a flat list of binaries instead of being an iolist.
This commit fixes this crash inefficiently by calling
iolist_to_binary/1. A better solution would be to allow AMQP legacy's #content.payload_fragments_rev
to be an iolist.
* Add accidentally deleted line back
* mc: optimise mc_amqp internal format
By removint the outer records for message and delivery annotations
as well as application properties and footers.
* mc: optimis mc_amqp map_add by using upsert
* mc: refactoring and bug fixes
* mc_SUITE routingheader assertions
* mc remove serialize/1 callback as only used by amqp
* mc_amqp: avoid returning a nested list from protocol_state
* test and bug fix
* move infer_type to mc_util
* mc fixes and additiona assertions
* Support headers exchange routing for MQTT messages
When a headers exchange is bound to the MQTT topic exchange, routing
will be performend based on both MQTT topic (by the topic exchange) and
MQTT User Property (by the headers exchange).
This combines the best worlds of both MQTT 5.0 and AMQP 0.9.1 and
enables powerful routing topologies.
When the User Property contains the same name multiple times, only the
last name (and value) will be considered by the headers exchange.
* Fix crash when sending from stream to amqpl
When publishing a message via the stream protocol and consuming it via
AMQP 0.9.1, the following crash occurred prior to this commit:
```
crasher:
initial call: rabbit_channel:init/1
pid: <0.818.0>
registered_name: []
exception exit: {{badmatch,undefined},
[{rabbit_channel,handle_deliver0,4,
[{file,"rabbit_channel.erl"},
{line,2728}]},
{lists,foldl,3,[{file,"lists.erl"},{line,1594}]},
{rabbit_channel,handle_cast,2,
[{file,"rabbit_channel.erl"},
{line,728}]},
{gen_server2,handle_msg,2,
[{file,"gen_server2.erl"},{line,1056}]},
{proc_lib,wake_up,3,
[{file,"proc_lib.erl"},{line,251}]}]}
```
This commit first gives `mc:init/3` the chance to set exchange and
routing_keys annotations.
If not set, `rabbit_stream_queue` will set these annotations assuming
the message was originally published via the stream protocol.
* Support consistent hash exchange routing for MQTT 5.0
When a consistent hash exchange is bound to the MQTT topic exchange,
MQTT 5.0 messages can be routed to queues consistently based on the
Correlation-Data in the PUBLISH packet.
* Convert MQTT 5.0 User Property
* to AMQP 0.9.1 headers
* from AMQP 0.9.1 headers
* to AMQP 1.0 application properties and message annotations
* from AMQP 1.0 application properties and message annotations
* Make use of Annotations in mc_mqtt:protocol_state/2
mc_mqtt:protocol_state/2 includes Annotations as parameter.
It's cleaner to make use of these Annotations when computing the
protocol state instead of relying on the caller (rabbitmq_mqtt_processor)
to compute the protocol state.
* Enforce AMQP 0.9.1 field name length limit
The AMQP 0.9.1 spec prohibits field names longer than 128 characters.
Therefore, when converting AMQP 1.0 message annotations, application
properties or MQTT 5.0 User Property to AMQP 0.9.1 headers, drop any
names longer than 128 characters.
* Fix type specs
Apply feedback from Michael Davis
Co-authored-by: Michael Davis <mcarsondavis@gmail.com>
* Add mc_mqtt unit test suite
Implement mc_mqtt:x_header/2
* Translate indicator that payload is UTF-8 encoded
when converting between MQTT 5.0 and AMQP 1.0
* Translate single amqp-value section from AMQP 1.0 to MQTT
Convert to a text representation, if possible, and indicate to MQTT
client that the payload is UTF-8 encoded. This way, the MQTT client will
be able to parse the payload.
If conversion to text representation is not possible, encode the payload
using the AMQP 1.0 type system and indiate the encoding via Content-Type
message/vnd.rabbitmq.amqp.
This Content-Type is not registered.
Type "message" makes sense since it's a message.
Vendor tree "vnd.rabbitmq.amqp" makes sense since merely subtype "amqp" is not
registered.
* Fix payload conversion
* Translate Response Topic between MQTT and AMQP
Translate MQTT 5.0 Response Topic to AMQP 1.0 reply-to address and vice
versa.
The Response Topic must be a UTF-8 encoded string.
This commit re-uses the already defined RabbitMQ target addresses:
```
"/topic/" RK Publish to amq.topic with routing key RK
"/exchange/" X "/" RK Publish to exchange X with routing key RK
```
By default, the MQTT topic exchange is configure dto be amq.topic using
the 1st target address.
When an operator modifies the mqtt.exchange, the 2nd target address is
used.
* Apply PR feedback
and fix formatting
Co-authored-by: Michael Davis <mcarsondavis@gmail.com>
* tidy up
* Add MQTT message_containers test
* consistent hash exchange: avoid amqp legacy conversion
When hashing on a header value.
* Avoid converting to amqp legacy when using exchange federation
* Fix test flake
* test and dialyzer fixes
* dialyzer fix
* Add MQTT protocol interoperability tests
Test receiving from and sending to MQTT 5.0 and
* AMQP 0.9.1
* AMQP 1.0
* STOMP
* Streams
* Regenerate portions of deps/rabbit/app.bzl with gazelle
I'm not exactly sure how this happened, but gazell seems to have been
run with an older version of the rules_erlang gazelle extension at
some point. This caused generation of a structure that is no longer
used. This commit updates the structure to the current pattern.
* mc: refactoring
* mc_amqpl: handle delivery annotations
Just in case they are included.
Also use iolist_to_iovec to create flat list of binaries when
converting from amqp with amqp encoded payload.
---------
Co-authored-by: David Ansari <david.ansari@gmx.de>
Co-authored-by: Michael Davis <mcarsondavis@gmail.com>
Co-authored-by: Rin Kuryloski <kuryloskip@vmware.com>
Due to how the `echo` statement adds a space character, the output
marker was not found while parsing the output of
`rabbitmq-env-conf.bat`. This, in turn, resulted in _none_ of the
environment variables defined there being re-exported.
In addition, the parsing of the output did not succeed as the default is
to expect `sh`-style output.
It is rare for users to use this file on Windows which is how this
behavior has not been noticed in years.
Related to:
* https://pivotal-esc.atlassian.net/browse/VESC-1077
* https://vmware.slack.com/archives/C0RDGG81Z/p1689274725888439
* https://github.com/rabbitmq/rabbitmq-server/pull/5486 (tangentially related)
Add some simple win32 parsing tests
Otherwise, if the FHC runs into an exception because of an ETS
write failure, most code paths begin failing.
This may help with the condition in #8784.
[Why]
Peer discovery is not Mnesia-specific and will be used once we introduce
Khepri.
[How]
The whole peer discovery driving code is moved from `rabbit_mnesia` to
`rabbit_peer_discovery`. When `rabbit_mnesia` calls that code, it simply
passes a callback for the Mnesia-specific cluster expansion code.
[Why]
Up to RabbitMQ 3.7.x, versions were compared to determine if two nodes
were compatible. In 3.8.0, feature flags were introduced to fulfill this
check.
The current branch of RabbitMQ is no longer compatible with 3.7.x
because several feature flags are now required. This means the user must
upgrade to intermediate versions first before.
Therefore, we don't need to keep comparing versions.
[How]
The check is now simply calling the feature flags subsystem check.
We still keep versions comparison for plugins. However, we can get rid
of the special case for RabbitMQ 3.6.6.
Currently it is not possible for an AMQP 1.0 to make find
out a message is unroutable as it is settled with the
accepted state. This is because the current implementation
relies only on the publish confirms AMQP 091 extension.
This commit changes this by settling the message with the
released state. It uses the mandatory flag mechanism from
AMQP 091 and "internally" extends it to provide not only
the message in the callback, but the publishing sequence
as well. This applies only for AMQP 1.0, not for other
cases. Publish confirms and the mandatory flag are used
in conjonction in this case.
References #7823
Allow rabbitmq_exchange:route_return() to return infos in addition to
the matched binding keys.
For example, some exchange types read the full #amqqueue{} record from the database
and can in future return it from the route/3 function to avoid reading the full
record again in the channel.
For MQTT 5.0 destination queues, the topic exchange does not only have
to return the destination queue names, but also the matched binding
keys.
This is needed to implement MQTT 5.0 subscription options No Local,
Retain As Published and Subscription Identifiers.
Prior to this commit, as the trie was walked down, we remembered the
edges being walked and assembled the final binding key with
list_to_binary/1.
list_to_binary/1 is very expensive with long lists (long topic names),
even in OTP 26.
The CPU flame graph showed ~3% of CPU usage was spent only in
list_to_binary/1.
Unfortunately and unnecessarily, the current topic exchange
implementation stores topic levels as lists.
It would be better to store topic levels as binaries:
split_topic_key/1 should ideally use binary:split/3 similar as follows:
```
1> P = binary:compile_pattern(<<".">>).
{bm,#Ref<0.1273071188.1488322568.63736>}
2> Bin = <<"aaa.bbb..ccc">>.
<<"aaa.bbb..ccc">>
3> binary:split(Bin, P, [global]).
[<<"aaa">>,<<"bbb">>,<<>>,<<"ccc">>]
```
The compiled pattern could be placed into persistent term.
This commit decided to avoid migrating Mnesia tables to use binaries
instead of lists. Mnesia migrations are non-trivial, especially with the
current feature flag subsystem.
Furthermore the Mnesia topic tables are already getting migrated to
their Khepri counterparts in 3.13.
Adding additional migration only for Mnesia does not make sense.
So, instead of assembling the binding key as we walk down the trie and
then calling list_to_binary/1 in the leaf, it
would be better to just fetch the binding key from the database in the leaf.
As we reach the leaf of the trie, we know both source and destination.
Unfortunately, we cannot fetch the binding key efficiently with the
current rabbit_route (sorted by source exchange) and
rabbit_reverse_route (sorted by destination) tables as the key is in
the middle between source and destination.
If there are a huge number of bindings for a given sourc exchange (very
realistic in MQTT use cases) or a large number of bindings for a given
destination (also realistic), it would require scanning these large
number of bindings.
Therefore this commit takes the simplest possible solution:
The solution leverages the fact that binding arguments are already part of
table rabbit_topic_trie_binding.
So, if we simply include the binding key into the binding arguments, we
can fetch and return it efficiently in the topic exchange
implementation.
The following patch omitting fetching the empty list binding argument
(the default) makes routing slower because function
`analyze_pattern.constprop.0` requires significantly more (~2.5%) CPU time
```
@@ -273,7 +273,11 @@ trie_bindings(X, Node) ->
node_id = Node,
destination = '$1',
arguments = '$2'}},
- mnesia:select(?MNESIA_BINDING_TABLE, [{MatchHead, [], [{{'$1', '$2'}}]}]).
+ mnesia:select(
+ ?MNESIA_BINDING_TABLE,
+ [{MatchHead, [{'andalso', {'is_list', '$2'}, {'=/=', '$2', []}}], [{{'$1', '$2'}}]},
+ {MatchHead, [], ['$1']}
+ ]).
```
Hence, this commit always fetches the binding arguments.
All MQTT 5.0 destination queues will create a binding that
contains the binding key in the binding arguments.
Not only does this solution avoid expensive list_to_binay/1 calls, but
it also means that Erlang app rabbit (specifically the topic exchange
implementation) does not need to be aware of MQTT anymore:
It just returns the binding key when the binding args tell to do so.
In future, once the Khepri migration completed, we should be able to
relatively simply remove the binding key from the binding arguments
again to free up some storage space.
Note that one of the advantages of a trie data structue is its space
efficiency that you don't have to store the same prefixes multiple
times.
However, for RabbitMQ the binding key is already stored at least N times
in various routing tables, so storing it a few times more via the
binding arguments should be acceptable.
The speed improvements are favoured over a few more MBs ETS usage.
Previously, the Will Message could be kept in memory in the MQTT
connection process state. Upon termination, the Will Message is sent.
The new MQTT 5.0 feature Will Delay Interval requires storing the Will
Message outside of the MQTT connection process state.
The Will Message should not be stored node local because the client
could reconnect to a different node.
Storing the Will Message in Mnesia is not an option because we want to
get rid of Mnesia. Storing the Will Message in a Ra cluster or in Khepri
is only an option if the Will Payload is small as there is currently no
way in Ra to **efficiently** snapshot large binary data (Note that these
Will Messages are not consumed in a FIFO style workload like messages in
quorum queues. A Will Message needs to be stored for as long as the
Session lasts - up to 1 day by default, but could also be much longer if
RabbitMQ is configured with a higher maximum session expiry interval.)
Usually Will Payloads are small: They are just a notification that its
MQTT session ended abnormally. However, we don't know how users leverage
the Will Message feature. The MQTT protocol allows for large Will Payloads.
Therefore, the solution implemented in this commit - which should work
good enough - is storing the Will Message in a queue.
Each MQTT session which has a Session Expiry Interval and Will Delay
Interval of > 0 seconds will create a queue if the current Network
Connection ends where it stores its Will Message. The Will Message has a
message TTL set (corresponds to the Will Delay Interval) and the queue
has a queue TTL set (corresponds to the Session Expiry Interval).
If the client does not reconnect within the Will Delay Interval, the
message is dead lettered to the configured MQTT topic exchange
(amq.topic by default).
The Will Delay Interval can be set by both publishers and subscribers.
Therefore, the Will Message is the 1st session state that RabbitMQ keeps
for publish-only MQTT clients.
One current limitation of this commit is that a Will Message that is
delayed (i.e. Will Delay Interval is set) and retained (i.e. Will Retain
flag set) will not be retained.
One solution to retain delayed Will Messages is that the retainer
process consumes from a queue and the queue binds to the topic exchange
with a topic starting with `$`, for example `$retain/#`.
The AMQP 0.9.1 Will Message that is dead lettered could then be added a
CC header such that it won't not only be published with the Will Topic,
but also with `$retain` topic. For example, if the Will Topic is `a/b`,
it will publish with routing key `a/b` and CC header `$retain/a/b`.
The reason this is not implemented in this commit is that to keep the
currently broken retained message store behaviour, we would require
creating at least one queue per node and publishing only to that local
queue. In future, once we have a replicated retained message store based
on a Stream for example, we could just publish all retained messages to
the `$retain` topic and thefore into the Stream.
So, for now, we list "retained and delayed Will Messages" as a limitation
that they actually won't be retained.
A recent change in OTP made the unit_SUITE fail with a case clause.
cdd7200cbe
Reverting this old commit seems to fix the problem for OTP master
without breaking OTP25/26.
Co-authored-by: @lhoguin
This commit replaces file combining with single-file compaction
where data is moved near the beginning of the file before
updating the index entries. The file is then truncated when
all existing readers are gone. This allows removing the lock
that existed before and enables reading multiple messages at
once from the shared files. This also helps us avoid many
ets operations and simplify the code greatly.
This commit still has some issues: reading a single message
is currently slow due to the removal of FHC in the client
code. This will be resolved by implementing read buffering
in a similar way as FHC but without keeping files open
more than necessary.
The dirty recovery code also likely has a number of issues
because of the compaction changes.
See #7869. #7875 resulted in elixir apps (besides the cli) in the deps
dir. This triggered dormant makefile logic to compile such deps. It
turns out that it's unnecessary to pre-compile them, given the cli's
mix.exs file.
Follow up of https://github.com/rabbitmq/rabbitmq-server/pull/7913
and https://github.com/rabbitmq/rabbitmq-server/pull/7921
This commit uses the approach explained in https://github.com/erlang/otp/issues/7130#issuecomment-1512808759
We cannot use the macro `?OTP_RELEASE` since macros are evaluated at
compile time. RabbitMQ can be compiled with OTP 25 and executed with OTP
26. Therefore, we use `erlang:system_info(otp_release)` instead.
As `erlang:system_info/1` might be costly, we store the send function in
persistent_term.
For OTP 25, we use the "old tcp send workaround" (i.e.
`erlang:port_command/2`) which avoids expensive selective receives.
For OTP 26, we use `gen_tcp:send/2` which uses the optimised selective
receive.
Once the minimum required version becomes OTP 26, we can just switch to
`gen_tcp:send/2` and delete the `inet_reply` handling code in the various
RabbitMQ reader and writer processes.
Note that `rabbit_net:port_command/2` is not only used by RabbitMQ server,
but also by the AMQP 0.9.1 client.
Therefore, instead of putting the OTP version (or send function) into
persistent_term within the rabbit app, we just do it the first time
`rabbit_net:port_command/2` is invoked.
(`rabbit_common` is just a library without supervision hierarchy.)
Revert the change introduced in https://github.com/rabbitmq/rabbitmq-server/pull/7900
RabbitMQ 3.12 will require OTP 25 (not yet OTP 26).
The use of macro `OTP_RELEASE` was wrong because RabbitMQ can be
compiled with OTP 25, but run with OTP 26. Macros are evaluated at
compile time.
An alternative runtime equivalent would have been
```
1> erlang:system_info(otp_release).
"26"
```
Bazel build files are now maintained primarily with `bazel run
gazelle`. This will analyze and merge changes into the build files as
necessitated by certain code changes (e.g. the introduction of new
modules).
In some cases there hints to gazelle in the build files, such as `#
gazelle:erlang...` or `# keep` comments. xref checks on plugins that
depend on the cli are a good example.
In OTP 25, gen_tcp:send/2 has poor performance when the Erlang mailbox
is large because the selective receive is not optimised.
https://erlang.org/download/otp_src_26.0-rc3.readme
OTP-18520 Application(s): erts
Related Id(s): GH-6455
gen_tcp:send/*, gen_udp:send/* and gen_sctp:send/* have
been optimized to use the infamous receive reference
optimization, so now sending should not have bad
performance when the calling process has a large
message queue.
These functions extend the functionality of `erlang:is_process_alive/1`
to take into account the node a process is running on and its cluster
membership.
These functions are moved away from `rabbit_mnesia` because we don't
want `rabbit_mnesia` to be a central piece of RabbitMQ.
Classic-mirrored-queue-related modules continue to use `rabbit_mnesia`
functions, therefore relying on Mnesia, because they depend entirely on
Mnesia anyway. They will go away at the same time as our use of Mnesia.
So by keeping this code untouched, we avoid possible regressions.
This is the latest commit in the series, it fixes (almost) all the
problems with missing and circular dependencies for typing.
The only 2 unsolved problems are:
- `lg` dependency for `rabbit` - the problem is that it's the only
dependency that contains NIF. And there is no way to make dialyzer
ignore it - looks like unknown check is not suppressable by dialyzer
directives. In the future making `lg` a proper dependency can be a
good thing anyway.
- some missing elixir function in `rabbitmq_cli` (CSV, JSON and
logging related).
- `eetcd` dependency for `rabbitmq_peer_discovery_etcd` - this one
uses sub-directories in `src/`, which confuses dialyzer (or our bazel
machinery is not able to properly handle it). I've tried the latest
rules_erlang which flattens directory for .beam files, but it wasn't
enough for dialyzer - it wasn't able to find core erlang files. This
is a niche plugin and an unusual dependency, so probably not worth
investigating further.
This commit is pure refactoring making the code base more maintainable.
Replace rabbit_misc:pipeline/3 with the new OTP 25 experimental maybe
expression because
"Frequent ways in which people work with sequences of failable
operations include folds over lists of functions, and abusing list
comprehensions. Both patterns have heavy weaknesses that makes them less
than ideal."
https://www.erlang.org/eeps/eep-0049#obsoleting-messy-patterns
Additionally, this commit is more restrictive in the type spec of
rabbit_mqtt_processor state fields.
Specifically, many fields were defined to be `undefined | T` where
`undefined` was only temporarily until the first CONNECT packet was
processed by the processor.
It's better to initialise the MQTT processor upon first CONNECT packet
because there is no point in having a processor without having received
any packet.
This allows many type specs in the processor to change from `undefined |
T` to just `T`.
Additionally, memory is saved by removing the `received_connect_packet`
field from the `rabbit_mqtt_reader` and `rabbit_web_mqtt_handler`.
- Use the same base .plt everywhere, so there is no need to list
standard apps everywhere
- Fix typespecs: some typos and the use of not-exported types
* Add rabbitmq_cli dialyze to bazel
and fix a number of warnings
Because we stop mix from recompiling rabbit_common in bazel, many
unknown functions are reported, so this dialyzer analysis is somewhat
incomplete.
* Use erlang dialyzer for rabbitmq_cli rather than mix dialyzer
Since this resolves all of the rabbit functions, there are far fewer
unknown functions.
Requires yet to be released rules_erlang 3.9.2
* Temporarily use pre-release rules_erlang
So that checks can run on this PR without a release
* Fix additional dialyzer warnings in rabbitmq_cli
* rabbitmq_cli: mix format
* Additional fixes for ignored return values
* Revert "Temporarily use pre-release rules_erlang"
This reverts commit c16b5b6815.
* Use rules_erlang 3.9.2
The MQTT protocol specs define the term "MQTT Control Packet".
The MQTT specs never talk about "frame".
Let's reflect this naming in the source code since things get confusing
otherwise:
Packets belong to MQTT.
Frames belong to AMQP 0.9.1 or web sockets.
Prior to this commit, 1 MQTT publisher publishing to 1 Million target
classic queues requires around 680 MB of process memory.
After this commit, it requires around 290 MB of process memory.
This commit requires feature flag classic_queue_type_delivery_support
and introduces a new one called no_queue_name_in_classic_queue_client.
Instead of storing the binary queue name 4 times, this commit now stores
it only 1 time.
The monitor_registry is removed since only classic queue clients monitor
their classic queue server processes.
The classic queue client does not store the queue name anymore. Instead
the queue name is included in messages handled by the classic queue
client.
Storing the queue name in the record ctx was unnecessary.
More potential future memory optimisations:
* When routing to destination queues, looking up the queue record,
delivering to queue: Use streaming / batching instead of fetching all
at once
* Only fetch ETS columns that are necessary instead of whole queue
records
* Do not hold the same vhost binary in memory many times. Instead,
maintain a mapping.
* Remove unnecessary tuple fields.
"Each Client connecting to the Server has a unique ClientId"
"If the ClientId represents a Client already connected to
the Server then the Server MUST disconnect the existing
Client [MQTT-3.1.4-2]."
Instead of tracking client IDs via Raft, we use local ETS tables in this
commit.
Previous tracking of client IDs via Raft:
(+) consistency (does the right thing)
(-) state of Ra process becomes large > 1GB with many (> 1 Million) MQTT clients
(-) Ra process becomes a bottleneck when many MQTT clients (e.g. 300k)
disconnect at the same time because monitor (DOWN) Ra commands get
written resulting in Ra machine timeout.
(-) if we need consistency, we ideally want a single source of truth,
e.g. only Mnesia, or only Khepri (but not Mnesia + MQTT ra process)
While above downsides could be fixed (e.g. avoiding DOWN commands by
instead doing periodic cleanups of client ID entries using session interval
in MQTT 5 or using subscription_ttl parameter in current RabbitMQ MQTT config),
in this case we do not necessarily need the consistency guarantees Raft provides.
In this commit, we try to comply with [MQTT-3.1.4-2] on a best-effort
basis: If there are no network failures and no messages get lost,
existing clients with duplicate client IDs get disconnected.
In the presence of network failures / lost messages, two clients with
the same client ID can end up publishing or receiving from the same
queue. Arguably, that's acceptable and less worse than the scaling
issues we experience when we want stronger consistency.
Note that it is also the responsibility of the client to not connect
twice with the same client ID.
This commit also ensures that the client ID is a binary to save memory.
A new feature flag is introduced, which when enabled, deletes the Ra
cluster named 'mqtt_node'.
Independent of that feature flag, client IDs are tracked locally in ETS
tables.
If that feature flag is disabled, client IDs are additionally tracked in
Ra.
The feature flag is required such that clients can continue to connect
to all nodes except for the node being udpated in a rolling update.
This commit also fixes a bug where previously all MQTT connections were
cluster-wide closed when one RabbitMQ node was put into maintenance
mode.
This function returns the data directory where all subsystems should
store their files.
Historically, this was the Mnesia directory. But semantically, this
should be the reverse: RabbitMQ owns the data directory and Mnesia is
configured to put its files there too.
`rabbit_mnesia:dir/0` now calls `rabbit:data_dir/0`.
Other subsystems will be modified in a follow-up commit to call
`rabbit:data_dir/0` instead of `rabbit_mnesia:dir/0`.
The location and name of this directory remains the same for
compatibility reasons. Therefore, it sill contains "mnesia" in its name.
However, semantically, we want this directory to be unrelated to Mnesia.
In the end, many subsystems write files and directories there, including
Mnesia, all Ra systems and in the future, Khepri.
This value is used internally by `rabbit_env` and usually not read by
RabbitMQ otherwise.
This patch prepares the rename of `mnesia_dir` to `data_dir`, in order
to not semantically rely on Mnesia configuration or use to locate data,
whether it is stored in Mnesia or not.
Seems like we are not using it anywhere in our code base.
It's unlikely that it's used somewhere else and even if it is,
the API is backwards compatible - we just pass 0, as if the priority_queue
was empty.
That was done in PR #3865.
The changes introduced in #3865 can cause message arrival ordering guarantees
between two logical erlang process (sending messages via delegate) to
be violated as a message sent to a single destination can overtake a prior
message sent as part of a fan-out. This is due to the fact that the fan-out
take a different route via the delegate process than the direct delivery that
bypasses it.
This commit only reverses it for the `invoke_no_result/2|3` API and leaves the
optimisation in for the synchronous `invoke/` API. This means that the message
send ordering you expect between erlang processes still can be violated when
mixing invoke and invoke_no_result invocations. As far as I can see there are
no places where the code relies on this and there are uses of invoke (mgmt db)
that very well could benefit from avoiding the additional copying.
This category should be unused with the decommissioning of the old
upgrade subsystem (in favor of the feature flags subsystem). It means:
1. The upgrade log file will not be created by default anymore.
2. The `$RABBITMQ_UPGRADE_LOG` environment variable is now unsupported.
The configuration variables remain to avoid breaking an existing and
working configuration.
Fix publish of libs to hex.pm
@lhoguin noticed that the hex packages for the amqp_client, amqp10_client and related project do not currently work with erlang.mk. This PR fixes this issue.
Tested using this project: https://github.com/lukebakken/amqp-clients-test.git
For the following flags I see an improvement of
30k/s to 34k/s on my machine:
-x 1 -y 1 -A 1000 -q 1000 -c 1000 -s 1000 -f persistent
-u cqv2 --queue-args=x-queue-version=2
Discovered by @dumbbell
Ensure externally read strings are saved as utf-8 encoded binaries. This
is necessary since `cmd.exe` on Windows uses ISO-8859-1 encoding and
directories can have latin1 characters, like `RabbitMQ Sérvér`.
The `é` is represented by decimal `233` in the ISO-8859-1 encoding. The
unicode code point is the same decimal value, `233`, so you will see
this in the charlist data. However, when encoded using utf-8, this
becomes the two-byte sequence `C3 A9` (hexidecimal).
When reading strings from env variables and configuration, they will be
unicode charlists, with each list item representing a unicode code
point. All of Erlang string functions can handle strings in this form.
Once these strings are written to ETS or Mnesia, they will be converted
to utf-8 encoded binaries. Prior to these changes just
`list_to_binary/1` was used.
Fix xref error
re:replace requires an iodata, which is not a list of unicode code points
Correctly parse unicode vhost tags
Fix many format strings to account for utf8 input. Try again to fix unicode vhost tags
More format string fixes, try to get the CONFIG_FILE var correct
Be sure to use the `unicode` option for re:replace when necessary
More unicode format strings, add unicode option to re:split
More format strings updated
Change ~s to ~ts for vhost format strings
Change ~s to ~ts for more vhost format strings
Change ~s to ~ts for more vhost format strings
Add unicode format chars to disk monitor
Quote the directory on unix
Finally figure out the correct way to pass unicode to the port
Stop sending connection_stats from protocol readers to rabbit_event.
Stop sending queue_stats from queues to rabbit_event.
Sending these stats every 5 seconds to the event manager process is
superfluous because noone handles these events.
They seem to be a relict from before rabbit_core_metrics ETS tables got
introduced in 2016.
Delete test head_message_timestamp_statistics because it tests that
head_message_timestamp is set correctly in queue_stats events
although queue_stats events are used nowhere.
The functionality of head_message_timestamp itself is still tested in
deps/rabbit/test/priority_queue_SUITE.erl and
deps/rabbit/test/temp/head_message_timestamp_tests.py
in e.g. the `advanced.config` file, or manually in runtime.
This also adds tracing through use of `rabbit_event`, controllable by
use of compile time flag, e.g. TRACE_SUP2.
A couple users reported `badmatch` crashes due to scenarios where
`inet:peername/1` does not return the expected value, most likely due to
the port closing between the time they are listed and when
`inet:peername/1` is called.
Fixes#5496
Discussion in #5490
Thoas is more efficient both in terms of encoding
time and peak memory footprint.
In the process we have discovered an issue:
https://github.com/lpil/thoas/issues/15
Pair: @pjk25
This avoids printing the full stacktrace when the error comes from the
sysctl invocation, the error message itself is sufficient
In practice, when testing with bazel with macos, sysctl is blocked by
the sandbox, so logging the stacktrace is rather noisy for tests
This gen_statem-based process is responsible for handling concurrency
when feature flags are enabled and synchronized when a cluster is
expanded.
This clarifies and stabilizes the behavior of the feature flag subsystem
w.r.t. situations where e.g. a feature flag migration function takes
time to update data and a new node joins a cluster and synchronizes its
feature flag states with the cluster. There was a chance that the
feature flag was marked as enabled on the joining node, even though the
migration function didn't take care of that node.
With this new feature flags controller, enabling or synchronizing
feature flags blocks and delays any concurrent operations which try to
modify feature flags states too.
This change also clarifies where and when the migration function is
called: it is called at least once on each node who knows the feature
flag and when the state goes from "disabled" to "enabled" on that node.
Note that even if the feature flag is being enabled on a subset of the
nodes (because other nodes already have it enabled), it is marked as
"state_changing" everywhere during the migration. This is to prevent
that a node where it is enabled assumes it is enabled on all nodes who
know the feature flag.
There is a new feature as well: just after a feature flag is enabled,
the migration function is called a second time for any post-enable
actions. The feature flag is marked as enabled between these "enable"
and "post-enable" steps. The success or failure of this "post-enable"
run does not affect the state of the feature flag (i.e. it is ignored).
A new migration function API is introduced to allow more advanced
things. The new API is:
my_migration_function(
#ffcommand{name = ...,
props = ...,
command = enable | post_enable,
extra = #{...}})
The record is defined in `include/feature_flags.hrl`. Here is the
meaning of each field:
* `name` and `props` are the equivalent of the `FeatureName` and
`FeatureProps` arguments of the previous migration function API.
* `command` is basically the same as the previous `Arg` arguments.
* `extra` is map containing context-specific information. For instance, it
contains the list of nodes where the feature flag state changes.
This whole new behavior is behind a new feature flag called
`feature_flags_v2`. If a feature flag uses the new migration function
API, `feature_flags_v2` will be automatically enabled.
If many feature flags are enabled at once (like when a fresh RabbitMQ
node is started for the first time), `feature_flags_v2` will be enabled
first if it is in the list.