Commit Graph

1421 Commits

Author SHA1 Message Date
David Ansari 3dc799afb5 Fix test expectation
Matching against #{} does not validate that the map is empty.
2023-06-21 17:14:08 +01:00
Chunyi Lyu 08fd9d00e3 Set topic alias when publish retained message
- remove topic alias from message props when storing retained msgs
- set topic alias for outbound before sending retained msgs
2023-06-21 17:14:08 +01:00
David Ansari 0e00e3479e CONNACK with Bad Authentication Method
RabbitMQ MQTT already supports authenticating clients via
username + password, OAuth tokens, and certificates.
We could make use of RabbitMQ SASL mechanisms in the future,
if needed. For now, if the client wants to perform extended
authentication, we return Bad Authentication Method in the CONNACK
packet.
2023-06-21 17:14:08 +01:00
David Ansari 3b65c97184 Add type alias for topic and client ID 2023-06-21 17:14:08 +01:00
Chunyi Lyu 12cdc69572 Test MQTT 5 with proxy protocol suite 2023-06-21 17:14:08 +01:00
Chunyi Lyu 1eac345f1c Run cluster suite against v5 enabled cluster 2023-06-21 17:14:08 +01:00
David Ansari 14d81b430f Add Topic Aliases from server to client
Once the server's Topic Alias cache for messages from server to client
is full, this commit does not replace any existing aliases.
So, the first topics "win" and stay in the cache forever.
This matches the behaviour of VerneMQ and EMQX.
For now that's good enough.
In the future, we can easily change that behaviour to some smarter strategy,
for example
1. Hash the TopicName to an Topic Alias and replace the old
   alias, or
2. For the Topic Alias Cache from server to client, keep 2 Maps:
   #{TopicName => TopicAlias} and #{TopicAlias => TopicName} and a
   counter that wraps to 1 once the Topic Alias Maximum is reached and
   just replace an existing Alias if the TopicName is not cached.

Also, refactor Topic Alias Maximum:
* Remove code duplication
* Allow an operator to prohibit Topic Aliases by allowing value 0 to be
  configured
* Change config name to topic_alias_maximum to that it matches exactly
  the MQTT feture name
* Fix wrong code formatting
* Add the invalid or unkown Topic Alias to log message for easier
  troubleshooting
2023-06-21 17:14:08 +01:00
Chunyi Lyu fd52caa211 Support topic alias from client to broker 2023-06-21 17:14:08 +01:00
Chunyi Lyu 60f6784d30 Make Topic Alias Maximum configurable
- default to 20, configurable through cuttlefish config
- add test to v5 suite for invalid topic alias in publish
2023-06-21 17:14:08 +01:00
David Ansari bb20618b13 Return matched binding keys faster
For MQTT 5.0 destination queues, the topic exchange does not only have
to return the destination queue names, but also the matched binding
keys.
This is needed to implement MQTT 5.0 subscription options No Local,
Retain As Published and Subscription Identifiers.

Prior to this commit, as the trie was walked down, we remembered the
edges being walked and assembled the final binding key with
list_to_binary/1.

list_to_binary/1 is very expensive with long lists (long topic names),
even in OTP 26.
The CPU flame graph showed ~3% of CPU usage was spent only in
list_to_binary/1.

Unfortunately and unnecessarily, the current topic exchange
implementation stores topic levels as lists.

It would be better to store topic levels as binaries:
split_topic_key/1 should ideally use binary:split/3 similar as follows:
```
1> P = binary:compile_pattern(<<".">>).
{bm,#Ref<0.1273071188.1488322568.63736>}
2> Bin = <<"aaa.bbb..ccc">>.
<<"aaa.bbb..ccc">>
3> binary:split(Bin, P, [global]).
[<<"aaa">>,<<"bbb">>,<<>>,<<"ccc">>]
```
The compiled pattern could be placed into persistent term.

This commit decided to avoid migrating Mnesia tables to use binaries
instead of lists. Mnesia migrations are non-trivial, especially with the
current feature flag subsystem.
Furthermore the Mnesia topic tables are already getting migrated to
their Khepri counterparts in 3.13.
Adding additional migration only for Mnesia does not make sense.

So, instead of assembling the binding key as we walk down the trie and
then calling list_to_binary/1 in the leaf, it
would be better to just fetch the binding key from the database in the leaf.

As we reach the leaf of the trie, we know both source and destination.
Unfortunately, we cannot fetch the binding key efficiently with the
current rabbit_route (sorted by source exchange) and
rabbit_reverse_route (sorted by destination) tables as the key is in
the middle between source and destination.
If there are a huge number of bindings for a given sourc exchange (very
realistic in MQTT use cases) or a large number of bindings for a given
destination (also realistic), it would require scanning these large
number of bindings.

Therefore this commit takes the simplest possible solution:
The solution leverages the fact that binding arguments are already part of
table rabbit_topic_trie_binding.
So, if we simply include the binding key into the binding arguments, we
can fetch and return it efficiently in the topic exchange
implementation.

The following patch omitting fetching the empty list binding argument
(the default) makes routing slower because function
`analyze_pattern.constprop.0` requires significantly more (~2.5%) CPU time
```
@@ -273,7 +273,11 @@ trie_bindings(X, Node) ->
                                    node_id       = Node,
                                    destination   = '$1',
                                    arguments     = '$2'}},
-    mnesia:select(?MNESIA_BINDING_TABLE, [{MatchHead, [], [{{'$1', '$2'}}]}]).
+    mnesia:select(
+      ?MNESIA_BINDING_TABLE,
+      [{MatchHead, [{'andalso', {'is_list', '$2'}, {'=/=', '$2', []}}], [{{'$1', '$2'}}]},
+       {MatchHead, [], ['$1']}
+      ]).
```
Hence, this commit always fetches the binding arguments.

All MQTT 5.0 destination queues will create a binding that
contains the binding key in the binding arguments.

Not only does this solution avoid expensive list_to_binay/1 calls, but
it also means that Erlang app rabbit (specifically the topic exchange
implementation) does not need to be aware of MQTT anymore:
It just returns the binding key when the binding args tell to do so.

In future, once the Khepri migration completed, we should be able to
relatively simply remove the binding key from the binding arguments
again to free up some storage space.

Note that one of the advantages of a trie data structue is its space
efficiency that you don't have to store the same prefixes multiple
times.
However, for RabbitMQ the binding key is already stored at least N times
in various routing tables, so storing it a few times more via the
binding arguments should be acceptable.
The speed improvements are favoured over a few more MBs ETS usage.
2023-06-21 17:14:08 +01:00
David Ansari f425f87192 Make retained message stores compatible with pre 3.13
The format of #mqtt_msg{} changes from 3.12 to 3.13.
In 3.13 the record contains 2 additional fields:
* props
* timestamp

The old #mqtt_msg{} might still be stored by the retained message store
in ets or dets.

This commit converts such an old message format when read from the
database.

The alternative would have been to run a migration function over the
whole table which is slightly more complex to implement.

Instead of giving the new message format a different record name,
e.g. #mqtt_msg_v2{}, this commit decides to re-use the same name such
that the new code only handles the record name #mqtt_msg{}.
2023-06-21 17:14:08 +01:00
David Ansari d7882b00dc PUBACK with reason code "No matching subscribers"
Support reason code "No matching subscribers" in PUBACK.

This somewhat corresponds to the `mandatory` message property
in AMQP 0.9.1.
2023-06-21 17:14:08 +01:00
David Ansari 23837c5270 DISCONNECT v5 clients with Server Shutting Down
When RabbitMQ enters maintenance mode / is being drained, all client
connections are closed.

This commit sends a DISCONNECT packet to (Web) MQTT 5.0 clients with
Reason Code "Server shutting down" before the connection is closed.
2023-06-21 17:14:08 +01:00
David Ansari 8c0b0e9338 Support MQTT 5.0 Properties
The following PUBLISH and Will properties are forwarded unaltered by the
server:
* Payload Format Indicator
* Content Type
* Response Topic
* Correlation Data
* User Property

Not only must these properties be forwarded unaltered from an MQTT
publishing client to an MQTT receiving client, but it would also be nice
to allow for protocol interoperability:
Think about RPC request-response style patterns where the requester is
an MQTT client and the responder is an AMQP 0.9.1 or STOMP client.

We reuse the P_basic fields where possible:
* content_type (if <= 255 bytes)
* correlation_id (if <= 255 bytes)

Otherwise, we add custom AMQP 0.9.1 headers.

The headers follow the naming pattern "x-mqtt-<property>" where
<property> is the MQTT v5 property if that property makes only
(mainly) sense in the MQTT world:
* x-mqtt-user-property
* x-mqtt-payload-format-indicator

If the MQTT v5 property makes also sense outside of the MQTT world, we
name it more generic:
* x-correlation (if > 255 bytes)
* x-reply-to-topic (since P_basic.reply_to assumes a queue name)

In the future, we can think about adding a header x-reply-to-exchange
and have the MQTT plugin set its value to the configured mqtt.exchange
such that clients don't have to assume the default topic exchange amq.topic.
2023-06-21 17:14:08 +01:00
David Ansari fb7af48df6 Support Will Delay Interval
Previously, the Will Message could be kept in memory in the MQTT
connection process state. Upon termination, the Will Message is sent.

The new MQTT 5.0 feature Will Delay Interval requires storing the Will
Message outside of the MQTT connection process state.

The Will Message should not be stored node local because the client
could reconnect to a different node.

Storing the Will Message in Mnesia is not an option because we want to
get rid of Mnesia. Storing the Will Message in a Ra cluster or in Khepri
is only an option if the Will Payload is small as there is currently no
way in Ra to **efficiently** snapshot large binary data (Note that these
Will Messages are not consumed in a FIFO style workload like messages in
quorum queues. A Will Message needs to be stored for as long as the
Session lasts - up to 1 day by default, but could also be much longer if
RabbitMQ is configured with a higher maximum session expiry interval.)
Usually Will Payloads are small: They are just a notification that its
MQTT session ended abnormally. However, we don't know how users leverage
the Will Message feature. The MQTT protocol allows for large Will Payloads.

Therefore, the solution implemented in this commit - which should work
good enough - is storing the Will Message in a queue.
Each MQTT session which has a Session Expiry Interval and Will Delay
Interval of > 0 seconds will create a queue if the current Network
Connection ends where it stores its Will Message. The Will Message has a
message TTL set (corresponds to the Will Delay Interval) and the queue
has a queue TTL set (corresponds to the Session Expiry Interval).
If the client does not reconnect within the Will Delay Interval, the
message is dead lettered to the configured MQTT topic exchange
(amq.topic by default).

The Will Delay Interval can be set by both publishers and subscribers.
Therefore, the Will Message is the 1st session state that RabbitMQ keeps
for publish-only MQTT clients.

One current limitation of this commit is that a Will Message that is
delayed (i.e. Will Delay Interval is set) and retained (i.e. Will Retain
flag set) will not be retained.
One solution to retain delayed Will Messages is that the retainer
process consumes from a queue and the queue binds to the topic exchange
with a topic starting with `$`, for example `$retain/#`.
The AMQP 0.9.1 Will Message that is dead lettered could then be added a
CC header such that it won't not only be published with the Will Topic,
but also with `$retain` topic. For example, if the Will Topic is `a/b`,
it will publish with routing key `a/b` and CC header `$retain/a/b`.

The reason this is not implemented in this commit is that to keep the
currently broken retained message store behaviour, we would require
creating at least one queue per node and publishing only to that local
queue. In future, once we have a replicated retained message store based
on a Stream for example, we could just publish all retained messages to
the `$retain` topic and thefore into the Stream.
So, for now, we list "retained and delayed Will Messages" as a limitation
that they actually won't be retained.
2023-06-21 17:14:08 +01:00
David Ansari becb92ca6f Do not set queue pid in MQTT connection process
Every queue type sets the queue pid when it creates the queue.

Prior to this commit, the queue pid set within the MQTT connection
process was a bit confusing as the queue pid will be different for
classic queues and quorum queues.
2023-06-21 17:14:08 +01:00
David Ansari 60a6af0054 Rename will_msg to will_payload
when only the payload is meant.
See [v5 3.1.3.4]
2023-06-21 17:14:08 +01:00
David Ansari 605e033f43 Fix test decode_basic_properties
This commit fixes 2 separate issues:
1. No quorum queue got created in v5 because Session Expiry Interval was 0.
2. Fix a function_clause error. Pass the decoded properties further to other
functions looking up headers.
2023-06-21 17:14:08 +01:00
David Ansari 48a442b23e Change routing options from list to map
as small maps with atom keys are optimized in OTP 26.
Rename v2 to return_binding_keys to make the routing option clearer.
2023-06-21 17:14:08 +01:00
David Ansari 0183909453 Fix failing test
due to rebasing onto main.

mqtt5 branch adds a new header
```
{<<"x-mqtt-retain">>, bool, false}
```
which caused the incoming_message_interceptors test case to fail.
2023-06-21 17:14:08 +01:00
David Ansari ce573c35fa Support MQTT 5.0 Subscription Option Retain Handling
The MQTT v5 spec is a bit vague on Retain Handling 1:
"If Retain Handling is set to 1 then if the subscription did not
already exist, the Server MUST send all retained message matching the
Topic Filter of the subscription to the Client, and if the subscription
did exist the Server MUST NOT send the retained messages.
[MQTT-3.3.1-10]." [v5 3.3.1.3]

Does a subscription with the same topic filter but different
subscription options mean that "the subscription did exist"?

This commit interprets "subscription exists" as both topic filter and
subscription options must be the same.

Therefore, if a client creates a subscription with a topic filter that
is identical to a previous subscription and subscription options that
are different and Retain Handling 1, the server sends the retained
message.
2023-06-21 17:14:08 +01:00
David Ansari e2b545f270 Support MQTT 5.0 features No Local, RAP, Subscription IDs
Support subscription options "No Local" and "Retain As Published"
as well as Subscription Identifiers.

All three MQTT 5.0 features can be set on a per subscription basis.
Due to wildcards in topic filters, multiple subscriptions
can match a given topic. Therefore, to implement Retain As Published and
Subscription Identifiers, the destination MQTT connection process needs
to know what subscription(s) caused it to receive the message.

There are a few ways how this could be implemented:

1. The destination MQTT connection process is aware of all its
   subscriptions. Whenever, it receives a message, it can match the
   message's routing key / topic against all its known topic filters.
   However, to iteratively match the routing key against all topic
   filters for every received message can become very expensive in the
   worst case when the MQTT client creates many subscriptions containing
   wildcards. This could be the case for an MQTT client that acts as a
   bridge or proxy or dispatcher: It could subscribe via a wildcard for
   each of its own clients.

2. Instead of interatively matching the topic of the received message
   against all topic filters that contain wildcards, a better approach
   would be for every MQTT subscriber connection process to maintain a
   local trie datastructure (similar to how topic exchanges are
   implemented) and perform matching therefore more efficiently.
   However, this does not sound optimal either because routing is
   effectively performed twice: in the topic exchange and again against
   a much smaller trie in each destination connection process.

3. Given that the topic exchange already perform routing, a much more
   sensible way would be to send the matched binding key(s) to the
   destination MQTT connection process. A subscription (topic filter)
   maps to a binding key in AMQP 0.9.1 routing. Therefore, for the first
   time in RabbitMQ, the routing function should not only output a list
   of unique destination queues, but also the binding keys (subscriptions)
   that caused the message to be routed to the destination queue.

This commit therefore implements the 3rd approach.
The downside of the 3rd approach is that it requires API changes to the
routing function and topic exchange.

Specifically, this commit adds a new function rabbit_exchange:route/3
that accepts a list of routing options. If that list contains version 2,
the caller of the routing function knows how to handle the return value
that could also contain binding keys.

This commits allows an MQTT connection process, the channel process, and
at-most-once dead lettering to handle binding keys. Binding keys are
included as AMQP 0.9.1 headers into the basic message.
Therefore, whenever a message is sent from an MQTT client or AMQP 0.9.1
client or AMQP 1.0 client or STOMP client, the MQTT receiver will know
the subscription identifier that caused the message to be received.

Note that due to the low number of allowed wildcard characters (# and
+), the cardinality of matched binding keys shouldn't be high even if
the topic contains for example 3 levels and the message is sent to for
example 5 million destination queues. In other words, sending multiple
distinct basic messages to the destination shouldn't hurt the delegate
optimisation too much. The delegate optimisation implemented for classic
queues and rabbit_mqtt_qos0_queue(s) still takes place for all basic
messages that contain the same set of matched binding keys.

The topic exchange returns all matched binding keys by remembering the
edges walked down to the leaves. As an optimisation, only for MQTT
queues are binding keys being returned. This does add a small dependency
from app rabbit to app rabbitmq_mqtt which is not optimal. However, this
dependency should be simple to remove when omitting this optimisation.

Another important feature of this commit is persisting subscription
options and subscription identifiers because they are part of the
MQTT 5.0 session state.

In MQTT v3 and v4, the only subscription information that were part of
the session state was the topic filter and the QoS level.
Both information were implicitly stored in the form of bindings:
The topic filter as the binding key and the QoS level as the destination
queue name of the binding.

For MQTT v5 we need to persist more subscription information.
From a domain perspective, it makes sense to store subscription options
as part of subscriptions, i.e. bindings, even though they are currently
not used in routing.
Therefore, this commits stores subscription options as binding arguments.

Storing subscription options as binding arguments comes in turn with
new challenges: How to handle mixed version clusters and upgrading an
MQTT session from v3 or v4 to v5?
Imagine an MQTT client connects via v5 with Session Expiry Interval > 0
to a new node in a mixed version cluster, creates a subscription,
disconnects, and subsequently connects via v3 to an old node. The
client should continue to receive messages.

To simplify such edge cases, this commit introduces a new feature flag
called mqtt_v5. If mqtt_v5 is disabled, clients cannot connect to
RabbitMQ via MQTT 5.0.

This still doesn't entirely solve the problem of MQTT session upgrades
(v4 to v5 client) or session downgrades (v5 to v4 client).

Ideally, once mqtt_v5 is enabled, all MQTT bindings contain non-empty binding
arguments. However, this will require a feature flag migration function
to modify all MQTT bindings. To be more precise, all MQTT bindings need
to be deleted and added because the binding argument is part of the
Mnesia table key.

Since feature flag migration functions are non-trivial to implement in
RabbitMQ (they can run on every node multiple times and concurrently),
this commit takes a simpler approach:
All v3 / v4 sessions keep the empty binding argument [].
All v5 sessions use the new binding argument [#mqtt_subscription_opts{}].

This requires only handling a session upgrade / downgrade by
creating a binding (with the new binding arg) and deleting the old
binding (with the old binding arg) when processing the CONNECT packet.

Note that such session upgrades or downgrades should be rather rare in
practice. Therefore these binding transactions shouldn't hurt peformance.

The No Local option is implemented within the MQTT publishing connection
process: The message is not sent to the MQTT destination if the
destination queue name matches the current MQTT client ID and the
message was routed due to a subscription that has the No Local flag set.
This avoids unnecessary traffic on the MQTT queue.
The alternative would have been that the "receiving side" (same process)
filters the message out - which would have been more consistent in how
Retain As Published and Subscription Identifiers are implemented, but
would have caused unnecessary load on the MQTT queue.
2023-06-21 17:14:08 +01:00
David Ansari 51d659fd07 Fix failing property in packet_prop_SUITE
1. Shrinking times out if there is an error, therefore remove the 60
   seconds Bazel timeout by using a medium size bazel test suite.
2. The MQTT 5.0 spec mandates for binary data types and UTF 8 string
   data types to have values of maximum 65,535 bytes.
   Therefore, ensure this test suite does not generate data greater than
   that limit.
2023-06-21 17:14:08 +01:00
Chunyi Lyu d601c6432e Send disconnect packet from server
- when clients connect with a duplicate client id;
disconnect with reason code session taken over 142
- when keep alive has timed out;
disconnect with reason code keep alive timeout 141
2023-06-21 17:14:08 +01:00
David Ansari e273b4c87b Fix two small bugs 2023-06-21 17:14:08 +01:00
David Ansari 2270a30af0 Point emqtt to rabbitmq/emqtt:master
emqtt repos:
emqx/emqtt PR #196 is based on rabbitmq:otp-26-compatibility
emqx/emqtt PR #198 is based on ansd:master
rabbitmq/master contains both of these 2 PRs cherry-picked.

rabbitmq-server repos:
main branch points emqtt to rabbitmq:otp-26-compatibility
mqtt5 branch points emqtt to rabbitmq:master

Therefore, the current mqtt5 branch is OTP 26 compatible and can support
multiple subscription identifiers.
2023-06-21 17:14:08 +01:00
David Ansari ad5152bdd6 bazel run gazelle 2023-06-21 17:14:08 +01:00
David Ansari 80d972e308 Add property test for MQTT encoder / decoder
All MQTT packets that can be sent in both directions (from client to
server and server to client) are tested in packet_prop_SUITE.

The symmetric property is very concise because encoding and then decoding an
MQTT packet should yield the original MQTT packet.

The input data variety of the previous example based tests was very
small.
2023-06-21 17:14:08 +01:00
David Ansari 3b3ccd4d42 Simplify UNSUBACK reply
Whether a payload is sent to the client is decided by the serialiser.
2023-06-21 17:14:08 +01:00
David Ansari cd7f396bea Simplify code and remove code duplication 2023-06-21 17:14:08 +01:00
David Ansari 6f4f9506a4 Add a test case for large Receive Maximum value 2023-06-21 17:14:08 +01:00
Chunyi Lyu 818a1f410b Max 10 unack msgs from server to client
- This is a revision on the commit that implemented
receive maximum 670d6b2 which sends as many unack msgs
as the client has set to be their receive max.
This commit adds back the unack msgs limit from the server (10)
which is a much safer limit than a user provided max.
2023-06-21 17:14:08 +01:00
Chunyi Lyu bb9fed85f5 Add return codes in unsuback packet 2023-06-21 17:14:08 +01:00
Chunyi Lyu d1b173de8c Return v5 failure reason codes for suback
- mqtt v5 has more descriptive return values for suback
- added two possible failure reason codes for suback packet
one for permission error, another for quota exceeded error
- modified auth suite to assert on reason codes for v5
- no new test case since failures were already covered
2023-06-21 17:14:08 +01:00
Chunyi Lyu 471540dbdc Implement client ReceiveMaximum
- rename processor state prefetch to receive_maximum
to better match property name for mqtt 5
- defaults to 10 (as previously) when not set
not saved in session state, configuration is per
connection
2023-06-21 17:14:08 +01:00
David Ansari acd249cb0f Add a test for Session Expiry Interval 2023-06-21 17:14:08 +01:00
Chunyi Lyu 68d59bcaf3 Update sess exp interval when client reconnect
- when client reconnecting with clean start false,
server respects the new session expiry interval provided
by the client
2023-06-21 17:14:08 +01:00
David Ansari df64e3a41c Rename #mqtt_topic{} to #mqtt_subscription{}
because that's what the record represents and is
more in line with the terminology used in the MQTT
specification.
2023-06-21 17:14:08 +01:00
David Ansari 2efd9c06b8 Support Session Expiry Interval
Allow Session Expiry Interval to be changed when client DISCONNECTs.

Deprecate config subscription_ttl in favour of max_session_expiry_interval_secs
because the Session Expiry Interval will also apply to publishers that
connect with a will message and will delay interval.
"The additional session state of an MQTT v5 server includes:
* The Will Message and the Will Delay Interval
* If the Session is currently not connected, the time at which the Session
  will end and Session State will be discarded."

The Session Expiry Interval picked by the server and sent to the client
in the CONNACK is the minimum of max_session_expiry_interval_secs and
the requested Session Expiry Interval by the client in CONNECT.

This commit favours dynamically changing the queue argument x-expires
over creating millions of different policies since that many policies
will come with new scalability issues.

Dynamically changing queue arguments is not allowed by AMQP 0.9.1
clients. However, it should be perfectly okay for the MQTT plugin to do
so for the queues it manages. MQTT clients are not aware that these
queues exist.
2023-06-21 17:14:08 +01:00
David Ansari 6e9aa952ea Remove code duplication 2023-06-21 17:14:08 +01:00
Chunyi Lyu 8ce0813bda Close conn when will msg qos is 2 [MQTT-3.2.2-12]
If a Server receives a CONNECT packet containing a Will QoS that
exceeds its capabilities, it MUST reject the connection. It SHOULD
use a CONNACK packet with Reason Code 0x9B (QoS not supported) as
described in section 4.13 Handling errors, and MUST close the Network Connection
2023-06-21 17:14:08 +01:00
David Ansari c31ce01443 Dead letter negatively ACKed MQTT v5 messages
MQTT v5 allows client and server to negatively ack a message by setting
a reason code of 128 or greater indicating failure.

"If PUBACK or PUBREC is received containing a Reason Code of 0x80 or greater
the corresponding PUBLISH packet is treated as acknowledged, and MUST NOT be
retransmitted [MQTT-4.4.0-2]."

Even though the spec prohibits resending such messages, if a client does
not accept a message, RabbitMQ can still dead letter the message.
2023-06-21 17:14:08 +01:00
David Ansari 66fe9630b5 Add Message Expiry Interval for retained messages
MQTT v5 spec:
"If the current retained message for a Topic expires, it is discarded
and there will be no retained message for that topic."

This commit also supports Message Expiry Interval for retained messages
when a node is restarted.
Therefore, the insertion timestamp needs to be stored on disk.
Upon recovery, the Erlang timers are re-created.
2023-06-21 17:14:08 +01:00
Chunyi Lyu c39079f657 Disconnect at pub qos > server max qos
- "If the Server included a Maximum QoS in its CONNACK response
to a Client and it receives a PUBLISH packet with a QoS greater than this
then it uses DISCONNECT with Reason Code 0x9B (QoS not supported)"
- only affects mqtt v5, server max qos is 1
2023-06-21 17:14:08 +01:00
David Ansari 044ee02b36 Add MQTT v5 feature Message Expiry Interval
This commit does not yet implement Message Expiry Interval of
* retained messages: "If the current retained message for a Topic
  expires, it is discarded and there will be no retained message for
  that topic."
2023-06-21 17:14:08 +01:00
Chunyi Lyu d237a6b0c9 Allow setting max packet size by cuttlefish 2023-06-21 17:14:08 +01:00
David Ansari 2ef1f79fdd Test server restart with retained messages
"Retained messages do not form part of the Session State in the Server,
they are not deleted as a result of a Session ending."

Both retained message stores ETS and DETS implement recovery.
This commit adds a test that recovery works as intended.
2023-06-21 17:14:08 +01:00
David Ansari e50e994ef4 Return Assigned Client Identifier in CONNACK
"If the Client connects using a zero length Client Identifier, the Server
MUST respond with a CONNACK containing an Assigned Client Identifier."
2023-06-21 17:14:08 +01:00
David Ansari f1f8167ec4 Add MQTT v5 feature Maximum Packet Size set by server
"Allow the Client and Server to independently specify the maximum
packet size they support. It is an error for the session partner
to send a larger packet."

This commit implements the part where the Server specifies the maximum
packet size.

"In the case of an error in a CONNECT packet it MAY send a CONNACK
packet containing the Reason Code, before closing the Network
Connection. In the case of an error in any other packet it SHOULD send a
DISCONNECT packet containing the Reason Code before closing the Network
Connection."

This commit implements only the "SHOULD" (second) part, not the "MAY"
(first) part.

There are now 2 different global wide MQTT settings on the server:
1. max_packet_size_unauthenticated which applies to the CONNECT packet
   (and maybe AUTH packet in the future)
2. max_packet_size_authenticated which applies to all other MQTT
   packets (that is, after the client successfully authenticated).

These two settings will apply to all MQTT versions.
In MQTT v5, if a non-CONNECT packet is too large, the server will send a
DISCONNECT packet to the client with Reason Code "Packet Too Large"
before closing the network connection.
2023-06-21 17:14:08 +01:00
David Ansari 49f1071591 Add MQTT v5 feature Maximum Packet Size set by client
"Allow the Client and Server to independently specify the maximum
packet size they support. It is an error for the session partner
to send a larger packet."

This commit implements the part where the Client specifies the maximum
packet size.

As per protocol spec, instead of sending, the server drops the MQTT packet
if it's too large.
A debug message is logged for "infrequent" packet types.

For PUBLISH packets, the messages is rejected to the queue such that it
will be dead lettered, if dead lettering is configured.
At the very least, Prometheus metrics for dead lettered messages will
be increased, even if dead lettering is not configured.
2023-06-21 17:14:08 +01:00
David Ansari c44b546f73 Test MQTT v5 in existing MQTT suites 2023-06-21 17:14:08 +01:00
David Ansari be6ff92692 Serialise and parse MQTT 5.0 packets 2023-06-21 17:14:08 +01:00
Michael Klishin 55442aa914 Replace @rabbitmq.com addresses with rabbitmq-core@groups.vmware.com
Don't ask why we have to do it. Because reasons!
2023-06-20 15:40:13 +04:00
David Ansari c82dbfd1bb Add missing MQTT test assertion
util:expect_publishes/3 returns 'ok' or
{'publish_not_received', Payload}
2023-06-13 09:40:03 +00:00
David Ansari f485e51d80 Fix Native MQTT crash if properties encoded
Fixes https://github.com/rabbitmq/rabbitmq-server/discussions/8252

The MQTT connection must decode AMQP 0.9.1 properties as they are
getting encoded for example in:
712c2b9ec9/deps/rabbit/src/rabbit_variable_queue.erl (L2219)
and
712c2b9ec9/deps/rabbit/src/rabbit_quorum_queue.erl (L1680)

Prior to this commit, the MQTT connection process could crash with:
```
[{rabbit_mqtt_processor,deliver_one_to_client,
     [{{resource,<<"/">>,queue,
           <<"mqtt-subscription-mqtt-explorer-c5351d21qos0">>},
       <0.4546.0>,undefined,false,
       {basic_message,
           {resource,<<"/">>,exchange,<<"amq.topic">>},
           [<<"plant.v1.M3.BCD423.rev.fillStateChangedEvent">>],
           {content,60,none,
               <<80,64,16,97,112,112,108,105,99,97,116,105,111,110,47,
                 106,115,111,110,2,0,0,0,0,100,103,141,186>>,
               rabbit_framing_amqp_0_9_1,
               [<<"{\"plantIdentificationCode\":\"M3/BCD423/rev\",\"isFullSensorTriggered\":false,\"numberOfCarriers\":20,\"maxNumberOfCarriers\":1174,\"numberOfEmpties\":0,\"numberOfCarriersWithPayload\":20,\"numberOfCarriersWithOrder\":0,\"trackLength\":30000,\"trackLengthOccupied\":1130}">>]},
           <<31,230,178,158,209,240,53,221,100,60,64,5,227,237,58,21>>,
           true}},
      false,
      {state,
          {cfg,#Port<0.282>,mqtt311,true,undefined,
              {resource,<<"/">>,exchange,<<"amq.topic">>},
              undefined,false,none,<0.680.0>,flow,none,10,<<"/">>,
              <<"mqtt-explorer-c5351d21">>,undefined,
              {192,168,10,131},
              1883,
              {192,168,10,130},
              53244,1684508087392,#Fun<rabbit_mqtt_reader.0.106886>},
          {rabbit_queue_type,
              #{{resource,<<"/">>,queue,
                    <<"mqtt-subscription-mqtt-explorer-c5351d21qos0">>} =>
                    {ctx,rabbit_classic_queue,
                        {rabbit_classic_queue,<0.4546.0>,#{},
                            #{<0.4546.0> => ok}}}}},
          #{},#{},1,
          #{<<"#">> => 0,<<"$SYS/#">> => 0},
          {auth_state,
              {user,<<"rabbit">>,[],
                  [{rabbit_auth_backend_internal,
                       #Fun<rabbit_auth_backend_internal.3.114557357>}]},
              #{<<"client_id">> => <<"mqtt-explorer-c5351d21">>}},
          registered,#{},0}],
     [{file,"rabbit_mqtt_processor.erl"},{line,1414}]},
 {lists,foldl,3,[{file,"lists.erl"},{line,1350}]},
 {rabbit_mqtt_processor,handle_queue_event,2,
     [{file,"rabbit_mqtt_processor.erl"},{line,1345}]},
 {rabbit_mqtt_reader,handle_cast,2,
     [{file,"rabbit_mqtt_reader.erl"},{line,134}]},
 {gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,1123}]},
 {gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,1200}]},
 {proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]}
```
2023-05-19 17:05:31 +00:00
Rin Kuryloski c7d0427d62
Merge pull request #8084 from rabbitmq/rin/nightly-compare-build-systems
Github Actions pipeline to compare build systems nightly
2023-05-15 17:26:21 +02:00
Rin Kuryloski eb94a58bc9 Add a workflow to compare the bazel/erlang.mk output
To catch any drift between the builds
2023-05-15 13:54:14 +02:00
David Ansari 044f6e3bac Move plugin rabbitmq-message-timestamp to the core
As reported in https://groups.google.com/g/rabbitmq-users/c/x8ACs4dBlkI/
plugins that implement rabbit_channel_interceptor break with
Native MQTT in 3.12 because Native MQTT does not use rabbit_channel anymore.
Specifically, these plugins don't work anymore in 3.12 when sending a message
from an MQTT publisher to an AMQP 0.9.1 consumer.

Two of these plugins are
https://github.com/rabbitmq/rabbitmq-message-timestamp
and
https://github.com/rabbitmq/rabbitmq-routing-node-stamp

This commit moves both plugins into rabbitmq-server.
Therefore, these plugins are deprecated starting in 3.12.

Instead of using these plugins, the user gets the same behaviour by
configuring rabbitmq.conf as follows:
```
incoming_message_interceptors.set_header_timestamp.overwrite = false
incoming_message_interceptors.set_header_routing_node.overwrite = false
```

While both plugins were incompatible to be used together, this commit
allows setting both headers.

We name the top level configuration key `incoming_message_interceptors`
because only incoming messages are intercepted.
Currently, only `set_header_timestamp` and `set_header_routing_node` are
supported. (We might support more in the future.)
Both can set `overwrite` to `false` or `true`.
The meaning of `overwrite` is the same as documented in
https://github.com/rabbitmq/rabbitmq-message-timestamp#always-overwrite-timestamps
i.e. whether headers should be overwritten if they are already present
in the message.

Both `set_header_timestamp` and `set_header_routing_node` behave exactly
to plugins `rabbitmq-message-timestamp` and `rabbitmq-routing-node-stamp`,
respectively.

Upon node boot, the configuration is put into persistent_term to not
cause any performance penalty in the default case where these settings
are disabled.

The channel and MQTT connection process will intercept incoming messages
and - if configured - add the desired AMQP 0.9.1 headers.

For now, this allows using Native MQTT in 3.12 with the old plugins
behaviour.

In the future, once "message containers" are implemented,
we can think about more generic message interceptors where plugins can be
written to modify arbitrary headers or message contents for various protocols.

Likewise, in the future, once MQTT 5.0 is implemented, we can think
about an MQTT connection interceptor which could function similar to a
`rabbit_channel_interceptor` allowing to modify any MQTT packet.
2023-05-15 08:37:52 +00:00
David Ansari 967e262272 Add MQTT client id to connection closed event
As requested in https://github.com/rabbitmq/rabbitmq-server/discussions/6331#discussioncomment-5796154
include all infos that were emitted in the MQTT connection created event also
in the MQTT connection closed event.
This ensures infos such as MQTT client ID are part of the connection
closed event.
Therefore, it's easy for the user to correlate between the two event
types.
Note that the MQTT plugin emits connection created and connection closed events only if
the CONNECT packet was successfully processed, i.e.authentication was successful.

Remove the disconnected_at property because it was never used.
rabbit_event already adds a timestamp to any event.
2023-05-04 09:15:55 +00:00
David Ansari 5ec22acd29 Ues more defensive rabbit_data_coercion:to_list/1 2023-04-28 09:19:39 +00:00
David Ansari 83eede7ef2 Keep storing MQTT client IDs as lists in Ra
Up to 3.11.x an MQTT client ID is tracked in Ra
as a list of bytes as returned by binary_to_list/1 in
48467d6e12/deps/rabbitmq_mqtt/src/rabbit_mqtt_frame.erl (L137)

This has two downsides:
1. Lists consume more memory than binaries (when tracking many clients).
2. It violates the MQTT spec which states
   "The ClientId MUST be a UTF-8 encoded string as defined in Section 1.5.3 [MQTT-3.1.3-4]." [v4 3.1.3.1]

Therefore, the original idea was to always store MQTT client IDs as
binaries starting with Native MQTT in 3.12.
However, this leads to client ID tracking misbehaving in mixed version
clusters since new nodes would register client IDs as binaries and old
nodes would register client IDs as lists. This means that a client
registering on a new node with the same client ID as a connection to the
old node did not terminate the connection on the old node.

Therefore, for backwards compatibility, we leave the client ID as a list of bytes
in the Ra machine state because the feature flag delete_ra_cluster_mqtt_node
introduced in v3.12 will delete the Ra cluster anyway and
the new client ID tracking via pg local will store client IDs as
binaries.

An interesting side note learned here is that the compiled file
rabbit_mqtt_collector must not be changed. This commit only modifies
function specs. However as soon as the compiled code is changed, this
module becomes a new version. The new version causes the anonymous ra query
function to fail in mixed clusters: When the old node does a
ra:leader_query where the leader is on the new node, the query function
fails on the new node with `badfun` because the new node does not have
the same module version. For more context, read:
https://web.archive.org/web/20181017104411/http://www.javalimit.com/2010/05/passing-funs-to-other-erlang-nodes.html
2023-04-28 07:57:23 +00:00
Michal Kuratczyk 858ed1bff6
Switch to an emqtt fork/branch for OTP26
This change should be reverted once emqx/emqtt is OTP26 compatible.
Our fork/branch isn't either at this point, but at least partially
works. Let's use this branch for now to uncover server-side OTP26
incompatibilities (and continue working on OTP26 support for emqtt of
course).
2023-04-26 11:06:23 +02:00
Rin Kuryloski a944439fba Replace globs in bazel with explicit lists of files
As this is preferred in rules_erlang 3.9.14
2023-04-25 17:29:12 +02:00
Michael Klishin 686fb9a4e9
Merge pull request #7966 from rabbitmq/otp26-compatibility
OTP26 compatibility: `{verify_none}`
2023-04-24 18:03:49 +04:00
David Ansari e514f85c71 Bump test cluster creation timeouts
in MQTT tests.

For example in the ff_SUITE wee see in Buildbuddy sporadic
failures that the cluster cannot be created within 2 minutes:

```
*** CT Error Notification 2023-04-24 10:58:55.628 ***🔗
rabbit_ct_helpers:port_receive_loop failed on line 945
Reason: {timetrap_timeout,120000}

...

=== Ended at 2023-04-24 10:58:55
=== Location: [{rabbit_ct_helpers,port_receive_loop,945},
              {rabbit_ct_helpers,exec,920},
              {rabbit_ct_broker_helpers,cluster_nodes1,858},
              {rabbit_ct_broker_helpers,cluster_nodes1,840},
              {rabbit_ct_helpers,run_steps,141},
              {ff_SUITE,init_per_group,last_expr},
              {test_server,ts_tc,1782},
              {test_server,run_test_case_eval1,1379},
              {test_server,run_test_case_eval,1223}]
=== Reason: timetrap timeout
===
*** init_per_group failed.
    Skipping all cases.
```

The default time limit for a test case is 30 minutes.
2023-04-24 15:23:15 +02:00
Michal Kuratczyk d04b3afe9b verify_none in a couple of tests 2023-04-24 13:11:44 +00:00
Rin Kuryloski 854d01d9a5 Restore the original -include_lib statements from before #6466
since this broke erlang_ls

requires rules_erlang 3.9.13
2023-04-20 12:40:45 +02:00
Michael Klishin c0ed80c625
Merge pull request #6466 from rabbitmq/gazelle
Use gazelle for some maintenance of bazel BUILD files
2023-04-19 09:33:44 +04:00
Rin Kuryloski 8de8f59d47 Use gazelle generated bazel files
Bazel build files are now maintained primarily with `bazel run
gazelle`. This will analyze and merge changes into the build files as
necessitated by certain code changes (e.g. the introduction of new
modules).

In some cases there hints to gazelle in the build files, such as `#
gazelle:erlang...` or `# keep` comments. xref checks on plugins that
depend on the cli are a good example.
2023-04-17 18:13:18 +02:00
David Ansari d670a7c50e Make test rabbit_mqtt_qos0_queue_overflow less flaky
Similarly to https://github.com/rabbitmq/rabbitmq-server/pull/7663,
increase the message size and decrease the client buffer sizes.

This change is needed because we switched from erlang:port_command/2 to
gen_tcp:send/2. The former is a bit more asynchronous than the latter
because the latter waits for the inet_reply from the port.
2023-04-17 15:06:53 +00:00
Rin Kuryloski 8a7eee6a86 Ignore warnings when building plt files for dependencies
As we don't generally care if a dependency has warnings, only the
target
2023-04-17 10:09:24 +02:00
David Ansari 0130d3ac36 Clean up exclusive durable queues after unclean shutdown
What:

Delete bindings of exclusive durable queues after an
unclean shutdown.

Why:

Native MQTT in 3.12 uses only durable queues to ease transition to
Khepri. Since auto-delete queues are not deleted after an unclean
shutdown, Native MQTT uses exclusive (instead of auto-delete) queues
for clean sessions.

While this bug is not specific to Native MQTT, this bug is most relevant
for the upcoming 3.12 release since exclusive durable queues are rarely used
otherwise.

How:

During queue recovery, not all bindings are recovered yet.
Therefore, if, during queue recovery, an exclusive, durable queue need to
be deleted, only durable bindings should be queried.

Queue types need to make sure that their exclusive, durable queues including their
bindings are deleted before starting with binding recovery.
Otherwise binding deletion and binding recovery get interleaved leading to topic
bindings being created and left behind.
Therefore, a classic queue process replies to the recovery process after it
deleted its queue record and associated bindings from the database.
2023-03-23 21:59:19 +00:00
Michael Klishin 170862c4be
Merge pull request #7659 from rabbitmq/rin/queue-deleted-events-include-queue-type
Include the queue type in the queue_deleted rabbit_event
2023-03-18 00:14:29 +04:00
David Ansari f5564da94a Fix flaky test rabbit_mqtt_qos0_queue_overflow
The test always succeeds on `main` branch.

The test also always succeeds on `mc` branch when running remotely:
```
bazel test //deps/rabbitmq_mqtt:reader_SUITE --test_env FOCUS="-group tests -case rabbit_mqtt_qos0_queue_overflow" --config=rbe-25 -t- --runs_per_test=50
```

However, the test flakes when running on `mc` branch locally on the MAC:
```
make -C deps/rabbitmq_mqtt ct-reader t=tests:rabbit_mqtt_qos0_queue_overflow FULL=1
```
with the following local changes:
```
~/workspace/rabbitmq-server/deps/rabbitmq_mqtt mc *6 !1 >                                                                                                                                                                                                                                                  3s direnv rb 2.7.2
diff --git a/deps/rabbitmq_mqtt/test/reader_SUITE.erl b/deps/rabbitmq_mqtt/test/reader_SUITE.erl
index fb71eae375..21377a2e73 100644
--- a/deps/rabbitmq_mqtt/test/reader_SUITE.erl
+++ b/deps/rabbitmq_mqtt/test/reader_SUITE.erl
@@ -27,7 +27,7 @@ all() ->

 groups() ->
     [
-     {tests, [],
+     {tests, [{repeat_until_any_fail, 30}],
       [
        block_connack_timeout,
        handle_invalid_packets,
@@ -43,7 +43,7 @@ groups() ->
     ].

 suite() ->
-    [{timetrap, {seconds, 60}}].
+    [{timetrap, {minutes, 60}}].

 %% -------------------------------------------------------------------
 %% Testsuite setup/teardown.
```
failes prior to this commit after the 2nd time and does not fail after
this commit.
2023-03-17 15:54:33 +00:00
Rin Kuryloski c61d16c971 Include the queue type in the queue_deleted rabbit_event
This is useful for understanding if a deleted queue was matching any
policies given the more selective policies introduced in #7601.

Does not apply to bulk deletion of transient queues on node down.
2023-03-17 11:50:14 +01:00
David Ansari 64d4e4f63a MQTT: Fix 3.12-beta.1 cluster creation
Deploying a 5 node RabbitMQ cluster with rabbitmq_mqtt plugin enabled
using the cluster-operator with rabbitmq image
rabbitmq:3.12.0-beta.1-management often fails with the following
error:
```
Feature flags: failed to enable `delete_ra_cluster_mqtt_node`:
{error,
    {no_more_servers_to_try,
        [{error,
            noproc},
        {error,
            noproc},
        {timeout,
            {mqtt_node,
                'rabbit@mqtt-rabbit-3-12-server-2.mqtt-rabbit-3-12-nodes.default'}}]}}
```

During rabbitmq_mqtt plugin start, the plugin decides whether it should
create a Ra cluster:
If feature flag delete_ra_cluster_mqtt_node is enabled, no Ra cluster
gets created.
If this feature flag is disabled, a Ra cluster is created.

Even though all feature flags are enabled by default for a fresh
cluster, the feature flag subsystem cannot provide any promise when feature
flag migration functions run during the boot process:
The migration functions can run before plugins are started or many
seconds after plugins are started.
There is also no API that tells when feature flag initialisation on the
local node is completed.

Therefore, during a fresh 3.12 cluster start with rabbitmq_mqtt enabled,
some nodes decide to start a Ra cluster (because not all feature flags
are enabled yet when the rabbitmq_mqtt plugin is started).
Prior to this commit, when the feature flag delete_ra_cluster_mqtt_node
got enabled, Ra cluster deletion timed out because the Ra cluster never got
initialised successfully: Members never proceeded past the pre-vote
phase because their Ra peers already had feature flag delete_ra_cluster_mqtt_node
enabled and therefore don't participate.

One possible fix is to remove the feature flag delete_ra_cluster_mqtt_node
and have the newly upgraded 3.12 node delete its Ra membership during a
rolling update. However, this approach is too risky because during a
rolling update of a 3 node cluster A, B, and C the following issue
arises:
1. C upgrades successfully and deletes its Ra membership
2. B shuts down. At this point A and B are members of the Ra cluster
   while only A is online (i.e. not a majority). MQTT clients which try
   to re-connect to A fail because A times out registering the client in
   ad5cc7e250/deps/rabbitmq_mqtt/src/rabbit_mqtt_processor.erl (L174-L179)

Therefore this commit fixes 3.12 cluster creation as follows:
The Ra cluster deletion timeout is reduced from 60 seconds to 15 seconds,
and the Ra server is force deleted if Ra cluster deletion fails.
Force deleting the server will wipe the Ra cluster data on disk.
2023-03-11 14:23:51 +00:00
David Ansari 4c602b7698 Add validator for non negative integer
A value of 0 means overload protection is disabled.
A negative value doesn't really make sense.
2023-03-07 14:35:28 +01:00
David Ansari d72ff9248b Expose mqtt.mailbox_soft_limit via Cuttlefish 2023-03-07 14:35:28 +01:00
David Ansari 942c4ab542 Add missing BEAM file
to fix failing Bazel test
2023-03-02 10:25:07 +01:00
David Ansari dd372619f8 Add missing exchange write access check
when an MQTT will message is published.
2023-03-02 10:25:07 +01:00
David Ansari dfc2ee634b Use integer as will message correlation
Instead of using atom `undefined`, use an integer as correlation term
when sending the will message to destination queues.

Classic queue clients for example expect a non negative integer.
Quorum queues expect any term.
2023-03-01 12:52:44 +01:00
David Ansari 0058380fbd Web MQTT: Send CONNACK error code before closing connection
"If the Client supplies a zero-byte ClientId with CleanSession set to 0,
the Server MUST respond to the CONNECT Packet with a CONNACK return code 0x02
(Identifier rejected) and then close the Network Connection" [MQTT-3.1.3-8].

In Web MQTT, the CONNACK was not sent to the client because the Web MQTT
connection process terminated before being sending the CONNACK to the
client.
2023-02-28 10:33:57 +01:00
David Ansari bf2a97a20a Bump emqx/emqtt to 1.8.2 2023-02-21 17:25:19 +01:00
David Ansari 2dc45e8084 Fix flaky MQTT Java test
Every ~30 runs, test case `sessionRedelivery` was failing with error:
```
[ERROR] sessionRedelivery{TestInfo}  Time elapsed: 1.298 s  <<< ERROR!
org.eclipse.paho.client.mqttv3.MqttException: Client is currently disconnecting
	at com.rabbitmq.mqtt.test.MqttTest.sessionRedelivery(MqttTest.java:535)
```

The problem was that the Java client was still in connection state
`DISCONNECTING` which throws a Java exception when `connect()`ing.

So, the problem was client side.

We already check for `isConnected()` to be `false` which internally
checks for
```
conState == CONNECTED
```
However, there is no public client API to check for other connection
states. Therefore just waiting for a few milliseconds fixes the flake.
2023-02-20 10:42:18 +01:00
David Ansari b165adb958 Add timeout for test AMQP 0.9.1 connection to open
We see sporadic test failures where a test case hangs in the
receive until the Bazel suite timeout is reached.

There is no point in a test case to wait forever for an AMQP 0.9.1
connection to establish. Let's time out after 1 minute.

This will make the test case fail faster.
2023-02-16 19:25:06 +01:00
Michael Klishin 8ac0829c15
Merge pull request #7196 from rabbitmq/dialyzer-enable-Wunkown
Fix all dependencies for the dialyzer
2023-02-14 13:41:22 -03:00
David Ansari 97634e9099 Add rabbit_queue_type:is_enabled/1
This commits partially reverts 575f4e78bc

Function `rabbit_queue_type:is_enabled/1` seems to be useful for future
queue types.

See https://github.com/rabbitmq/rabbitmq-server/pull/7269#issuecomment-1429838579
2023-02-14 14:51:11 +00:00
Michael Klishin d0dc951343
Merge pull request #7058 from rabbitmq/add-node-lists-functions-to-clarify-intent
rabbit_nodes: Add list functions to clarify which nodes we are interested in
2023-02-13 23:06:50 -03:00
Michael Klishin 054381a99b
Merge pull request #7269 from rabbitmq/ff-stream_queue
Remove compatibility for feature flag stream_queue
2023-02-13 22:35:08 -03:00
Alexey Lebedeff 949b53543d Fix all dependencies for the dialyzer
This is the latest commit in the series, it fixes (almost) all the
problems with missing and circular dependencies for typing.

The only 2 unsolved problems are:

- `lg` dependency for `rabbit` - the problem is that it's the only
  dependency that contains NIF. And there is no way to make dialyzer
  ignore it - looks like unknown check is not suppressable by dialyzer
  directives. In the future making `lg` a proper dependency can be a
  good thing anyway.

- some missing elixir function in `rabbitmq_cli` (CSV, JSON and
  logging related).

- `eetcd` dependency for `rabbitmq_peer_discovery_etcd` - this one
  uses sub-directories in `src/`, which confuses dialyzer (or our bazel
  machinery is not able to properly handle it). I've tried the latest
  rules_erlang which flattens directory for .beam files, but it wasn't
  enough for dialyzer - it wasn't able to find core erlang files. This
  is a niche plugin and an unusual dependency, so probably not worth
  investigating further.
2023-02-13 17:37:44 +01:00
David Ansari 575f4e78bc Remove compatibility for feature flag stream_queue
Remove compatibility code for feature flag `stream_queue`
because this feature flag is required in 3.12.

See #7219
2023-02-13 15:31:40 +00:00
David Ansari efa56bf0cc Fix flaky test
Very rarely, the assertion failed with
```
=== Ended at 2023-02-13 13:25:52
=== Location: [{shared_SUITE,global_counters,590},
              {test_server,ts_tc,1782},
              {test_server,run_test_case_eval1,1291},
              {test_server,run_test_case_eval,1223}]
=== === Reason: {assertEqual,
                     [{module,shared_SUITE},
                      {line,590},
                      {expression,"get_global_counters ( Config , ProtoVer )"},
                      {expected,
                          #{consumers => 0,messages_confirmed_total => 2,
                            messages_received_confirm_total => 2,
                            messages_received_total => 5,
                            messages_routed_total => 3,
                            messages_unroutable_dropped_total => 1,
                            messages_unroutable_returned_total => 1,
                            publishers => 0}},
                      {value,
                          #{consumers => 1,messages_confirmed_total => 2,
                            messages_received_confirm_total => 2,
                            messages_received_total => 5,
                            messages_routed_total => 3,
                            messages_unroutable_dropped_total => 1,
                            messages_unroutable_returned_total => 1,
                            publishers => 1}}]}
  in function  shared_SUITE:global_counters/2 (shared_SUITE.erl, line 590)
  in call from test_server:ts_tc/3 (test_server.erl, line 1782)
  in call from test_server:run_test_case_eval1/6 (test_server.erl, line 1291)
  in call from test_server:run_test_case_eval/9 (test_server.erl, line 1223)
```

The DISCONNECT packet is sent one-way from client to server.
2023-02-13 14:14:14 +00:00
Jean-Sébastien Pédron d65637190a
rabbit_nodes: Add list functions to clarify which nodes we are interested in
So far, we had the following functions to list nodes in a RabbitMQ
cluster:
* `rabbit_mnesia:cluster_nodes/1` to get members of the Mnesia cluster;
  the argument was used to select members (all members or only those
  running Mnesia and participating in the cluster)
* `rabbit_nodes:all/0` to get all members of the Mnesia cluster
* `rabbit_nodes:all_running/0` to get all members who currently run
  Mnesia

Basically:
* `rabbit_nodes:all/0` calls `rabbit_mnesia:cluster_nodes(all)`
* `rabbit_nodes:all_running/0` calls `rabbit_mnesia:cluster_nodes(running)`

We also have:
* `rabbit_node_monitor:alive_nodes/1` which filters the given list of
  nodes to only select those currently running Mnesia
* `rabbit_node_monitor:alive_rabbit_nodes/1` which filters the given
  list of nodes to only select those currently running RabbitMQ

Most of the code uses `rabbit_mnesia:cluster_nodes/1` or the
`rabbit_nodes:all*/0` functions. `rabbit_mnesia:cluster_nodes(running)`
or `rabbit_nodes:all_running/0` is often used as a close approximation
of "all cluster members running RabbitMQ". This list might be incorrect
in times where a node is joining the clustered or is being worked on
(i.e. Mnesia is running but not RabbitMQ).

With Khepri, there won't be the same possible approximation because we
will try to keep Khepri/Ra running even if RabbitMQ is stopped to
expand/shrink the cluster.

So in order to clarify what we want when we query a list of nodes, this
patch introduces the following functions:
* `rabbit_nodes:list_members/0` to get all cluster members, regardless
  of their state
* `rabbit_nodes:list_reachable/0` to get all cluster members we can
  reach using Erlang distribution, regardless of the state of RabbitMQ
* `rabbit_nodes:list_running/0` to get all cluster members who run
  RabbitMQ, regardless of the maintenance state
* `rabbit_nodes:list_serving/0` to get all cluster members who run
  RabbitMQ and are accepting clients

In addition to the list functions, there are the corresponding
`rabbit_nodes:is_*(Node)` checks and `rabbit_nodes:filter_*(Nodes)`
filtering functions.

The code is modified to use these new functions. One possible
significant change is that the new list functions will perform RPC calls
to query the nodes' state, unlike `rabbit_mnesia:cluster_nodes(running)`.
2023-02-13 12:58:40 +01:00
David Ansari 700c122f9e Add test that MQTT ingores configured default queue type
The queue type being created for MQTT connections is solely determined
by the rabbitmq_mqtt plugin, not by per vhost defaults.

If the per vhost default queue type is configured to be a quorum queue,
we still want to create classic queues for MQTT connections.
2023-02-09 19:18:34 +01:00
David Ansari bd50a41d67 Decrease MQTT mailbox_soft_limit
Let's decrease the mailbox_soft_limit from 1000 to 200.
Obviously, both values are a bit arbitrary.
However, MQTT workloads usually do not have high throughput patterns for
a single MQTT connection. The only valid scenario where an MQTT
connections' process mailbox could have many messages is in large fan-in
scenarios where many MQTT devices sending messages at once to a single MQTT
device - which is rather unusual.

It makes more sense to protect against cluster wide memory alarms by
decreasing the mailbox_soft_limit.
2023-02-09 19:18:34 +01:00
Jean-Sébastien Pédron ddc5b3ee0a
Merge pull request #7228 from rabbitmq/mqtt-ff
Remove mixed version check in MQTT tests
2023-02-09 17:36:58 +01:00
David Ansari 3cd7d80d04 Enable feature flag in test case
Always enable feature flag rabbit_mqtt_qos0_queue
in test case rabbit_mqtt_qos0_queue_overflow because this test case does
not make sense without the mqtt_qos0 queue type.

Note that enabling the feature flag should always succeed because this
test case runs on a single node, and therefore on a new version in mixed
version tests.
2023-02-09 11:17:52 +00:00
David Ansari 91b56bd85d Remove mixed version check in MQTT tests
In the MQTT test assertions, instead of checking whether the test runs
in mixed version mode where all non-required feature flags are disabled
by default, check whether the given feature flag is enabled.

Prior to this commit, once feature flag rabbit_mqtt_qos0_queue becomes
required, the test cases would have failed.
2023-02-09 10:53:50 +00:00
David Ansari db107500c5 Remove compatibility code for management agent feature flags
Remove compatibility code for feature flags
* drop_unroutable_metric
* empty_basic_get_metric

since they are required in 3.12.0.

See https://github.com/rabbitmq/rabbitmq-server/pull/7219
2023-02-09 09:53:46 +00:00
David Ansari 327e3da8cb Run MQTT maintenance test in mixed version cluster
Nowadays, the old RabbitMQ nodes in mixed version cluster
tests on `main` branch run in version 3.11.7.

Since maintenance mode was wrongly closing cluster-wide MQTT connections
only in RabbitMQ <3.11.2 (and <3.10.10), we can re-enable this mixed
version test.
2023-02-08 16:26:10 +01:00
David Ansari 3432dcb161 Shard shared_SUITE
such that MQTT and WebMQTT tests of the shared_SUITE can run in parallel.
Before this commit, the shared_SUITE runs 14 minutes, after this commit
the shared_SUITE runs 4 minutes in GitHub actions.
2023-02-07 16:36:08 +01:00
David Ansari 146570df5e Delete AMQP 0.9.1 header x-mqtt-dup
AMQP 0.9.1 header x-mqtt-dup was determined by the incoming MQTT PUBLISH
packet's DUP flag. Its only use was to determine the outgoing MQTT
PUBLISH packet's DUP flag. However, that's wrong behaviour because
the MQTT 3.1.1 protocol spec mandates:
"The value of the DUP flag from an incoming PUBLISH packet is not
propagated when the PUBLISH Packet is sent to subscribers by the Server.
The DUP flag in the outgoing PUBLISH packet is set independently to the
incoming PUBLISH packet, its value MUST be determined solely by whether
the outgoing PUBLISH packet is a retransmission."
[MQTT-3.3.1-3]

Native MQTT fixes this wrong behaviour. Therefore, we can delete this
AMQP 0.9.1 header.
2023-02-07 16:36:08 +01:00
David Ansari 5f6a1f96ca Remove unused NodeId parameter 2023-02-07 16:36:08 +01:00
David Ansari bec8f9a21c Support topic variable expansion for vhost and username
Native MQTT introduced a regression where the "{username}" and "{vhost}"
variables were not expanded in permission patterns.

This regression was unnoticed because the java_SUITE's
topicAuthorisationVariableExpansion test was wrongfully passing because
its topic started with "test-topic" which matched another allow listed
topic (namely "test-topic") instead of the pattern
"{username}.{client_id}.a".
This other java_SUITE regression got introduced by commit
26a17e8530

This commit fixes both the buggy Java test and the actual regression
introduced in Native MQTT.
2023-02-07 16:36:08 +01:00
David Ansari 1ba4823495 Delete unused files 2023-02-07 16:36:08 +01:00
David Ansari 804777f575 Explicitly match `maybe` return values 2023-02-07 16:36:08 +01:00
David Ansari 79c12b60bc Use maybe expression instead of messy patterns
This commit is pure refactoring making the code base more maintainable.

Replace rabbit_misc:pipeline/3 with the new OTP 25 experimental maybe
expression because
"Frequent ways in which people work with sequences of failable
operations include folds over lists of functions, and abusing list
comprehensions. Both patterns have heavy weaknesses that makes them less
than ideal."
https://www.erlang.org/eeps/eep-0049#obsoleting-messy-patterns

Additionally, this commit is more restrictive in the type spec of
rabbit_mqtt_processor state fields.
Specifically, many fields were defined to be `undefined | T` where
`undefined` was only temporarily until the first CONNECT packet was
processed by the processor.
It's better to initialise the MQTT processor upon first CONNECT packet
because there is no point in having a processor without having received
any packet.
This allows many type specs in the processor to change from `undefined |
T` to just `T`.
Additionally, memory is saved by removing the `received_connect_packet`
field from the `rabbit_mqtt_reader` and `rabbit_web_mqtt_handler`.
2023-02-07 16:36:08 +01:00
David Ansari 67a428a913 Add API functions to behaviour module
Include API functions to the rabbit_mqtt_retained_msg_store
behaviour module.

"There is a best practice to have the behaviour module include
the API also as it helps other parts of the code to be correct
and a bit more dialyzable."

This commit also fixes a bug where the retainer process had only
60 milliseconds shutdown time before being unconditionally killed.
60 milliseconds can be too short to dump a large ETS table containing
many retained messages to disk.
2023-02-07 15:42:16 +01:00
David Ansari 2d0826c335 Add OAuth 2.0 MQTT system test
Add a test that rabbitmq_auth_backend_oauth2 works with MQTT.

See https://github.com/rabbitmq/rabbitmq-oauth2-tutorial#mqtt-protocol
2023-02-03 14:08:51 +00:00
David Ansari 0706cad7a1 Fix pid tracking when decommissioning an MQTT node
Prior to this commit test `deps.rabbitmq_mqtt.cluster_SUITE`
`connection_id_tracking_with_decommissioned_node` was flaky and sometimes
failed with
```
{cluster_SUITE,connection_id_tracking_with_decommissioned_node,160}
{test_case_failed,failed to match connection count 0}
```
2023-02-02 19:33:29 +00:00
Jean-Sébastien Pédron 9a99480bc9
Merge pull request #6821 from rabbitmq/rabbit-db-modules
Move missing Mnesia-specific code to rabbit_db_* modules
2023-02-02 15:40:11 +01:00
Diana Parra Corbacho 9cf10ed8a7 Unit test rabbit_db_* modules, spec and API updates 2023-02-02 15:01:42 +01:00
Chunyi Lyu 6205e5e5e3 MQTT print stacktrace/payload only at debug level 2023-01-31 10:36:08 +00:00
David Ansari cbb389bb2a Remove MQTT processor field peer_addr
as it seems to always match peer_host.

Commit 7e09b85426 adds peer address
provided by WebMQTT plugin.

However, this seems unnecessary since function rabbit_net:peername/1 on
the unwrapped socket provides the same address.

The peer address was the address of the proxy if the proxy protocol is
enabled.

This commit simplifies code and reduces memory consumption.
2023-01-30 12:17:19 +00:00
David Ansari 02cf072ae4 Restrict MQTT CONNECT packet size
In MQTT 3.1.1, the CONNECT packet consists of
1. 10 bytes variable header
2. ClientId (up to 23 bytes must be supported)
3. Will Topic
4. Will Message (maximum length 2^16 bytes)
5. User Name
6. Password

Restricting the CONNECT packet size to 2^16 = 65,536 bytes
seems to be a reasonalbe default.

The value is configurable via the MQTT app parameter
`max_packet_size_unauthenticated`.

(Instead of being called `max_packet_size_connect`) the
name `max_packet_size_unauthenticated` is generic
because MQTT 5 introduces an AUTH packet type.
2023-01-29 15:00:19 +00:00
Chunyi Lyu 209f23fa2f
Revert "Format MQTT code with `erlfmt`" 2023-01-27 18:25:57 +00:00
Chunyi Lyu 1de9fcf582 Format mqtt files with erlfmt 2023-01-27 11:06:41 +00:00
Chunyi Lyu 50e25778bb Adding missing function specs 2023-01-25 19:34:24 +00:00
David Ansari 1f106fcd98 Fix wrong and add missing type specs 2023-01-25 17:13:54 +00:00
David Ansari ec137bc783 Add nested config record for rarely changing fields
When a single field in a record is updated, all remaining
fields' pointers are copied. Hence, if the record is large,
a lot will be copied.

Therefore, put static or rarely changing fields into their own record.

The same was done for the state in rabbit_channel or rabbit_fifo
for example.

Also, merge #info{} record into the new #cfg{} record.
2023-01-25 15:59:35 +00:00
David Ansari 9c2f5975ea Support tracing in Native MQTT 2023-01-24 17:32:59 +00:00
Chunyi Lyu 6cc65ecceb Export opaque types for event and mqtt_packet state 2023-01-24 17:32:59 +00:00
David Ansari 9db8626abf Re-enable dialyzer option Wunmatched_returns 2023-01-24 17:32:59 +00:00
David Ansari 8a2a82e19b Remove feature flag no_queue_name_in_classic_queue_client
as it was unnecessary to introduce it in the first place.

Remove the queue name from all queue type clients and pass the queue
name to the queue type callbacks that need it.

We have to leave feature flag classic_queue_type_delivery_support
required because we removed the monitor registry
1fd4a6d353/deps/rabbit/src/rabbit_queue_type.erl (L322-L325)

Implements review from Karl:
"rather than changing the message format we could amend the queue type
callbacks involved with the stateful operation to also take the queue
name record as an argument. This way we don't need to maintain the extra
queue name (which uses memory for known but obscurely technical reasons
with how maps work) in the queue type state (as it is used in the queue
type state map as the key)"
2023-01-24 17:32:59 +00:00
David Ansari cd8962b5fd Remove optional rabbit_queue_type callbacks
Instead of having optional rabbit_queue_type callbacks, add stub
implementations to rabbit_mqtt_qos0_queue throwing an exception.
The exception uses erlang:error/2 including stack trace and arguments
of the unsupported functions to ease debugging in case these functions
were ever to be called.

Dialyzer suppressions are added for these functions such that dialyzer
won't complain about:
```
rabbit_mqtt_qos0_queue.erl:244:1: Function init/1 only terminates with explicit exception
```
2023-01-24 17:32:59 +00:00
Chunyi Lyu d86ce70fd6 Add missing type definitions in mqtt records 2023-01-24 17:32:59 +00:00
David Ansari d4cfbddd35 Parse at most maximum packet length of 256MB
"This allows applications to send Control Packets of size up to
268,435,455 (256 MB)."
http://docs.oasis-open.org/mqtt/mqtt/v3.1.1/os/mqtt-v3.1.1-os.html#_Toc398718023
2023-01-24 17:32:59 +00:00
David Ansari 3f85f8d904 Do not log message payload
because it might contain large output.
2023-01-24 17:32:59 +00:00
David Ansari 437cbb73be Use delegate for stateless deliveries
For example when at-most-once dead lettering does a fan out to many
target classic queues this commit will reduce inter-node data traffic by
using delegate.
2023-01-24 17:32:59 +00:00
David Ansari 63ccf3ea3b Reduce inter-node traffic for MQTT QoS 0 queue type
Use delegate.

For large fan-outs with medium to large message size,
this commit will reduce inter-node data traffic by
multiple orders of magnitude preventing busy distribution
ports.
2023-01-24 17:32:59 +00:00
David Ansari a4db85de0d Make pipeline fail when there are dialyzer warnings
We want the build to fail if there are any dialyzer warnings in
rabbitmq_mqtt or rabbitmq_web_mqtt. Otherwise we rely on people manually
executing and checking the results of dialyzer.

Also, we want any test to fail that is flaky.
Flaky tests can indicate subtle errors in either test or program execution.
Instead of marking them as flaky, we should understand and - if possible -
fix the underlying root cause.

Fix OTP 25.0 dialyzer warning

Type gen_server:format_status() is known in OTP 25.2, but not in 25.0
2023-01-24 17:32:59 +00:00
David Ansari dcb00f1324 Reduce test load
Run test either on a RabbitMQ cluster of size 1 or size 3.
Running a test on both cluster sizes does not result in higher test
coverage.
This puts less pressure on Buildbuddy and reduces overall test
execution time.
2023-01-24 17:32:59 +00:00
David Ansari b5febb8a08 Hold subsriptions in process state
Although the first element (destination queue) of the compound key
in rabbit_reverse_route is provided, scaling tests with million of
subscribers have shown via timer:tc/1 that the mnesia:match_object/3
query often takes > 100 microseconds, sometimes even a few milliseconds.
So, querying bindings is not super expensive, but moderately expensive
when done many times concurrenlty.

Espcecially when mass diconnecting millions of clients, `msacc` showed
that all schedulers were 80% busy in `ets`.

To put less pressure on the CPU, in this commit we rather decide to
slightly increase memory usage.

When first connecting a client, we only query bindings and cache them in
the process state if a prior session for the same client is present.
Thereafter, bindings are not queried again.
2023-01-24 17:32:59 +00:00
David Ansari f46f0541ea Fix "Clean Session" state for QoS 0 subscriptions
Up to RabbitMQ 3.11, the following bug existed.

The MQTT 3.1.1. protocol spec mandates:
```
The Session state in the Server consists of:
* The Client’s subscriptions.
* ...
```

However, because QoS 0 queues were auto-delete up to 3.11 (or exclusive
prior to this commit), QoS 0 queues and therefore their bindings were
deleted when a non-clean session terminated.

When the same client re-connected, its QoS 0 subscription was lost.
Note that storing **messages** for QoS 0 subscription is optional while the
client is disconnected. However, storing the subscription itself (i.e.
bindings in RabbitMQ terms) is not optional: The client must receive new
messages for its QoS 0 subscriptions after it reconnects without having
to send a SUBSCRIBE packet again.

"After the disconnection of a Session that had CleanSession set to 0,
the Server MUST store further QoS 1 and QoS 2 messages that match any
subscriptions that the client had at the time of disconnection as part
of the Session state [MQTT-3.1.2-5]. It MAY also store QoS 0 messages
that meet the same criteria."
This commit additionally implements the last sentence.
2023-01-24 17:32:59 +00:00
David Ansari 863b7ea16a Include non-AMQP connections in connection count
Prior to this commit:
```
rabbitmqctl status
...
Totals

Connection count: 0
Queue count: 64308
Virtual host count: 1
...
```
only counted AMQP connections, but did not include MQTT or stream
connections.

Let's include the count of all connections in the output of
`rabbitmqctl status`.
2023-01-24 17:32:59 +00:00
David Ansari 35afeffceb Eliminate bindings query CPU bottleneck
Prior to this commit, there was a CPU bottleneck (not present in 3.11.x)
when creating, deleting or disconnecting many MQTT subscribers.

Example:
Add 120 MQTT connections per second each creating a subscription.
Starting at around 300k MQTT subscribers, all 45 CPUs on the server were
maxed out spending time in `ets` according to msacc.

When running a similar workload with only 30k MQTT subscribers on a
local server with only 5 CPUs, all 5 CPUs were maxed out and the CPU
flame graph showed that 86% of the CPU time is spent in function
rabbit_mqtt_processor:topic_names/2.

This commit uses the rabbit_reverse_route table to query MQTT
subscriptions for a given client ID. CPU usage is now drastically lower.

The configured source topic exchange is always the same in the MQTT
plugin. There is however a high cardinality in the destination queues
(MQTT client IDs) and routing keys (topics).
2023-01-24 17:32:59 +00:00
David Ansari a341912b75 Expand clean_session=false test 2023-01-24 17:32:59 +00:00
David Ansari 9283b4f4f6 Add test AMQP 0.9.1 to MQTT with QoS 0 2023-01-24 17:32:59 +00:00
David Ansari fb93a3c17d Block only publishing (Web) MQTT connections
When a cluster wide memory or disk alarm is fired, in AMQP 0.9.1 only
connections that are publishing messages get blocked.
Connections that only consume can continue to empty the queues.

Prior to this commit, all MQTT connections got blocked during a memory
or disk alarm. This has two downsides:
1. MQTT connections that only consume, but never published, cannot empty
   queues anymore.
2. If the memory or disk alarm takes long, the MQTT client does not
   receive a PINGRESP from the server when it sends a PINGREQ potentially
   leading to mass client disconnection (depending on the MQTT client
   implementation).

This commit makes sure that an MQTT connection that never sent a single
PUBLISH packet (e.g. "pure" MQTT subscribers) are not blocked during
memory or disk alarms.

In contrast to AMQP 0.9.1, new connections are still blocked from being
accepted because accepting (many) new MQTT connections also lead to
increased resource usage.

The implemenation as done in this commit is simpler, but also more naive
than the logic in rabbit_reader: rabbit_reader blocks connections more
dynamically whereas rabbit_mqtt_reader and rabbit_web_mqtt_handler
block a connection if the connection ever sent a single PUBLISH packet
during its lifetime.
2023-01-24 17:32:59 +00:00
David Ansari 6ba2dc4afc Switch to Logger macros
Convert from the old rabbit_log* API to the new Logger macros for MQTT
and Web MQTT connections.

Advantages:
* metadata mfa, file, line, pid, gl, time is auto-inserted by Logger.
* Log lines output by the shared module rabbit_mqtt_processor now
  include via the domain whether it's a MQTT or Web MQTT connection.

Instead of using domain [rabbitmq, connection], this commit now uses
the smaller and more specialized domains [rabbitmq, connection, mqtt] and
[rabbitmq, connection, web_mqtt] for MQTT and Web MQTT processes
respectively, resulting in the following example output:
"msg":"Received a CONNECT,", "domain":"rabbitmq.connection.mqtt"
or
"msg":"Received a CONNECT,", "domain":"rabbitmq.connection.web_mqtt"
2023-01-24 17:32:59 +00:00
David Ansari d651f87ea7 Share tests between MQTT and Web MQTT
New test suite deps/rabbitmq_mqtt/test/shared_SUITE contains tests that
are executed against both MQTT and Web MQTT.

This has two major advantages:
1. Eliminates test code duplication between rabbitmq_mqtt and
rabbitmq_web_mqtt making the tests easier to maintain and to understand.
2. Increases test coverage of Web MQTT.

It's acceptable to add a **test** dependency from rabbitmq_mqtt to
rabbitmq_web_mqtt. Obviously, there should be no such dependency
for non-test code.
2023-01-24 17:32:59 +00:00
David Ansari 7c1aa49361 Increase MQTT test coverage and fix edge cases 2023-01-24 17:32:59 +00:00
David Ansari c9df098f5c Handle topic, username, password as binaries
Topic, username, and password are parsed as binaries.
Storing topics as lists or converting between
lists and binaries back and forth several times is
unnecessary and expensive.
2023-01-24 17:32:59 +00:00
David Ansari fb6c8da2fc Block Web MQTT connection if memory or disk alarm
Previously (until RabbitMQ v3.11.x), a memory or disk alarm did
not block the Web MQTT connection because this feature was only
implemented half way through: The part that registers the Web MQTT
connection with rabbit_alarm was missing.
2023-01-24 17:32:59 +00:00
David Ansari a8b69b43c1 Fix dialyzer issues and add function specs
Fix all dialyzer warnings in rabbitmq_mqtt and rabbitmq_web_mqtt.

Add more function specs.
2023-01-24 17:32:58 +00:00
David Ansari 1720aa0e75 Allow CLI listing rabbit_mqtt_qos0_queue queues 2023-01-24 17:30:10 +00:00
David Ansari 56e97a9142 Fix MQTT in management plugin
1. Allow to inspect an (web) MQTT connection.
2. Show MQTT client ID on connection page as part of client_properties.
3. Handle force_event_refresh (when management_plugin gets enabled
   after (web) MQTT connections got created).
4. Reduce code duplication between protocol readers.
5. Display '?' instead of 'NaN' in UI for absent queue metrics.
6. Allow an (web) MQTT connection to be closed via management_plugin.

For 6. this commit takes the same approach as already done for the stream
plugin:
The stream plugin registers neither with {type, network} nor {type,
direct}.
We cannot use gen_server:call/3 anymore to close the connection
because the web MQTT connection cannot handle gen_server calls (only
casts).
Strictly speaking, this commit requires a feature flag to allow to force
closing stream connections from the management plugin during a rolling
update. However, given that this is rather an edge case, and there is a
workaround (connect to the node directly hosting the stream connection),
this commit will not introduce a new feature flag.
2023-01-24 17:30:10 +00:00
Chunyi Lyu cb68e4866e Resolve some dialyzer issues
- mqtt processor publish_to_queue/2 is called in
process_request(?PUBLISH,_, _) and maybe_send_will/3. In
both places #mqtt_msg{} is initialized with value so it will
never be 'undefined'.
- all possible value are already matched in mqtt_processor
human_readable_vhost_lookup_strategy/1; deleted the unneeded
catch all function clause.
- Removed a unnecessary case matching in mqtt_reader init/1.
Return values for 'rabbit_net:connection_string' are {ok, _} or
{error, _}. {'network_error', Reason} will never match.
- Fix function spec for mqtt_util gen_client_id/1. Return type of
rabbit_misc:base64url is string(), not binary().
2023-01-24 17:30:10 +00:00
Chunyi Lyu 4fa8e830ad Allow undefined in some mqtt record type fields
- to get rid of dialyzer warnings like "Record construction...
violates the declared type of field XYZ"
2023-01-24 17:30:10 +00:00
Chunyi Lyu 340e930d28 Web mqtt returns 1002 with mqtt parsing error
- it is a mqtt protocol error
2023-01-24 17:30:10 +00:00
Chunyi Lyu 4ca12b767a Fix func spec for mqtt process_request
- it also returns {stop, disconnect, state()} when receiving
a disconnect packet
- remove match for a {timeout, _} return when calling register_client.
register_client only returns {ok, _} and {error, _} according to its
function spec
2023-01-24 17:30:10 +00:00
Chunyi Lyu fb913009c4 Add func specs for mqtt process_packet and process_request
- removed return matching for {error, Error} when calling process_packet
because that's not the return type
2023-01-24 17:30:10 +00:00
David Ansari bd0acb33e4 Remove test helper util:connect_to_node/3
because this method is superfluous given that util:connect
already exists.
2023-01-24 17:30:10 +00:00
David Ansari 97fefff0fe Add overflow drop-head to rabbit_mqtt_qos_queue type
Traditionally, queue types implement flow control by keeping state in
both sending and receiving Erlang processes (for example credit based flow
control keeps the number of credits within the process dictionary).

The rabbit_mqtt_qos0_queue cannot keep such state in sending or receiving
Erlang process because such state would consume a large amount of memory
in case of large fan-ins or large fan-outs.
The whole idea of the rabbit_mqtt_qos_queue type is to not keep any
state in the rabbit_queue_type client. This makes this new queue
type scale so well.

Therefore the new queue type cannot (easily) implement flow control
throttling individual senders.

In this commit, we take a different approach:
Instead of implementing flow control throttling individual senders,
the receiving MQTT connection process drops QoS 0 messages from the
rabbit_mqtt_qos_queue if it is overflowed with messages AND its MQTT
client is not capable of receiving messages fast enough.

This is a simple and sufficient solution because it's better to drop QoS
0 (at most once) messages instead of causing cluster-wide memory alarms.

The user can opt out of dropping messages by setting the new env setting
mailbox_soft_limit to 0.

Additionally, we reduce the send_timeout from 30 seconds default in
Ranch to 15 seconds default in MQTT. This will detect hanging MQTT
clients faster causing the MQTT connection to be closed.
2023-01-24 17:30:10 +00:00
Chunyi Lyu de28560d8f Extract connect to node helper in rmq mqtt tests 2023-01-24 17:30:10 +00:00
Chunyi Lyu aea7ff8f8d Use helper to connect to node in mqtt cluster suite 2023-01-24 17:30:10 +00:00
David Ansari 61f6ca7b66 Support iodata() when sending message to MQTT client
When the MQTT connection receives an AMQP 0.9.1 message, it will contain
a list of payload fragments.

This commit avoids the expensive operation of turning that list into a binary.

All I/O methods accept iodata():
* erlang:port_command/2
* ssl:send/2
* In Web MQTT, cowboy websockets accept iodata():
0d04cfffa3/src/cow_ws.erl (L58)
2023-01-24 17:30:10 +00:00
David Ansari 15636fdb90 Rename frame to packet
The MQTT protocol specs define the term "MQTT Control Packet".
The MQTT specs never talk about "frame".

Let's reflect this naming in the source code since things get confusing
otherwise:
Packets belong to MQTT.
Frames belong to AMQP 0.9.1 or web sockets.
2023-01-24 17:30:10 +00:00
David Ansari 3980c28596 Allow higher load on Mnesia by default
Prior to this commit, when connecting or disconnecting many thousands of
MQTT subscribers, RabbitMQ printed many times:
```
[warning] <0.241.0> Mnesia('rabbit@mqtt-rabbit-1-server-0.mqtt-rabbit-1-nodes.default'): ** WARNING ** Mnesia is overloaded: {dump_log,write_threshold}
```

Each MQTT subscription causes queues and bindings to be written into Mnesia.

In order to allow for higher Mnesia load, the user can configure
```
[
 {mnesia,[
  {dump_log_write_threshold, 10000}
 ]}
].
```
in advanced.config

or set this value via
```
RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="-mnesia dump_log_write_threshold 10000"
```

The Mnesia default for dump_log_write_threshold is 1,000.
The Mnesia default for dump_log_time_threshold is 180,000 ms.

It is reasonable to increase the default for dump_log_write_threshold from
1,000 to 5,000 and in return decrease the default dump_log_time_threshold
from 3 minutes to 1.5 minutes.
This way, users can achieve higher MQTT scalability by default.

This setting cannot be changed at Mnesia runtime, it needs to be set
before Mnesia gets started.
Since the rabbitmq_mqtt plugin can be enabled dynamically after Mnesia
started, this setting must therefore apply globally to RabbitMQ.

Users can continue to set their own defaults via advanced.config or
RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS. They continue to be respected
as shown by the new test suite included in this commit.
2023-01-24 17:30:10 +00:00
David Ansari 86de0a1557 Reduce memory usage of MQTT connection process
by removing the two fields referencing a function:
mqtt2amqp_fun and amqp2mqtt_fun

Each field required 1 words + 11 words for the function reference.

Therefore, for 1 million MQTT connections this commit saves:
    (1+11) * 2 * 1,000,000 words
    = 192 MB of memory

In addition, the code is now simpler to understand.

There is no change in behaviour except for the sparkplug environment
variable being read upon application start.

We put the compiled regexes into persistent term because they are the
same for all MQTT connections.
2023-01-24 17:30:10 +00:00
David Ansari 7782142020 Reduce memory usage of reader processes
Reduce memory usage of (Web) MQTT connection process and STOMP reader
process by storing the connection name as binary instead of string.

Previously:
82 = erts_debug:size("192.168.2.104:52497 -> 192.168.2.175:1883").

The binary <<"192.168.2.104:52497 -> 192.168.2.175:1883">>
requires 8 words.

So, for 1 million MQTT connections, this commit should save
    (82 - 8) words * 1,000,000
    = 592 MB
of memory.
2023-01-24 17:30:10 +00:00
Chunyi Lyu 46e8a65d96 Check if state.stats_timer is undefined to avoid crashing
- if #state.stats_timer is undefined, rabbit_event:if_enabled crashes
- remove compression related TODO from web_mqtt. It's a intentional
default behavior set in: https://github.com/rabbitmq/rabbitmq-web-mqtt/pull/35
2023-01-24 17:30:10 +00:00
David Ansari f842ffd250 Add feature flag rabbit_mqtt_qos0_queue
Routing to a queue of type rabbit_mqtt_qos0_queue hosted on
a remote node requires knowledge of that queue type on the
local node.
2023-01-24 17:30:10 +00:00
Chunyi Lyu 30a9ea521e Use connect helper func in more mqtt tests
- reduce code duplication
- connect helper does not unlink the connection process by default
2023-01-24 17:30:10 +00:00
David Ansari 1493cbe13d Rename message_id to packet_id
MQTT spec only talks about "Packet Identifier",
but never about "Message Identitier".

RabbitMQ has message identifiers (for example the classic queue store
uses message identifiers to uniquely identify internal messages).

So, let's not confuse these two terms and be specific.
2023-01-24 17:30:10 +00:00
David Ansari 7bc8208a1b Remove local record definitions from header files
Record #state{} is purely local to rabbit_mqtt_reader.
Record #proc_state{} is purely local to rabbit_mqtt_processor.

Therefore, move these record definitions to the defining module.
This avoids unnecessarily exposing internal information.

Now, that #proc_state{} is defined in rabbit_mqtt_processor,
rename #proc_state to #state{}.
2023-01-24 17:30:10 +00:00
David Ansari 6815ceb54a Fix mixed version reader_SUITE will test 2023-01-24 17:30:10 +00:00
David Ansari 76f4598d92 Send last will if client did not DISCONNECT
"The Will Message MUST be published when the Network Connection is subsequently
closed unless the Will Message has been deleted by the Server on receipt of a
DISCONNECT Packet [MQTT-3.1.2-8].
Situations in which the Will Message is published include, but are not limited to:
•	An I/O error or network failure detected by the Server.
•	The Client fails to communicate within the Keep Alive time.
•	The Client closes the Network Connection without first sending a DISCONNECT Packet.
•	The Server closes the Network Connection because of a protocol error.
"

Prior to this commit, the will message was not sent in all scenarios
where it should have been sent.

In this commit, the will message is always sent unless the client sent a
DISCONNECT packet to the server.

We achieve this by sending the will message in the terminate callback.

Note that the Reason passed into the terminate callback of
rabbit_web_mqtt_handler is the atom 'stop' (that is, we cannot pass a custom
reason here).

Therefore, in order to know within the terminate callback whether the client
sent a DISCONNECT packet, we have to modify the process state.
Rather than including a new field into the process state record which requires
1 additional word per MQTT connection (i.e. expensive with millions of
MQTT connection processes - we want to keep the process state small),
we intead modify the state just before stopping the process to
{SendWill, State}.
2023-01-24 17:30:10 +00:00
David Ansari 14b3b93b25 Make ff_SUITE less flaky 2023-01-24 17:30:10 +00:00
David Ansari 65bc0c395b Fix global counters
Prior to this commit messages_delivered for queue_type_qos0 is
wrongfully incremented if clean session is false

Also, delete duplicate code.
2023-01-24 17:30:10 +00:00
Chunyi Lyu 075bc06623 Handle messages_delivered_consume_*_ack counters at delivery
- increment messages_delivered_consume_auto_ack if subscribe queue
is type mqtt_qos0 queue or if publising QoS is 0
- increment messages_delivered_consume_manual_ack if both publising
and subcribe are QoS 1
- increment messages_acknowledged at queue_type:settle()
2023-01-24 17:30:10 +00:00
David Ansari 16fa12244e Avoid exceptions in mixed version cluster
1. Avoid following exceptions in mixed version clusters when new MQTT
   connections are created:
```
{{exception,{undef,[{rabbit_mqtt_util,remove_duplicate_clientid_connections,
                                      [{<<"/">>,
                                        <<"publish_to_all_queue_types">>},
                                       <0.1447.0>],
                                      []}]}},
 [{erpc,execute_cast,3,[{file,"erpc.erl"},{line,621}]}]}
```
If feature flag delete_ra_cluster_mqtt_node is disabled, let's still
populate pg with MQTT client IDs such that we don't have to migrate them
from the Ra cluster to pg when we enable the feature flag.
However, for actually closing duplicate MQTT client ID connections, if
that feature flag is disabled, let's rely on the Ra cluster to take care
of it.

2. Write a test ensuring the QoS responses are in the right order when a
   single SUBSCRIBE packet contains multiple subscriptions.
2023-01-24 17:30:10 +00:00
David Ansari 573934259a Consume from queue once
Each MQTT connection consumes from its queue at most once
(unless when failing over).
2023-01-24 17:30:10 +00:00
Chunyi Lyu c3779d9996 Implement message consuming counters in mqtt 2023-01-24 17:30:10 +00:00
David Ansari 6e527fb940 Replace existing subscription
"If a Server receives a SUBSCRIBE Packet containing a Topic Filter
that is identical to an existing Subscription’s Topic Filter then
it MUST completely replace that existing Subscription with a new
Subscription. The Topic Filter in the new Subscription will be
identical to that in the previous Subscription, although its
maximum QoS value could be different. Any existing retained
messages matching the Topic Filter MUST be re-sent, but the flow
of publications MUST NOT be interrupted [MQTT-3.8.4-3]."
2023-01-24 17:30:10 +00:00
David Ansari 0ba0a6e8f8 Several small improvements
1. The mqtt_qos0 queue type uses now QName in the delivery.
This makes the code simpler although it might be a bit less efficient
because the tuple containing binaries is sent around and a hash is
computed within rabbit_queue_type:module/2

2. Do not construct a new binary on every PUBACK. This is expensive with
   many PUBACKs per second. Instead, we store the QoS1 queue name in the
   process state (but only if the connection also consumes from that
   queue).

3. To make the code more readable, and less specialised, always handle
   queue actions when we call rabbit_queue_type:settle/5.
   This method only returns an action (delivery) when settling to the stream
   queue, which the MQTT plugin never does because an MQTT connection
   does not consume from a stream. It's not expensive at all to handle
   an empty list of queue actions.
2023-01-24 17:30:10 +00:00
David Ansari b2c87c59a0 Minor reformatting and renaming 2023-01-24 17:30:10 +00:00
David Ansari e06d3e7069 Unblock queue when it gets deleted
When a quorum queue or stream gets deleted while the MQTT connection
process (or channel) is blocked by that deleted queue due to soft limit
being exceeded, unblock that queue.

In this commit, an unblock action is also returned with the eol.
2023-01-24 17:30:09 +00:00
Chunyi Lyu 80f8e0754f Implement consumer global counter for clean sess false
- remove has_subs from proc state; query datebase to check
if a connection has subscription or not
2023-01-24 17:29:08 +00:00
David Ansari 016451ee87 Reset application env in MQTT flow tests
so that they run independently of other tests
2023-01-24 17:29:07 +00:00
Chunyi Lyu 0b43f002f5 Remove subscriptions map from proc state in mqtt
- subscriptions information can be retrieved directly from mnesia
- when unsubscribe, we check if there is binding between topic name
and queue (check for both qos0 queue name and qos1 queue name) to
unbind
- added a boolean value has_subs in proc state which will indicate
if connection has any active subscriptions. Used for setting consumer
global counter
2023-01-24 17:29:07 +00:00
David Ansari aad7e1cdf6 Add test for consuming MQTT classic queue going down 2023-01-24 17:29:07 +00:00
David Ansari bda52dbf64 Support consuming classic mirrored queue failover
Some users use classic mirrored queues for MQTT queues by
applying a policy.

Given that classic mirrored queues are deprecated, but still supported
in RabbitMQ 3.x, native MQTT must support classic mirrored queues.
2023-01-24 17:29:07 +00:00
David Ansari b97006c4b9 Output username in connection closed event 2023-01-24 17:29:07 +00:00
David Ansari 16b5ec5659 Add missing unblock stream queue action
Test flow control for all queue types.
2023-01-24 17:29:07 +00:00
David Ansari 7fc2234117 Test ETS and NOOP retained message stores 2023-01-24 17:29:07 +00:00
David Ansari 6533532039 Simplify counters
by storing mqtt310 and mqtt311 atoms directly in the processor state.
2023-01-24 17:29:07 +00:00
David Ansari ab5007a53b Handle queue deletion
Handle deletion of queues correctly that an MQTT connection is
publishing to with QoS 1.
2023-01-24 17:29:07 +00:00
Chunyi Lyu de984d026b Subs from 1 connection counts as 1 consumer in global counter
- rename proc state isPublisher to has_published
- create macro for v3 and v4 mqtt protocol name for global
counters
- sub groups in integration suite
2023-01-24 17:29:07 +00:00
David Ansari 38e5e20bb8 Add tests 2023-01-24 17:29:07 +00:00
Chunyi Lyu 17c5dffe7a Set common global counters for mqtt
- protocols are set to mqtt301 or mqtt311 depends
on protocal version set by client connection
- added boolean isPublisher in proc state to track
if connection was ever used to publish. This is used to
set publisher_create and publisher_delete global counters.
- tests added in integration_SUITE
2023-01-24 17:29:07 +00:00
David Ansari 9fd5704e30 Fix mixed version Web MQTT system tests
In mixed verion tests all non-required feature flags are disabled.
Therefore, MQTT client ID tracking happens in Ra.
The Ra client sends commands asynchronously when registering and
deregistering the client ID.

Also, add more tests.
2023-01-24 17:29:07 +00:00
Chunyi Lyu 96854a8c4c Use emqtt:publish in mqtt tests
- rename publish_qos1 to publish_qos1_timeout
since it's only been used for handling publisher timeout
more gracefully in tests
2023-01-24 17:29:07 +00:00
David Ansari 319af3872e Handle duplicate packet IDs
"If a Client re-sends a particular Control Packet, then it MUST use the
same Packet Identifier in subsequent re-sends of that packet."

A client can re-send a PUBLISH packet with the same packet ID.
If the MQTT connection process already received the original packet and
sent it to destination queues, it will ignore this re-send.

The packet ID will be acked to the publisher once a confirmation from
all target queues is received.

There should be no risk of "stuck" messages within the MQTT connection
process because quorum and stream queue clients re-send the message and
classic queues will send a monitor DOWN message in case they are down.
2023-01-24 17:29:07 +00:00
David Ansari 4c15299196 Delete old emqttc client
Instead use latest emqtt client for Web MQTT tests.
2023-01-24 17:29:07 +00:00
Chunyi Lyu 645531bc95 Register mqtt connections in case event refresh 2023-01-24 17:29:07 +00:00
David Ansari 14f59f1380 Handle soft limit exceeded as queue action
Instead of performing credit_flow within quorum queue and stream queue
clients, return new {block | unblock, QueueName} actions.

The queue client process can then decide what to do.

For example, the channel continues to use credit_flow such that the
channel gets blocked sending any more credits to rabbit_reader.

However, the MQTT connection process does not use credit_flow. It
instead blocks its reader directly.
2023-01-24 17:29:07 +00:00
David Ansari 816fedf080 Enable flow control to target classic queue 2023-01-24 17:29:07 +00:00
David Ansari 33bf2150a5 Add test for publishing via MQTT to different queue types 2023-01-24 17:29:07 +00:00
Chunyi Lyu 8126925617 Implement format_status for mqtt reader
- truncate queue type state from mqtt proc_state, which
could be huge with many destination queues. Instead, format_status
now returns number of destination queues.
2023-01-24 17:29:07 +00:00
David Ansari 627ea8588a Add rabbit_event tests for MQTT
Add tests that MQTT plugin sends correct events to rabbit_event.

Add event connection_closed.
2023-01-24 17:29:07 +00:00
Chunyi Lyu b74dea4435 Send rabbit event declaring mqtt_qos0 queue 2023-01-24 17:29:07 +00:00
David Ansari 07ad410d81 Skip queue when MQTT QoS 0
This commit allows for huge fanouts if the MQTT subscriber connects with
clean_session = true and QoS 0. Messages are not sent to a conventional queue.
Instead, messages are forwarded directly from MQTT publisher connection
process or channel to MQTT subscriber connection process.
So, the queue process is skipped.

The MQTT subscriber connection process acts as the queue process.
Its mailbox is a superset of the queue. This new queue type is called
rabbit_mqtt_qos0_queue.
Given that the only current use case is MQTT, this queue type is
currently defined in the MQTT plugin.
The rabbit app is not aware that this new queue type exists.

The new queue gets persisted as any other queue such that routing via
the topic exchange contineues to work as usual. This allows routing
across different protocols without any additional changes, e.g. huge
fanout from AMQP client (or management UI) to all MQTT devices.

The main benefit is that memory usage of the publishing process is kept at
0 MB once garbage collection kicked in (when hibernating the gen_server).
This is achieved by having this queue type's client not maintain any
state. Previously, without this new queue type, the publisher process
maintained state of 500MB to all the 1 million destination queues even
long after stopping sending messages to these queues.

Another big benefit is that no queue process need to be created.
Prior to this commit, with 1 million MQTT subscribers, 3 million Erlang
processes got created: 1 million MQTT connection processes, 1 million classic
queue processes, and 1 million classic queue supervisor processes.
After this commit, only the 1 million MQTT connection processes get
created. Hence, a few GBs of process memory will be saved.

Yet another big benefit is that because the new queue type's client
auto-settles the delivery when sending, the publishing process only
awaits confirmation from queues who potentially have at-least-once
consumers. So, the publishing process is not blocked on sending the
confirm back to the publisher if 1 message is let's say routed to 1
million MQTT QoS 0 subscribers while 1 copy is routed to an important
quorum queue or stream and while a single out of the million MQTT
connection processes is down.

Other benefits include:
* Lower publisher confirm latency
* Reduced inter-node network traffic

In a certain sense, this commit allows RabbitMQ to act as a high scale
and high throughput MQTT router (that obviously can lose messages at any
time given the QoS is 0).
For example, it allows use cases as using RabbitMQ to send messages cheaply
and quickly to 1 million devices that happen to be online at the given
time: e.g. send a notification to any online mobile device.
2023-01-24 17:29:07 +00:00
David Ansari af68fb4484 Decrease memory usage of queue_type state
Prior to this commit, 1 MQTT publisher publishing to 1 Million target
classic queues requires around 680 MB of process memory.

After this commit, it requires around 290 MB of process memory.

This commit requires feature flag classic_queue_type_delivery_support
and introduces a new one called no_queue_name_in_classic_queue_client.

Instead of storing the binary queue name 4 times, this commit now stores
it only 1 time.

The monitor_registry is removed since only classic queue clients monitor
their classic queue server processes.

The classic queue client does not store the queue name anymore. Instead
the queue name is included in messages handled by the classic queue
client.

Storing the queue name in the record ctx was unnecessary.

More potential future memory optimisations:
* When routing to destination queues, looking up the queue record,
  delivering to queue: Use streaming / batching instead of fetching all
  at once
* Only fetch ETS columns that are necessary instead of whole queue
  records
* Do not hold the same vhost binary in memory many times. Instead,
  maintain a mapping.
* Remove unnecessary tuple fields.
2023-01-24 17:29:07 +00:00
David Ansari 4b1c2c870b Emit cluster-wide MQTT connection infos
When listing MQTT connections with the CLI, whether feature flag
delete_ra_cluster_mqtt_node is enabled or disabled, in both cases
return cluster wide MQTT connections.

If connection tracking is done in Ra, the CLI target node returns all
connection infos because Ra is aware of all MQTT connections.

If connection tracking is done in (local-only) pg, all nodes return
their local MQTT connection infos.
2023-01-24 17:29:07 +00:00
David Ansari 3e28a52066 Convert rabbit_mqtt_reader from gen_server2 to gen_server
There is no need to use gen_server2.

gen_server2 requires lots of memory with millions of MQTT connections
because it creates 1 entry per connection into ETS table
'gen_server2_metrics'.

Instead of using handle_pre_hibernate, erasing the permission cache
is now done by using a timeout.

We do not need a hibernate backoff feature, simply hibernate after 1
second.

It's okay for MQTT connection processes to hibernate because they
usually send data rather rarely (compared to AMQP connections).
2023-01-24 17:29:07 +00:00
David Ansari 199238d76e Use pg to track MQTT client IDs
Instead of tracking {Vhost, ClientId} to ConnectionPid mappings in our
custom process registry, i.e. custom local ETS table with a custom
gen_server process managing that ETS table, this commit uses the pg module
because pg is better tested.

To save memory with millions of MQTT client connections, we want to save
the mappings only locally on the node where the connection resides and
therfore not be replicated across all nodes.

According to Maxim Fedorov:
"The easy way to provide per-node unique pg scope is to start it like
pg:start_link(node()). At least that's what we've been doing to have
node-local scopes. It will still try to discover scopes on nodeup from
nodes joining the cluster, but since you cannot have nodes with the
same name in one cluster, using node() for local-only scopes worked
well for us."

So that's what we're doing in this commit.
2023-01-24 17:29:07 +00:00
David Ansari ab8957ba9c Use best-effort client ID tracking
"Each Client connecting to the Server has a unique ClientId"

"If the ClientId represents a Client already connected to
the Server then the Server MUST disconnect the existing
Client [MQTT-3.1.4-2]."

Instead of tracking client IDs via Raft, we use local ETS tables in this
commit.

Previous tracking of client IDs via Raft:
(+) consistency (does the right thing)
(-) state of Ra process becomes large > 1GB with many (> 1 Million) MQTT clients
(-) Ra process becomes a bottleneck when many MQTT clients (e.g. 300k)
    disconnect at the same time because monitor (DOWN) Ra commands get
    written resulting in Ra machine timeout.
(-) if we need consistency, we ideally want a single source of truth,
    e.g. only Mnesia, or only Khepri (but not Mnesia + MQTT ra process)

While above downsides could be fixed (e.g. avoiding DOWN commands by
instead doing periodic cleanups of client ID entries using session interval
in MQTT 5 or using subscription_ttl parameter in current RabbitMQ MQTT config),
in this case we do not necessarily need the consistency guarantees Raft provides.

In this commit, we try to comply with [MQTT-3.1.4-2] on a best-effort
basis: If there are no network failures and no messages get lost,
existing clients with duplicate client IDs get disconnected.

In the presence of network failures / lost messages, two clients with
the same client ID can end up publishing or receiving from the same
queue. Arguably, that's acceptable and less worse than the scaling
issues we experience when we want stronger consistency.

Note that it is also the responsibility of the client to not connect
twice with the same client ID.

This commit also ensures that the client ID is a binary to save memory.

A new feature flag is introduced, which when enabled, deletes the Ra
cluster named 'mqtt_node'.

Independent of that feature flag, client IDs are tracked locally in ETS
tables.
If that feature flag is disabled, client IDs are additionally tracked in
Ra.

The feature flag is required such that clients can continue to connect
to all nodes except for the node being udpated in a rolling update.

This commit also fixes a bug where previously all MQTT connections were
cluster-wide closed when one RabbitMQ node was put into maintenance
mode.
2023-01-24 17:29:07 +00:00
David Ansari 43bd548dcc Handle deprecated classic queue delivery
when feature flag classic_queue_type_delivery_support is disabled.
2023-01-24 17:29:07 +00:00
David Ansari 5710a9474a Support MQTT Keepalive in WebMQTT
Share the same MQTT keepalive code between rabbit_mqtt_reader and
rabbit_web_mqtt_handler.

Add MQTT keepalive test in both plugins rabbitmq_mqtt and
rabbitmq_web_mqtt.
2023-01-24 17:29:07 +00:00
David Ansari 6f00ccb3ad Get all existing rabbitmq_web_mqtt tests green 2023-01-24 17:29:07 +00:00
David Ansari a02cbb73a1 Get all existing rabbitmq_mqtt tests green 2023-01-24 17:29:07 +00:00
David Ansari 23dac495ad Support QoS 1 for sending and receiving 2023-01-24 17:29:07 +00:00
David Ansari cdd253ee87 Receive many messages from classic queue
Before this commit, a consumer from a classic queue was receiving max
200 messages:
bb5d6263c9/deps/rabbit/src/rabbit_queue_consumers.erl (L24)

MQTT consumer process must give credit to classic queue process
due to internal flow control.
2023-01-24 17:29:07 +00:00
David Ansari 99337b84d3 Emit stats
'connection' field is not needed anymore because it was
previously the internal AMQP connection PID
2023-01-24 17:29:07 +00:00
David Ansari 218ee196c4 Make proxy_protocol tests green 2023-01-24 17:29:07 +00:00
David Ansari 77da78f478 Get most auth_SUITE tests green
Some tests which require clean_start=false
or QoS1 are skipped for now.

Differentiate between v3 and v4:
v4 allows for an error code in SUBACK frame.
2023-01-24 17:29:07 +00:00
David Ansari 73ad3bafe7 Revert maybe expression
rabbit_misc:pipeline looks better and doesn't require experimental
feature
2023-01-24 17:29:07 +00:00
David Ansari f4d1f68212 Move authn / authz into rabbitmq_mqtt 2023-01-24 17:29:07 +00:00
David Ansari eac0622f37 Consume with QoS0 via queue_type interface 2023-01-24 17:29:07 +00:00
David Ansari 24b0a6bcb2 Publish with QoS0 via queue_type interface 2023-01-24 17:29:07 +00:00
David Ansari 8710565b2a Use 1 instead of 22 Erlang processes per MQTT connection
* Create MQTT connections without proxying via AMQP
* Do authn / authz in rabbitmq_mqtt instead of rabbit_direct:connect/5
* Remove rabbit_heartbeat process and per connection supervisors

Current status:

Creating 10k MQTT connections with clean session succeeds:
./emqtt_bench conn -V 4 -C true -c 10000 -R 500
2023-01-24 17:29:07 +00:00
Michael Klishin 3bfba02281
Merge pull request #6919 from rabbitmq/rin/rework-elixir-dialyze
Rework elixir dialyze
2023-01-21 12:11:02 -06:00
Rin Kuryloski b84e746ee9 Rework plt/dialyze for rabbitmqctl and plugins that depend on it
This allows us to stop ignorning undefined callback warnings

When mix compiles rabbitmqctl, it produces a 'consolidated' directory
alongside the 'ebin' dir. Some of the modules in consolidated are
intended to be used instead of those provided by elixir. We now handle
the conflicts properly in the bazel build.
2023-01-19 17:29:23 +01:00
Alexey Lebedeff b6cd708a08 Fix all dialyzer warnings in rabbitmq_web_mqtt 2023-01-19 17:23:23 +01:00
Rin Kuryloski 5ef8923462 Avoid the need to pass package name to rabbitmq_integration_suite 2023-01-18 15:25:27 +01:00
Rin Kuryloski a317b30807 Use improved assert_suites2 macro from rules_erlang 3.9.0 2023-01-18 15:07:06 +01:00
Michael Klishin ec4f1dba7d
(c) year bump: 2022 => 2023 2023-01-01 23:17:36 -05:00
Jean-Sébastien Pédron 15d9cdea61
Call `rabbit:data_dir/0` instead of `rabbit_mnesia:dir/0`
This is a follow-up commit to the parent commit. To quote part of the
parent commit's message:

> Historically, this was the Mnesia directory. But semantically, this
> should be the reverse: RabbitMQ owns the data directory and Mnesia is
> configured to put its files there too.

Now all subsystems call `rabbit:data_dir/0`. They are not tied to Mnesia
anymore.
2022-11-30 14:41:32 +01:00
David Ansari 5bf8192982 Support code coverage
Previously it was not possible to see code coverage for the majority of
test cases: integration tests that create RabbitMQ nodes.
It was only possible to see code coverage for unit tests.
This commit allows to see code coverage for tests that create RabbitMQ
nodes.

The only thing you need to do is setting the `COVER` variable, for example
```
make -C deps/rabbitmq_mqtt ct COVER=1
```
will show you coverage across all tests in the MQTT plugin.

Whenever a RabbitMQ node is started `ct_cover:add_nodes/1` is called.
Contrary to the documentation which states

> To have effect, this function is to be called from init_per_suite/1 (see common_test) before any tests are performed.

I found that it also works in init_per_group/1 or even within the test cases themselves.

Whenever a RabbitMQ node is stopped or killed `ct_cover:remove_nodes/1`
is called to transfer results from the RabbitMQ node to the CT node.

Since the erlang.mk file writes a file called `test/ct.cover.spec`
including the line:
```
{export,".../rabbitmq-server/deps/rabbitmq_mqtt/cover/ct.coverdata"}.
```
results across all test suites will be accumulated in that file.

The accumulated result can be seen through the link `Coverage log` on the test suite result pages.
2022-11-10 15:04:31 +01:00
David Ansari 694501b923 Close local MQTT connections when draining node
When a node gets drained (i.e. goes into maintenance mode), only local
connections should be terminated.

However, prior to this commit, all MQTT connections got terminated
cluster-wide when a single node was drained.
2022-10-13 11:14:01 +00:00
Luke Bakken 7fe159edef
Yolo-replace format strings
Replaces `~s` and `~p` with their unicode-friendly counterparts.

```
git ls-files *.erl | xargs sed -i.ORIG -e s/~s>/~ts/g -e s/~p>/~tp/g
```
2022-10-10 10:32:03 +04:00
Michal Kuratczyk 2855278034
Migrate from supervisor2 to supervisor 2022-09-27 13:53:06 +02:00
David Ansari 307e6730cc Point emqtt test dependency from ansd to emqx
Given that https://github.com/emqx/emqtt/pull/169 has been merged
and a new tag has been set on emqx/emqtt,
we do not need the fork ansd/emqtt anymore.
2022-09-21 19:07:25 +02:00
Péter Gömöri c4b7cd98bf Add login_timeout to mqtt and stomp reader
Similarly to handshake_timeout in amqp reader.
2022-09-12 17:48:48 +02:00
David Ansari b953b0f10e Stop sending stats to rabbit_event
Stop sending connection_stats from protocol readers to rabbit_event.
Stop sending queue_stats from queues to rabbit_event.
Sending these stats every 5 seconds to the event manager process is
superfluous because noone handles these events.

They seem to be a relict from before rabbit_core_metrics ETS tables got
introduced in 2016.

Delete test head_message_timestamp_statistics because it tests that
head_message_timestamp is set correctly in queue_stats events
although queue_stats events are used nowhere.
The functionality of head_message_timestamp itself is still tested in
deps/rabbit/test/priority_queue_SUITE.erl and
deps/rabbit/test/temp/head_message_timestamp_tests.py
2022-09-09 10:52:38 +00:00
David Ansari 4c997f84bd Fix MQTT protocol version in MQTT tests
This is a follow-up commit of https://github.com/rabbitmq/rabbitmq-server/pull/5693

The allowed values of emqtt client library are:
```
{proto_ver, v3 | v4 | v5}
```

Therefore, `{proto_ver, 3}` did not have any effect and used the default
protocol version v4.

Let's fix the misleading version in our tests and be explicit that
we use v4.
2022-09-06 10:11:53 +02:00
David Ansari 1c96bf1315 Point emqtt test dependency to a tree reference
Since I force pushed to master branch of
https://github.com/ansd/emqtt, the old commit does
not belong to any branch anymore.

While Bazel is happy, make complains:
```
make -C deps/rabbitmq_mqtt ct
 DEP    emqtt (f6d7ddd391890f4db5f77c775e83cf0ffe3d2d76)
fatal: reference is not a tree: f6d7ddd391890f4db5f77c775e83cf0ffe3d2d76
```
2022-09-02 14:03:26 +00:00
David Ansari ac2a5d3dd3 Upgrade MQTT Erlang client
The rabbitmq_mqtt tests used an outdated MQTT Erlang client.
It was a fork that has not been updated for > 4 years.
This commit upgrades the client to the latest version.
Therefore, we can delete our fork https://github.com/rabbitmq/emqttc.git
2022-08-31 14:12:23 +00:00
David Ansari 49ed70900e Fix failing proxy_protocol test
Prior to this commit, test
```
make -C deps/rabbitmq_web_mqtt ct-proxy_protocol t=http_tests:proxy_protocol
```

was failing with reason
```
exception error: no function clause matching
                 rabbit_net:sockname({rabbit_proxy_socket,#Port<0.96>,
```
2022-08-25 20:00:49 +02:00
David Ansari 28db862d56 Avoid crash when client disconnects before server handles MQTT CONNECT
In case of a resource alarm, the server accepts incoming TCP
connections, but does not read from the socket.
When a client connects during a resource alarm, the MQTT CONNECT frame
is therefore not processed.

While the resource alarm is ongoing, the client might time out waiting
on a CONNACK MQTT packet.

When the resource alarm clears on the server, the MQTT CONNECT frame
gets processed.

Prior to this commit, this results in the following crash on the server:
```
** Reason for termination ==
** {{badmatch,{error,einval}},
    [{rabbit_mqtt_processor,process_login,4,
                            [{file,"rabbit_mqtt_processor.erl"},{line,585}]},
     {rabbit_mqtt_processor,process_request,3,
                            [{file,"rabbit_mqtt_processor.erl"},{line,143}]},
     {rabbit_mqtt_processor,process_frame,2,
                            [{file,"rabbit_mqtt_processor.erl"},{line,69}]},
     {rabbit_mqtt_reader,process_received_bytes,2,
                         [{file,"src/rabbit_mqtt_reader.erl"},{line,307}]},
```

After this commit, the server just logs:
```
[error] <0.887.0> MQTT protocol error on connection 127.0.0.1:55725 -> 127.0.0.1:1883: peername_not_known
```

In case the client already disconnected, we want the server to bail out
early, i.e. not authenticating and registering the client at all
since that can be expensive when many clients connected while the
resource alarm was ongoing.

To detect whether the client disconnected, we rely on inet:peername/1
which will return an error when the peer is not connected anymore.

Ideally we could use some better mechanism for detecting whether the
client disconnected.

The MQTT reader does receive a {tcp_closed, Socket} message once the
socket becomes active. However, we don't really want to read frames
ahead (i.e. ahead of the received CONNECT frame), one reason being that:
"Clients are allowed to send further Control Packets immediately
after sending a CONNECT Packet; Clients need not wait for a CONNACK Packet
to arrive from the Server."

Setting socket option `show_econnreset` does not help either because the client
closes the connection normally.

Co-authored-by: Péter Gömöri @gomoripeti
2022-08-25 18:42:37 +02:00
Rin Kuryloski 165f946ffd Remove .travis.yml.patch files 2022-08-16 09:48:46 +02:00
Rin Kuryloski 575c5f9975 Remove all of the .travis.yml files
since we no longer use them
2022-08-16 09:46:31 +02:00
Jean-Sébastien Pédron 6e9ee4d0da
Remove test code which depended on the `quorum_queue` feature flags
These checks are now irrelevant as the feature flag is required.
2022-08-01 12:41:30 +02:00
Philip Kuryloski a250a533a4 Remove elixir related -ignore_xref calls
As they are no longer necessary with xref2 and the erlang.mk updates
2022-06-09 23:18:40 +02:00
Philip Kuryloski 15a79466b1 Use the new xref2 macro from rules_erlang
That adopts the modern erlang.mk xref behaviour
2022-06-09 23:18:28 +02:00
Philip Kuryloski 327f075d57 Make rabbitmq-server work with rules_erlang 3
Also rework elixir dependency handling, so we no longer rely on mix to
fetch the rabbitmq_cli deps

Also:

- Specify ra version with a commit rather than a branch
- Fixup compilation options for erlang 23
- Add missing ra reference in MODULE.bazel
- Add missing flag in oci.yaml
- Reduce bazel rbe jobs to try to save memory
- Use bazel built erlang for erlang git master tests
- Use the same cache for all the workflows but windows
- Avoid using `mix local.hex --force` in elixir rules
  - Fetching seems blocked in CI, and this should reduce hex api usage in
    all builds, which is always nice
- Remove xref and dialyze tags since rules_erlang 3 includes them in
  the defaults
2022-06-08 14:04:53 +02:00
Loïc Hoguin dc70cbf281
Update Erlang.mk and switch to new xref code 2022-05-31 13:51:12 +02:00
David Ansari 20677395cd Check queue and exchange existence with ets:member/2
This reduces memory usage and improves code readability.
2022-05-10 10:16:40 +00:00
Péter Gömöri 52cb5796a3 Remove leftover compiler option for get_stacktrace 2022-05-03 18:40:49 +02:00
Michael Klishin 7c47d0925a
Revert "Correct a double quote introduced in #4603"
This reverts commit 6a44e0e2ef.

That wiped a lot of files unintentionally
2022-04-20 16:05:56 +04:00
Michael Klishin 6a44e0e2ef
Correct a double quote introduced in #4603 2022-04-20 16:01:29 +04:00
Luke Bakken dba25f6462
Replace files with symlinks
This prevents duplicated and out-of-date instructions.
2022-04-15 06:04:29 -07:00
Philip Kuryloski 2dd9bde891 Bring over PROJECT_APP_EXTRA_KEYS values from make to bazel 2022-04-07 17:39:33 +02:00
Philip Kuryloski a22234f6eb Updates for rules_erlang 2.5.0
rabbitmq_cli uses some private rules_erlang apis that have changed in
the upcoming release

Additionally:
- Avoid including both standard and test versions of amqp_client in
integration test suites
- Eliminate most of the compilation order hints (explicit first_srcs)
in the bazel build
- Fix an include statement - in bazel, an app is not available to
itself as a library at compilation time
2022-04-07 14:54:37 +02:00
Michael Klishin 0ae3f19698
mqtt.queue_type => mqtt.durable_queue_type 2022-03-31 19:48:00 +04:00
Gabriele Santomaggio 2c49748c70
Add quorum queues support for MQTT
Enable the quorum queue for MQTT only if CleanSession is False.
QQs don't support auto-delete flag so in case Clean session is True
the queue will be a classic queue.

Add another group test non_parallel_tests_quorum.
For Mixed test the quorum_queue feature flag must be enabled.

Add log message
2022-03-30 08:49:17 -07:00
Michael Klishin c38a3d697d
Bump (c) year 2022-03-21 01:21:56 +04:00
Philip Kuryloski dabf053cf8 Additional dialyzer warning fixes
Currently loading of the rabbitmq_cli defined behaviors compiled with
Elixir does not work, so we ignore the callback definitions contained therein
2022-02-25 18:14:35 +01:00
Philip Kuryloski 226e00fcd2 Tighten up dialyzer usage
now that rules_erlang no longer cascades up dialyzer warnings from deps
2022-02-24 11:18:41 +01:00
Philip Kuryloski d8201726ae Ignore dialyzer warnings for most apps 2022-02-21 09:19:56 +01:00
Philip Kuryloski 2ec7ed8a41 Mark rabbitmq_mqtt:auth_SUITE as flaky 2022-02-02 16:06:35 +01:00
Philip Kuryloski efcd881658 Use rules_erlang v2
bazel-erlang has been renamed rules_erlang. v2 is a substantial
refactor that brings Windows support. While this alone isn't enough to
run all rabbitmq-server suites on windows, one can at least now start
the broker (bazel run broker) and run the tests that do not start a
background broker process
2022-01-18 13:43:46 +01:00
Michael Klishin f7d32d69f8 Introduce a new CLI tool (scope), rabbitmq-tanzu
For Tanzu (commercial) plugins to attach their commands to instead of
polluting rabbitmqctl.

Pair: @pjk25
(cherry picked from commit 6e0f2436fa)
2021-11-30 14:54:09 +00:00
Alexey Lebedeff e0723d5e66 Prevent crash logs when mqtt user is missing permissions
Fixes #2941

This adds proper exception handlers in the right places. And tests
ensure that it indeed provides nice neat logs without large
stacktraces for every amqp operation.

Unnecessary checking for subscribe permissions on topic was dropped,
as `queue.bind` does exactly the same check. Topic permissions tests
were also added, and they indeed confirm that there was no change in
behaviour.

Ideally the same explicit topic permission check should be dropped for
publishing, but it's more complicated - so for now there only a
detailed comment in the source code explaining it.

A few other things were also optimized away:
- Using amqp client to test for queue existence
- Creating queues/starting consumptions too eagerly, even if not yet
  requested by client
2021-11-12 18:03:05 +01:00
Michael Klishin 0f6a9dac27
Introduce rabbit_nodes:all/0 2021-09-20 22:24:25 +03:00
Philip Kuryloski 2b6296c4e2 Mark //deps/rabbitmq_mqtt:cluster_SUITE as flaky 2021-09-08 11:53:05 +02:00
Philip Kuryloski f95fc8aa0c Increase some suite timeouts in bazel 2021-07-23 09:43:05 +02:00
Michael Klishin 7de491fd82
Merge pull request #3187 from rabbitmq/less-chatty-mqtt
Change a log line from INFO to DEBUG
2021-07-12 19:07:29 +03:00
Philip Kuryloski 8f9de08de7 Also assert no missing suites for all other deps 2021-07-12 18:05:55 +02:00
Michal Kuratczyk 41922b96cf
Change a log line from INFO to DEBUG
This line is printed on every new MQTT connection which leads to very chatty logs when there is a lot of connections. Given that the way MQTT uses vhosts is generally static (once set up, always the same for all connections), I think this can be a debug message instead.
2021-07-12 16:50:25 +02:00
Philip Kuryloski 8c7e7e0656 Revert "Default all `rabbitmq_integration_suite` to flaky in bazel"
This reverts commit 70cb8147b2.
2021-06-23 20:53:14 +02:00
Philip Kuryloski 70cb8147b2 Default all `rabbitmq_integration_suite` to flaky in bazel
Most tests that can start rabbitmq nodes have some chance of
flaking. Rather than chase individual flakes for now, this commit
changes the default (though it can still be overriden, as is the case
for config_scheme_SUITE in many places, since I have yet to see that
particular suite flake).
2021-06-21 16:10:38 +02:00
Philip Kuryloski 55b3b6a370 Mark //deps/rabbitmq_mqtt:java_SUITE as flaky in bazel 2021-06-21 11:16:45 +02:00
Philip Kuryloski 30f9a95b9f Add dialyze for remaning tier-1 plugins 2021-06-01 10:19:10 +02:00
Philip Kuryloski 98e71c45d8 Perform xref checks on many tier-1 plugins 2021-05-21 12:03:22 +02:00
Michael Klishin a755dca8e9
MQTT: use consistent Ra operation timeout values
of more than the default 5s which is really low.
2021-05-18 14:35:48 +03:00
Philip Kuryloski c13c2af614 Bazel file refactoring 2021-05-11 12:03:27 +02:00
Philip Kuryloski 36321ee126 Test rabbitmq_mqtt with bazel 2021-04-19 09:50:42 +02:00
Carl Hörberg 681cb78b0d Test that proxy dest address is picked up in all plugins 2021-03-31 11:28:40 +02:00
kjnilsson 62677cbacf
MQTT ra systems changes 2021-03-22 21:44:19 +03:00
Philip Kuryloski a63f169fcb Remove duplicate rabbitmq-components.mk and erlang.mk files
Also adjust the references in rabbitmq-components.mk to account for
post monorepo locations
2021-03-22 15:40:19 +01:00
Michael Klishin 5e0d7041cd
Merge pull request #2910 from rabbitmq/configure-num-conns-sup
Make ranch parameter `num_conns_sups` configurable
2021-03-19 21:59:30 +03:00
dcorbacho a41ece3950 Make ranch parameter `num_conns_sups` configurable
Defaults to 1
rabbit - num_conns_sup
rabbitmq_mqtt - num_conns_sup
rabbitmq_stomp - num_conns_sup
2021-03-18 21:38:13 +01:00
kjnilsson 52f745dcde Update rabbitmq-components.mk
use v1.x branch of ra
2021-03-18 15:14:40 +00:00
Loïc Hoguin d5e3bdd623
Add ADDITIONAL_PLUGINS variable
This allows including additional applications or third party
plugins when creating a release, running the broker locally,
or just building from the top-level Makefile.

To include Looking Glass in a release, for example:

$ make package-generic-unix ADDITIONAL_PLUGINS="looking_glass"

A Docker image can then be built using this release and will
contain Looking Glass:

$ make docker-image

Beware macOS users! Applications such as Looking Glass include
NIFs. NIFs must be compiled in the right environment. If you
are building a Docker image then make sure to build the NIF
on Linux! In the two steps above, this corresponds to Step 1.

To run the broker with Looking Glass available:

$ make run-broker ADDITIONAL_PLUGINS="looking_glass"

This commit also moves Looking Glass dependency information
into rabbitmq-components.mk so it is available at all times.
2021-03-12 12:29:28 +01:00
Michael Klishin 91964db0e6
MQTT: correct a typo in mqtt_machine
Introduced in #2861
2021-03-12 05:33:03 +03:00
Michael Klishin 97ff62d3b2
Drop trailing newlines from logged messages where possible
Lager strips trailing newline characters but OTP logger with the default
formatter adds a newline at the end. To avoid unintentional multi-line log
messages we have to revisit most messages logged.

Some log entries are intentionally multiline, others
are printed to stdout directly: newlines are required there
for sensible formatting.
2021-03-11 15:17:37 +01:00
Jean-Sébastien Pédron cdcf602749
Switch from Lager to the new Erlang Logger API for logging
The configuration remains the same for the end-user. The only exception
is the log root directory: it is now set through the `log_root`
application env. variable in `rabbit`. People using the Cuttlefish-based
configuration file are not affected by this exception.

The main change is how the logging facility is configured. It now
happens in `rabbit_prelaunch_logging`. The `rabbit_lager` module is
removed.

The supported outputs remain the same: the console, text files, the
`amq.rabbitmq.log` exchange and syslog.

The message text format slightly changed: the timestamp is more precise
(now to the microsecond) and the level can be abbreviated to always be
4-character long to align all messages and improve readability. Here is
an example:

    2021-03-03 10:22:30.377392+01:00 [dbug] <0.229.0> == Prelaunch DONE ==
    2021-03-03 10:22:30.377860+01:00 [info] <0.229.0>
    2021-03-03 10:22:30.377860+01:00 [info] <0.229.0>  Starting RabbitMQ 3.8.10+115.g071f3fb on Erlang 23.2.5
    2021-03-03 10:22:30.377860+01:00 [info] <0.229.0>  Licensed under the MPL 2.0. Website: https://rabbitmq.com

The example above also shows that multiline messages are supported and
each line is prepended with the same prefix (the timestamp, the level
and the Erlang process PID).

JSON is also supported as a message format and now for any outputs.
Indeed, it is possible to use it with e.g. syslog or the exchange. Here
is an example of a JSON-formatted message sent to syslog:

    Mar  3 11:23:06 localhost rabbitmq-server[27908] <0.229.0> - {"time":"2021-03-03T11:23:06.998466+01:00","level":"notice","msg":"Logging: configured log handlers are now ACTIVE","meta":{"domain":"rabbitmq.prelaunch","file":"src/rabbit_prelaunch_logging.erl","gl":"<0.228.0>","line":311,"mfa":["rabbit_prelaunch_logging","configure_logger",1],"pid":"<0.229.0>"}}

For quick testing, the values accepted by the `$RABBITMQ_LOGS`
environment variables were extended:
  * `-` still means stdout
  * `-stderr` means stderr
  * `syslog:` means syslog on localhost
  * `exchange:` means logging to `amq.rabbitmq.log`

`$RABBITMQ_LOG` was also extended. It now accepts a `+json` modifier (in
addition to the existing `+color` one). With that modifier, messages are
formatted as JSON intead of plain text.

The `rabbitmqctl rotate_logs` command is deprecated. The reason is
Logger does not expose a function to force log rotation. However, it
will detect when a file was rotated by an external tool.

From a developer point of view, the old `rabbit_log*` API remains
supported, though it is now deprecated. It is implemented as regular
modules: there is no `parse_transform` involved anymore.

In the code, it is recommended to use the new Logger macros. For
instance, `?LOG_INFO(Format, Args)`. If possible, messages should be
augmented with some metadata. For instance (note the map after the
message):

    ?LOG_NOTICE("Logging: switching to configured handler(s); following "
                "messages may not be visible in this log output",
                #{domain => ?RMQLOG_DOMAIN_PRELAUNCH}),

Domains in Erlang Logger parlance are the way to categorize messages.
Some predefined domains, matching previous categories, are currently
defined in `rabbit_common/include/logging.hrl` or headers in the
relevant plugins for plugin-specific categories.

At this point, very few messages have been converted from the old
`rabbit_log*` API to the new macros. It can be done gradually when
working on a particular module or logging.

The Erlang builtin console/file handler, `logger_std_h`, has been forked
because it lacks date-based file rotation. The configuration of
date-based rotation is identical to Lager. Once the dust has settled for
this feature, the goal is to submit it upstream for inclusion in Erlang.
The forked module is calld `rabbit_logger_std_h` and is based
`logger_std_h` in Erlang 23.0.
2021-03-11 15:17:36 +01:00
dcorbacho 61f7b2a723 Update to ranch 2.0 2021-03-08 23:11:05 +01:00
Michael Klishin b6c4831e75
Bump Lager to 3.9.1 2021-03-04 04:36:39 +03:00
Loïc Hoguin 66ac1bf5e9
Bump observer_cli to 1.6.1
More responsive when the system is overloaded with file calls.
2021-03-01 21:55:27 +03:00
Michael Klishin 8fe3df9343
Upgrade Lager to 3.9.0 for OTP 24 compatibility
`lager_util:expand_path/1` use changes are
due to erlang-lager/lager#540
2021-02-26 00:52:15 +03:00
Michael Klishin f73e851f9c
Bump observer_cli to 1.6.0 2021-02-24 12:53:55 +03:00
Michael Klishin a5098b28a7
Bump Lager to 3.8.2 for OTP 24 compatibility 2021-02-24 12:53:30 +03:00
Michael Klishin b11a79cccf
Bump (c) year in header files 2021-02-04 07:04:58 +03:00
Arnaud Cogoluègnes b921ac11a8
Merge pull request #2712 from rabbitmq/rabbitmq-stream-prometheus
Add stream prometheus plugin
2021-01-27 16:46:37 +01:00
Michael Klishin 52479099ec
Bump (c) year 2021-01-22 09:00:14 +03:00
Arnaud Cogoluègnes b5315c0166
Merge branch 'master' into rabbitmq-stream-prometheus 2021-01-18 11:26:06 +01:00
Michael Klishin e8fccbaf48
MQTT auth_SUITE: synchronise concurrent setup with the test 2021-01-13 16:41:03 +03:00
Arnaud Cogoluègnes bf72683eb2
Add stream prometheus plugin 2021-01-11 16:49:56 +01:00
Arnaud Cogoluègnes cbd3c8dfdd
Merge branch 'master' into rabbitmq-stream-management 2021-01-04 09:50:47 +01:00
kjnilsson 04a55e0ee6 bug fixes 2020-12-22 15:16:17 +00:00
kjnilsson 160e41687d MQTT machine versions 2020-12-22 10:21:21 +00:00
kjnilsson 067a42e066 Optimise MQTT state machine
It was particularly slow when processing down commands.
2020-12-21 15:58:32 +00:00
Arnaud Cogoluègnes 224e9914b2
Merge branch 'master' into rabbitmq-stream-management 2020-12-04 10:26:42 +01:00
kjnilsson 6fdb7d29ec Handle errors in crashing_queues_SUITE
As the connection may crash during the previous declaration and a caught
error would be returned in amqp_connection:open_channel/1 that wasn't
handled previously. Exactly how things fail in this test is most likely
very timing dependent and may vary.

Also fixes mqtt test where the process that set up a mock auth ETS table
was transient when an rpc timeout was introduced
2020-12-03 13:56:09 +00:00
Arnaud Cogoluègnes 23d7e8114c
Introduce stream management plugin 2020-11-19 14:48:25 +01:00
Jean-Sébastien Pédron 47686ee1f0
Remove unused .github directories
They were valid until the switch to the "monorepository" when everything
was merged into a single Git repository.
2020-11-17 13:33:16 +01:00
Arnaud Cogoluègnes 07125203b9 Update rabbitmq-components.mk 2020-11-03 14:27:43 +01:00
Michael Klishin 89235cb9fc Update rabbitmq-components.mk 2020-10-21 12:55:39 +03:00
Michael Klishin 79a02256f1 Merge pull request #238 from rabbitmq/auth-attempt-metrics
Add auth attempt metrics
2020-10-14 23:56:29 +03:00
dcorbacho d80e8e1bec Add protocol to auth attempt metrics 2020-09-23 11:16:13 +01:00
Luke Bakken 1daae3064d Revert "Switch to classic OTP supervisor for two modules"
This reverts commit 1bead422a9.
2020-08-31 15:51:39 -07:00
Luke Bakken abb0ab5bd9 Revert "Closes #233"
This reverts commit c45b8d813a.
2020-08-31 15:50:32 -07:00
dcorbacho b138241b52 Add auth attempt metrics 2020-08-28 13:19:05 +01:00
Luke Bakken 65937d3b15 Update rabbitmq-components.mk 2020-08-04 08:41:48 -07:00
Jean-Sébastien Pédron 7d7c8e11d2 Update rabbitmq-components.mk 2020-07-30 12:06:54 +02:00
Luke Bakken f92e4b24ca Update rabbitmq-components.mk 2020-07-29 10:02:04 -07:00
dcorbacho 40e2e3fb13 Update erlang.mk 2020-07-21 14:33:00 +01:00
Michael Klishin 3cc3974f82 Update rabbitmq-components.mk 2020-07-21 13:12:50 +03:00
Michael Klishin 05d99be34e Update rabbitmq-components.mk 2020-07-21 03:43:02 +03:00
dcorbacho 99bf86bb87 Revert drop of Exhibit B on MPL 2.0 2020-07-20 17:01:38 +01:00
dcorbacho e92f7999a2 Update LICENSE 2020-07-20 11:43:03 +01:00
Michael Klishin 4b33266425 Update MPL2 license file, drop Exhibit B
and add a VMware copyright notice.

We did not mean to make this code Incompatible with Secondary Licenses
as defined in [1].

1. https://www.mozilla.org/en-US/MPL/2.0/FAQ/
2020-07-17 14:53:09 +03:00
dcorbacho dae65d8e8d Merge branch 'master' into rabbitmq-server-2321 2020-07-14 15:48:30 +01:00
D Corbacho 1a9632576d Merge pull request #236 from rabbitmq/switch-to-MPL-2.0
Switch to Mozilla Public License 2.0 (MPL 2.0)
2020-07-13 17:40:24 +01:00
dcorbacho 119eb99e8d Switch to Mozilla Public License 2.0 (MPL 2.0) 2020-07-13 17:39:36 +01:00
Michael Klishin 30e6cbdd24 Extract rabbit_networking:stop_ranch_listener_of_protocol/1
Part of rabbitmq/rabbitmq-server#2321
2020-07-09 22:02:09 +03:00
kjnilsson 3cf84a19b2 Fix mqtt_machine crash bug
When a client performs repeated requests the state machine would crash
with a match exception.

Add unit test suite for mqtt_machine.
2020-07-09 14:41:00 +01:00
Michael Klishin d7474cee33 Cosmetics 2020-07-08 20:00:00 +03:00
Michael Klishin 13cbcfff79 Cosmetics 2020-07-08 19:21:21 +03:00
Michael Klishin 033128cb35 Make sure MQTT plugin closes its connections when a node is put into maintenance mode
Part of rabbitmq/rabbitmq-server#2321
2020-07-08 19:10:54 +03:00
Michael Klishin 73ed4d3772 Unify Ranch ref construction for all listeners
This makes the refs predictable and easy to compute
from a listener record. Then suspending all listeners
becomes a lot simpler.

While at it, make protocol applications clean up
their listeners when they stop. This way tests
and other callers that have to stop the app
would not need to know anything about
its listeners.

Part of rabbitmq/rabbitmq-server#2321
2020-06-24 04:27:34 +03:00
Jean-Sébastien Pédron 378b6719e4 Update erlang.mk 2020-06-23 17:14:34 +02:00
Michael Klishin c45b8d813a Closes #233 2020-06-18 03:32:35 +03:00
Michael Klishin 1bead422a9 Switch to classic OTP supervisor for two modules
supervisor2 features are not needed there, so why not use
the standard thing that's evolving together with Erlang/OTP.

Part of #233.
2020-06-18 03:04:13 +03:00
Michael Klishin 6280926bb2 Bump Recon to 2.5.1
for Erlang 23 compatibility of 'rabbitmq-diagnostics observer'

References zhongwencool/observer_cli#68.
2020-06-09 08:22:16 +03:00
Michael Klishin ae37b4723a Use a higher Ra operation timeout 2020-06-02 21:37:14 +03:00
Jean-Sébastien Pédron dcc5f7b553 Update copyright (year 2020) 2020-03-10 16:39:48 +01:00
Gerhard Lazu 30d4f0a4e5 Update rabbitmq-components.mk 2020-03-06 09:19:17 +00:00
Gerhard Lazu 64db4888d1 Update erlang.mk 2020-03-06 09:18:01 +00:00
Jean-Sébastien Pédron f48167a514 Travis CI: Update config from rabbitmq-common 2020-03-04 14:24:30 +01:00
Jean-Sébastien Pédron 3f09bfb1f3 Travis CI: Update config from rabbitmq-common 2020-03-04 11:17:16 +01:00
Jean-Sébastien Pédron 9e8def2ed5 Travis CI: Update config from rabbitmq-common 2020-03-03 14:53:39 +01:00
Jean-Sébastien Pédron 2b62f489dc Travis CI: Refresh config patch 2020-03-03 14:30:05 +01:00
Michael Klishin 8b638f413a Avoid using erlang:get_stacktrace/0 for improved OTP 23/24 compat
(cherry picked from commit 251a40f705)
2020-02-27 22:34:56 +03:00
Michael Klishin ed5c1c954a Randomized Raft node startup delay
Same fundamental idea as Raft itself uses to avoid
cluster fragmentation on parallel node boot.

(cherry picked from commit f8a9b4b8b6)
2020-02-26 02:23:33 +03:00
Michael Klishin bbc170af7b Synchronise plugin start with that of Raft node
While at it, start listeners after the client ID
tracker is ready. Otherwise we run the risk of taking
client connections in before they can be accepted.

(cherry picked from commit 5da74b6e82)
2020-02-26 02:23:26 +03:00
Michael Klishin e6a8d93bb5 Inject a delay before joining client ID tracking cluster
We have considered multiple options in preventing a split cluster
scenario when N nodes a started in parallel and are initially unaware of
each other. They all are fairly involved and run various risks, e.g.
of losing consistency for cluster members that need to rejoin a newly
discovered set of members.

A simple delay to see if there may be any peers seems to be a straightfoward
solution that would make a practical difference.

In the future consistent client ID tracking should be a feature the user
can opt out of because it tilts MQTT plugin potentially to far towards
C on the consistency/availability range.

Pair: @kjnilsson
2020-02-24 17:58:03 +03:00