Documentation updates
This commit is contained in:
parent
d82a77cecc
commit
9b5fe90736
|
|
@ -3,9 +3,9 @@
|
||||||
## Introduction
|
## Introduction
|
||||||
|
|
||||||
This plugin adds a consistent-hash exchange type to RabbitMQ. This
|
This plugin adds a consistent-hash exchange type to RabbitMQ. This
|
||||||
exchange type uses consistent hashing ([intro blog post 1](http://www.martinbroadhurst.com/Consistent-Hash-Ring.html), [intro blog post 2](http://michaelnielsen.org/blog/consistent-hashing/)) to distribute
|
exchange type uses consistent hashing (intro blog posts: [one](http://www.martinbroadhurst.com/Consistent-Hash-Ring.html), [two](http://michaelnielsen.org/blog/consistent-hashing/), [three](https://akshatm.svbtle.com/consistent-hash-rings-theory-and-implementation)) to distribute
|
||||||
messages between the bound queues. It is recommended to get a basic understanding of the
|
messages between the bound queues. It is recommended to get a basic understanding of the
|
||||||
concept before evaluating this plugin.
|
concept before evaluating this plugin and its alternatives.
|
||||||
|
|
||||||
[rabbitmq-sharding](https://github.com/rabbitmq/rabbitmq-sharding) is another plugin
|
[rabbitmq-sharding](https://github.com/rabbitmq/rabbitmq-sharding) is another plugin
|
||||||
that provides a way to partition a stream of messages among a set of consumers
|
that provides a way to partition a stream of messages among a set of consumers
|
||||||
|
|
@ -13,23 +13,22 @@ while trading off total stream ordering for processing parallelism.
|
||||||
|
|
||||||
## Problem Definition
|
## Problem Definition
|
||||||
|
|
||||||
In various scenarios, you may wish to ensure that messages sent to an
|
In various scenarios it may be desired to ensure that messages sent to an
|
||||||
exchange are consistently and equally distributed across a number of
|
exchange are reasonably [uniformly distributed](https://en.wikipedia.org/wiki/Uniform_distribution_(discrete)) across a number of
|
||||||
different queues based on the routing key of the message, a nominated
|
queues based on the routing key of the message, a [nominated
|
||||||
header (see "Routing on a header" below), or a message property (see
|
header](#routing-on-a-header), or a [message property](#routing-on-a-header).
|
||||||
"Routing on a message property" below). You could arrange for this to
|
Technically this can be accomplished using a direct or topic exchange,
|
||||||
occur yourself by using a direct or topic exchange, binding queues
|
binding queues to that exchange and then publishing messages to that exchange that
|
||||||
to that exchange and then publishing messages to that exchange that
|
|
||||||
match the various binding keys.
|
match the various binding keys.
|
||||||
|
|
||||||
However, arranging things this way can be problematic:
|
However, arranging things this way can be problematic:
|
||||||
|
|
||||||
1. It is difficult to ensure that all queues bound to the exchange
|
1. It is difficult to ensure that all queues bound to the exchange
|
||||||
will receive a (roughly) equal number of messages without baking in to
|
will receive a (roughly) equal number of messages (distribution uniformity)
|
||||||
the publishers quite a lot of knowledge about the number of queues and
|
without baking in to the publishers quite a lot of knowledge about the number of queues and
|
||||||
their bindings.
|
their bindings.
|
||||||
|
|
||||||
2. If the number of queues changes, it is not easy to ensure that the
|
2. When the number of queues changes, it is not easy to ensure that the
|
||||||
new topology still distributes messages between the different queues
|
new topology still distributes messages between the different queues
|
||||||
evenly.
|
evenly.
|
||||||
|
|
||||||
|
|
@ -41,20 +40,33 @@ to the computed hash (and the hash space wraps around). The effect of
|
||||||
this is that when a new bucket is added or an existing bucket removed,
|
this is that when a new bucket is added or an existing bucket removed,
|
||||||
only a very few hashes change which bucket they are routed to.
|
only a very few hashes change which bucket they are routed to.
|
||||||
|
|
||||||
## How Consistent Hashing Routing Works
|
## Supported RabbitMQ Versions
|
||||||
|
|
||||||
|
This plugin supports RabbitMQ 3.3.x and later versions.
|
||||||
|
|
||||||
|
## Supported Erlang Versions
|
||||||
|
|
||||||
|
This plugin supports the same [Erlang versions](https://rabbitmq.com/which-erlang.html) as RabbitMQ core.
|
||||||
|
|
||||||
|
|
||||||
|
## How It Works
|
||||||
|
|
||||||
In the case of Consistent Hashing as an exchange type, the hash is
|
In the case of Consistent Hashing as an exchange type, the hash is
|
||||||
calculated from the hash of the routing key of each message
|
calculated from a message property (most commonly the routing key).
|
||||||
received. Thus messages that have the same routing key will have the
|
Thus messages that have the same routing key will have the
|
||||||
same hash computed, and thus will be routed to the same queue,
|
same hash value computed for them, and thus will be routed to the same queue,
|
||||||
assuming no bindings have changed.
|
assuming no bindings have changed.
|
||||||
|
|
||||||
When you bind a queue to a consistent-hash exchange, the binding key
|
### Binding Weights
|
||||||
is a number-as-a-string which indicates the number of points in the
|
|
||||||
hash space at which you wish the queue to appear. The actual points
|
|
||||||
are generated randomly.
|
|
||||||
|
|
||||||
The hashing distributes *routing keys* among queues, not *messages*
|
When a queue is bound to a Consistent Hash exchange, the binding key
|
||||||
|
is a number-as-a-string which indicates the binding weight: the number
|
||||||
|
of buckets (sections of the range) that will be associated with the
|
||||||
|
target queue.
|
||||||
|
|
||||||
|
### Consistent Hashing-based Routing
|
||||||
|
|
||||||
|
The hashing distributes *routing keys* among queues, not *message payloads*
|
||||||
among queues; all messages with the same routing key will go the
|
among queues; all messages with the same routing key will go the
|
||||||
same queue. So, if you wish for queue A to receive twice as many
|
same queue. So, if you wish for queue A to receive twice as many
|
||||||
routing keys routed to it than are routed to queue B, then you bind
|
routing keys routed to it than are routed to queue B, then you bind
|
||||||
|
|
@ -75,14 +87,33 @@ deletion/death of that queue that does permit the possibility of the
|
||||||
message being sent to a queue which then disappears before the message
|
message being sent to a queue which then disappears before the message
|
||||||
is processed. Hence in general, at most one queue.
|
is processed. Hence in general, at most one queue.
|
||||||
|
|
||||||
The exchange type is "x-consistent-hash".
|
The exchange type is `"x-consistent-hash"`.
|
||||||
|
|
||||||
## Supported RabbitMQ Versions
|
|
||||||
|
|
||||||
This plugin supports RabbitMQ 3.3.x and later versions.
|
|
||||||
|
|
||||||
|
|
||||||
## Examples
|
## Usage Examples
|
||||||
|
|
||||||
|
### Overview
|
||||||
|
|
||||||
|
In the below example the queues `q0` and `q1` get bound each with the weight of 1
|
||||||
|
in the hash space to the exchange `e` which means they'll each get
|
||||||
|
roughly the same number of routing keys. The queues `q2` and `q3`
|
||||||
|
however, get 2 buckets each (their weight is 2) which means they'll each get roughly the
|
||||||
|
same number of routing keys too, but that will be approximately twice
|
||||||
|
as many as `q0` and `q1`. The example then publishes 100,000 messages to our
|
||||||
|
exchange with random routing keys, the queues will get their share of
|
||||||
|
messages roughly equal to the binding keys ratios. After this has
|
||||||
|
completed, running `rabbitmqctl list_queues` should show that the
|
||||||
|
messages have been distributed approximately as desired.
|
||||||
|
|
||||||
|
Note the `routing_key`s in the bindings are numbers-as-strings. This
|
||||||
|
is because AMQP 0-9-1 specifies the `routing_key` field must be a string.
|
||||||
|
|
||||||
|
It is important to ensure that the messages being published
|
||||||
|
to the exchange have a range of different `routing_key`s: if a very
|
||||||
|
small set of routing keys are being used then there's a possibility of
|
||||||
|
messages not being evenly distributed between the various queues. If
|
||||||
|
the routing key is a pseudo-random session ID or such, then good
|
||||||
|
results should follow.
|
||||||
|
|
||||||
### Erlang
|
### Erlang
|
||||||
|
|
||||||
|
|
@ -102,11 +133,11 @@ test() ->
|
||||||
[amqp_channel:call(Chan, #'queue.declare' { queue = Q }) || Q <- Queues],
|
[amqp_channel:call(Chan, #'queue.declare' { queue = Q }) || Q <- Queues],
|
||||||
[amqp_channel:call(Chan, #'queue.bind' { queue = Q,
|
[amqp_channel:call(Chan, #'queue.bind' { queue = Q,
|
||||||
exchange = <<"e">>,
|
exchange = <<"e">>,
|
||||||
routing_key = <<"10">> })
|
routing_key = <<"1">> })
|
||||||
|| Q <- [<<"q0">>, <<"q1">>]],
|
|| Q <- [<<"q0">>, <<"q1">>]],
|
||||||
[amqp_channel:call(Chan, #'queue.bind' { queue = Q,
|
[amqp_channel:call(Chan, #'queue.bind' { queue = Q,
|
||||||
exchange = <<"e">>,
|
exchange = <<"e">>,
|
||||||
routing_key = <<"20">> })
|
routing_key = <<"2">> })
|
||||||
|| Q <- [<<"q2">>, <<"q3">>]],
|
|| Q <- [<<"q2">>, <<"q3">>]],
|
||||||
Msg = #amqp_msg { props = #'P_basic'{}, payload = <<>> },
|
Msg = #amqp_msg { props = #'P_basic'{}, payload = <<>> },
|
||||||
[amqp_channel:call(Chan,
|
[amqp_channel:call(Chan,
|
||||||
|
|
@ -120,34 +151,7 @@ amqp_connection:close(Conn),
|
||||||
ok.
|
ok.
|
||||||
```
|
```
|
||||||
|
|
||||||
As you can see, the queues `q0` and `q1` get bound each with 10 points
|
## Routing on a Header
|
||||||
in the hash space to the exchange `e` which means they'll each get
|
|
||||||
roughly the same number of routing keys. The queues `q2` and `q3`
|
|
||||||
however, get 20 points each which means they'll each get roughly the
|
|
||||||
same number of routing keys too, but that will be approximately twice
|
|
||||||
as many as `q0` and `q1`. We then publish 100,000 messages to our
|
|
||||||
exchange with random routing keys, the queues will get their share of
|
|
||||||
messages roughly equal to the binding keys ratios. After this has
|
|
||||||
completed, running `rabbitmqctl list_queues` should show that the
|
|
||||||
messages have been distributed approximately as desired.
|
|
||||||
|
|
||||||
Note the `routing_key`s in the bindings are numbers-as-strings. This
|
|
||||||
is because AMQP specifies the routing_key must be a string.
|
|
||||||
|
|
||||||
The more points in the hash space each binding has, the closer the
|
|
||||||
actual distribution will be to the desired distribution (as indicated
|
|
||||||
by the ratio of points by binding). However, large numbers of points
|
|
||||||
(many thousands) will substantially decrease performance of the
|
|
||||||
exchange type.
|
|
||||||
|
|
||||||
Equally, it is important to ensure that the messages being published
|
|
||||||
to the exchange have a range of different `routing_key`s: if a very
|
|
||||||
small set of routing keys are being used then there's a possibility of
|
|
||||||
messages not being evenly distributed between the various queues. If
|
|
||||||
the routing key is a pseudo-random session ID or such, then good
|
|
||||||
results should follow.
|
|
||||||
|
|
||||||
## Routing on a header
|
|
||||||
|
|
||||||
Under most circumstances the routing key is a good choice for something to
|
Under most circumstances the routing key is a good choice for something to
|
||||||
hash. However, in some cases you need to use the routing key for some other
|
hash. However, in some cases you need to use the routing key for some other
|
||||||
|
|
@ -169,7 +173,7 @@ be used. For example using the Erlang client as above:
|
||||||
If you specify "hash-header" and then publish messages without the named
|
If you specify "hash-header" and then publish messages without the named
|
||||||
header, they will all get routed to the same (arbitrarily-chosen) queue.
|
header, they will all get routed to the same (arbitrarily-chosen) queue.
|
||||||
|
|
||||||
## Routing on a message property
|
## Routing on a Message Property
|
||||||
|
|
||||||
In addition to a value in the header property, you can also route on the
|
In addition to a value in the header property, you can also route on the
|
||||||
``message_id``, ``correlation_id``, or ``timestamp`` message property. To do so,
|
``message_id``, ``correlation_id``, or ``timestamp`` message property. To do so,
|
||||||
|
|
@ -185,23 +189,44 @@ property to be used. For example using the Erlang client as above:
|
||||||
}).
|
}).
|
||||||
```
|
```
|
||||||
|
|
||||||
Note that you can not declare an exchange that routes on both "hash-header" and
|
Note that you declaring an exchange that routes on both "hash-header" and
|
||||||
"hash-property". If you specify "hash-property" and then publish messages without
|
"hash-property" is not allowed. When "hash-property" is specified, and a message
|
||||||
a value in the named property, they will all get routed to the same
|
is published without a value in the named property, they will all get routed to the same
|
||||||
(arbitrarily-chosen) queue.
|
(arbitrarily-chosen) queue.
|
||||||
|
|
||||||
|
|
||||||
## Getting Help
|
## Getting Help
|
||||||
|
|
||||||
Any comments or feedback welcome, to the
|
Any comments or feedback welcome, to the
|
||||||
[RabbitMQ mailing list](https://groups.google.com/forum/#!forum/rabbitmq-users).
|
[RabbitMQ mailing list](https://groups.google.com/forum/#!forum/rabbitmq-users).
|
||||||
|
|
||||||
|
|
||||||
|
## Implementation Details
|
||||||
|
|
||||||
|
The hash function used in this plugin as of RabbitMQ 3.7.8
|
||||||
|
is [A Fast, Minimal Memory, Consistent Hash Algorithm](https://arxiv.org/abs/1406.2294) by Lamping and Veach.
|
||||||
|
|
||||||
|
When a queue is bound to a consistent hash exchange, the protocol method, `queue.bind`,
|
||||||
|
carries a weight in the routing (binding) key. The binding is given
|
||||||
|
a number of buckets on the hash ring (hash space) equal to the weight.
|
||||||
|
When a queue is unbound, the buckets added for the binding are deleted.
|
||||||
|
These two operations use linear algorithms to update the ring.
|
||||||
|
|
||||||
|
To perform routing the exchange extract the appropriate value for hashing,
|
||||||
|
hashes it and retrieves a bucket number from the ring, then the bucket and
|
||||||
|
its associated queue.
|
||||||
|
|
||||||
|
The implementation assumes there is only one binding between a consistent hash
|
||||||
|
exchange and a queue. Having more than one bunding is unnecessary because
|
||||||
|
queue weight can be provided at the time of binding.
|
||||||
|
|
||||||
## Continuous Integration
|
## Continuous Integration
|
||||||
|
|
||||||
[](https://travis-ci.org/rabbitmq/rabbitmq-consistent-hash-exchange)
|
[](https://travis-ci.org/rabbitmq/rabbitmq-consistent-hash-exchange)
|
||||||
|
|
||||||
## Copyright and License
|
## Copyright and License
|
||||||
|
|
||||||
(c) 2013-2015 Pivotal Software Inc.
|
(c) 2013-2018 Pivotal Software Inc.
|
||||||
|
|
||||||
Released under the Mozilla Public License 1.1, same as RabbitMQ.
|
Released under the Mozilla Public License 1.1, same as RabbitMQ.
|
||||||
See [LICENSE](https://github.com/rabbitmq/rabbitmq-consistent-hash-exchange/blob/master/LICENSE) for
|
See [LICENSE](https://github.com/rabbitmq/rabbitmq-consistent-hash-exchange/blob/master/LICENSE) for
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue