This way multiple brokers can be run at the same time. Before that the
only option was to use `bazel run start-cluster`, but it's not
granular enough to run rabbits from different checkouts or with
different configs.
to have the equivalent of `make start-cluster` and `make stop-cluster`.
To create a 3-node RabbitMQ cluster:
```
bazel run --config=local start-cluster
```
To define number of nodes or a custom directory:
```
bazel run --config=local start-cluster NODES=5 TEST_TMPDIR="$HOME/scratch/myrabbit"
```
To stop the cluster:
```
bazel run --config=local stop-cluster
```
or, if started by the 2nd command:
```
bazel run --config=local stop-cluster NODES=5 TEST_TMPDIR="$HOME/scratch/myrabbit"
```
bazel-erlang has been renamed rules_erlang. v2 is a substantial
refactor that brings Windows support. While this alone isn't enough to
run all rabbitmq-server suites on windows, one can at least now start
the broker (bazel run broker) and run the tests that do not start a
background broker process
Unlike with gnu make, mixed version testing with bazel uses a package-generic-unix for the secondary umbrella rather than the source. This brings the benefit of being able to mixed version test releases built with older erlang versions (even though all nodes will run under the single version given to bazel)
This introduces new test labels, adding a `-mixed` suffix for every existing test. They can be skipped if necessary with `--test_tag_filters` (see the github actions workflow for an example)
As part of the change, it is now possible to run an old release of rabbit with rabbitmq_run rule, such as:
`bazel run @rabbitmq-server-generic-unix-3.8.17//:rabbitmq-run run-broker`
In newer Erlang, beam.smp no longer writes a pidfile, until the rabbit
applicataion starts. It also no longer passes -mneisa dir and -sname,
which are required in order to start the node only delaying
the application start up.
Handle that so the Pacemaker HA setup keeps working with newer Erlang
and rabbitmq-server versions.
Fix '[ x == x ]' bashisms as well to silence errors in the RA logs.
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
Adds WORKSPACE.bazel, BUILD.bazel & *.bzl files for partial build & test with Bazel. Introduces a build-time dependency on https://github.com/rabbitmq/bazel-erlang
RABBITMQ_NODE_PORT is exported by default and set to 5672. Re-exporting it in that
case will actually break the case where we set up rabbit with tls on the default port:
2021-02-28 07:44:10.732 [error] <0.453.0> Failed to start Ranch listener
{acceptor,{172,17,1,93},5672} in ranch_ssl:listen([{cacerts,'...'},{key,'...'},{cert,'...'},{ip,{172,17,1,93}},{port,5672},
inet,{keepalive,true}, {versions,['tlsv1.1','tlsv1.2']},{certfile,"/etc/pki/tls/certs/rabbitmq.crt"},{keyfile,"/etc/pki/tls/private/rabbitmq.key"},
{depth,1},{secure_renegotiate,true},{reuse_sessions,true},{honor_cipher_order,true},{verify,verify_none},{fail_if_no_peer_cert,false}])
for reason eaddrinuse (address already in use)
This is because by explicitely always exporting it, we force rabbit to listen to
that port via tcp and that is a problem when we want to do SSL on that port.
Since 5672 is the default port already we can just avoid exporting this port when
the user does not customize the port.
Tested both in a non-TLS env (A) and in a TLS-env (B) successfully:
(A) Non-TLS
[root@messaging-0 /]# grep -ir -e tls -e ssl /etc/rabbitmq
[root@messaging-0 /]#
[root@messaging-0 /]# pcs status |grep rabbitmq
* rabbitmq-bundle-0 (ocf::rabbitmq:rabbitmq-server-ha): Master messaging-0
* rabbitmq-bundle-1 (ocf::rabbitmq:rabbitmq-server-ha): Master messaging-1
* rabbitmq-bundle-2 (ocf::rabbitmq:rabbitmq-server-ha): Master messaging-2
(B) TLS
[root@messaging-0 /]# grep -ir -e tls -e ssl /etc/rabbitmq/ |head -n3
/etc/rabbitmq/rabbitmq.config: {ssl, [{versions, ['tlsv1.1', 'tlsv1.2']}]},
/etc/rabbitmq/rabbitmq.config: {ssl_listeners, [{"172.17.1.48", 5672}]},
/etc/rabbitmq/rabbitmq.config: {ssl_options, [
[root@messaging-0 ~]# pcs status |grep rabbitmq
* rabbitmq-bundle-0 (ocf::rabbitmq:rabbitmq-server-ha): Master messaging-0
* rabbitmq-bundle-1 (ocf::rabbitmq:rabbitmq-server-ha): Master messaging-1
* rabbitmq-bundle-2 (ocf::rabbitmq:rabbitmq-server-ha): Master messaging-2
Note: I don't believe we should export RABBITMQ_NODE_PORT at all, since you can specify all ports
in the rabbit configuration anyways, but prefer to play it safe here as folks might rely on being
able to customize this.
Signed-off-by: Michele Baldessari <michele@acksyn.org>
Currently every call to unblock_client_access() is followed by a log line
showing which function requested the unblocking. When we pass the parameter
OCF_RESKEY_avoid_using_iptables=true it makes no sense to log
unblocking of iptables since it is effectively a no-op.
Let's move that logging inside the unblock_client_access() function
allowing a parameter to log which function called it.
Tested on a cluster with rabbitmq bundles with avoid_using_iptables=true
and observed no spurious logging any longer:
[root@messaging-0 ~]# journalctl |grep 'unblocked access to RMQ port' |wc -l
0
We introduce the OCF_RESKEY_allowed_cluster_node parameter which can be used to specify
which nodes of the cluster rabbitmq is expected to run on. When this variable is not
set the resource agent assumes that all nodes of the cluster (output of crm_node -l)
are eligible to run rabbitmq. The use case here is clusters that have a large
numbers of node, where only a specific subset is used for rabbitmq (usually this is
done with some constraints).
Tested in a 9-node cluster as follows:
[root@messaging-0 ~]# pcs resource config rabbitmq
Resource: rabbitmq (class=ocf provider=rabbitmq type=rabbitmq-server-ha)
Attributes: allowed_cluster_nodes="messaging-0 messaging-1 messaging-2" avoid_using_iptables=true
Meta Attrs: container-attribute-target=host master-max=3 notify=true ordered=true
Operations: demote interval=0s timeout=30 (rabbitmq-demote-interval-0s)
monitor interval=5 timeout=30 (rabbitmq-monitor-interval-5)
monitor interval=3 role=Master timeout=30 (rabbitmq-monitor-interval-3)
notify interval=0s timeout=20 (rabbitmq-notify-interval-0s)
promote interval=0s timeout=60s (rabbitmq-promote-interval-0s)
start interval=0s timeout=200s (rabbitmq-start-interval-0s)
stop interval=0s timeout=200s (rabbitmq-stop-interval-0s)
[root@messaging-0 ~]# pcs status |grep -e rabbitmq -e messaging
* Online: [ controller-0 controller-1 controller-2 database-0 database-1 database-2 messaging-0 messaging-1 messaging-2 ]
...
* Container bundle set: rabbitmq-bundle [cluster.common.tag/rhosp16-openstack-rabbitmq:pcmklatest]:
* rabbitmq-bundle-0 (ocf::rabbitmq:rabbitmq-server-ha): Master messaging-0
* rabbitmq-bundle-1 (ocf::rabbitmq:rabbitmq-server-ha): Master messaging-1
* rabbitmq-bundle-2 (ocf::rabbitmq:rabbitmq-server-ha): Master messaging-2
Console output is handled in the SysV init scripts consistently (no more
differences between the Debian and RPM packages). See the previous
commit.
This fixes an issue for users who used to define $RABBITMQ_LOG_BASE in
the environment and called this script directly (i.e. not using the SysV
init scripts). Before commit 4b7048205d
(which made it to RabbitMQ 3.8.4), `rabbitmq-script-wrapper` took
$RABBITMQ_LOG_BASE from rabbitmq-env(8) or the environment. After the
mentionned commit, $RABBITMQ_LOG_BASE was hard-coded to setup console
redirection (in the case of Debian only) because rabbitmq-env(8) didn't
have the variable anymore and thus was not sourced.
For those users, it meant they couldn't override $RABBITMQ_LOG_BASE in
the environment and call this script, even if they wanted to change the
location of RabbitMQ actual log files.
Now that console redirection is handled by the SysV init scripts, we can
get rid of that code in `rabbitmq-script-wrapper`.
Fixesrabbitmq/rabbitmq-server-release#131.
Currently the resource agent hard-codes iptables calls to block off
client access before the resource becomes master. This was done
historically because many libraries were fairly buggy detecting a
not-yet functional rabbitmq, so they were being helped by getting
a tcp RST packet and they would go on trying their next configured
server.
It makes sense to be able to disable this behaviour because
most libraries by now have gotten better at detecting timeouts when
talking to rabbit and because when you run rabbitmq inside a bundle
(pacemaker term for a container with an OCF resource inside) you
normally do not have access to iptables.
Tested by creating a three-node bundle cluster inside a container:
Container bundle set: rabbitmq-bundle [cluster.common.tag/rhosp16-openstack-rabbitmq:pcmklatest]
Replica[0]
rabbitmq-bundle-podman-0 (ocf:💓podman): Started controller-0
rabbitmq-bundle-0 (ocf::pacemaker:remote): Started controller-0
rabbitmq (ocf::rabbitmq:rabbitmq-server-ha): Master rabbitmq-bundle-0
Replica[1]
rabbitmq-bundle-podman-1 (ocf:💓podman): Started controller-1
rabbitmq-bundle-1 (ocf::pacemaker:remote): Started controller-1
rabbitmq (ocf::rabbitmq:rabbitmq-server-ha): Master rabbitmq-bundle-1
Replica[2]
rabbitmq-bundle-podman-2 (ocf:💓podman): Started controller-2
rabbitmq-bundle-2 (ocf::pacemaker:remote): Started controller-2
rabbitmq (ocf::rabbitmq:rabbitmq-server-ha): Master rabbitmq-bundle-2
The ocf resource was created inside a bundle with:
pcs resource create rabbitmq ocf:rabbitmq:rabbitmq-server-ha avoid_using_iptables="true" \
meta notify=true container-attribute-target=host master-max=3 ordered=true \
op start timeout=200s stop timeout=200s promote timeout=60s bundle rabbitmq-bundle
Signed-off-by: Michele Baldessari <michele@acksyn.org>
This was used to define `$RABBITMQ_LOG_BASE`, but this variable is no
longer define there.
rabbitmq-env would also load `rabbitmq-env.conf` which could redefine
`$RABBITMQ_LOG_BASE`, but this is a corner case and doesn't fit
packaging well: packages already prepare a location for log files and
will clean this location up on removal.
Now, we set `$RABBITMQ_LOG_BASE` value in those scripts and get rid of
rabbitmq-env load.
This commit updates URLs to prefer the https protocol. Redirects are not followed to avoid accidentally expanding intentionally shortened URLs (i.e. if using a URL shortener).
# Fixed URLs
## Fixed Success
These URLs were switched to an https URL with a 2xx status. While the status was successful, your review is still recommended.
* [ ] http://www.apache.org/licenses/LICENSE-2.0 with 1 occurrences migrated to:
https://www.apache.org/licenses/LICENSE-2.0 ([https](https://www.apache.org/licenses/LICENSE-2.0) result 200).
Instead of calling crm_node directly it is preferrable to use the
ocf_attribute_target function. This function will return crm_node -n
as usual, except when run inside a bundle (aka container in pcmk
language). Inside a bundle it will return the bundle name or, if the
meta attribute meta_container_attribute_target is set to 'host', it
will return the physical node name where the bundle is running.
Typically when running a rabbitmq cluster inside containers it is
desired to set 'meta_container_attribute_target=host' on the rabbit
cluster resource so that the RA is aware on which host it is running.
Tested both on baremetal (without containers):
Master/Slave Set: rabbitmq-master [rabbitmq]
Masters: [ controller-0 controller-1 controller-2 ]
And with bundles as well.
Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com>
Exit codes from sysexits.h were introduced in rabbitmq CLI with
https://github.com/rabbitmq/rabbitmq-server/pull/412. The OCF
agent for non-clustered setup was not updated and some exit codes
were incorrectly reported as unexpected.
In is_clustered_with(), commands that we run to check if the node is
clustered with us, or partitioned with us may fail. When they fail, it
actually doesn't tell us anything about the remote node.
Until now, we were considering such failures as hints that the remote
node is not in a sane state with us. But doing so has pretty negative
impact, as it can cause rabbitmq to get restarted on the remote node,
causing quite some disruption.
So instead of doing this, ignore the error (it's still logged).
There was a comment in the code wondering what is the best behavior;
based on experience, I think preferring stability is the slightly more
acceptable poison between the two options.
Right now, every time we get a start notification, all nodes will ensure
the rabbitmq app is started. This makes little sense, as nodes that are
already active don't need to do that.
On top of that, this had the sideeffect of updating the start time for
each of these nodes, which could result in the master moving to another
node.
If there's nothing starting and nothing active, then we do a -z " ",
which doesn't have the same result as -z "". Instead, just test for
emptiness for each set of nodes.
It may happen that two nodes have the same start time, and one of these
is the master. When this happens, the node actually gets the same score
as the master and can get promoted. There's no reason to avoid being
stable here, so let's keep the same master in that scenario.
Remove argument quoting which is not necessary as long as command is
passed to `/sbin/runuser` and `/bin/su` as arguments instead of a
string.
Fixes#44.
[#150221349]