rabbitmq-server

Commit Graph

Author	SHA1	Message	Date
Bogdan Dobrelya	7f67adcb5c	OCF RA: fix start/stop handling In newer Erlang, beam.smp no longer writes a pidfile, until the rabbit applicataion starts. It also no longer passes -mneisa dir and -sname, which are required in order to start the node only delaying the application start up. Handle that so the Pacemaker HA setup keeps working with newer Erlang and rabbitmq-server versions. Fix '[ x == x ]' bashisms as well to silence errors in the RA logs. Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>	2021-06-30 17:40:37 +02:00
Pavel Heimlich	020b22f9ea	fix hostname(1) calls on Solaris	2021-06-17 13:23:05 +02:00
Michele Baldessari	cc2d46a30d	Only export RABBITMQ_NODE_PORT when it is not the default RABBITMQ_NODE_PORT is exported by default and set to 5672. Re-exporting it in that case will actually break the case where we set up rabbit with tls on the default port: 2021-02-28 07:44:10.732 [error] <0.453.0> Failed to start Ranch listener {acceptor,{172,17,1,93},5672} in ranch_ssl:listen([{cacerts,'...'},{key,'...'},{cert,'...'},{ip,{172,17,1,93}},{port,5672}, inet,{keepalive,true}, {versions,['tlsv1.1','tlsv1.2']},{certfile,"/etc/pki/tls/certs/rabbitmq.crt"},{keyfile,"/etc/pki/tls/private/rabbitmq.key"}, {depth,1},{secure_renegotiate,true},{reuse_sessions,true},{honor_cipher_order,true},{verify,verify_none},{fail_if_no_peer_cert,false}]) for reason eaddrinuse (address already in use) This is because by explicitely always exporting it, we force rabbit to listen to that port via tcp and that is a problem when we want to do SSL on that port. Since 5672 is the default port already we can just avoid exporting this port when the user does not customize the port. Tested both in a non-TLS env (A) and in a TLS-env (B) successfully: (A) Non-TLS [root@messaging-0 /]# grep -ir -e tls -e ssl /etc/rabbitmq [root@messaging-0 /]# [root@messaging-0 /]# pcs status \|grep rabbitmq * rabbitmq-bundle-0 (ocf::rabbitmq:rabbitmq-server-ha): Master messaging-0 * rabbitmq-bundle-1 (ocf::rabbitmq:rabbitmq-server-ha): Master messaging-1 * rabbitmq-bundle-2 (ocf::rabbitmq:rabbitmq-server-ha): Master messaging-2 (B) TLS [root@messaging-0 /]# grep -ir -e tls -e ssl /etc/rabbitmq/ \|head -n3 /etc/rabbitmq/rabbitmq.config: {ssl, [{versions, ['tlsv1.1', 'tlsv1.2']}]}, /etc/rabbitmq/rabbitmq.config: {ssl_listeners, [{"172.17.1.48", 5672}]}, /etc/rabbitmq/rabbitmq.config: {ssl_options, [ [root@messaging-0 ~]# pcs status \|grep rabbitmq * rabbitmq-bundle-0 (ocf::rabbitmq:rabbitmq-server-ha): Master messaging-0 * rabbitmq-bundle-1 (ocf::rabbitmq:rabbitmq-server-ha): Master messaging-1 * rabbitmq-bundle-2 (ocf::rabbitmq:rabbitmq-server-ha): Master messaging-2 Note: I don't believe we should export RABBITMQ_NODE_PORT at all, since you can specify all ports in the rabbit configuration anyways, but prefer to play it safe here as folks might rely on being able to customize this. Signed-off-by: Michele Baldessari <michele@acksyn.org>	2021-02-28 15:51:39 +01:00
Michele Baldessari	97117f86dd	Stop logging unblock client access unconditionally Currently every call to unblock_client_access() is followed by a log line showing which function requested the unblocking. When we pass the parameter OCF_RESKEY_avoid_using_iptables=true it makes no sense to log unblocking of iptables since it is effectively a no-op. Let's move that logging inside the unblock_client_access() function allowing a parameter to log which function called it. Tested on a cluster with rabbitmq bundles with avoid_using_iptables=true and observed no spurious logging any longer: [root@messaging-0 ~]# journalctl \|grep 'unblocked access to RMQ port' \|wc -l 0	2021-02-28 15:51:39 +01:00
Michele Baldessari	cf039f9a54	Allow rabbitmq to run in a larger cluster composed of also non-rabbitmq nodes We introduce the OCF_RESKEY_allowed_cluster_node parameter which can be used to specify which nodes of the cluster rabbitmq is expected to run on. When this variable is not set the resource agent assumes that all nodes of the cluster (output of crm_node -l) are eligible to run rabbitmq. The use case here is clusters that have a large numbers of node, where only a specific subset is used for rabbitmq (usually this is done with some constraints). Tested in a 9-node cluster as follows: [root@messaging-0 ~]# pcs resource config rabbitmq Resource: rabbitmq (class=ocf provider=rabbitmq type=rabbitmq-server-ha) Attributes: allowed_cluster_nodes="messaging-0 messaging-1 messaging-2" avoid_using_iptables=true Meta Attrs: container-attribute-target=host master-max=3 notify=true ordered=true Operations: demote interval=0s timeout=30 (rabbitmq-demote-interval-0s) monitor interval=5 timeout=30 (rabbitmq-monitor-interval-5) monitor interval=3 role=Master timeout=30 (rabbitmq-monitor-interval-3) notify interval=0s timeout=20 (rabbitmq-notify-interval-0s) promote interval=0s timeout=60s (rabbitmq-promote-interval-0s) start interval=0s timeout=200s (rabbitmq-start-interval-0s) stop interval=0s timeout=200s (rabbitmq-stop-interval-0s) [root@messaging-0 ~]# pcs status \|grep -e rabbitmq -e messaging * Online: [ controller-0 controller-1 controller-2 database-0 database-1 database-2 messaging-0 messaging-1 messaging-2 ] ... * Container bundle set: rabbitmq-bundle [cluster.common.tag/rhosp16-openstack-rabbitmq:pcmklatest]: * rabbitmq-bundle-0 (ocf::rabbitmq:rabbitmq-server-ha): Master messaging-0 * rabbitmq-bundle-1 (ocf::rabbitmq:rabbitmq-server-ha): Master messaging-1 * rabbitmq-bundle-2 (ocf::rabbitmq:rabbitmq-server-ha): Master messaging-2	2021-02-28 15:51:39 +01:00
Michele Baldessari	6c33da543b	Allow operator to disable iptables client blocking Currently the resource agent hard-codes iptables calls to block off client access before the resource becomes master. This was done historically because many libraries were fairly buggy detecting a not-yet functional rabbitmq, so they were being helped by getting a tcp RST packet and they would go on trying their next configured server. It makes sense to be able to disable this behaviour because most libraries by now have gotten better at detecting timeouts when talking to rabbit and because when you run rabbitmq inside a bundle (pacemaker term for a container with an OCF resource inside) you normally do not have access to iptables. Tested by creating a three-node bundle cluster inside a container: Container bundle set: rabbitmq-bundle [cluster.common.tag/rhosp16-openstack-rabbitmq:pcmklatest] Replica[0] rabbitmq-bundle-podman-0 (ocf:💓podman): Started controller-0 rabbitmq-bundle-0 (ocf::pacemaker:remote): Started controller-0 rabbitmq (ocf::rabbitmq:rabbitmq-server-ha): Master rabbitmq-bundle-0 Replica[1] rabbitmq-bundle-podman-1 (ocf:💓podman): Started controller-1 rabbitmq-bundle-1 (ocf::pacemaker:remote): Started controller-1 rabbitmq (ocf::rabbitmq:rabbitmq-server-ha): Master rabbitmq-bundle-1 Replica[2] rabbitmq-bundle-podman-2 (ocf:💓podman): Started controller-2 rabbitmq-bundle-2 (ocf::pacemaker:remote): Started controller-2 rabbitmq (ocf::rabbitmq:rabbitmq-server-ha): Master rabbitmq-bundle-2 The ocf resource was created inside a bundle with: pcs resource create rabbitmq ocf:rabbitmq:rabbitmq-server-ha avoid_using_iptables="true" \ meta notify=true container-attribute-target=host master-max=3 ordered=true \ op start timeout=200s stop timeout=200s promote timeout=60s bundle rabbitmq-bundle Signed-off-by: Michele Baldessari <michele@acksyn.org>	2020-01-31 08:26:39 +01:00
Spring Operator	8bcebe2185	URL Cleanup This commit updates URLs to prefer the https protocol. Redirects are not followed to avoid accidentally expanding intentionally shortened URLs (i.e. if using a URL shortener). # Fixed URLs ## Fixed Success These URLs were switched to an https URL with a 2xx status. While the status was successful, your review is still recommended. * [ ] http://www.apache.org/licenses/LICENSE-2.0 with 1 occurrences migrated to: https://www.apache.org/licenses/LICENSE-2.0 ([https](https://www.apache.org/licenses/LICENSE-2.0) result 200).	2019-03-21 03:25:18 -05:00
Michele Baldessari	c587ba79eb	Use ocf_attribute_target instead of crm_node Instead of calling crm_node directly it is preferrable to use the ocf_attribute_target function. This function will return crm_node -n as usual, except when run inside a bundle (aka container in pcmk language). Inside a bundle it will return the bundle name or, if the meta attribute meta_container_attribute_target is set to 'host', it will return the physical node name where the bundle is running. Typically when running a rabbitmq cluster inside containers it is desired to set 'meta_container_attribute_target=host' on the rabbit cluster resource so that the RA is aware on which host it is running. Tested both on baremetal (without containers): Master/Slave Set: rabbitmq-master [rabbitmq] Masters: [ controller-0 controller-1 controller-2 ] And with bundles as well. Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com>	2018-11-19 22:06:23 +01:00
Vincent Untz	056f7ed2ec	OCF RA: Do not consider local failures as remote node problems In is_clustered_with(), commands that we run to check if the node is clustered with us, or partitioned with us may fail. When they fail, it actually doesn't tell us anything about the remote node. Until now, we were considering such failures as hints that the remote node is not in a sane state with us. But doing so has pretty negative impact, as it can cause rabbitmq to get restarted on the remote node, causing quite some disruption. So instead of doing this, ignore the error (it's still logged). There was a comment in the code wondering what is the best behavior; based on experience, I think preferring stability is the slightly more acceptable poison between the two options.	2017-12-20 10:24:21 +01:00
Vincent Untz	ea745e62c4	OCF RA: Fix syntax error (cherry picked from commit a9b4a4ff97a96e798de51933fc44f61aa6bc88a3)	2017-12-14 07:07:02 +03:00
Michael Klishin	7e93369f0c	Merge pull request #64 from vuntz/ocf-fix-notify-start OCF RA: Fix various issues with start notification handler	2017-12-12 19:19:39 +03:00
Vincent Untz	a6dc3f91b0	OCF RA: Fix logging in start notification handler The "post-start end" log message was written too early (some things were still done afterwards), and not in all cases (it was inside a if statement).	2017-12-08 14:17:38 +01:00
Vincent Untz	2f284bf595	OCF RA: Do not start rabbitmq if notification of start is not about us Right now, every time we get a start notification, all nodes will ensure the rabbitmq app is started. This makes little sense, as nodes that are already active don't need to do that. On top of that, this had the sideeffect of updating the start time for each of these nodes, which could result in the master moving to another node.	2017-12-08 14:15:24 +01:00
Vincent Untz	a8e7a62513	OCF RA: Fix test for no node in start notification handler If there's nothing starting and nothing active, then we do a -z " ", which doesn't have the same result as -z "". Instead, just test for emptiness for each set of nodes.	2017-12-08 14:13:59 +01:00
Vincent Untz	62a4f75611	OCF RA: Avoid promoting nodes with same start time as master It may happen that two nodes have the same start time, and one of these is the master. When this happens, the node actually gets the same score as the master and can get promoted. There's no reason to avoid being stable here, so let's keep the same master in that scenario.	2017-12-08 13:32:45 +01:00
Michael Klishin	0da346eb88	Merge pull request #21 from vuntz/ocf-limit_nofile OCF RA: Add new limit_nofile parameter to both OCF resource agents	2017-04-05 17:49:34 +03:00
Vincent Untz	73080ac783	OCF RA: Only set limit for open files when higher than current value This allows to set the limit via some other way.	2017-04-04 15:13:52 +02:00
Michael Klishin	91ffc30b66	Merge pull request #24 from vuntz/ocf-vhost OCF RA: Add vhost parameter to rabbitmq-server-ha.ocf	2017-04-04 16:11:08 +03:00
Vincent Untz	89d65b51aa	OCF RA: Add new limit_nofile parameter to rabbitmq-server-ha OCF RA This enables to change the limit of open files, as the default on distributions is usually too low for rabbitmq. Default is 65535.	2017-04-04 15:08:51 +02:00
Vincent Untz	525eaba13a	OCF RA: Add default_vhost parameter to rabbitmq-server-ha.ocf This enables the cluster to focus on a vhost that is not /, in case the most important vhost is something else. For reference, other vhosts may exist in the cluster, but these are not guaranteed to not suffer from any data loss. This patch doesn't address this issue. Closes https://github.com/rabbitmq/rabbitmq-server-release/issues/22	2017-04-04 14:41:50 +02:00
Vincent Untz	9bd1b0a5f3	OCF RA: Don't hardcode primitive name in rabbitmq-server-ha.ocf We can compute the name of the primitive automatically from environment variables, instead of hard-coding p_rabbitmq-server; this makes the resource agent more flexible. Closes https://github.com/rabbitmq/rabbitmq-server-release/issues/23	2017-03-31 13:24:27 +02:00
Dmitry Mescheryakov	67cdbe3067	Correctly return exit code from stop Panicking and returning non-success on stop often leads to resource becoming unmanaged on that node. Before we called get_status to verify that RabbitMQ is dead. But sometimes it returns error even though RabbitMQ is not running. There is no reason to call it - we will just verify that there is no beam process running. Related fuel bug - https://bugs.launchpad.net/fuel/+bug/1626933	2016-10-17 19:43:46 +03:00
Alexey Lebedeff	1d564c8746	OCF RA: Check partitions on non-master nodes Partitions reported by `rabbit_node_monitor:partitions/0` are not commutative (i.e. node1 can report itself as partitioned with node2, but not vice versa). Given that we now have strong notion of master in OCF script, we can check for those fishy situations during master health check, and order damaged nodes to restart. Fuel bug: https://bugs.launchpad.net/fuel/+bug/1628487	2016-09-29 16:13:18 +03:00
Jean-Sébastien Pédron	e97ca28ac7	scripts: Take package-specific files from rabbitmq-server [#130659985]	2016-09-21 16:25:24 +02:00

24 Commits