Commit Graph

14515 Commits

Author SHA1 Message Date
Michael Klishin 4f7da6bef9 Merge branch 'rabbitmq-server-541' into stable
This is only a part of what #541 is supposed to cover but
it already helped in a particular node shutdown lockup we've
observed => worth merging earlier.

Per discussion with @dcorbacho.
2016-01-29 17:13:53 +03:00
Michael Klishin 548c1ca6bf Merge pull request #588 from rabbitmq/rabbitmq-management-117
Specify hash algorithm in change_password_hash
2016-01-29 02:59:51 +03:00
Michael Klishin 25f7e3d4f5 Trailing ws 2016-01-28 22:51:31 +03:00
Daniil Fedotov f1f28eac1b Specify hash algorithm in change_password_hash 2016-01-28 15:17:06 +00:00
Michael Klishin 6aba71548f Merge branch 'stable' into rabbitmq-server-541 2016-01-27 16:16:45 +03:00
Michael Klishin 9af509002c Merge pull request #584 from rabbitmq/rabbitmq-server-581
Unblock receive after 15s
2016-01-27 16:16:12 +03:00
Diana Corbacho dc6cb6ba63 Increase supervisor timeout to net_ticktime + 10s 2016-01-27 12:17:27 +00:00
Diana Corbacho a540dcb152 Unblock receive after 15s 2016-01-27 12:15:40 +00:00
Diana Corbacho 266d94ac27 Introduce timeout in rabbit_channel_sup 2016-01-27 12:07:52 +00:00
Jean-Sébastien Pédron 5dfd117a18 rabbit.erl: Do not run systemd-notify on Windows
This silences a warning logged during RabbitMQ startup.
2016-01-27 11:51:16 +01:00
Jean-Sébastien Pédron 44cd71d1a4 Merge pull request #535 from rabbitmq/rabbitmq-server-307
Ignore duplicate down_from_ch
2016-01-27 09:08:19 +01:00
Michael Klishin 9b96066d59 Merge branch 'rabbitmq-server-493' into stable 2016-01-27 04:06:33 +03:00
Jean-Sébastien Pédron bb8c65c0e7 Create directories and files on Windows before conversion to short filenames
If the directory or file does not exist before RabbitMQ starts, we can't let
RabbitMQ create it, otherwise, it's created with its short filename, not its
long one.

With this new correction, we can "escape" all variables instead of only
RABBITMQ_BASE.

Fixes #493.
2016-01-26 18:42:57 +01:00
Jean-Sébastien Pédron 8431261b43 rabbitmq-server.bat: Honor RABBITMQ_LOGS=- to log to stdout
Note that at the time of this commit, Lager does not support logging
to stdout on Windows. This commit still improves consistency between
Unix and Windows.

References #493.
2016-01-26 11:38:30 +01:00
Jean-Sébastien Pédron 0cf09727a6 Use RABBITMQ_HOME to set the path to RabbitMQ ebin directory
Compared to the script's parent directory (stored in TDP0),
RABBITMQ_HOME is converted to a short filename to avoid non-ASCII in the
path.

Fixes #493.
2016-01-26 11:29:43 +01:00
Jean-Sébastien Pédron 4fdacff37d Use short filenames in Windows startup scripts
On Windows, cmd.exe and batch scripts do not support Uniode apparently.
However, Windows uses UTF-16 to encode filenames one disk. In batch
scripts, filenames are converted to some one-byte-wide charset. Once
passed to Erlang and RabbitMQ, those filenames are incorrect. In
particular, the management UI is unhappy because filenames obviously
contain invalid UTF-8 characters.

Using short filenames makes sure filename only contain US-ASCII
characters.

To convert them, we use "for" expansion. At the same time, filenames are
made absolute. It works even better than realpath.exe because the latter
also converts filenames to another charset again.

Fixe #493.
2016-01-26 11:29:39 +01:00
Michael Klishin 6d3636afe1 Merge pull request #573 from binarin/rabbitmq-server-systemd-notify-zero-deps
Use systemd-notify(1) shell helper as fallback
2016-01-22 17:06:52 +03:00
Alexey Lebedeff 466dea8ba6 Use systemd-notify(1) shell helper as fallback
Currently external erlang library `sd_notify` is used to make systemd
unit with `Type=notify` to work correctly. This library contains some C
code and thus cannot be built into architecture-independent package.

But it is not actually needed, as systemd provides systemd-notify(1)
helper for shell scripts which serves exactly the same purpose.

The only thing is that you need to add `NotifyAccess=all` to your unit
file to make everything work well.
2016-01-22 16:46:47 +03:00
Michael Klishin 116f9bd449 Merge pull request #571 from binarin/rabbitmq-server-ocf-shell-quoting
Fix usage of uninitialized variable in OCF script
2016-01-21 15:04:22 +03:00
Alexey Lebedeff 6599946f13 Fix usage of uninitialized variable in OCF script 2016-01-21 14:56:30 +03:00
Michael Klishin 09efccd62c Merge pull request #563 from binarin/rabbitmq-server-ocf-list-channels-diagnostics
Improve OCF script diagnostics for timed-out 'list_channels'
2016-01-21 14:12:07 +03:00
Michael Klishin 8021fef203 Merge pull request #566 from rabbitmq/rabbitmq-server-319
Remove duplicate code in pre_publish and publish functions
2016-01-20 17:34:33 +03:00
Alexey Lebedeff 3108dabf61 Improve 'list_channels' diagnostics in OCF
timeout(1) manpage mentions 124 as another valid return code from, in addition to 128 + signal-number.
2016-01-20 17:14:38 +03:00
Loïc Hoguin 7a0b50ce24 Remove duplicate code in pre_publish and publish functions 2016-01-20 14:15:16 +01:00
Michael Klishin afd30fb0a1 Merge pull request #560 from dmitrymex/reset-master-score
Reset master score if we decide to restart RabbitMQ on timeout
2016-01-20 15:29:18 +03:00
Alexey Lebedeff e78bc2d9b7 Improve rabbitmq OCF script diagnostics
Currently time-out when running 'rabbitmqctl list_channels' is treated
as a sign that current node is unhealthy. But it could not be the
case, as the hanging channel could be actually on some other
node. Given that currently we have more than one bug related to
'list_channels', it makes sense to improve diagnostics here.

This patch doesn't change any behaviour, only improves logging after
time-out happens. If time-outs continue to occur (even with latest
rabbitmq versions or with backported fixes), we could switch to this
improved list_channels and kill rabbitmq only if stuck channels are
located on current node. But I hope that all related rabbitmq bugs
were already closed.
2016-01-20 12:30:02 +03:00
Dmitry Mescheryakov 9846fbf5d3 Reset master score if we decide to restart RabbitMQ on timeout
Doing otherwise might not trigger the restart while it is clearly
needed.
2016-01-19 19:15:31 +03:00
Michael Klishin 6f6825a074 Merge pull request #558 from galanoff/stable
Add optional prefix for RabbitMQ node FQDNs
2016-01-19 16:50:29 +03:00
Kyrylo Galanov 3202d21514 Add optional prefix for RabbitMQ node FQDNs
It would allow to instantiate multiple rabbit clusters constructed
from prefix-based instances of rabbit nodes.
2016-01-18 17:44:25 +02:00
Jean-Sébastien Pédron 115babd868 Merge pull request #527 from binarin/rabbitmq-server-better-startup-diagnostics
Improve diagnostics in 'rabbitmq-server' script
2016-01-15 18:06:00 +01:00
Michael Klishin c02d7ba8e4 Merge pull request #547 from bogdando/bug/1531838
Fix rabbitMQ OCF monitor detection of running master
2016-01-15 14:06:10 +03:00
Michael Klishin 9a1b1caad9 Merge pull request #543 from binarin/rabbitmq-server-rotate-logs-data-loss
Fix 'rabbitmqctl rotate_logs' behaviour
2016-01-15 02:37:58 +03:00
Michael Klishin d54c06ab2f Merge pull request #552 from binarin/rabbitmq-server-549
Limit number of unique node names for rabbitmqctl
2016-01-14 23:48:40 +03:00
Alexey Lebedeff 5eeed4886c Limit number of unique node names for rabbitmqctl
It prevents atom table overflow in a long running broker.

Fixes #549
2016-01-14 17:16:25 +03:00
Bogdan Dobrelya 6fd4eb5bcb Fix rabbitMQ OCF monitor detection of running master
When monitor detected the node as OCF_RUNNING_MASTER, this may be
lost while the monitor checks in progress.
* Rework the prev_rc by the rc_check to fix this.
* Also add info log if detected as running master.
* Break the monitor check loop early, if it shall be exiting to be
  restarted by pacemaker.
* Do not recheck the master status and do not update the master score,
  if the node was already detected by monitor as OCF_RUNNING_MASTER.
  By that point, the running and healthy master shall not be checked
  against other nodes uptime as it is pointless and only takes more
  time and resources for the action monitor to finish.
* Fail early, if monitor detected the node as OCF_RUNNING_MASTER, but
  the rabbit beam process is not running
* For OCF_CHECK_LEVEL>20, exclude the current node from the check
  loop as we already checked it before

Related Fuel bug:
https://launchpad.net/bugs/1531838

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-01-14 10:03:57 +01:00
Alexey Lebedeff dbbe3d6e47 Fix 'rabbitmqctl rotate_logs' behaviour
When 'rabbitmqctl rotate_logs' is called without any parameters, it
clears logs unconditionally. And given that this form is used in
logrotate config files, this could result in data loss.

This could be reproduced with following scenario:
1) 'max_size' is set globally in lograte config
2) One of two rabbitmq logs is greater than that limit
3) Daily logrotate run was already performed today, and now we
   are calling it manually. In this case logrotate will copy only file
   that is bigger than max_size, but 'rabbitmqctl rotate_logs' will
   clear both of them - leading to data loss.
2016-01-12 17:03:19 +03:00
Michael Klishin 86db8bccc9 Merge pull request #542 from rabbitmq/rabbitmq-server-528
Make number of Ranch acceptors configurable
2016-01-12 16:14:55 +03:00
Michael Klishin 577183deed Merge branch 'stable' into rabbitmq-server-528 2016-01-12 14:31:56 +03:00
Loïc Hoguin ce484f2fa9 Make number of Ranch acceptors configurable 2016-01-12 11:25:57 +01:00
Michael Klishin 1dcaad8f48 Merge pull request #540 from bogdando/bug/1529897
OCF: Fuel bug 1529897
2016-01-12 13:25:47 +03:00
Joseph Yiasemides 93b9e37c3e Include alarm information in output for cluster status
After this change `rabbitmqctl cluster_status` will print information
about alarms raised across a cluster.
2016-01-11 19:04:05 +03:00
Bogdan Dobrelya 5a3418f2f1 Fix get_status, action_stop, proc_stop then beam's unresponsive
* Fix get status() to catch beam state and output errors
* Fix action_stop() to force name-based mathcing then no
pidfile and the beam's unresponsive
* Fix proc_stop to use name based matching if no pidfile
found
* Fix proc_stop to retry sending the signal when using the name
based match as well

W/o this patch, the situation is possible when:
- beam's running and cannot process signals, but is reported "not running"
by the get_status(), while in fact it shall be reported as generic error
- which_applications() returned error, while its output is still
being parsed for the "what" match, while it shall not.
- action stop and proc_stop gives up then there is no pidfile and the beam's
running unresponsive.

The solution is to make get_status to return generic error and action
stop to use the rabbit process name matching for killing it.

Related Fuel bug:
https://bugs.launchpad.net/fuel/+bug/1529897

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-01-11 15:45:15 +01:00
Bogdan Dobrelya 968623d98d Fix proc_kill then there is no pid found
W/o this fix, the rabbit OCF cannot make
proc_stop to try to kill the pid-less beam process
by its name matching because the proc_kill()'s
1st parameter cannot be passed empty.

The fix is to use the "none" value then the pid-less
process must be matched by the service_name instead.

Also, fix the proc_kill to deal with Multi process
pid files as well (there are many pids, a space separated).

Related Fuel bugs:
https://launchpad.net/bugs/1529897
https://launchpad.net/bugs/1532723

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-01-11 13:36:59 +01:00
Michael Klishin 7e1022d11b Merge pull request #533 from rabbitmq/rabbitmq-federation-7
Set deleting exchange status
2016-01-11 14:33:19 +03:00
Michael Klishin 87e68fef82 Merge pull request #538 from bogdando/bug/1529897
Syntax and local vars usage fixes to OCF HA
2016-01-11 13:20:33 +03:00
Bogdan Dobrelya b00c0576dd Syntax and local vars usage fixes to OCF HA
Related Fuel bug:
https://launchpad.net/bugs/1529897

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2016-01-11 10:59:20 +01:00
Michael Klishin 630beb3a5c Clear temporary runtime exchange parameters on boot
This makes sure that values that were set right before
node failure or restart are not retained.
2016-01-10 01:40:19 +03:00
Michael Klishin 7b864ffa82 Elaborate 2016-01-09 20:38:12 +03:00
Michael Klishin 438f16d2b8 Ensure exchange-delete-in-progress is always cleared 2016-01-09 20:30:04 +03:00
Michael Klishin 26b670a90e Merge branch 'stable' into rabbitmq-federation-7 2016-01-09 18:01:05 +03:00