Commit Graph

24893 Commits

Author SHA1 Message Date
Luke Bakken 6de0656fce
Add redbug library
`redbug` compliments `recon` well and has a better tracing interface IMHO.
2022-01-11 16:22:05 -08:00
Luke Bakken 95a60fc3be
Replace one use of filelib:is_regular/1
This specific case is called multiple times by the Prometheus plugin. It eventually calls `file:read_file_info/1` which leaks on Windows

See #3936
2022-01-11 09:08:34 -08:00
Luke Bakken fd781441f3
Fix issue with fsutil
Fsutil has language-specific messages. Fix by using powershell.exe instead.

Follow-up to #3895

Reported here:
https://groups.google.com/g/rabbitmq-users/c/ypk51AtmrSM
2022-01-08 05:37:41 -08:00
Karl Nilsson 82470e9d1c Stream coordinator: handle machine_version command 2022-01-07 12:30:03 +00:00
Karl Nilsson 9a5d0f9d85 Make stream coodinator machine versioned
In order to retain deterministic results of state machine applications
during upgrades we need to make the stream coordinator versioned such
that we only use the new logic once the stream coordinator switches to
machine version 1.
2022-01-07 12:11:11 +00:00
tomyouyou e5ccf267ff 'rabbit_stream_coordinator:select_leader' runs with wrong comparison
The list consists of candidates which is a tuple {node, tail}, and the tail is made of {epoch, offset}.
While the 'select_leader' think the tail is made of {offset, epoch}. 

Suppose there are two candidates:
[{node1,{1,100}},{node2,{2,99}}] 

It selects node1 as the leader instead of node2 with larger epoch.
2022-01-07 10:06:16 +00:00
Michael Klishin 5a07f728c3
Merge pull request #3956 from tomyouyou/record_dist_ip
Distribution listener IP address is hardcoded in listener metadata
2022-01-05 17:38:35 +04:00
tomyouyou fac249b755
The wrong distribution listener IP address is recorded when it is configured.
Add an item to the configuration file(/etc/rabbitmq/rabbitmq.config):
{kernel, [{inet_dist_use_interface, {8193,291,0,0,0,0,0,1}}]}

Use the netstat command to check the IP address of the distribution port(25672):
netstat -anp | grep 25672
tcp6       0      0 2001:123::1:25672       :::*                    LISTEN      2075/beam.smp

However, 'rabbitmqctl status' shows:
...
Interface: [::], port: 25672, protocol: clustering, purpose: inter-node and CLI tool communication
...
2022-01-05 16:03:24 +08:00
Michael Klishin d2e0fae1aa
Bump (c) year in node startup banner 2022-01-05 09:03:46 +04:00
skorzhevsky ec1ee0c011
Fix config example in README.md for rabbitmq_auth_backend_cache 2022-01-04 14:30:04 +03:00
Michael Klishin c75ac14efa
Merge pull request #3936 from rabbitmq/lukebakken/fix-all-read-file
Fix all uses of file:read_file/1
2022-01-04 12:46:34 +04:00
Philip Kuryloski 12f58beb04
Merge pull request #3941 from rabbitmq/bazel-run-dev-broker
Disable +deterministic in compilation_mode dbg under bazel
2022-01-04 09:30:15 +01:00
Luke Bakken 7f0285834e
Fix all uses of file:read_file/1
This is to address another memory leak on win32 reported here:

https://groups.google.com/g/rabbitmq-users/c/UE-wxXerJl8

"RabbitMQ constant memory increase (binary_alloc) in idle state"

The root cause is the Prometheus plugin making repeated calls to `rabbit_misc:otp_version/0` which then calls `file:read_file/1` and leaks memory on win32.

See https://github.com/erlang/otp/issues/5527 for the report to the Erlang team.

Turn `badmatch` into actual error
2022-01-03 11:33:36 -08:00
dcorbacho 0bd8d41b72 Skip new import testcase on mixed environments 2022-01-03 17:37:06 +01:00
Philip Kuryloski 70ef6b6984 Disable +deterministic in compilation_mode dbg under bazel
This allows compiling and reloading code from the erlang shell when
running the broker with `bazel run -c dbg broker`
2022-01-03 12:38:06 +01:00
David Ansari e10247feee Reduce CPU usage of rabbit_framing_amqp_0_9_1:decode_method_fields/2
This diff will generate module rabbit_framing_amqp_0_9_1 with
re-ordered clauses of function decode_method_fields/2.

After this change, basic class will be at the top of the function
clauses. That's better because for example basic.publish and basic.ack
are called far more often than for example methods of class connection,
channel or exchange.

Note that "the compiler does not rearrange clauses that match binaries."
See https://www.erlang.org/doc/efficiency_guide/functions.html#pattern-matching

Measurement taken on an Ubuntu 20.04 VM before and after this change:
1. Install latest Erlang from master (i.e. Erlang 25)
2. RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+JPperf true +S 1" make run-broker
   (Single scheduler makes the flame graph easier to read.)
3. Run rabbitmq-perf-test without any parameters
4. sudo perf record -g -F 9999 -p <pid> -- sleep 5
   where <pid> is the output of 'os:getpid().' (in the Erlang shell).
   This samples CPU stack traces via frame pointers of
   rabbitmq-server at 9999 Hertz for 5 seconds.
5. Generate a differential flame graph as described in
   https://www.brendangregg.com/blog/2014-11-09/differential-flame-graphs.html

Before this change, stack frame rabbit_framing_amqp_0_9_1:decode_method_fields/2
was 1.57% present in all stack trackes, most of the time (> 1%) running
on the CPU, i.e. directly consuming CPU cycles.
This does not sound like a lot, but is actually quite a lot for a single
function!

The diffential flame graph depicts a single dark blue frame: function decode_method_fields
with a reduction of ~0.85%.
2021-12-31 02:26:21 +01:00
Michael Klishin 4993fefa08
#3925 follow-up: add a rabbit_common Bazel dep 2021-12-28 01:30:07 +03:00
Michael Klishin 19ae35aa14
#3925 follow-up: don't include Erlang client headers 2021-12-28 01:24:32 +03:00
Michael Klishin 7ded41f26a
#3925 follow-up: update Bazel files to match new suite names 2021-12-28 01:19:00 +03:00
Michael Klishin b569ab5d74
Rename two newly introduced test modules 2021-12-28 00:35:55 +03:00
Michael Klishin dfa730b737
delegate: documentation edits 2021-12-26 04:32:00 +03:00
Michael Klishin 202f881601
Make xref happy 2021-12-26 04:32:00 +03:00
dcorbacho c88605aab4
Import definitions: support user limits 2021-12-26 04:32:00 +03:00
Michael Klishin 5d2a735ae7
Cosmetics 2021-12-26 04:32:00 +03:00
Lajos Gerecs c972f07816
wrap authentication calls in try catch to avoid leaking error 2021-12-26 04:32:00 +03:00
Luke Bakken d1496a2c7c
Fix tests 2021-12-26 04:32:00 +03:00
Luke Bakken 043641c99f
Use protected ets so that data can be read quickly 2021-12-26 04:31:59 +03:00
Luke Bakken eecfa0a1e9
Clarify warning message 2021-12-26 04:31:59 +03:00
Luke Bakken fe8ae3c713
Restore old win32 free disk query using `dir` as a last resort 2021-12-26 04:31:59 +03:00
Luke Bakken c6271a90a0
Be smarter about extracting the drive letter from a directory on win32 2021-12-26 04:31:59 +03:00
Luke Bakken 6200887f84
Disk monitor improvements
Related to VESC-1015

* Remove `infinity` timeouts
* Improve free disk space retrieval on win32

Run commands with a timeout

This PR fixes an issue I observed while reproducing VESC-1015 on Windows
10. Within an hour or so of running a 3-node cluster that has health
checks being run against it, one or more nodes' memory use would spike.
I would see that the rabbit_disk_monitor process is stuck executing
os:cmd to retrieve free disk space information. Thus, all
gen_server:call calls to the process would never return, especially
since they used an infinity timeout.

Do something with timeout

Fix unit_disk_monitor_mocks_SUITE
2021-12-26 04:31:59 +03:00
tomyouyou 2d58ce7c5f
Optimisation for 'delegate'
This is copied from https://github.com/rabbitmq/rabbitmq-common/pull/349


If a message is sent to only one queue(in most application scenarios), passing through the 'delegate' is meaningless. Otherwise, it increases the delay of the message and the possibility of 'delegate' congestion.

Here are some test data:
node1: Pentium(R) Dual-Core CPU E5300 @ 2.60GHz
node2: Pentium(R) Dual-Core CPU E5300 @ 2.60GHz

Join node1 and node2 to a cluster. Create 100 queues on node2, and start 100 consumers to receive messages from these queues.
Start 100 publishers on node1 to send messages to the queues of node2. Each publisher will send 10k messages at the rate of 100/s(10k/s theoretically in total), and all the messages for all publishers is 1 million.

Before optimisation:
{1,[{msg_time,812312(=<1ms),177922(=<5ms),9507(=<50ms),221(=<500ms),38(=<1000ms),0,0,0,0,1061,1069,0,0}]}

After optimisation:
{1,[{msg_time,902854(=< 1ms),93993(=<5ms),3038(=<50ms),96(=<500ms),19(=<1000ms),0,0,0,0,1049,1060,0,0}]}

Additional information:

Time counted here is the stay time of a message in the cluster, that is, Time(leaving from node2 at) - Time(reaching node1 at).
"812312(=<1ms)" is the number of messages with time consumption less than or equal to 1ms.
Overall, the optimisation is effective.
2021-12-26 04:31:58 +03:00
Thuan Duong Ba 542d2cf7a5 add bazel rule definition for rabbit_mirror_queue_misc_SUITE and rabbit_mirror_queue_sync_SUITE 2021-12-20 17:39:06 -08:00
Thuan Duong Ba fe8bd1508a fix the sync pause time calculation 2021-12-20 17:39:06 -08:00
Thuan Duong Ba 83b94ca6a9 reset counter after each sync throughput check interval 2021-12-20 17:39:06 -08:00
Thuan Duong Ba dc6fb24761 minor fix on condition to stop batching when total batch size is large 2021-12-20 17:39:06 -08:00
Thuan Duong Ba 1ab485b44c minor update for batching messages when syncthroughput is 0 2021-12-20 17:39:06 -08:00
Thuan Duong Ba 157bffa332 Support configure max sync throughput in CMQs 2021-12-20 17:39:06 -08:00
Michael Klishin cee6c25bc0
A slightly improved log message wording 2021-12-20 12:56:20 +05:00
Anh Thi Lan Nguyen 89fd4aba46
Increase token expiration time 2021-12-20 12:36:10 +05:00
Anh Thi Lan Nguyen 8bcfbd594f
Start SSL app for testing server 2021-12-20 12:36:10 +05:00
Anh Thi Lan Nguyen 77608eb624
Standardise README.md 2021-12-20 12:36:10 +05:00
Anh Thi Lan Nguyen 20af75bcdd
Correct configuration example in README.md 2021-12-20 12:36:09 +05:00
Anh Thi Lan Nguyen 19ea17c652
Add timeout for httpc request 2021-12-20 12:36:09 +05:00
Anh Thi Lan Nguyen 0e46c873de
Add configurable crl_check and fail_if_no_peer_cert
- Add configuration: crl_check, fail_if_no_peer_cert
- Correct configuration: hostname_verification
2021-12-20 12:36:09 +05:00
Anh Thi Lan Nguyen b803f9ea75
Add wildcard configuration
A "wildcard" configuration is added to enable key server verification with wildcard certificate
2021-12-20 12:36:09 +05:00
Anh Thi Lan Nguyen 9565e7d975
Update README.md
- Update new configuration document
- Add configurable "depth" for key server verification
2021-12-20 12:36:09 +05:00
Anh Thi Lan Nguyen 0ff2e0c4e4
Set peer_verification default as verify_none 2021-12-20 12:36:09 +05:00
Anh Thi Lan Nguyen f658a51cbc
Update better configuration names
- "strict" changes to "https.peer_verification"
- "cacertfile" changes to "https.cacertfile"
2021-12-20 12:36:09 +05:00
Anh Thi Lan Nguyen 5abfc2b547
Oauth2 plugin improvements
- Validate JWKS server when getting keys
- Restrict usable algorithms
2021-12-20 12:36:08 +05:00