In order to retain deterministic results of state machine applications
during upgrades we need to make the stream coordinator versioned such
that we only use the new logic once the stream coordinator switches to
machine version 1.
The list consists of candidates which is a tuple {node, tail}, and the tail is made of {epoch, offset}.
While the 'select_leader' think the tail is made of {offset, epoch}.
Suppose there are two candidates:
[{node1,{1,100}},{node2,{2,99}}]
It selects node1 as the leader instead of node2 with larger epoch.
v7 introduces changes that break pipelines -
https://github.com/rabbitmq/hexpm-cli/issues/3
It would be better to drop hexpm-cli and use the hex publishing
support present in newer versions of erlang.mk
Add an item to the configuration file(/etc/rabbitmq/rabbitmq.config):
{kernel, [{inet_dist_use_interface, {8193,291,0,0,0,0,0,1}}]}
Use the netstat command to check the IP address of the distribution port(25672):
netstat -anp | grep 25672
tcp6 0 0 2001:123::1:25672 :::* LISTEN 2075/beam.smp
However, 'rabbitmqctl status' shows:
...
Interface: [::], port: 25672, protocol: clustering, purpose: inter-node and CLI tool communication
...
This is to address another memory leak on win32 reported here:
https://groups.google.com/g/rabbitmq-users/c/UE-wxXerJl8
"RabbitMQ constant memory increase (binary_alloc) in idle state"
The root cause is the Prometheus plugin making repeated calls to `rabbit_misc:otp_version/0` which then calls `file:read_file/1` and leaks memory on win32.
See https://github.com/erlang/otp/issues/5527 for the report to the Erlang team.
Turn `badmatch` into actual error
This diff will generate module rabbit_framing_amqp_0_9_1 with
re-ordered clauses of function decode_method_fields/2.
After this change, basic class will be at the top of the function
clauses. That's better because for example basic.publish and basic.ack
are called far more often than for example methods of class connection,
channel or exchange.
Note that "the compiler does not rearrange clauses that match binaries."
See https://www.erlang.org/doc/efficiency_guide/functions.html#pattern-matching
Measurement taken on an Ubuntu 20.04 VM before and after this change:
1. Install latest Erlang from master (i.e. Erlang 25)
2. RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS="+JPperf true +S 1" make run-broker
(Single scheduler makes the flame graph easier to read.)
3. Run rabbitmq-perf-test without any parameters
4. sudo perf record -g -F 9999 -p <pid> -- sleep 5
where <pid> is the output of 'os:getpid().' (in the Erlang shell).
This samples CPU stack traces via frame pointers of
rabbitmq-server at 9999 Hertz for 5 seconds.
5. Generate a differential flame graph as described in
https://www.brendangregg.com/blog/2014-11-09/differential-flame-graphs.html
Before this change, stack frame rabbit_framing_amqp_0_9_1:decode_method_fields/2
was 1.57% present in all stack trackes, most of the time (> 1%) running
on the CPU, i.e. directly consuming CPU cycles.
This does not sound like a lot, but is actually quite a lot for a single
function!
The diffential flame graph depicts a single dark blue frame: function decode_method_fields
with a reduction of ~0.85%.
Related to VESC-1015
* Remove `infinity` timeouts
* Improve free disk space retrieval on win32
Run commands with a timeout
This PR fixes an issue I observed while reproducing VESC-1015 on Windows
10. Within an hour or so of running a 3-node cluster that has health
checks being run against it, one or more nodes' memory use would spike.
I would see that the rabbit_disk_monitor process is stuck executing
os:cmd to retrieve free disk space information. Thus, all
gen_server:call calls to the process would never return, especially
since they used an infinity timeout.
Do something with timeout
Fix unit_disk_monitor_mocks_SUITE