The `:io.format/2` call was originally passed a single-quote string
(i.e. a charlist in Elixir terminology) which emits a warning in more
recent Elixir versions:
warning: single-quoted strings represent charlists. Use ~c"" if you indeed want a charlist or use "" instead
└─ nofile:1:12
This warning would pop up a few times when using `make dialyze` within
a deps directory. To resolve it we can switch the quoting so that the
eval string is wrapped in single quotes (equivalent for shell since this
line doesn't use variables) and the format argument is wrapped in double
quotes. This uses a binary in Elixir instead, but that's ok because
`io:format/3`'s `io:format()` parameter may either be an atom, string,
or binary.
This trick was copied from Makefile:49 which uses the same quoting.
For example, if the first restarted node doesn't start,
don't try to restart the other nodes. This mimics what
orchestrators such as Kubernetes or BOSH would do
(although they perform this check differently)
All CT logs will now be under <toplevel>/logs. An improved
test workflow would be to always keep the logs/all_runs.html
page open in the browser and refresh it whenever tests are
run in any of the rabbit applications.
Because `ct_master` is yet another Erlang node, and it is used
to run multiple CT nodes, meaning it is in a cluster of CT
nodes, the tests that change the net_ticktime could not
work properly anymore. This is because net_ticktime must
be the same value across the cluster.
The same value had to be set for all tests in order to solve
this. This is why it was changed to 5s across the board. The
lower net_ticktime was used in most places to speed up tests
that must deal with cluster failures, so that value is good
enough for these cases.
One test in amqp_client was using the net_ticktime to test
the behavior of the direct connection timeout with varying
net_ticktime configurations. The test now mocks the
`net_kernel:get_net_ticktime()` function to achieve the
same result.
This has no real impact on performance[1] but should
make it clear which application can run the broker
and/or publish to Hex.pm. In particular, applications
that we can't run the broker from will now give up
early if we try to.
Note that while the broker can't normally run from the
amqp_client application's directory, it can run from
tests and some of the tests start the broker.
[1] on my machine
No real need to have two files, especially since it contains
only a few variable definitions. Plan is to only keep
separate files for larger features such as dist or run.
The DIST step used rsync for copying files; changing this
to using cp/rm provides a noticeable speed boost.
Before this commit the situation was as follows. With
FAST_RUN_BROKER=1 we are pretty fast but don't benefit
from parallel make:
make -C deps/rabbitmq_management run-broker FAST_RUN_BROKER=1
2,04s user 1,57s system 90% cpu 4,016 total
make -C deps/rabbitmq_management run-broker FAST_RUN_BROKER=1 -j8
2,08s user 1,55s system 89% cpu 4,069 total
With FAST_RUN_BROKER=0 we are slow; on the other hand
we greatly benefit from parallel make:
make -C deps/rabbitmq_management run-broker FAST_RUN_BROKER=0
3,29s user 1,93s system 81% cpu 6,425 total
make -C deps/rabbitmq_management run-broker FAST_RUN_BROKER=0 -j8
3,36s user 1,90s system 142% cpu 3,695 total
The reason this method achieves such a result is because
the DIST step that takes a lot of time can be run in
parallel. In addition, this method results on only
the necessary plugins being available in the path,
therefore it doesn't discover unrelated plugins
during node startup, saving time.
By changing rsync to cp/rm, we get great results even
without parallel make:
make -C deps/rabbitmq_management run-broker FAST_RUN_BROKER=0
3,28s user 1,64s system 105% cpu 4,684 total
make -C deps/rabbitmq_management run-broker FAST_RUN_BROKER=0 -j8
3,27s user 1,65s system 135% cpu 3,640 total
We are within 1s of FAST_RUN_BROKER=1 by default, and
faster than FAST_RUN_BROKER=1 with parallel make. On
top of that, we greatly benefit when rebuilding as the
DIST files do not need to be rebuilt every time:
make -C deps/rabbitmq_management run-broker FAST_RUN_BROKER=0
2,94s user 1,40s system 107% cpu 4,035 total
make -C deps/rabbitmq_management run-broker FAST_RUN_BROKER=0 -j8
2,85s user 1,51s system 138% cpu 3,140 total
Therefore it only makes sense to remove FAST_RUN_BROKER,
and instead use the old method which is both more correct
and has more potential for optimisation.
It has largely been superseded by `perf`. It is no longer
generally useful. It can always be added to BUILD_DEPS for
the rare cases it is needed, or installed locally and
pointed to by setting its path to ERL_LIBS.
When FAST_RUN_BROKER=1 was introduced it helped reduce the
time to run the broker, but mistakenly was always starting
all plugins even when running run-broker against a specific
plugin, for example `make -C deps/rabbitmq_management run-broker`.
Starting broker... completed with 36 plugins.
With this commit only the target plugin (and its dependencies)
will be started.
Starting broker... completed with 3 plugins.
This also has a positive effect on start performance:
make -C deps/rabbitmq_management run-broker 2,28s user 2,11s system 88% cpu 4,943 total
make -C deps/rabbitmq_management run-broker 2,00s user 1,61s system 94% cpu 3,807 total
With the monorepo I do not believe we need to worry about
where the scripts are. They are always in deps/rabbit/scripts.
So we copy them from there directly.
This does not improve performance but should work better on
older environments.
Rather than one by one. I do not know why it was done that
way, but since that dates back before the monorepo, it may
no longer be necessary.
This has a small but noticeable speedup when building:
make -C deps/rabbit 0,44s user 0,11s system 101% cpu 0,546 total
make -C deps/rabbitmq_management 0,65s user 0,18s system 101% cpu 0,816 total
make -C deps/rabbit 0,41s user 0,11s system 101% cpu 0,510 total
make -C deps/rabbitmq_management 0,57s user 0,21s system 101% cpu 0,778 total
Commit 5840834fa8 introduced
copying of scripts and escripts to the plugin currently
being worked on. At some point after that, in recent weeks,
the target would run for all applications being compiled,
and not just the plugin we have asked Make to build. This
led to increased build times.
This change ensures the scripts and escripts are only
copied for the directory we have ran Make in. So if
we simply do "make" this is the top-level directory.
If we do "make -C deps/rabbit" this is the rabbit
application's directory.
Doing this has a huge impact on performance when
rebuilding with no changes done to the source code:
make 6,63s user 2,60s system 106% cpu 8,668 total
make 2,88s user 1,30s system 117% cpu 3,567 total
And a less pronounced impact on specific applications
(results will vary):
make -C deps/rabbit 0,50s user 0,18s system 100% cpu 0,677 total
make -C deps/rabbit 0,43s user 0,12s system 101% cpu 0,551 total
This enables code reloading for single nodes and
for clusters. For single nodes using `make run-broker`
at least two terminals must be used: the one with
the broker, and the one doing the reloading.
When only using a single node (`make run-broker`),
the `RELOAD=1` variable can be passed to Make when
recompiling. This will lead to code reloading at
the end of compilation.
make RELOAD=1
Alternatively the "reload-broker" target can be used:
make all reload-broker
Special care must be given when the -j flag is used.
In that case it may be needed to separate compilation
and reloading in two separate calls:
make && make reload-broker
The same considerations apply to clusters, only there
isn't a shorthand variable for them. Instead, the
"reload-cluster" target must be used:
make all reload-cluster
Or:
make && make reload-cluster
Either of these will compile and reload the modules
on all nodes of the cluster.
It is also possible to reload modules only on some nodes,
using `make reload-broker` with the variable "RABBITMQ_NODENAME"
set to the node you want to trigger a reload on. This could
be used to test scenarios where the code differs between
the nodes in the cluster.
During development we want `make run-broker` to execute fast,
yet still pick up the changes we made in rabbit applications.
We can already do this by setting the appropriate variables.
This commit makes it so that this is the default. Now instead
of depending on the `dist` target we run plugins from the deps/
directory. And we depend on `all` to pick up changes.
This is equivalent to running
`make run-broker PLUGINS_FROM_DEPS_DIR=1 DIST_TARGET=all`.
It can be disabled by setting `FAST_RUN_BROKER=0`.
It doesn't invalide the `NOBUILD=1` variable which lets us
run the broker without recompiling (used in tests). It also
doesn't make `NOBUILD=1` faster (or slower).
The difference when running `make run-broker` by default is
roughly half the time of what it was before:
make run-broker 16,67s user 10,42s system 101% cpu 26,567 total
make run-broker 8,75s user 4,40s system 102% cpu 12,873 total
And it also applies to `make start-cluster`:
make start-cluster 26,32s user 15,15s system 141% cpu 29,279 total
make start-cluster 18,09s user 8,76s system 170% cpu 15,726 total
[Why]
So far, we use the CLI to create the cluster after starting the
individual nodes.
It's faster to use peer discovery and gives more exposure to the
feature. Thus it will be easier to quickly test changes to the peer
discovery subsystem with a simple `make start-cluster`.
[How]
We pass the classic configuration `cluster_nodes` application
environment variable to all nodes' command line.
This is only when the target is `start-cluster`, not `start-brokers`.
... instead of the `dist` target.
[Why]
We already do that when building tests. Thus it is more consistent to do
the same.
Also, it makes sense to ensure everything is ready before the `dist`
step. For instance, an Erlang release would not depend on the `dist`
target, just the build and it would still need the CLI to be ready.
[Why]
For Mnesia, we don't really care. However, for the upcoming use of
Khepri, we need that because the cluster will require at least a quorum
number of nodes to start to become available.
[How]
We simply rely on the shell's job management and wait for them to
complete.
We do the same for `stop-{brokers,cluster}` mostly for consistency's
sake.
It also makes starting nodes slightly faster.
See #7869. #7875 resulted in elixir apps (besides the cli) in the deps
dir. This triggered dormant makefile logic to compile such deps. It
turns out that it's unnecessary to pre-compile them, given the cli's
mix.exs file.
The location and name of this directory remains the same for
compatibility reasons. Therefore, it sill contains "mnesia" in its name.
However, semantically, we want this directory to be unrelated to Mnesia.
In the end, many subsystems write files and directories there, including
Mnesia, all Ra systems and in the future, Khepri.