We need these to debug various rabbitmq-related issues. Ephemeral debug
containers are not enabled on GKE Autopilot (not sure if GKE enables
this alpha feature at all), so we are doing this instead.
error: ephemeral containers are disabled for this cluster (error from server: "the server could not find the requested resource").
Backport to v3.9.x & v3.8.x if build passes
Pair @Gsantomaggio
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
Add a make target that helps finding out SHA256 for specific OTP
versions. Inline comments have all the context.
Pushing straight to master to find out if it works. If it does, will
back-port to v3.9.x & v3.8.x
cc @ansd
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
This is the best way of ensuring that everything works. RabbitMQ is not
just the core components, it's all the plugins that ship with it. While
we don't expect anyone to enable all plugins at the same time (for
example enabling all peer discovery plugins at the same time doesn't
make sense), we need to be minimally confident that everything works.
As we work on & QA various features in dev, this is the quickest way of
doing just that.
For example, today we were testing some Stream features with
@GSantomaggio and discovered that we needed to use a different image for
it, pivotalrabbitmq/rabbitmq-stream, because in the dev image we don't
enable the stream plugin by default. There is no reason why we would
need a different container image to test a core feature, that ships part
of RabbitMQ as a Tier 1 plugin. To be honest, I strongly believe that
everyone be looking at these specific images (otp-max & otp-max-1) which
bundle the majority of the RabbitMQ experience in a single artefact,
including OpenSSL & Erlang.
While I would normally push this straight into the main branch, I'm
doing it as a PR so that the following notice, and have the opportunity
to comment on this standalone:
- @michaelklishin
- @GSantomaggio
- @Zerpet
- @MirahImage
- @ansd
As next steps, after all checks pass, can someone from the above list
action the following please:
- merge into our main branch
- backport to v3.9.x
- deploy to lre-3-9
🙌
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
See https://github.com/docker-library/rabbitmq/pull/492
Co-authored-by: Michal Kuratczyk <mkuratczyk@vmware.com>
Ensure compilation and runtime Erlang are in sync
Bump to Ubuntu 20.04
docker-library/rabbitmq also uses Ubuntu 20.04
Certain keyservers fail seasonally & intermittently - every few months
one stops working reliabley - so then we alternate between them. Last
time we had this problem, pgpkeys.eu was failing, and now pgpkeys.uk
seems to be failing and resulting in flaky builds:
https://github.com/rabbitmq/rabbitmq-server/runs/2663558141?check_suite_focus=true
Trying pgpkeys.eu now, and preparing to go to the Ubuntu one next if
this one proves to not be as reliable as we would like it to be.
As a different approach, I am wondering if we should store these PGP
keys somewhere instead? WDYT @dumbbell?
Thanks @pjk25 for flagging this!
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
in addition to tagging the docker image with the Git commit SHA.
This enables the cluster-operator to run system-tests against latest
rabbitmq-server master branch which allows us to fix any breaking
changes early and to test unreleased features such as rabbitmq_stream
plugin.
Before this, someone would need to run the package-generic-unix and
docker-image make targets in order to produce a container image. This
wouldn't always work: more than one archive in the PACKAGES dir, VERSION
& RABBITMQ_VERSION had to be specified, + characters in VERSION. It also
required me to have Docker running, which I am reluctant to leave
running on macOS because it uses CPU even when idle, and starting it
every time that I wanted an ad-hoc container image implied waiting a few
minutes, and then waiting another 15 minutes for the container image to
be produced and published to hub.docker.com.
This commit delegates all the above to GitHub Actions. The best part is
that it happens asynchronously - every commit triggers it - and a
container image for that commit will appear on hub.docker.com within ~15
minutes. In other words, anyone can test any commit in their container
runtime within 20 minutes, without needing to lift a finger. No Docker,
no make targets, just reference the new tag (matches the git sha) and
off you go. It's the beginning of something better, for sure.
Did some cleanup in the Dockerfile part of this. Dropped
RABBITMQ_VERSION build arg, env & commented steps. This is cruft that we
no longer need or even use.
Also fixed VERSION in push target and broken the dependency on dist so
that they can be run independently.
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
The previous approach assumed `make package-generic-unix` had only been run once.
This borrows a snippet from rabbitmq-components.mk to select the `current` version
for the docker build by default.
These directories are mutated by the server-release concourse
pipelines, and so their contents will remain in the separate
rabbitmq-packaging repository
Also, don't depend on package-generic-unix, otherwise that will be
re-built every time we want to build the docker image, which is
unnecessary. It will fail now if a generic-unix doesn't exist, but
that's OK.
Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
This shortcut is using cmd.exe, and when the install directory of rabbitmq is not in the same drive as the cmd.exe, the cd command will not work unless adding a \d for it.
Like other packaging files, they should not be part of RabbitMQ itself.
One day, they will probably be moved to a dedicated repository, like
other packaging files.
We used to have a special case for OpenSUSE because we were using the
"Ledest" Erlang community packages.
We just switch to the Erlang:Factory packages in CI for OpenSUSE. They
are compatible with other official and community packages, and also
Erlang packages from other RPM-based distributions.
Note that the build dependencies are not tested at all: we still use
Debian to build the RPM packages and dependency checking is disabled.
Therefore, it's more than possible that the RPM source package does not
build successfully.
We find two series of Erlang packages for OpenSUSE. The first one
provides `erlang-epmd` and `erlang` only depends on it. The official
packages and most community packages use that form.
In CI, we were used to use packages from "Ledest". It uses a second form
where the EPMD package is called `epmd` nad `erlang` does not depend on
it. Therefore, in order to have EPMD installed, I addded `epmd` as a
dependency.
Now that there are many good community packages providing the latest and
greatest versions of Erlang, we can use other recommended community
packages, if the official OpenSUSE packages only provide an old version
of Erlang.
Therefore, in order to be compatible with the first form, we can't
depend on `epmd` as it doesn't exist in that series.
If we use xargs(1) to call tar(1), we are limited by the number of
arguments we can put on the command line. Since we switch to use
directories to "package" plugins instead of .ez archives, the number of
files exploded. This led to incomplete generic-unix archives (i.e. some
plugins and CLI scripts were missing for instance).
Now, the list of files is written to a manifest, exactly like we do it
to create the source archive.
Console output is handled in the SysV init scripts consistently (no more
differences between the Debian and RPM packages). See the previous
commit.
This fixes an issue for users who used to define $RABBITMQ_LOG_BASE in
the environment and called this script directly (i.e. not using the SysV
init scripts). Before commit 4b7048205d
(which made it to RabbitMQ 3.8.4), `rabbitmq-script-wrapper` took
$RABBITMQ_LOG_BASE from rabbitmq-env(8) or the environment. After the
mentionned commit, $RABBITMQ_LOG_BASE was hard-coded to setup console
redirection (in the case of Debian only) because rabbitmq-env(8) didn't
have the variable anymore and thus was not sourced.
For those users, it meant they couldn't override $RABBITMQ_LOG_BASE in
the environment and call this script, even if they wanted to change the
location of RabbitMQ actual log files.
Now that console redirection is handled by the SysV init scripts, we can
get rid of that code in `rabbitmq-script-wrapper`.
Fixesrabbitmq/rabbitmq-server-release#131.
Historically we were using $RABBITMQ_LOG_BASE to configure the
redirection. The variable default value was set in rabbitmq-env(8) which
made it to the SysV init scripts because they sourced it in the past.
This was removed in commit 4b7048205d as
part of the transition to rabbit_env/rabbitmq_prelaunch to handle the
environment in the Erlang code (see rabbitmq/rabbitmq-server#2180).
Instead, the value of $RABBITMQ_LOG_BASE was hard-coded. Unfortunately,
this caused a regression because users couldn't configure it from
rabbitmq-env.conf anymore (only /etc/default/rabbitmq-server).
Anyway, the semantic was slightly incorrect: $RABBITMQ_LOG_BASE is used
in the configuration of log files RabbitMQ is responsible for. Console
redirection is the responsibility of the SysV init scripts and the
package which creates the directory and set ownership.
This patch introduces a new $RABBITMQ_SERVER_CONSOLE_OUTPUT_DIR variable
which is specific to the SysV init scripts. For backward compatibility,
we still look at the value of $RABBITMQ_LOG_BASE if the user set it in
/etc/default/rabbitmq-server.
While here, align the Debian SysV init script behavior with the RPM
version of the script: console redirection is always configured in the
SysV init script, not in `rabbitmq-script-wrapper`. A subsequent commit
will take care of cleaning `rabbitmq-script-wrapper`.
The solution to redirect output when using start-stop-daemon(8) was
taking from the following post:
https://stackoverflow.com/questions/8251933/how-can-i-log-the-stdout-of-a-process-started-by-start-stop-daemon
References rabbitmq/rabbitmq-server-release#131.
After starting the RabbitMQ server process, the startup script will
wait for the server to start by calling `rabbitmqctl wait` and will
time out after 10 s.
The startup time of the server depends on how quickly the Mnesia
database becomes available and the server will time out after
`mnesia_table_loading_retry_timeout` ms times
`mnesia_table_loading_retry_limit` retries. By default this wait is
30,000 ms times 10 retries, i.e. 300 s.
The mismatch between these two timeout values might lead to the
startup script failing prematurely while the server is still waiting
for the Mnesia tables.
This change introduces variable `RABBITMQ_STARTUP_TIMEOUT` and the
`--timeout` option into the startup script. The default value for this
timeout is set to 10 minutes (600 seconds).
This change also updates the systemd service file to match the timeout
values between the two service management methods.
Signed-off-by: Nicolas Bock <nicolas.bock@canonical.com>