This adds the rake task rake
gitlab:cleanup:orphan_job_artifact_files. This rake task cleans all
orphan job artifact files it can find on disk.
It performs a search on the complete folder of all artifacts on
disk. Then it filters out all the job artifact ID for which it could
not find a record with matching ID in the database. For these, the
file is deleted from disk.
This backports all EE schema changes to CE, including EE migrations,
ensuring both use the same schema.
== Updated tests
A spec related to ghost and support bot users had to be modified to make
it pass. The spec in question assumes that the "support_bot" column
exists when defining the spec. In the single codebase setup this is not
the case, as the column is backported in a later migration. Any attempt
to use a different schema version or use of "around" blocks to
conditionally disable specs won't help, as reverting the backport
migration would also drop the "support_bot" column. Removing the
"support_bot" tests entirely appears to be the only solution.
We also need to update some foreign key tests now that we have
backported the EE columns. Fortunately, these changes are very minor.
== Backporting migrations
This commit moves EE specific migrations (except those for the Geo
tracking database) and related files to CE, and also removes any traces
of the ee/db directory.
Some migrations had to be modified or removed, as they no longer work
with the schema being backported. These migrations were all quite old,
so we opted for removing them where modifying them would take too much
time and effort.
Some old migrations were modified in EE, while also existing in CE. In
these cases we took the EE code, and in one case removed them entirely.
It's not worth spending time trying to merge these changes somehow as we
plan to remove old migrations around the release of 12.0, see
https://gitlab.com/gitlab-org/gitlab-ce/issues/59177 for more details.
It used to be the case that GitLab created symlinks for each repository
to one copy of the Git hooks, so these ran when required. This changed
to set the hooks dynamically on Gitaly when invoking Git.
The side effect is that we didn't need all these symlinks anymore, which
Gitaly doesn't create anymore either. Now that means that the tests in
GitLab-Rails should test for it either.
Related: https://gitlab.com/gitlab-org/gitaly/issues/1392#note_175619926
Many customers forget to include the gitlab-secrets.json file. This adds
a warning that both gitlab-secrets.json and gitlab.rb are not included
in the backup.
The `#reload` makes to load all objects into memory,
and the main purpose of `#reload` is to drop the association cache.
The `#reset` seems to solve exactly that case.
This backports EE specific changes for the Rake task `gitlab:env:info`,
wrapping them in a conditional. There is no way to inject code in the
middle of a Rake task in EE, so unfortunately this is the best we can
do.
In this commit, some methods that aren't being used
are removed from `Gitlab::Shell`. They are the ff:
- `#remove_keys_not_found_in_db`
- `#batch_read_key_ids`
- `#list_key_ids`
The corresponding methods in `Gitlab::Keys` have been
removed as well.
This is a small polishing on the storage migration and storage rollback
rake tasks. By aborting a migration while a rollback is already
scheduled we want to prevent unexpected consequences.
This brings back some of the changes in
https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/20339.
For users using Gitaly on top of NFS, accessing the Git data directly
via Rugged is more performant than Gitaly. This merge request introduces
the feature flag `rugged_find_commit` to activate Rugged paths.
There are also Rake tasks `gitlab:features:enable_rugged` and
`gitlab:features:disable_rugged` to enable/disable these feature
flags altogether.
Part of four Rugged changes identified in
https://gitlab.com/gitlab-org/gitlab-ce/issues/57317.
Rollback is done similar to Migration for the Hashed Storage.
It also shares the same ExclusiveLease key to prevent both happening
at the same time.
All Hashed Storage related workers now share the same queue namespace
which allows for assigning dedicated workers easily.
If there are any clients connected to the DB, PostgreSQL won't let you
drop the database. It's possible that Sidekiq, Unicorn, or some other
client will be hanging onto a connection, preventing the DROP DATABASE
from working. To workaround this problem, this method cancels all the
connections so that the db:reset command will work.
Note that there's still a slight possibility a client connects after its
connection is terminated. If this is an issue, we could solve it by
revoking CONNECT access, but for now it seems this works.
Closes https://gitlab.com/gitlab-org/gitlab-development-kit/issues/450
Specs were reviewed and improved to better cover the current behavior.
There was some standardization done as well to facilitate the
implementation of the rollback functionality.
StorageMigratorWorker was extracted to HashedStorage namespace were
RollbackerWorker will live one as well.
To separate the different kinds of repositories we have at GitLab this
table will be renamed to pool_repositories. A project can, for now at
least, be member of none, or one of these. The table will get additional
columns in a later merge request where more logic is implemented for the
model.
Further included is a small refactor of logic around hashing ids for the
disk_path, mainly to ensure a previous implementation is reusable.
The disk_path for the pool_repositories table no longer has a NOT NULL
constraint, but given the hashing of the ID requires the DB to assign
the record an ID, an after_create hook is used to update the value.
A related MR is:
https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/23143, adding
tables for 'normal' repositories and wiki_repositories.
Also, since Gitaly now takes care of checking for storage paths
existence/accessibility, we can remove those check from the
gitlab:gitlab_shell_check task and advance further into 0 direct disk
approach on gitlab-rails