As per https://gitlab.com/gitlab-org/gitlab-ce/issues/46043, project
templates should be squashed before updating, so that repositories
created from these templates don't include the full history of the
backing repository.
This rake task had been broken for a while. This fixes the breakages,
adds a test to help avoid future breakages, and adds a few ergonomic
improvements to the task itself.
In some cases ActiveSession.cleanup was not called after authentication,
so for some user ActiveSession lookup keys grew without ever cleaning
up. This Rake task manually iterates over the lookup keys and removes
ones without existing ActiveSession.
This adds the rake task rake
gitlab:cleanup:orphan_job_artifact_files. This rake task cleans all
orphan job artifact files it can find on disk.
It performs a search on the complete folder of all artifacts on
disk. Then it filters out all the job artifact ID for which it could
not find a record with matching ID in the database. For these, the
file is deleted from disk.
The various LDAP check Rake tasks have long supported a SANITIZE
environment variable. When present, identifiable information is
obscured such as user names and project/group names. Until now,
the LDAP check did not honor this. Now it will only say how many
users were found. This should at least give the indication that
the LDAP configuration found something, but will not leak what
it is. Resolves#56131
We've already migrated all the legacy artifacts to the new realm,
which is ci_job_artifacts table.
It's time to remove the old code base that is no longer used.
It used to be the case that GitLab created symlinks for each repository
to one copy of the Git hooks, so these ran when required. This changed
to set the hooks dynamically on Gitaly when invoking Git.
The side effect is that we didn't need all these symlinks anymore, which
Gitaly doesn't create anymore either. Now that means that the tests in
GitLab-Rails should test for it either.
Related: https://gitlab.com/gitlab-org/gitaly/issues/1392#note_175619926
This is a small polishing on the storage migration and storage rollback
rake tasks. By aborting a migration while a rollback is already
scheduled we want to prevent unexpected consequences.
Specs were reviewed and improved to better cover the current behavior.
There was some standardization done as well to facilitate the
implementation of the rollback functionality.
StorageMigratorWorker was extracted to HashedStorage namespace were
RollbackerWorker will live one as well.
Pool repositories are persisted in the database, and when the DB is
restored, the data need to be restored on disk. This is done by
resetting the state machine and rescheduling the object pool creation.
This is not an exact replica of the state like at the time of the
creation of the backup. However, the data is consistent again.
Dumping isn't required as internally GitLab uses git bundles which
bundle all refs and include all objects in the bundle that they require,
reduplicating as more repositories get backed up. This does require more
data to be stored.
Fixes https://gitlab.com/gitlab-org/gitaly/issues/1355
Add an index to the `store` column on `uploads`. This makes counting
local uploads faster.
Also, there is no longer need to check for objects with `store = NULL`.
See https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/18557
---
### Query plans
Query:
```sql
SELECT COUNT(*)
FROM "uploads"
WHERE ("uploads"."store" = ? OR "uploads"."store" IS NULL)
```
#### Without index
```
gitlabhq_production=# EXPLAIN ANALYZE SELECT uploads.* FROM uploads WHERE (uploads.store = 1 OR uploads.store IS NULL);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Seq Scan on uploads (cost=0.00..601729.54 rows=578 width=272) (actual time=6.170..2308.256 rows=545 loops=1)
Filter: ((store = 1) OR (store IS NULL))
Rows Removed by Filter: 4411957
Planning time: 38.652 ms
Execution time: 2308.454 ms
(5 rows)
```
#### Add index
```
gitlabhq_production=# create index uploads_tmp1 on uploads (store);
CREATE INDEX
```
#### With index
```
gitlabhq_production=# EXPLAIN ANALYZE SELECT uploads.* FROM uploads WHERE (uploads.store = 1 OR uploads.store IS NULL);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on uploads (cost=11.46..1238.88 rows=574 width=272) (actual time=0.155..0.577 rows=545 loops=1)
Recheck Cond: ((store = 1) OR (store IS NULL))
Heap Blocks: exact=217
-> BitmapOr (cost=11.46..11.46 rows=574 width=0) (actual time=0.116..0.116 rows=0 loops=1)
-> Bitmap Index Scan on uploads_tmp1 (cost=0.00..8.74 rows=574 width=0) (actual time=0.095..0.095 rows=545 loops=1)
Index Cond: (store = 1)
-> Bitmap Index Scan on uploads_tmp1 (cost=0.00..2.44 rows=1 width=0) (actual time=0.020..0.020 rows=0 loops=1)
Index Cond: (store IS NULL)
Planning time: 0.274 ms
Execution time: 0.637 ms
(10 rows)
```
Closes https://gitlab.com/gitlab-org/gitlab-ee/issues/6070
We started syncing all the wiki regardless of the fact it's disabled or
not. We couldn't do that in one stage because of needing of smoth update
and deprecating things. This is the second stage that finally removes
unused columns in the geo_node_status table.
If doing a schema load, the post_migrations should also be marked as up,
even if SKIP_POST_DEPLOYMENT_MIGRATIONS was set, otherwise future
migration runs will be broken.
Rake tasks cleaning up the Git storage were still using direct disk
access, which won't work if these aren't attached. To mitigate a
migration issue was created.
To port gitlab:cleanup:dirs, and gitlab:cleanup:repos, a new RPC was
required, ListDirectories. This was implemented in Gitaly, through
https://gitlab.com/gitlab-org/gitaly/merge_requests/868.
To be able to use the new RPC the Gitaly server was bumped to v0.120.
This is an RPC that will not use feature gates, as this doesn't scale on
.com so there is no way to test it at scale. Futhermore, we _know_ it
doesn't scale, but this might be a useful task for smaller instances.
Lastly, the tests are slightly updated to also work when the disk isn't
attached. Eventhough this is not planned, it was very little effort and
thus I applied the boy scout rule.
Closes https://gitlab.com/gitlab-org/gitaly/issues/954
Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/40529
These tasks are happening through housekeeping right now, by default
ever 10th push. This removes the need for these tasks.
Side note, this removes one of my first contributions to GitLab, as back
than I introduced these tasks through: 54e6c0045b
Closes https://gitlab.com/gitlab-org/gitaly/issues/768
Because no Git repository was actually created at the temporary path we
were using, `git fsck` would traverse up until it found a repository,
which in our case was the CE or EE repository.
Adds a new method 'puts_time' that prepends the time of a
message when printing it. All instances of 'progress.puts'
in the gitlab:backup:restore tasks are replaced with puts_time.
Example output:
2018-06-03 16:33:25 -0400 -- Restoring uploads ..
Closes#46448
Direct disk access is done through Gitaly now, so the legacy path was
deprecated. This path was used in Gitlab::Shell however. This required
the refactoring in this commit.
Added is the removal of direct path access on the project model, as that
lookup wasn't needed anymore is most cases.
Closes https://gitlab.com/gitlab-org/gitaly/issues/1111
Fixes an issue where, when using branch versions, the component wouldn't
be updated after the first branch checkout. We also save one step, since
checking out the FETCH_HEAD with `-f` already does what `reset --hard`
did.
It seems that bad things happen when two gRPC stubs share one gRPC
channel so let's stop doing that. The downside of this is that we
create more gRPC connections; one per stub.
Currently we specify versions for Gitlab-Shell, Workhorse and Gitaly
using version strings, to which we prepend 'v' and assume are tags.
These changes allow branches or tags with other name formats to be
specified by prepending '=' to the version string (á la govendor).
We also simplify the process to reset to the given version (now a
branch or tag): Right now there's a check to supposedly try to avoid
fetching from the remote the version if it already exist locally. But
the previous logic already clones if the directory doesn't exist or
fetches if it does, so this check is pointless. We can safely assume the
version exists once we get to the reset stage.
This will be necessary when adding gitaly settings. This version
doesn't make any functional changes, but allows us to include this
breaking change in 9.0 and add the needed extra settings in the future
with backwards compatibility
- The pages are created when build artifacts for `pages` job are uploaded
- Pages serve the content under: http://group.pages.domain.com/project
- Pages can be used to serve the group page, special project named as host: group.pages.domain.com
- User can provide own 403 and 404 error pages by creating 403.html and 404.html in group page project
- Pages can be explicitly removed from the project by clicking Remove Pages in Project Settings
- The size of pages is limited by Application Setting: max pages size, which limits the maximum size of unpacked archive (default: 100MB)
- The public/ is extracted from artifacts and content is served as static pages
- Pages asynchronous worker use `dd` to limit the unpacked tar size
- Pages needs to be explicitly enabled and domain needs to be specified in gitlab.yml
- Pages are part of backups
- Pages notify the deployment status using Commit Status API
- Pages use a new sidekiq queue: pages
- Pages use a separate nginx config which needs to be explicitly added
Sometimes admins will change the LDAP configuration, not realizing
that problems will occur if the user's LDAP identities are not
also updated to use the new provider name. This task will give
admins a single command to run to update identities and will
prevent having to run multiple Rails console queries.
It was previously possible for invalid credential errors to go unnoticed
in this task. Users would believe everything was configured correctly and
then sign in would fail with 'invalid credentials'. This adds a specific
bind check, plus catches errors connecting to the server. Also, specs :)
- Offloads uploading to GitLab Workhorse
- Use /authorize request for fast uploading
- Added backup recipes for artifacts
- Support download acceleration using X-Sendfile
Improve regexp to prevent false positives
If a filename happened to contain "db" and enough "rwx" characters before, then
this test would previously fail. For example:
```
drwxr-xr-x gitlab-runner/gitlab-runner 0 2015-04-02 07:46 uploads/tmp/cassidy.stokes8477/gitlabhq/36d972fa55d6b44810fc6fd843473adb/
```
Adding a space before the "db" match string tightens up the regexp and reduces the
chance of an unintended match.
See merge request !489
If a filename happened to contain "db" and enough "rwx" characters before, then
this test would previously fail. For example:
```
drwxr-xr-x gitlab-runner/gitlab-runner 0 2015-04-02 07:46 uploads/tmp/cassidy.stokes8477/gitlabhq/36d972fa55d6b44810fc6fd843473adb/
```
Adding a space before the "db" match string tightens up the regexp and reduces the
chance of unintended match.
DEPRECATION: `expect { }.not_to raise_error(SpecificErrorClass)` is deprecated. Use `expect { }.not_to raise_error` (with no args) instead. Called from /home/travis/build/gitlabhq/gitlabhq/spec/tasks/gitlab/backup_rake_spec.rb:42:in `block (4 levels) in <top (required)>'.