By default statement_timeout will only be enabled during transaction
lifetime, therefore not leaking outside of it.
With `transaction: false` it will set for entire session, but requires
a block to passed. It yields control and cleans up session after block
finishes, also preventing leaking outside of it.
This works the same way as change_column_type_using_background_migration, but
for renaming a column. It takes a table, not a relation, to match its concurrent
counterpart.
Also, generalise the cleanup migrations to reduce code duplication.
This commit does a number of things:
1. Reduces the number of queries needed by perform a single query to get all
the tuples for the relevant rows.
2. Uses a transaction to query the tuple counts to ensure that the data
is retrieved from the primary.
Closes#46742
Uses PostgreSQL tuple estimates to provide a much faster yet approximate
count. See https://wiki.postgresql.org/wiki/Slow_Counting for more details.
We only use this fast method if the table has been analyzed or vacuumed
within the last hour.
Closes#46255
Index creation does not have an effect if the index is present already.
Index removal does not have an affect if the index is not present.
This helps to avoid patterns like this in migrations:
```
if index_exists?(...)
remove_concurrent_index(...)
end
```
Prior to this commit we would essentially update all rows in a table,
even those where the source column (e.g. `issues.closed_at`) was NULL.
This in turn could lead to statement timeouts when using the default
batch size of 10 000 rows per job.
To work around this we don't schedule jobs for rows where the source
value is NULL. We also don't update rows where the source column is NULL
(as an extra precaution) or the target column already has a non-NULL
value. Using this approach it should be possible to update 10 000 rows
in the "issues" table in about 7.5 - 8 seconds.
Fixes https://gitlab.com/gitlab-org/gitlab-ce/issues/42158
This adds a minimum interval to BackgroundMigrationWorker, ensuring
background migrations of the same class only run once every 5 minutes.
This prevents a thundering herd problem where scheduled migrations all
run at once due to their delays having been expired (e.g. as the result
of a queue being paused for a long time).
If a job was recently executed it's rescheduled with a delay that equals
the remaining time of the job's lease. This means that if the lease
expires in two minutes we only need to wait two minutes, instead of
five.
Fixes https://gitlab.com/gitlab-org/gitlab-ce/issues/41624
In a previous attempt (rolled back in
https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/16021) we tried
to migrate `issues.closed_at` from timestamp to timestamptz using a
regular migration. This has a bad impact on GitLab.com and as such was
rolled back.
This commit re-implements the original migrations using generic
background migrations, allowing us to still migrate the data in a single
release but without a negative impact on availability.
To ensure the database schema is up to date the background migrations
are performed inline in development and test environments. We also make
sure to not migrate that that doesn't need migrating in the first place
or has already been migrated.
When a project is using hashed storage, the repositories and
attachments wouldn't be saved on disk using the `full_path`. So the
migration would not do anything.
However: best to just skip moving when hashed storage is enabled.
Replaces all the explicit include metadata syntax in the specs (tag:
true) into the implicit one (:tag).
Added a cop to prevent future errors and handle autocorrection.
This adds a bunch of checks to migrations that may create or drop
triggers. Dropping triggers/functions is done using "IF EXISTS" so we
don't throw an error if the object in question has already been dropped.
We now also raise a custom error (message) when the user does not have
TRIGGER privileges. This should prevent the schema from entering an
inconsistent state while also providing the user with enough information
on how to solve the problem.
The recommendation of using SUPERUSER permissions is a bit extreme but
we require this anyway (Omnibus also configures users with this
permission).
Fixes https://gitlab.com/gitlab-org/gitlab-ce/issues/36633
These attributes are stored in binary in the database, but exposed as
strings. This allows one to query/create data using plain SHA1 hashes as
Strings, while storing them more efficiently as binary.
* master: (56 commits)
File view buttons
Don't reset the session when the example failed, because we need capybara-screenshot to have access to it
Resolve "MR comment + system note highlight don't have the same width"
Add feature spec for dashboard state filter tabs
Wording of Mysql support.
a new feature checklist and more elaborate documentation requirements
Filter archived project in API v3 only if param present
Revert to using links instead of buttons in Issuable Index tabs.
Do not run the codeclimate job on docs-only changes
Only show gray footer space if environment actions exist
Migrate Gitlab::Git::Blob.find to Gitaly
Backport filtered search lazy token consistent state fix
Add a comment explaining how the branch clean up happens
Fix Github::Representation::PullRequest#source_branch_exists?
Add CHANGELOG
Fix GitHub importer performance on branch existence check
Rebuild the dynamic path before validating it
Rename stage ref migration specs to match a class name
Enable Style/DotPosition Rubocop 👮
Revert "Merge branch 'winh-merge-request-related-issues' into 'master'"
...
Conflicts:
db/post_migrate/20170526185921_migrate_build_stage_reference.rb
Starting with GitLab 9.1.0 we will no longer allow downtime migrations
unless absolutely necessary. This commit updates the various developer
guides and adds code that is necessary to make zero downtime migrations
less painful.
This was initially not implemented simply because I forgot about the
size limit of constraint names in PostgreSQL (63 bytes). Using the old
technique we can't add foreign keys for certain tables. For example,
adding a foreign key on
protected_branch_merge_access_levels.protected_branch_id would lead to
the following key name:
fk_protected_branch_merge_access_levels_protected_branches_protected_branch_id
This key is 78 bytes long, thus violating the PostgreSQL size
requirements.
The hashing strategy is copied from Rails' foreign_key_name() method,
which unfortunately is private and subject to change without notice.
This method allows one to create foreign keys without blocking access to
the source table, but only on PostgreSQL.
When creating a regular foreign key the "ALTER TABLE" statement used for
this won't return until all data has been validated. This statement in
turn will acquire a lock on the source table. As a result this lock can
be held for quite a long amount of time, depending on the number of rows
and system load.
By breaking up the foreign key creation process in two steps (creation,
and validation) we can reduce the amount of locking to a minimum.
Locking is still necessary for the "ALTER TABLE" statement that adds the
constraint, but this is a fast process and so will only block access for
a few milliseconds.
This ensures that whatever locks are acquired aren't held onto until the
end of the transaction (= after _all_ rows have been updated). Timing
wise there's also no difference between using a transaction and not
using one.
By passing a block to update_column_in_batches() one can now customize
the queries executed. This in turn can be used to only update a specific
set of rows instead of simply all the rows in the table.
These helpers can be used to perform migrations without taking down the
entire application.
For example, the method "add_column_with_default" can be used to add a
new column with a default value without locking the entire table.