- When renaming a column concurrently, drop any existing trigger before
attempting to create a new one.
When running migration specs multiple times (as it happens during
local development), the down method of previous migrations are called.
If any of the called methods contains a call to
rename_column_concurrently, a trigger will be created and not removed.
So, the next time a migration spec is run, if the same down method is
executed again, it will cause an error when attempting to create the
trigger (since it already exists). Dropping the trigger if it already
exists will prevent this problem.
This improves the `add_timestamps_with_timezone` helper by allowing the
column names to be configured. This has the advantage that unnecessary
columns can be avoided, saving space.
A helper for removing the columns is also provided, to be used in the
`down` method of migrations.
In https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/21497, we
migrated all project import data into a separate table,
`project_import_data`. In addition, we also added:
```
ignore_column :import_status, :import_jid, :import_error
```
In https://gitlab.com/gitlab-com/gl-infra/production/issues/908, we
observed some of these `import_error` columns consumed megabytes of
error backtraces and caused slow loading of projects whenever a `SELECT
* from projects` query loaded the row into memory.
Since we have long migrated away from these columns, we can now drop
these columns entirely.
1. Ignore tables that use STI in reltuples count strategy.
Models that use Rails' single-type inheritance, such as `Group` and
`CiService`, need an additional WHERE clause to count the total
properly, which isn't supported by the reltuples strategy. For now,
we just omit these from the statistics sampling and rely on the other
strategies to get this data.
2. Fix tablesample count strategy not counting groups properly.
Models such as `Group` needs a WHERE clause to distinguish it from
namespaces. We now add in the WHERE clause if STI is in use.
Closes https://gitlab.com/gitlab-org/gitlab-ee/issues/7435
A tablesample count executes in two phases:
* Estimate table sizes based on reltuples.
* Based on the estimate:
* If the table is considered 'small', execute an exact relation count.
* Otherwise, count on a sample of the table using TABLESAMPLE.
The size of the sample is chosen in a way that we always roughly scan
the same amount of rows (see TABLESAMPLE_ROW_TARGET).
We want to run CI with rails 4 for a short-term (until we are sure that
we will ship with rails 5). The problem is that rails 4 can not handle
rails 5 schema.rb properly - specifically `t.index` directive can not
handle multiple indexes on the same column.
Because combination of rails 4 + rails 5 schema will be used
only in CI for a short-term, we can just ignore these incompatibility
failures. This patch adds `rails5` helper for specs.
it will decide the method for disable statement_timeout upon
per transaction or per session, based on how it's called.
When calling with a block, block will be executed and it will use
session based statement_timeout, otherwise will default to existing
behavior.
By default statement_timeout will only be enabled during transaction
lifetime, therefore not leaking outside of it.
With `transaction: false` it will set for entire session, but requires
a block to passed. It yields control and cleans up session after block
finishes, also preventing leaking outside of it.
This works the same way as change_column_type_using_background_migration, but
for renaming a column. It takes a table, not a relation, to match its concurrent
counterpart.
Also, generalise the cleanup migrations to reduce code duplication.
This commit does a number of things:
1. Reduces the number of queries needed by perform a single query to get all
the tuples for the relevant rows.
2. Uses a transaction to query the tuple counts to ensure that the data
is retrieved from the primary.
Closes#46742
Uses PostgreSQL tuple estimates to provide a much faster yet approximate
count. See https://wiki.postgresql.org/wiki/Slow_Counting for more details.
We only use this fast method if the table has been analyzed or vacuumed
within the last hour.
Closes#46255
Index creation does not have an effect if the index is present already.
Index removal does not have an affect if the index is not present.
This helps to avoid patterns like this in migrations:
```
if index_exists?(...)
remove_concurrent_index(...)
end
```
Prior to this commit we would essentially update all rows in a table,
even those where the source column (e.g. `issues.closed_at`) was NULL.
This in turn could lead to statement timeouts when using the default
batch size of 10 000 rows per job.
To work around this we don't schedule jobs for rows where the source
value is NULL. We also don't update rows where the source column is NULL
(as an extra precaution) or the target column already has a non-NULL
value. Using this approach it should be possible to update 10 000 rows
in the "issues" table in about 7.5 - 8 seconds.
Fixes https://gitlab.com/gitlab-org/gitlab-ce/issues/42158
This adds a minimum interval to BackgroundMigrationWorker, ensuring
background migrations of the same class only run once every 5 minutes.
This prevents a thundering herd problem where scheduled migrations all
run at once due to their delays having been expired (e.g. as the result
of a queue being paused for a long time).
If a job was recently executed it's rescheduled with a delay that equals
the remaining time of the job's lease. This means that if the lease
expires in two minutes we only need to wait two minutes, instead of
five.
Fixes https://gitlab.com/gitlab-org/gitlab-ce/issues/41624
In a previous attempt (rolled back in
https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/16021) we tried
to migrate `issues.closed_at` from timestamp to timestamptz using a
regular migration. This has a bad impact on GitLab.com and as such was
rolled back.
This commit re-implements the original migrations using generic
background migrations, allowing us to still migrate the data in a single
release but without a negative impact on availability.
To ensure the database schema is up to date the background migrations
are performed inline in development and test environments. We also make
sure to not migrate that that doesn't need migrating in the first place
or has already been migrated.
When a project is using hashed storage, the repositories and
attachments wouldn't be saved on disk using the `full_path`. So the
migration would not do anything.
However: best to just skip moving when hashed storage is enabled.
Replaces all the explicit include metadata syntax in the specs (tag:
true) into the implicit one (:tag).
Added a cop to prevent future errors and handle autocorrection.