- When renaming a column concurrently, drop any existing trigger before
attempting to create a new one.
When running migration specs multiple times (as it happens during
local development), the down method of previous migrations are called.
If any of the called methods contains a call to
rename_column_concurrently, a trigger will be created and not removed.
So, the next time a migration spec is run, if the same down method is
executed again, it will cause an error when attempting to create the
trigger (since it already exists). Dropping the trigger if it already
exists will prevent this problem.
This improves the `add_timestamps_with_timezone` helper by allowing the
column names to be configured. This has the advantage that unnecessary
columns can be avoided, saving space.
A helper for removing the columns is also provided, to be used in the
`down` method of migrations.
So funny story, true story. I tried to run the test locally, but
didn't make it past setting up Gitaly.
Here's what I tried:
First attempt:
`git clone gitlab-ce`
`cd gitlab-ce && bundle install`
`be rspec`
This didn't work because I was missing the config/database.yml, I
didn't see a `script/bootstrap` so I looked in the readme which
redirected me to a webpage which redirected me to the
gitlab-development-kit.
Second attempt:
`gem install gitlab-development-kit`
cd gitlab-development-kit
gdk init
gdk isntall
This broke somwhere along the way because it couldn't install Gitaly
because my go version was too low. But it did clone the gitlab repo
again and this time it did have a config/database.yml.
So I tried to cd into it and `be rspec
spec/lib/gitlab/database/migration_helpers_spec.rb` which complained
about the database not being configured so I:
- Changed the socket to localhost (in the config/database.yml)
- `createdb <dev_db>` `createdb test_db`
- `be rake db:test:prepare`
Great success, it was doing things! But then failed when it came at
the Gitaly step.
Since I only want to change these three lines, at the point I gave up
and entrusted the pipeline to do its thing.
What I would have liked:
- A 'It's a Rails system, I know this' readme/docs (It's in there
somewhere just couldn't find it)
- A way to run tests without having to use Gitaly
- Not having too install all the things for a small fix (I get why'd
you want this, but to me it's overkill)
The rather cryptic:
"fk_#{Digest::SHA256.hexdigest("#{table}_#{column}_fk").first(10)}"
Was too much for emacs too handle*, since it was coming from the Rails
codebase I took their way of doing the same thing and applied it here.
I think it's easier to read and it also makes emacs render the
migration helpers pretty again.
* not true, emacs can handle anything, leave emacs alone!
1. Ignore tables that use STI in reltuples count strategy.
Models that use Rails' single-type inheritance, such as `Group` and
`CiService`, need an additional WHERE clause to count the total
properly, which isn't supported by the reltuples strategy. For now,
we just omit these from the statistics sampling and rely on the other
strategies to get this data.
2. Fix tablesample count strategy not counting groups properly.
Models such as `Group` needs a WHERE clause to distinguish it from
namespaces. We now add in the WHERE clause if STI is in use.
Closes https://gitlab.com/gitlab-org/gitlab-ee/issues/7435
A tablesample count executes in two phases:
* Estimate table sizes based on reltuples.
* Based on the estimate:
* If the table is considered 'small', execute an exact relation count.
* Otherwise, count on a sample of the table using TABLESAMPLE.
The size of the sample is chosen in a way that we always roughly scan
the same amount of rows (see TABLESAMPLE_ROW_TARGET).
The original code caused Rails to generate invalid SQL. The problem
lays in the `.arel` method in `ActiveRecord::Relation`. When there was
a `limit` on the relation, the `LIMIT` statement was taken over to
Arel, but the value wasn't.
```ruby
relation = Event.limit(2)
relation.to_sql
#=> "SELECT `events`.* FROM `events` LIMIT 2"
relation.arel.to_sql
#=> "SELECT `events`.* FROM `events` LIMIT ?"
```
Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/51729
This commit adds the module `FromUnion`, which provides the class method
`from_union`. This simplifies the process of selecting data from the
result of a UNION, and reduces the likelihood of making mistakes. As a
result, instead of this:
union = Gitlab::SQL::Union.new([foo, bar])
Foo.from("(#{union.to_sql}) #{Foo.table_name}")
We can now write this instead:
Foo.from_union([foo, bar])
This commit also includes some changes to make this new setup work
properly. For example, a bug in Rails 4
(https://github.com/rails/rails/issues/24193) would break the use of
`from("sub-query-here").includes(:relation)` in certain cases. There was
also a CI query which appeared to repeat a lot of conditions from an
outer query on an inner query, which isn't necessary.
Finally, we include a RuboCop cop to ensure developers use this new
module, instead of using Gitlab::SQL::Union directly.
Fixes https://gitlab.com/gitlab-org/gitlab-ce/issues/51307
it will decide the method for disable statement_timeout upon
per transaction or per session, based on how it's called.
When calling with a block, block will be executed and it will use
session based statement_timeout, otherwise will default to existing
behavior.
By default statement_timeout will only be enabled during transaction
lifetime, therefore not leaking outside of it.
With `transaction: false` it will set for entire session, but requires
a block to passed. It yields control and cleans up session after block
finishes, also preventing leaking outside of it.
This changes the BackgroundMigration worker so it checks for the health
of the DB before performing a background migration. This in turn allows
us to reduce the minimum interval, without having to worry about blowing
things up if we schedule too many migrations.
In this setup, the BackgroundMigration worker will reschedule jobs as
long as the database is considered to be in an unhealthy state. Once the
database has recovered, the migration can be performed.
To determine if the database is in a healthy state, we look at the
replication lag of any replication slots defined on the primary. If the
lag is deemed to great (100 MB by default) for too many slots, the
migration is rescheduled for a later point in time.
The health checking code is hidden behind a feature flag, allowing us to
disable it if necessary.
This works the same way as change_column_type_using_background_migration, but
for renaming a column. It takes a table, not a relation, to match its concurrent
counterpart.
Also, generalise the cleanup migrations to reduce code duplication.
This commit does a number of things:
1. Reduces the number of queries needed by perform a single query to get all
the tuples for the relevant rows.
2. Uses a transaction to query the tuple counts to ensure that the data
is retrieved from the primary.
Closes#46742
Uses PostgreSQL tuple estimates to provide a much faster yet approximate
count. See https://wiki.postgresql.org/wiki/Slow_Counting for more details.
We only use this fast method if the table has been analyzed or vacuumed
within the last hour.
Closes#46255
In Arel 7.0.0 (Arel 7.1.4 is used in Rails 5.0) the `engine` parameter
of `Arel::UpdateManager#initializer` was removed.
This commit makes the gitlab database helpers work both in rails 4 and
rails 5.
Fixes errors like this one:
```
1) Gitlab::Database::MigrationHelpers#update_column_in_batches when running outside of a transaction updates all the rows in a table
Failure/Error:
update_arel = Arel::UpdateManager.new(ActiveRecord::Base)
.table(table)
.set([[table[column], value]])
.where(table[:id].gteq(start_id))
ArgumentError:
wrong number of arguments (given 1, expected 0)
# ./lib/gitlab/database/migration_helpers.rb:317:in `new'
# ./lib/gitlab/database/migration_helpers.rb:317:in `block in update_column_in_batches'
# ./lib/gitlab/database/migration_helpers.rb:307:in `loop'
# ./lib/gitlab/database/migration_helpers.rb:307:in `update_column_in_batches'
# ./spec/lib/gitlab/database/migration_helpers_spec.rb:367:in `block (4 levels) in <top (required)>'
```
Direct disk access is done through Gitaly now, so the legacy path was
deprecated. This path was used in Gitlab::Shell however. This required
the refactoring in this commit.
Added is the removal of direct path access on the project model, as that
lookup wasn't needed anymore is most cases.
Closes https://gitlab.com/gitlab-org/gitaly/issues/1111
[10.6] Prevent notes on confidential issues from being sent to chat
See merge request gitlab/gitlabhq!2366
# Conflicts:
# app/helpers/services_helper.rb
Index creation does not have an effect if the index is present already.
Index removal does not have an affect if the index is not present.
This helps to avoid patterns like this in migrations:
```
if index_exists?(...)
remove_concurrent_index(...)
end
```
The concurrency issue originates from inserts on
`user_interacted_projects` from the app while running the post-deploy
migration.
This change comes with a strategy to lock the table while removing
duplicates and creating the unique index (and similar for FK
constraints).
Also, we'll have a non-unique index until the post-deploy migration is
finished to speed up queries during that time.
Closes#44205.
Prior to this commit we would essentially update all rows in a table,
even those where the source column (e.g. `issues.closed_at`) was NULL.
This in turn could lead to statement timeouts when using the default
batch size of 10 000 rows per job.
To work around this we don't schedule jobs for rows where the source
value is NULL. We also don't update rows where the source column is NULL
(as an extra precaution) or the target column already has a non-NULL
value. Using this approach it should be possible to update 10 000 rows
in the "issues" table in about 7.5 - 8 seconds.
Fixes https://gitlab.com/gitlab-org/gitlab-ce/issues/42158
This adds a minimum interval to BackgroundMigrationWorker, ensuring
background migrations of the same class only run once every 5 minutes.
This prevents a thundering herd problem where scheduled migrations all
run at once due to their delays having been expired (e.g. as the result
of a queue being paused for a long time).
If a job was recently executed it's rescheduled with a delay that equals
the remaining time of the job's lease. This means that if the lease
expires in two minutes we only need to wait two minutes, instead of
five.
Fixes https://gitlab.com/gitlab-org/gitlab-ce/issues/41624
In a previous attempt (rolled back in
https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/16021) we tried
to migrate `issues.closed_at` from timestamp to timestamptz using a
regular migration. This has a bad impact on GitLab.com and as such was
rolled back.
This commit re-implements the original migrations using generic
background migrations, allowing us to still migrate the data in a single
release but without a negative impact on availability.
To ensure the database schema is up to date the background migrations
are performed inline in development and test environments. We also make
sure to not migrate that that doesn't need migrating in the first place
or has already been migrated.
* upstream/master: (671 commits)
Make rubocop happy
Use guard clause
Improve language
Prettify
Use temp branch
Pass info about who started the job and which job triggered it
Docs: add indexes for monitoring and performance monitoring
clearer-documentation-on-inline-diffs
Add docs for commit diff discussion in merge requests
sorting for tags api
Clear BatchLoader after each spec to prevent holding onto records longer than necessary
Include project in BatchLoader key to prevent returning blobs for the wrong project
moved lfs_blob_ids method into ExtractsPath module
Converted JS modules into exported modules
spec fixes
Bump gitlab-shell version to 5.10.3
Clear caches before updating MR diffs
Use new Ruby version 2.4 in GitLab QA images
moved lfs blob fetch from extractspath file
Update GitLab QA dependencies
...
* upstream/master: (126 commits)
Update VERSION to 10.3.0-pre
Update CHANGELOG.md for 10.2.0
default fill color for SVGs
ignore hashed repos (for now) when using `rake gitlab:cleanup:repos`
Use Redis cache for branch existence checks
Update CONTRIBUTING.md: Link definition of done to criteria
Use `make install` for Gitaly setups in non-test environments
FileUploader should check for hashed_storage?(:attachments) to use disk_path
Set the default gitlab-shell timeout to 3 hours
Update composite pipelines index to include "id"
Use arrays in Pipeline#latest_builds_with_artifacts
Fix blank states using old css
Skip confirmation user api
Custom issue tracker
Revert "check for `read_only?` first before seeing if request is disallowed"
add `#with_metadata` scope to remove a N+1 from the notes' API
Fix promoting milestone updating all issuables without milestone
Batchload blobs for diff generation
check for `read_only?` first before seeing if request is disallowed
use `Gitlab::Routing.url_helpers` instead of `Rails.application.routes.url_helpers`
...
When a project is using hashed storage, the repositories and
attachments wouldn't be saved on disk using the `full_path`. So the
migration would not do anything.
However: best to just skip moving when hashed storage is enabled.
* upstream/master: (507 commits)
Add dropdowns documentation
Convert migration to populate latest merge request ID into a background migration
Set 0.69.0 instead of latest for codeclimate image
De-duplicate background migration matchers defined in spec/support/migrations_helpers.rb
Update database_debugging.md
Update database_debugging.md
Move installation of apps higher
Change to Google Kubernetes Cluster and add internal links
Add Ingress description from official docs
Add info on creating your own k8s cluster from the cluster page
Add info about the installed apps in the Cluster docs
Resolve "lock/confidential issuable sidebar custom svg icons iteration"
Update HA README.md to clarify GitLab support does not troubleshoot DRBD.
Update license_finder to 3.1.1
Make sure NotesActions#noteable returns a Noteable in the update action
Cache the number of user SSH keys
Adjust openid_connect_spec to use `raise_error`
Resolve "Clicking on GPG verification badge jumps to top of the page"
Add changelog for container repository path update
Update container repository path reference
...
Prior to this commit running
Namespace#force_share_with_group_lock_on_descendants would result in
updating _all_ namespaces in the namespaces table, not just the
descendants. This is the result of ActiveRecord::Relation#update_all not
taking into account the CTE. To work around this we use the CTE query as
a sub-query instead of directly calling #update_all.
To prevent this from happening the relations returned by
Gitlab::GroupHierarchy are now marked as read-only, resulting in an
error being raised when methods such as #update_all are used.
Fortunately on GitLab.com our statement timeouts appear to have
prevented this query from actually doing any damage other than causing
a very large amount of dead tuples.
Fixes https://gitlab.com/gitlab-org/gitlab-ce/issues/37916
This adds a bunch of checks to migrations that may create or drop
triggers. Dropping triggers/functions is done using "IF EXISTS" so we
don't throw an error if the object in question has already been dropped.
We now also raise a custom error (message) when the user does not have
TRIGGER privileges. This should prevent the schema from entering an
inconsistent state while also providing the user with enough information
on how to solve the problem.
The recommendation of using SUPERUSER permissions is a bit extreme but
we require this anyway (Omnibus also configures users with this
permission).
Fixes https://gitlab.com/gitlab-org/gitlab-ce/issues/36633
Due to a missing `on_delete: :cascade`, users would hit the error that
looked like:
```
PG::ForeignKeyViolation: ERROR: update or delete on table "protected_tags"
violates foreign key constraint "fk_rails_f7dfda8c51" on table
"protected_tag_create_access_levels" DETAIL: Key (id)=(1385) is still
referenced from table "protected_tag_create_access_levels". : DELETE FROM
"protected_tags" WHERE "protected_tags"."id" = 1385
```
Closes#36013
These attributes are stored in binary in the database, but exposed as
strings. This allows one to query/create data using plain SHA1 hashes as
Strings, while storing them more efficiently as binary.
* master: (56 commits)
File view buttons
Don't reset the session when the example failed, because we need capybara-screenshot to have access to it
Resolve "MR comment + system note highlight don't have the same width"
Add feature spec for dashboard state filter tabs
Wording of Mysql support.
a new feature checklist and more elaborate documentation requirements
Filter archived project in API v3 only if param present
Revert to using links instead of buttons in Issuable Index tabs.
Do not run the codeclimate job on docs-only changes
Only show gray footer space if environment actions exist
Migrate Gitlab::Git::Blob.find to Gitaly
Backport filtered search lazy token consistent state fix
Add a comment explaining how the branch clean up happens
Fix Github::Representation::PullRequest#source_branch_exists?
Add CHANGELOG
Fix GitHub importer performance on branch existence check
Rebuild the dynamic path before validating it
Rename stage ref migration specs to match a class name
Enable Style/DotPosition Rubocop 👮
Revert "Merge branch 'winh-merge-request-related-issues' into 'master'"
...
Conflicts:
db/post_migrate/20170526185921_migrate_build_stage_reference.rb
When using update_column_in_batches the upper limit on the batch size is
now 1000. This ensures that for very large tables we don't lock tens of
thousands of rows during the update. This in turn should reduce the
likelyhood of running into deadlocks.
MySQL doesn't allow us to create a trigger for a column that doesn't
exist yet. Failing with this error:
```
Mysql2::Error: Unknown column 'build_events' in 'NEW': CREATE TRIGGER trigger_6a80c097c862_insert
BEFORE INSERT
ON `services`
FOR EACH ROW
SET NEW.`build_events` = NEW.`job_events`
```
Creating the new column before creating the trigger avoids this.
Starting with GitLab 9.1.0 we will no longer allow downtime migrations
unless absolutely necessary. This commit updates the various developer
guides and adds code that is necessary to make zero downtime migrations
less painful.