In a previous attempt (rolled back in
https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/16021) we tried
to migrate `issues.closed_at` from timestamp to timestamptz using a
regular migration. This has a bad impact on GitLab.com and as such was
rolled back.
This commit re-implements the original migrations using generic
background migrations, allowing us to still migrate the data in a single
release but without a negative impact on availability.
To ensure the database schema is up to date the background migrations
are performed inline in development and test environments. We also make
sure to not migrate that that doesn't need migrating in the first place
or has already been migrated.
The previous implementation iterated across the entire patch set
to determine the number of lines added, deleted, and changed. Rugged
has a native method `Rugged::Diff#stat` that does this already,
which appears to be a little faster and require less RAM than doing
this ourselves.
Improves performance in #41524
When searching for issues, an additional subquery
is added which filters only issues in a project. If global context is
used (no project is specified) this query filters all projects user has
access to.
In that case we can skip this filter because filtering only projects
user has access to is added anyway.
The filter is used only if a custom project context is specified
Related to #40540
Given the priorities shifted for the Gitaly team, this endpoint does not
get a dedicated endpoint yet. To make it work in a cloud native
environment the request needs to go to Gitaly, not rugged. This is
achieved by rerouting to the generic TreeEntry endpoint.
Reduce cardinality of some of GitLab's Prometheus metrics and fix observed duration reporting.
Closes#41045
See merge request gitlab-org/gitlab-ce!15881
When searching for merge requests, an additional subquery
is added which by default filters only merge requests which belong
to source or target project user has permission for.
This filter is not needed because more restrictive filter
which checks if user has permission for target project
is used in the query.
So unless a custom projects filter is used by user, it's possible
to skip the default projects filter and speed up the final query.
Related to #40540
* upstream/master: (671 commits)
Make rubocop happy
Use guard clause
Improve language
Prettify
Use temp branch
Pass info about who started the job and which job triggered it
Docs: add indexes for monitoring and performance monitoring
clearer-documentation-on-inline-diffs
Add docs for commit diff discussion in merge requests
sorting for tags api
Clear BatchLoader after each spec to prevent holding onto records longer than necessary
Include project in BatchLoader key to prevent returning blobs for the wrong project
moved lfs_blob_ids method into ExtractsPath module
Converted JS modules into exported modules
spec fixes
Bump gitlab-shell version to 5.10.3
Clear caches before updating MR diffs
Use new Ruby version 2.4 in GitLab QA images
moved lfs blob fetch from extractspath file
Update GitLab QA dependencies
...
By importing this Ruby code into gitlab-rails (and gitaly-ruby), we avoid
200ms of startup time for each gitlab_projects subprocess we are eliminating.
By not having a gitlab_projects subprocess between gitlab-rails / sidekiq and
any git subprocesses (e.g. for fork_project, fetch_remote, etc, calls), we can
also manage these git processes more cleanly, and avoid sending SIGKILL to them
i.e.
Using compare and swap we update the expires_at value.
The thread that actually is able to update this value will also set
the cache holding method_call enabled state
The Gitaly CommitService is being hammered by n + 1 calls, mostly when
finding commits. This leads to this gRPC being turned of on production:
https://gitlab.com/gitlab-org/gitaly/issues/514#note_48991378
Hunting down where it came from, most of them were due to
MergeRequest#show. To prove this, I set a script to request the
MergeRequest#show page 50 times. The GDK was being scraped by
Prometheus, where we have metrics on controller#action and their Gitaly
calls performed. On both occations I've restarted the full GDK so all
caches had to be rebuild.
Current master, 806a68a81f, needed 435 requests
After this commit, 154 requests
Instead of using the factories. Since the factories might be using
columns that aren't available in the schema at version the particular
spec is running in.
When a merge request is created from email, use message body
as merge request description. If message body is empty then
merge request description is still created from the source
branch commit (if there is only single commit in the merge
request).
If message body is empty and there are multiple commits in
the source branch, then merge request description is left empty.
Closes#40968
On GitLab.com, InfluxSampler#sample_objects appears to take 1.2 s or so to
iterate through 1059 objects. This had led to delays of a couple hundred
milliseconds in processing in the main thread. Remove this code since it's not
really being used.
Closesgitlab-com/infrastructure#3250
If the source import directory were different from the destination directory,
GitLab would first create a blank repository and then erroneously move the
original one into a subdirectory. Adding an import type prevents this the project
from being initialized in the first place. It was accidentally removed in
1f917cbd49.
Closes#40765
TableOfContents filter generates hrefs for each header in markdown,
if the header text consists from digits and redacted symbols only,
e.g. "123" or "1.0 then the auto-generated href has the same format
as issue references.
If the generated id contains only digits, then 'anchor-' prefix is
prepended to the id.
Closes#38473
Moving the check out of the general requests, makes sure we don't have
any slowdown in the regular requests.
To keep the process performing this checks small, the check is still
performed inside a unicorn. But that is called from a process running
on the same server.
Because the checks are now done outside normal request, we can have a
simpler failure strategy:
The check is now performed in the background every
`circuitbreaker_check_interval`. Failures are logged in redis. The
failures are reset when the check succeeds. Per check we will try
`circuitbreaker_access_retries` times within
`circuitbreaker_storage_timeout` seconds.
When the number of failures exceeds
`circuitbreaker_failure_count_threshold`, we will block access to the
storage.
After `failure_reset_time` of no checks, we will clear the stored
failures. This could happen when the process that performs the checks
is not running.
create_merge_request_handler_spec needs a repository for some tests but
this repository lasts on disk by default which causes failures of other
tests. TestEnv.clean_test_path is used to get rid of the repository
after each test.
Closes#40900
Later migrations added fields to the EE DB which were used by factories which were used in these specs.
And in CE on MySQL, a single appearance row is enforced.
The migration and migration specs should not depend on the codebase staying the same.
* new merge request can be created by sending an email to the specific
email address (similar to creating issues by email)
* for the first iteration, source branch must be specified in the mail
subject, other merge request parameters can not be set yet
* user should enable "Receive notifications about your own activity" in
user settings to receive a notification about created merge request
Part of #32878
* Hopefully fixes spec failures in which the table doesn’t exist
* Decouples the background migration from the post-deploy migration, e.g. we could easily run it again even though the table is dropped when finished.
So the path on source installs cannot be too long for our column.
And fix the column length test since Route.path is limited to 255 chars, it doesn’t matter how many nested groups there are.
Copy `KubernetesService` logic in `Clusters::Platforms::Kubernetes` to make it interchangeable. And implement a selector.
See merge request gitlab-org/gitlab-ce!15515
* upstream/master: (170 commits)
support ordering of project notes in notes api
Redirect to an already forked project if it exists
Reschedule the migration to populate fork networks
Create fork networks for forks for which the source was deleted.
Fix item name and namespace text overflow in Projects dropdown
Minor backport from EE
fix link that was linking to `html` instead of `md`
Backport epic tasklist
Add timeouts for Gitaly calls
SSHUploadPack over Gitaly is now OptOut
fix icon colors in commit list
Fix star icon color/stroke
Backport border inline edit
Add checkboxes to automatically run AutoDevops pipeline
BE for automatic pipeline when enabling Auto DevOps
I am certainly weary of debugging sidekiq but I don't think that's what was meant
Ensure MRs always use branch refs for comparison
Fix issue comment submit button disabled on GFM paste
Lock seed-fu at the correct version in Gemfile.lock
Improve indexes on merge_request_diffs
...
That way we can join forks-of-forks into the same network even if
their original source was deleted.
Consider the following:
`awesome-oss/badger` is forked to `coolstuff/badger`, which is
forked to `user-a/badger` which in turn is forked to
`user-b/badger`.
When `awesome-oss/badger` is deleted, we will now create a fork
network with `coolstuff/badger` as the root. The `user-a/badger`
and `user-b/badger` projects will be added to the network.
If a merge request was created with a branch name that also matched a tag name,
we'd generate a comparison to or from the tag respectively, rather than the
branch. Merging would still use the branch, of course.
To avoid this, ensure that when we get the branch heads, we prepend the
reference prefix for branches, which will ensure that we generate the correct
comparison.
The st_commits and st_diffs columns on merge_request_diffs historically held the
YAML-serialised data for a merge request diff, in a variety of formats.
Since 9.5, these have been migrated in the background to two new tables:
merge_request_diff_commits and merge_request_diff_files. That has the advantage
that we can actually query the data (for instance, to find out how many commits
we've stored), and that it can't be in a variety of formats, but must match the
new schema.
This is the final step of that journey, where we drop those columns and remove
all references to them. This is a breaking change to the importer, because we
can no longer import diffs created in the old format, and we cannot guarantee
the export will be in the new format unless it was generated after this commit.
Update Prometheus Gem version and disable Prometheus method call instrumentation by default.
Closes gitlab-ee#4139 and #40457
See merge request gitlab-org/gitlab-ce!15558
Compared to the merge_request_diff association:
1. It's simpler to query. The query uses a foreign key to the
merge_request_diffs table, so no ordering is necessary.
2. It's faster for preloading. The merge_request_diff association has to load
every diff for the MRs in the set, then discard all but the most recent for
each. This association means that Rails can just query for N diffs from N
MRs.
3. It's more complicated to update. This is a bidirectional foreign key, so we
need to update two tables when adding a diff record. This also means we need
to handle this as a special case when importing a GitLab project.
There is some juggling with this association in the merge request model:
* `MergeRequest#latest_merge_request_diff` is _always_ the latest diff.
* `MergeRequest#merge_request_diff` reuses
`MergeRequest#latest_merge_request_diff` unless:
* Arguments are passed. These are typically to force-reload the association.
* It doesn't exist. That means we might be trying to implicitly create a
diff. This only seems to happen in specs.
* The association is already loaded. This is important for the reasons
explained in the comment, which I'll reiterate here: if we a) load a
non-latest diff, then b) get its `merge_request`, then c) get that MR's
`merge_request_diff`, we should get the diff we loaded in c), even though
that's not the latest diff.
Basically, `MergeRequest#merge_request_diff` is the latest diff in most cases,
but not quite all.
* upstream/master: (126 commits)
Update VERSION to 10.3.0-pre
Update CHANGELOG.md for 10.2.0
default fill color for SVGs
ignore hashed repos (for now) when using `rake gitlab:cleanup:repos`
Use Redis cache for branch existence checks
Update CONTRIBUTING.md: Link definition of done to criteria
Use `make install` for Gitaly setups in non-test environments
FileUploader should check for hashed_storage?(:attachments) to use disk_path
Set the default gitlab-shell timeout to 3 hours
Update composite pipelines index to include "id"
Use arrays in Pipeline#latest_builds_with_artifacts
Fix blank states using old css
Skip confirmation user api
Custom issue tracker
Revert "check for `read_only?` first before seeing if request is disallowed"
add `#with_metadata` scope to remove a N+1 from the notes' API
Fix promoting milestone updating all issuables without milestone
Batchload blobs for diff generation
check for `read_only?` first before seeing if request is disallowed
use `Gitlab::Routing.url_helpers` instead of `Rails.application.routes.url_helpers`
...
In !15082, we changed the behavior of the middleware to call
`Rails.application.routes.recognize_path` whenever a new route arrived.
However, this can be a CPU-intensive task because Rails needs to allocate
memory and compile 850+ different regular expressions, which are complicated
in GitLab.
As a short-term fix, we can do a lightweight string match before
we do the heavier comparison.
Closes#40185, gitlab-com/infrastructure#3240
When a project is using hashed storage, the repositories and
attachments wouldn't be saved on disk using the `full_path`. So the
migration would not do anything.
However: best to just skip moving when hashed storage is enabled.
Conflicts used to take a `Repository` and pass that to
`Gitlab::Highlight.highlight`, which would call `#gitattribute` on the
repository. Now they use a `Gitlab::Git::Repository`, which didn't have that
method defined - but defining it on `Gitlab::Git::Repository` does make it
available on `Repository` through `method_missing`, so we can do that and both
cases will work.