also, I refactored the MergeRequest#fetch_ref method to express
the side-effect that this method has.
MergeRequest#fetch_ref -> MergeRequest#fetch_ref!
Repository#fetch_source_branch -> Repository#fetch_source_branch!
Resolve "ActiveRecord::StatementInvalid: PG::QueryCanceled: ERROR: canceling statement due to statement timeout"
Closes#39054
See merge request gitlab-org/gitlab-ce!15063
For MRs with many thousands of commits, `SELECT DISTINCT(sha)` will be very
slow.
What we can't do to fix this:
1. Add an index. Postgres won't use it for DISTINCT without a lot of ceremony.
2. Do the `uniq` in Ruby. That can still be very slow with hundreds of
thousands of commits.
3. Use a subquery. We haven't removed the `st_commits` column yet, but we will
soon.
Until 3 is available to us, we can just do 2, but also add a limit clause. There
is no ordering, so this may return different results, but our goal with these
MRs is just to get them to load, so it's not a huge deal.
In GitLab EE, a GitLab instance can be read-only (e.g. when it's a Geo
secondary node). But in GitLab CE it also might be useful to have the
"read-only" idea around. So port it back to GitLab CE.
Also having the principle of read-only in GitLab CE would hopefully
lead to less errors introduced, doing write operations when there
aren't allowed for read-only calls.
Closesgitlab-org/gitlab-ce#37534.
MergeRequest#create_merge_request_diff and MergeRequest#reload_diff are
the only places where we generate a new MR diff so that's where we
should fetch the ref.
This also ensures that the ref is not fetched when we call
merge_request.merge_request_diffs.create in
Github::Import#fetch_pull_requests.
Signed-off-by: Rémy Coutable <remy@rymai.me>
In this particular case the use of UNION ALL leads to a better query
plan compared to using 1 big query that uses an OR statement to combine
different data sources.
See https://gitlab.com/gitlab-org/gitlab-ce/issues/38508 for more
information.
This ensures the open issues/MR count caches are refreshed properly when
creating new issues or MRs. This MR also includes a change to the cache
keys to ensure all caches are rebuilt on the fly.
This particular problem was not caught in the test suite due to a null
cache being used, resulting in all calls that would use a cache using
the underlying data directly. In production the code would fail because
a newly saved record returns an empty hash in #changes meaning checks
such as `state_changed? || confidential_changed?` would return false for
new rows, thus never updating the counters.
Fixes https://gitlab.com/gitlab-org/gitlab-ce/issues/38061
This ensures the issues/MR cache of the sidebar is only updated when the
state or confidential flags changes, instead of changing this for every
update.
* upstream/master: (225 commits)
Add changelog entry
Backports EE 2756 logic to CE.
Make rubocop happy
Make profile settings dropdown consistent
Add filter by my reaction
Update spec initialization with it being a shared component
Update identicon path and selector
Renamed to `identicon` and make shared component
Merge branch 'master-i18n' into 'master'
Fix broken Frontend JS guide
Replace 'project/star.feature' spinach test with an rspec analog
Adds position fixed to right sidebar
Fixes the margin of the top buttons of the pipeline page
Remove commented out code
Better align fallback image emojis
Decrease Metrics/CyclomaticComplexity threshold to 15
Add changelog
Respect the default visibility level when creating a group
Further break with_repo_branch_commit into parts
Make sure inspect doesn't generate crazy string
...
Every project page displays a navigation menu that in turn displays the
number of open issues and merge requests. This means that for every
project page we run two COUNT(*) queries, each taking up roughly 30
milliseconds on GitLab.com. By caching these numbers and refreshing them
whenever necessary we can reduce loading times of all these pages by up
to roughly 60 milliseconds.
The number of open issues does not include confidential issues. This is
a trade-off to keep the code simple and to ensure refreshing the data
only needs 2 COUNT(*) queries instead of 3. A downside is that if a
project only has 5 confidential issues the counter will be set to 0.
Because we now have 3 similar counting service classes the code
previously used in Projects::ForksCountService has mostly been moved to
Projects::CountService, which in turn is reused by the various service
classes.
Fixes https://gitlab.com/gitlab-org/gitlab-ce/issues/36622
Having two states that essentially mean the same thing is very much like
having a boolean "true" and boolean "mostly-true": it's rather silly.
This commit merges the "reopened" state into the "opened" state while
taking care of system notes still showing messages along the lines of
"Alice reopened this issue".
A big benefit from having only two states (opened and closed) is that
indexing and querying becomes simpler and more performant. For example,
to get all the opened queries we no longer have to query both states:
SELECT *
FROM issues
WHERE project_id = 2
AND state IN ('opened', 'reopened');
Instead we can query a single state directly, which can be much faster:
SELECT *
FROM issues
WHERE project_id = 2
AND state = 'opened';
Further, only having two states makes indexing easier as we will only
ever filter (and thus scan an index) using a single value. Partial
indexes could help but aren't supported on MySQL, complicating the
development process and not being helpful for MySQL.
For merge requests created after 9.4, we have a `merge_request_diff_commits`
table we can get all the SHAs from very quickly. We just need to exclude these
when we load from the legacy format, by ignoring diffs with no serialised
commits.
Once these have been migrated in the background, every MR will see this
improvement.
This is an ID-less table with just three columns: an association to the merge
request diff the commit belongs to, the relative order of the commit within the
merge request diff, and the commit SHA itself.
Previously we stored much more information about the commits, so that we could
display them even when they were deleted from the repo. Since 8.0, we ensure
that those commits are kept around for as long as the target repo itself is, so
we don't need to duplicate that data in the database.
This is allowed for existing instances so we don't end up 76 offenses
right away, but for new code one should _only_ use this if they _have_
to remove non database data. Even then it's usually better to do this in
a service class as this gives you more control over how to remove the
data (e.g. in bulk).
This removes the need for relying on Rails' "dependent" option for data
removal, which is _incredibly_ slow (even when using :delete_all) when
deleting large amounts of data. This also ensures data consistency is
enforced on DB level and not on application level (something Rails is
really bad at).
This commit also includes various migrations to add foreign keys to
tables that eventually point to "projects" to ensure no rows get
orphaned upon removing a project.
Fix https://gitlab.com/gitlab-org/gitlab-ce/issues/27070
Deprecate "chat commands" in favor of "slash commands"
We looked for things like:
- `slash commmand`
- `slash_command`
- `slash-command`
- `SlashCommand`
I don't know why this happens exactly, but given an upstream and fork repository
from a customer, both of which required GC, resolving conflicts would corrupt
the fork so badly that it couldn't be cloned.
This isn't a perfect fix for that case, because the MR may still need to be
merged manually, but it does ensure that the repository is at least usable.
My best guess is that when we generate the index for the conflict
resolution (which we previously did in the target project), we obtain a
reference to an OID that doesn't exist in the source, even though we already
fetch the refs from the target into the source.
Explicitly setting the source project as the place to get the merge index from
seems to prevent repository corruption in this way.
The problem is that we often go via a diff object constructed from the diffs
stored in the DB. Those diffs, by definition, don't overflow, so we don't have
access to the 'correct' `real_size` - that is stored on the MR diff object
iself.
Rename column in the database
Rename fields related to import/export feature
Rename API endpoints
Rename documentation links
Rename the rest of occurrences in the code
Replace the images that contain the words "build succeeds" and docs referencing to them
Make sure pipeline is green and nothing is missing.
updated doc images
renamed only_allow_merge_if_build_succeeds in projects and fixed references
more updates
fix some spec failures
fix rubocop offences
fix v3 api spec
fix MR specs
fixed issues with partials
fix MR spec
fix alignment
add missing v3 to v4 doc
wip - refactor v3 endpoints
fix specs
fix a few typos
fix project specs
copy entities fully to V3
fix entity error
more fixes
fix failing specs
fixed missing entities in V3 API
remove comment
updated code based on feedback
typo
fix spec
Previously, we created an unmergeable todo when a merge request:
1. Had merge when pipeline succeeds set.
2. Became unmergeable.
However, when merge when pipeline succeeds fails due to unmergeability,
the flag isn't actually removed. And a merge request can become
unmergeable multiple times, as every time the target branch is updated
we need to re-check the mergeable status. This means that if the todo
was marked done, and the MR was checked again, a new todo would be
created for the same event.
Instead of checking this, we should create the todo from the service
responsible for merging when the pipeline succeeds. That way the todo is
guaranteed to only be created when we care about it.
This was wrong when there were over 100 files in the diff, because we
did not use the same diff options as subclasses of
`Gitlab::Diff::FileCollection::Base` when getting the raw diffs. (The
reason we don't use those classes directly is because they may perform
highlighting, which isn't needed for just counting the diffs.)
We don't need to create the intermediate FileCollection object with its
associated highlighting in order to count how many files changed. Both
the compare object and the MR diff object have a raw_diffs method that
is perfectly fine for this case.
Fix missing Note access checks in by moving Note#search to updated NoteFinder
Split from !2024 to partially solve https://gitlab.com/gitlab-org/gitlab-ce/issues/23867
## Which fixes are in this MR?
⚠️ - Potentially untested
💣 - No test coverage
🚥 - Test coverage of some sort exists (a test failed when error raised)
🚦 - Test coverage of return value (a test failed when nil used)
✅ - Permissions check tested
### Note lookup without access check
- [x] ✅ app/finders/notes_finder.rb:13 :download_code check
- [x] ✅ app/finders/notes_finder.rb:19 `SnippetsFinder`
- [x] ✅ app/models/note.rb:121 [`Issue#visible_to_user`]
- [x] ✅ lib/gitlab/project_search_results.rb:113
- This is the only use of `app/models/note.rb:121` above, but importantly has no access checks at all. This means it leaks MR comments and snippets when those features are `team-only` in addition to the issue comments which would be fixed by `app/models/note.rb:121`.
- It is only called from SearchController where `can?(current_user, :download_code, @project)` is checked, so commit comments are not leaked.
### Previous discussions
- [x] https://dev.gitlab.org/gitlab/gitlabhq/merge_requests/2024/diffs#b915c5267a63628b0bafd23d37792ae73ceae272_13_13 `: download_code` check on commit
- [x] https://dev.gitlab.org/gitlab/gitlabhq/merge_requests/2024/diffs#b915c5267a63628b0bafd23d37792ae73ceae272_19_19 `SnippetsFinder` should be used
- `SnippetsFinder` should check if the snippets feature is enabled -> https://gitlab.com/gitlab-org/gitlab-ce/issues/25223
### Acceptance criteria met?
- [x] Tests added for new code
- [x] TODO comments removed
- [x] Squashed and removed skipped tests
- [x] Changelog entry
- [ ] State Gitlab versions affected and issue severity in description
- [ ] Create technical debt issue for NotesFinder.
- Either split into `NotesFinder::ForTarget` and `NotesFinder::Search` or consider object per notable type such as `NotesFinder::OnIssue`. For the first option could create `NotesFinder::Base` which is either inherited from or which can be included in the other two.
- Avoid case statement anti-pattern in this finder with use of `NotesFinder::OnCommit` etc. Consider something on the finder for this? `Model.finder(user, project)`
- Move `inc_author` to the controller, and implement `related_notes` to replace `non_diff_notes`/`mr_and_commit_notes`
See merge request !2035
The target branch of a merge request has to be a branch in the project
for which the merge request is submitted. When a branch changes in a fork,
it does not make sense to reload diffs of merge requests in the upstream
project that use the same branch name as the target branch.
Please note that it does make sense to reload diffs when the source branch
changes.
When a merge request can only be merged when all discussions are
resolved. This feature allows to easily delegate those discussions to a
new issue, while marking them as resolved in the merge request.
The user is presented with a new issue, prepared with mentions of all
unresolved discussions, including the first unresolved note of the
discussion, time and link to the note.
When the issue is created, the discussions in the merge request will get
a system note directing the user to the newly created issue.
Flushing the events cache worked by updating a recent number of rows in
the "events" table. This has the result that on PostgreSQL a lot of dead
tuples are produced on a regular basis. This in turn means that
PostgreSQL will spend considerable amounts of time vacuuming this table.
This in turn can lead to an increase of database load.
For GitLab.com we measured the impact of not using events caching and
found no measurable increase in response timings. Meanwhile not flushing
the events cache lead to the "events" table having no more dead tuples
as now rows are only inserted into this table.
As a result of this we are hereby removing events caching as it does not
appear to help and only increases database load.
For more information see the following comment:
https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/6578#note_18864037
Also allow merge request to be merged with skipped pipeline and the
"only allow merge when pipeline is green" feature enabled
Signed-off-by: Rémy Coutable <remy@rymai.me>
When reading conflicts:
1. Add a `type` field. `text` works as before, and has `sections`;
`text-editor` is a file with ambiguous conflict markers that can only
be resolved in an editor.
2. Add a `content_path` field pointing to a JSON representation of the
file's content for a single file.
3. Hitting `content_path` returns a similar datastructure to the `file`,
but without the `content_path` and `sections` fields, and with a
`content` field containing the full contents of the file (with
conflict markers).
When writing conflicts:
1. Instead of `sections` being at the top level, they are now in a
`files` array. This matches the read format better.
2. The `files` array contains file hashes, each of which must contain:
a. `new_path`
b. `old_path`
c. EITHER `sections` (which works as before) or `content` (with the
full content of the resolved file).
Now a merge request with a blank description will no longer produce a
merge commit message like this:
```
Merge branch 'foo' into 'master'
Bring the wonders of foo into the world
See merge request !7283
```
What an improvement! 🎉
Show all pipelines from all merge_request_diffs
This way we could also show pipelines from commits which
were discarded due to a force push.
Closes#21889
See merge request !6414
- Don't use `TableReferences` - using `.arel_table` is shorter!
- Move some database-related code to `Gitlab::Database`
- Remove the `MergeRequest#issues_closed` and
`Issue#closed_by_merge_requests` associations. They were either
shadowing or were too similar to existing methods. They are not being
used anywhere, so it's better to remove them to reduce confusion.
- Use Rails 3-style validations
- Index for `MergeRequest::Metrics#first_deployed_to_production_at`
- Only include `CycleAnalyticsHelpers::TestGeneration` for specs that
need it.
- Other minor refactorings.
- Move things common to `Issue` and `MergeRequest` into `Issuable`
- Move more database-specific functions into `Gitlab::Database`
- Indentation changes and other minor refactorings.
However, if MergeRequest#all_commits_sha would want to handle
non-persisted merge request, by judging its name, it should not
just give 1 SHA, but all of them.
But we don't really care all_commits_sha for non-persisted merge
request anyway. So I think we should just ignore that case.
Better to not implementing something than implementing it in a
wrong and confusing way.
1. Change multiple updates to a single `update_all`
2. Use cascading deletes
3. Extract an average function for the database median.
4. Move database median to `lib/gitlab/database`
5. Use `delete_all` instead of `destroy_all`
6. Minor refactoring
All the code that pre-calculates metrics for use in the cycle analytics
page.
- Ci::Pipeline -> build start/finish
- Ci::Pipeline#merge_requests
- Issue -> record default metrics after save
- MergeRequest -> record default metrics after save
- Deployment -> Update "first_deployed_to_production_at" for MR metrics
- Git Push -> Update "first commit mention" for issue metrics
- Merge request create/update/refresh -> Update "merge requests closing issues"
1. These changes bring down page load time for 100 issues from more than
a minute to about 1.5 seconds.
2. This entire commit is composed of these types of performance
enhancements:
- Cache relevant data in `IssueMetrics` wherever possible.
- Cache relevant data in `MergeRequestMetrics` wherever possible.
- Preload metrics
3. Given these improvements, we now only need to make 4 SQL calls:
- Load all issues
- Load all merge requests
- Load all metrics for the issues
- Load all metrics for the merge requests
4. A list of all the data points that are now being pre-calculated:
a. The first time an issue is mentioned in a commit
- In `GitPushService`, find all issues mentioned by the given commit
using `ReferenceExtractor`. Set the `first_mentioned_in_commit_at`
flag for each of them.
- There seems to be a (pre-existing) bug here - files (and
therefore commits) created using the Web CI don't have
cross-references created, and issues are not closed even when
the commit title is "Fixes #xx".
b. The first time a merge request is deployed to production
When a `Deployment` is created, find all merge requests that
were merged in before the deployment, and set the
`first_deployed_to_production_at` flag for each of them.
c. The start / end time for a merge request pipeline
Hook into the `Pipeline` state machine. When the `status` moves to
`running`, find the merge requests whose tip commit matches the
pipeline, and record the `latest_build_started_at` time for each
of them. When the `status` moves to `success`, record the
`latest_build_finished_at` time.
d. The merge requests that close an issue
- This was a big cause of the performance problems we were having
with Cycle Analytics. We need to use `ReferenceExtractor` to make
this calculation, which is slow when we have to run it on a large
number of merge requests.
- When a merge request is created, updated, or refreshed, find the
issues it closes, and create an instance of
`MergeRequestsClosingIssues`, which acts as a join model between
merge requests and issues.
- If a `MergeRequestsClosingIssues` instance links a merge request
and an issue, that issue closes that merge request.
5. The `Queries` module was changed into a class, so we can cache the
results of `issues` and `merge_requests_closing_issues` across
various cycle analytics stages.
6. The code added in this commit is untested. Tests will be added in the
next commit.