The previous implementation iterated across the entire patch set
to determine the number of lines added, deleted, and changed. Rugged
has a native method `Rugged::Diff#stat` that does this already,
which appears to be a little faster and require less RAM than doing
this ourselves.
Improves performance in #41524
Given the priorities shifted for the Gitaly team, this endpoint does not
get a dedicated endpoint yet. To make it work in a cloud native
environment the request needs to go to Gitaly, not rugged. This is
achieved by rerouting to the generic TreeEntry endpoint.
Uses `list_commits_by_oid` on the CommitService, to request the needed
commits for pipelines. These commits are needed to display the user that
created the commit and the commit title.
This includes fixes for tests failing that depended on the commit
being `nil`. However, now these are batch loaded, this doesn't happen
anymore and the commits are an instance of BatchLoader.
* upstream/master: (671 commits)
Make rubocop happy
Use guard clause
Improve language
Prettify
Use temp branch
Pass info about who started the job and which job triggered it
Docs: add indexes for monitoring and performance monitoring
clearer-documentation-on-inline-diffs
Add docs for commit diff discussion in merge requests
sorting for tags api
Clear BatchLoader after each spec to prevent holding onto records longer than necessary
Include project in BatchLoader key to prevent returning blobs for the wrong project
moved lfs_blob_ids method into ExtractsPath module
Converted JS modules into exported modules
spec fixes
Bump gitlab-shell version to 5.10.3
Clear caches before updating MR diffs
Use new Ruby version 2.4 in GitLab QA images
moved lfs blob fetch from extractspath file
Update GitLab QA dependencies
...
This does two things:
- Pass commit oids instead of `Gitlab::Git::Commit`s. We only need the
former.
- Depend on only the target repository for conflict listing. For
conflict resolution, treat one repository as a remote one so that we can
implement it as such in Gitaly.
By importing this Ruby code into gitlab-rails (and gitaly-ruby), we avoid
200ms of startup time for each gitlab_projects subprocess we are eliminating.
By not having a gitlab_projects subprocess between gitlab-rails / sidekiq and
any git subprocesses (e.g. for fork_project, fetch_remote, etc, calls), we can
also manage these git processes more cleanly, and avoid sending SIGKILL to them
Moving the check out of the general requests, makes sure we don't have
any slowdown in the regular requests.
To keep the process performing this checks small, the check is still
performed inside a unicorn. But that is called from a process running
on the same server.
Because the checks are now done outside normal request, we can have a
simpler failure strategy:
The check is now performed in the background every
`circuitbreaker_check_interval`. Failures are logged in redis. The
failures are reset when the check succeeds. Per check we will try
`circuitbreaker_access_retries` times within
`circuitbreaker_storage_timeout` seconds.
When the number of failures exceeds
`circuitbreaker_failure_count_threshold`, we will block access to the
storage.
After `failure_reset_time` of no checks, we will clear the stored
failures. This could happen when the process that performs the checks
is not running.
* upstream/master: (170 commits)
support ordering of project notes in notes api
Redirect to an already forked project if it exists
Reschedule the migration to populate fork networks
Create fork networks for forks for which the source was deleted.
Fix item name and namespace text overflow in Projects dropdown
Minor backport from EE
fix link that was linking to `html` instead of `md`
Backport epic tasklist
Add timeouts for Gitaly calls
SSHUploadPack over Gitaly is now OptOut
fix icon colors in commit list
Fix star icon color/stroke
Backport border inline edit
Add checkboxes to automatically run AutoDevops pipeline
BE for automatic pipeline when enabling Auto DevOps
I am certainly weary of debugging sidekiq but I don't think that's what was meant
Ensure MRs always use branch refs for comparison
Fix issue comment submit button disabled on GFM paste
Lock seed-fu at the correct version in Gemfile.lock
Improve indexes on merge_request_diffs
...
If a merge request was created with a branch name that also matched a tag name,
we'd generate a comparison to or from the tag respectively, rather than the
branch. Merging would still use the branch, of course.
To avoid this, ensure that when we get the branch heads, we prepend the
reference prefix for branches, which will ensure that we generate the correct
comparison.
* upstream/master: (126 commits)
Update VERSION to 10.3.0-pre
Update CHANGELOG.md for 10.2.0
default fill color for SVGs
ignore hashed repos (for now) when using `rake gitlab:cleanup:repos`
Use Redis cache for branch existence checks
Update CONTRIBUTING.md: Link definition of done to criteria
Use `make install` for Gitaly setups in non-test environments
FileUploader should check for hashed_storage?(:attachments) to use disk_path
Set the default gitlab-shell timeout to 3 hours
Update composite pipelines index to include "id"
Use arrays in Pipeline#latest_builds_with_artifacts
Fix blank states using old css
Skip confirmation user api
Custom issue tracker
Revert "check for `read_only?` first before seeing if request is disallowed"
add `#with_metadata` scope to remove a N+1 from the notes' API
Fix promoting milestone updating all issuables without milestone
Batchload blobs for diff generation
check for `read_only?` first before seeing if request is disallowed
use `Gitlab::Routing.url_helpers` instead of `Rails.application.routes.url_helpers`
...
Conflicts used to take a `Repository` and pass that to
`Gitlab::Highlight.highlight`, which would call `#gitattribute` on the
repository. Now they use a `Gitlab::Git::Repository`, which didn't have that
method defined - but defining it on `Gitlab::Git::Repository` does make it
available on `Repository` through `method_missing`, so we can do that and both
cases will work.
* upstream/master: (507 commits)
Add dropdowns documentation
Convert migration to populate latest merge request ID into a background migration
Set 0.69.0 instead of latest for codeclimate image
De-duplicate background migration matchers defined in spec/support/migrations_helpers.rb
Update database_debugging.md
Update database_debugging.md
Move installation of apps higher
Change to Google Kubernetes Cluster and add internal links
Add Ingress description from official docs
Add info on creating your own k8s cluster from the cluster page
Add info about the installed apps in the Cluster docs
Resolve "lock/confidential issuable sidebar custom svg icons iteration"
Update HA README.md to clarify GitLab support does not troubleshoot DRBD.
Update license_finder to 3.1.1
Make sure NotesActions#noteable returns a Noteable in the update action
Cache the number of user SSH keys
Adjust openid_connect_spec to use `raise_error`
Resolve "Clicking on GPG verification badge jumps to top of the page"
Add changelog for container repository path update
Update container repository path reference
...
Moving more git operations to be executed by Gitaly, now the check if a
repository exists is an opt out endpoint.
Can be disabled, for the time being, by performing in the rails console:
> Feature.get('gitaly_repository_exists').disable
=> true
Part of gitlab-org/gitaly#314
Prior to this MR there were two GitHub related importers:
* Github::Import: the main importer used for GitHub projects
* Gitlab::GithubImport: importer that's somewhat confusingly used for
importing Gitea projects (apparently they have a compatible API)
This MR renames the Gitea importer to Gitlab::LegacyGithubImport and
introduces a new GitHub importer in the Gitlab::GithubImport namespace.
This new GitHub importer uses Sidekiq for importing multiple resources
in parallel, though it also has the ability to import data sequentially
should this be necessary.
The new code is spread across the following directories:
* lib/gitlab/github_import: this directory contains most of the importer
code such as the classes used for importing resources.
* app/workers/gitlab/github_import: this directory contains the Sidekiq
workers, most of which simply use the code from the directory above.
* app/workers/concerns/gitlab/github_import: this directory provides a
few modules that are included in every GitHub importer worker.
== Stages
The import work is divided into separate stages, with each stage
importing a specific set of data. Stages will schedule the work that
needs to be performed, followed by scheduling a job for the
"AdvanceStageWorker" worker. This worker will periodically check if all
work is completed and schedule the next stage if this is the case. If
work is not yet completed this worker will reschedule itself.
Using this approach we don't have to block threads by calling `sleep()`,
as doing so for large projects could block the thread from doing any
work for many hours.
== Retrying Work
Workers will reschedule themselves whenever necessary. For example,
hitting the GitHub API's rate limit will result in jobs rescheduling
themselves. These jobs are not processed until the rate limit has been
reset.
== User Lookups
Part of the importing process involves looking up user details in the
GitHub API so we can map them to GitLab users. The old importer used
an in-memory cache, but this obviously doesn't work when the work is
spread across different threads.
The new importer uses a Redis cache and makes sure we only perform
API/database calls if absolutely necessary. Frequently used keys are
refreshed, and lookup misses are also cached; removing the need for
performing API/database calls if we know we don't have the data we're
looking for.
== Performance & Models
The new importer in various places uses raw INSERT statements (as
generated by `Gitlab::Database.bulk_insert`) instead of using Rails
models. This allows us to bypass any validations and callbacks,
drastically reducing the number of SQL queries and Gitaly RPC calls
necessary to import projects.
To ensure the code produces valid data the corresponding tests check if
the produced rows are valid according to the model validation rules.
* upstream/master: (1723 commits)
Resolve "Editor icons"
Refactor issuable destroy action
Ignore routes matching legacy_*_redirect in route specs
Gitlab::Git::RevList and LfsChanges use lazy popen
Gitlab::Git::Popen can lazily hand output to a block
Merge branch 'master-i18n' into 'master'
Remove unique validation from external_url in Environment
Expose `duration` in Job API entity
Add TimeCop freeze for DST and Regular time
Harcode project visibility
update a changelog
Put a condition to old migration that adds fast_forward column to MRs
Expose project visibility as CI variable
fix flaky tests by removing unneeded clicks and focus actions
fix flaky test in gfm_autocomplete_spec.rb
Use Gitlab::Git operations for repository mirroring
Encapsulate git operations for mirroring in Gitlab::Git
Create a Wiki Repository's raw_repository properly
Add `Gitlab::Git::Repository#fetch` command
Fix Gitlab::Metrics::System#real_time and #monotonic_time doc
...
This allows input to start processing immediately without waiting for the process to complete.
This also allows long or infinite inputs to be partially processed,
which will termiate the process when reading stops with SIGPIPE.
also, I refactored the MergeRequest#fetch_ref method to express
the side-effect that this method has.
MergeRequest#fetch_ref -> MergeRequest#fetch_ref!
Repository#fetch_source_branch -> Repository#fetch_source_branch!
Now, when requesting a commit from the Repository model, the results are
not cached. This means we're fetching the same commit by oid multiple times
during the same request. To prevent us from doing this, we now cache
results. Caching is done only based on object id (aka SHA).
Given we cache on the Repository model, results are scoped to the
associated project, eventhough the change of two repositories having the
same oids for different commits is small.
Instead of only checking once within a timeout, check multiple times
within a timeout.
That means with a timeout of 30 seconds and 3 retries. Each try would
be allowed 20 seconds.
The circuitbreaker now has 2 failure modes:
- Backing off: This will raise the `Gitlab::Git::Storage::Failing`
exception. Access to the shard is blocked temporarily.
- Circuit broken: This will raise the
`Gitlab::Git::Storage::CircuitBroken` exception. Access to the shard
will be blocked until the failures are reset.
When calling pre-receive, post-receive, and update hooks, add the GitLab
username as the GL_USERNAME environment variable.
This patch only handles cases where pushes are over http, or via
the web interface. Later, we will address the ssh case.
If the ref doesn't exist, and the source branch is deleted, we can't get it back
easily. Previously, we ignored this error by shelling out, so replicate that
behaviour.
* upstream/master: (168 commits)
Update CHANGELOG.md for 10.0.1
Remove Grit settings from default settings
Fix duplicate key errors in PostDeployMigrateUserExternalMailData migration
Workaround for #38259
Workaround for n+1 in Projects::TreeController#show
Removed old icons from project page
Make branches page translatable
fix typo in icons section
Don't show it if there's no project.
Update CHANGELOG.md for 10.0.0
Inform user that current shared projects will remain shared
Allow the git circuit breaker to correctly handle missing repository storages
Reserve refs/replace cos `git-replace` is using it
Resolve "Better SVG Usage in the Frontend"
Replace the 'project/service.feature' spinach test with an rspec analog
Replace the 'project/shortcuts.feature' spinach test with an rspec analog
Removed two legacy config options
Fix rendering double note issue.
IssueNotes: Switch back to Write pane when note cancel or submit.
Upgrade Nokogiri because of CVE-2017-9050
...
In gitlab-org/gitlab-ee!2976, we saw that a given OID could point
to a commit, which would cause the delta size check to fail.
Gitaly already returns nil if the OID isn't a blob, so this change
makes the Rugged implementation consistent.
* upstream/master: (225 commits)
Add changelog entry
Backports EE 2756 logic to CE.
Make rubocop happy
Make profile settings dropdown consistent
Add filter by my reaction
Update spec initialization with it being a shared component
Update identicon path and selector
Renamed to `identicon` and make shared component
Merge branch 'master-i18n' into 'master'
Fix broken Frontend JS guide
Replace 'project/star.feature' spinach test with an rspec analog
Adds position fixed to right sidebar
Fixes the margin of the top buttons of the pipeline page
Remove commented out code
Better align fallback image emojis
Decrease Metrics/CyclomaticComplexity threshold to 15
Add changelog
Respect the default visibility level when creating a group
Further break with_repo_branch_commit into parts
Make sure inspect doesn't generate crazy string
...
Users of project mirrors would see that the number of branches did not
match the number in the branch dropdown because remote branches were
counted when Rugged was in use. With Gitaly, only local branches
are counted.
Closes#36934