Previously, this wasn't needed: text was normally set to the highlighted
contents anyway. Now, it is: we store different things in text and rich_text.
This caused https://gitlab.com/gitlab-com/production/issues/439.
If a file is deleted, its new_pos is 0 (less than
total_blob_lines), but there is no reason to add the
bottom 'match' line in this case because there is not
extra content which could be expanded.
This currently causes 500's errors when loading the MR page
(Discussion) in a few scenarios.
We were not considering detailed diff headers such as
"--- a/doc/update/mysql_to_postgresql.md\n+++ b/doc/update/mysql_to_postgresql.md"
to crop the diff. In order to address it, we're now using
Gitlab::Diff::Parser, clean the diffs and builds Gitlab::Diff::Line objects
we can iterate and filter on.
We request Gitaly in a N+1 manner to build discussion diffs. Once the diffs are from different revisions, it's hard to make a single request to the service in order to build the whole response.
With this change we solve this problem and simplify a lot fetching this piece of info.
This adds a method to track errors that can be recovered from in
sentry.
It is useful when debugging performance issues, or exceptions that are
hard to reproduce.
Previously, we kept them all in the cache. We don't need the highlight results
for older diffs - if someone does view that (which is rare), we can do the
highlighting on the fly.
Rugged sometimes chops a context line in between bytes, resulting in the context
line having an invalid encoding: https://github.com/libgit2/rugged/issues/716
When that happens, we will try to detect the encoding for the diff, and
sometimes we'll get it wrong. If that difference in encoding results in a
difference in string lengths between the diff and the underlying blobs, we'd
fail to highlight any inline diffs, and return a 500 status to the user.
As we're using the underlying blobs, the encoding is 'correct' anyway, so if the
string range is invalid, we can just discard the inline diff highlighting. We
still report to Sentry to ensure that we can investigate further in future.
Due to the refactoring in !16082, `Blob#batch` no longer falls back
to a default `blob_size_limit`. Since `Repository#batch_blobs` was using
a default `nil` value, this would cause issues in the `Blob#find_by_rugged`
method.
This fix here is to be consistent and use a non-nil default value in
`Repository#batch_blobs`.
The problem was masked in development and tests because Gitaly is always
enabled by default for all features.
Closes#41735
Old merge requests can have diffs without corresponding blobs. (This also may be
possible for commit diffs in corrupt repositories.)
We can't use the `&.` operator on the blobs, because the blob objects are never
nil, but `BatchLoader` instances that delegate to `Blob`. We can't use
`Object#try`, because `Blob` doesn't inherit from `Object`.
`BatchLoader` provides a `__sync` method that returns the delegated object, but
using `itself` also works because it's forwarded, and will work for
non-`BatchLoader` instances too. So the simplest solution is to just use that
with the `&.` operator.
If a merge request was created with a branch name that also matched a tag name,
we'd generate a comparison to or from the tag respectively, rather than the
branch. Merging would still use the branch, of course.
To avoid this, ensure that when we get the branch heads, we prepend the
reference prefix for branches, which will ensure that we generate the correct
comparison.
For some old merge requests, we don't have enough information to figure out the
old blob and the new blob for the file. This means that we can't highlight the
diff correctly, but we can still display it without highlighting.