This change is a fairly straightforward refactor to extract the tracing
and correlation-id code from the gitlab rails codebase into the new
LabKit-Ruby project.
The corresponding import into LabKit-Ruby was in
https://gitlab.com/gitlab-org/labkit-ruby/merge_requests/1
The code itself remains very similar for now.
Extracting it allows us to reuse it in other projects, such as
Gitaly-Ruby. This will give us the advantages of correlation-ids and
distributed tracing in that project too.
The change in
https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/24199 caused
requests coming from a load balancer to arrive as 127.0.0.1 instead of
the actual IP.
`Rack::Request#ip` behaves slightly differently different than
`ActionDispatch::Request#remote_ip`: the former will return the first
X-Forwarded-For IP if all of the IPs are trusted proxies, while the
second one filters out all proxies and falls back to REMOTE_ADDR, which
is 127.0.0.1.
For now, we can revert back to using `Rack::Request` because these
middlewares don't manipulate parameters. The actual fix problem involves
fixing Rails: https://github.com/rails/rails/issues/28436.
Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/58573
`queue_duration` is a useful metric that is currently in api_json.log
but not in production_json.log. We should add it because it tells us how
long the request sat in Workhorse before Unicorn processed it. Having
this field enables the support team to better troubleshoot when delays
began to happen.
As mentioned in
https://gitlab.com/gitlab-org/gitlab-ee/issues/9035#note_129093444,
Rails 5 switched ActionDispatch::Request so that it no longer inherits
Rack::Request directly. A middleware that uses Rack::Request to
read the environment may see stale request parameters if
another middleware modifies the environment via ActionDispatch::Request.
To be safe, we should be using ActionDispatch::Request everywhere.
Mixing and matching the use of Rack::Request and ActionDispatch::Request
in Rails 5 is bad, particularly if you have middleware that
manipulates or accesses environment variables.
`Gitlab::Middleware::Multipart` attempts to rewrite request parameters
to the proper values (e.g. replacing `data_file` with
`UploadedFile`). It does this by calling `Rack::Request#update_params`,
which essentially updates `env['rack.request.form_hash']`.
By changing to `ActionDispatch::Request`, the Go middleware was causing
the request parameters to be stored inside
`env['action_dispatch.request.request_parameters']`. Later calls to
`Rack::Request#update_params` would not have any effect because it would
attempt to update `env['rack.request.form_has']` instead of
`env['action_dispatch.request.request_parameters']`. As a result, the
controller still saw the old parameters.
Since the Go middleware appears to be using `ActionDispatch::Request`
for authorization methods, we can switch the multipart middleware to
use it too.
Closes https://gitlab.com/gitlab-org/gitlab-ee/issues/9035
The Correlation ID is taken or generated from received X-Request-ID.
Then it is being passed to all executed services (sidekiq workers
or gitaly calls).
The Correlation ID is logged in all structured logs as `correlation_id`.
In Ruby 2.4, `URI.join("http://test//", "a").to_s` will
remove the double slash, however it's not the case in
Ruby 2.5. Using chomp should work better for the intention,
as we're not trying to allow things like ../ or / paths
resolution.
Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/53180
When direct_upload is enabled and a for file is being uploaded,
then workhorse uses `public/uploads/tmp` path. If `uploads.storage_path`
i sset to a different directory, then upload fails because
`public/uploads/tmp` is not in allowed paths.
The previous implementation would hit the database each time
and provide a dummy response. If the database goes down, this
means all application workers would be taken out of service.
Simplify this check by using a Rails middleware that intercepts
this endpoint and returns a 200 response.
Currently we check if uploaded file is under
`Gitlab.config.uploads.storage_path`, the problem is that
uploads are placed in `uploads` subdirectory which is symlink.
In allow_path? method we check real (expanded) paths, which causes
that `Gitlab.config.uploads.storage_path` is expaned into symlink
path and there is a mismatch with upload file path.
By adding `Gitlab.config.uploads.storage_path/uploads` into allowed
paths, this path is expaned during path check.
`Gitlab.config.uploads.storage_path` is left there intentionally in case
some uploader wouldn't use `uploads` subdir.