This changes the format of metadata to handle paths, that may contain
whitespace characters, new line characters and non-UTF-8 characters.
Now those paths along with metadata in JSON format are stored as
length-prefixed strings (uint32 prefix).
Metadata file has a custom format:
1. First string field is metadata version field (string)
2. Second string field is metadata errors field (JSON strong)
3. All subsequent fields is pair of path (string) and path metadata
in JSON format.
Path's metadata contains all fields that where possible to extract from
ZIP archive like date of modification, CRC, compressed size,
uncompressed size and comment.
`StringPath` class is something similar to Ruby's `Pathname` class,
but does not involve any IO operations. `StringPath` objects require
passing string representation of path, and array of paths that
represents universe to constructor to be intantiated.
LDAP Sync blocked user edgecases
Allow GitLab admins to block otherwise valid GitLab LDAP users
(https://gitlab.com/gitlab-org/gitlab-ce/issues/3462)
Based on the discussion on the original issue, we are going to differentiate "normal" block operations to the ldap automatic ones in order to make some decisions when its one or the other.
Expected behavior:
- [x] "ldap_blocked" users respond to both `blocked?` and `ldap_blocked?`
- [x] "ldap_blocked" users can't be unblocked by the Admin UI
- [x] "ldap_blocked" users can't be unblocked by the API
- [x] Block operations that are originated from LDAP synchronization will flag user as "ldap_blocked"
- [x] Only "ldap_blocked" users will be automatically unblocked by LDAP synchronization
- [x] When LDAP identity is removed, we should convert `ldap_blocked` into `blocked`
Mockup for the Admin UI with both "ldap_blocked" and normal "blocked" users:

There will be another MR for the EE version.
See merge request !2242
Sampling data at a fixed interval means we can potentially miss data
from events occurring between sampling intervals. For example, say we
sample data every 15 seconds but Unicorn workers get killed after 10
seconds. In this particular case it's possible to miss interesting data
as the sampler will never get to actually submitting data.
To work around this (at least for the most part) the sampling interval
is randomized as following:
1. Take the user specified sampling interval (15 seconds by default)
2. Divide it by 2 (referred to as "half" below)
3. Generate a range (using a step of 0.1) from -"half" to "half"
4. Every time the sampler goes to sleep we'll grab the user provided
interval and add a randomly chosen "adjustment" to it while making
sure we don't pick the same value twice in a row.
For a specified timeout of 15 this means the actual intervals can be
anywhere between 7.5 and 22.5, but never can the same interval be used
twice in a row.
The rationale behind this change is that on dev.gitlab.org I'm sometimes
seeing certain Gitlab::Git/Rugged objects being retained, but only for a
few minutes every 24 hours. Knowing the code of Gitlab and how much
memory it uses/leaks I suspect we're missing data due to workers getting
terminated before the sampler can write its data to InfluxDB.
Where a vew is called from doesn't matter as much. We already know what
action they belong to and this is more than enough information. By
removing the file/line number from the list of tags we should also be
able to reduce the number of series stored in InfluxDB.
This gives a very rough estimate of how much memory is allocated during
a transaction. This only works reliably when using a single-threaded
application server and a Ruby implementation with a GIL as otherwise
memory allocated by other threads might skew the statistics. Sadly
there's no way around this as Ruby doesn't provide a reliable way of
gathering accurate object sizes upon allocation on a per-thread basis.
Without this it's impossible to find out what methods/views/queries are
executed by a certain controller or Sidekiq worker. While this will
increase the total number of series it should stay within reasonable
limits due to the amount of "actions" being small enough.
Every time I push to GitLab, I get > 2 emails saying a spec failed when
I don't care about benchmarks and other specs that have `allow_failure` set to `true`.
Since filtering by these values is very rare (they're mostly just
displayed as-is) we don't need to waste any index space by saving them
as tags. By storing them as values we also greatly reduce the number of
series in InfluxDB.
This reverts commit 7549102bb7.
Apparently I was wrong about
ActiveSupport::Notifications::Event#duration returning the duration in
seconds, instead it returns it in milliseconds already.
This will be used to store/increment the total query/view rendering
timings on a per transaction basis. This in turn can greatly reduce the
amount of metrics stored.
This particular setup had 3 problems:
1. Storing SQL queries as tags is very inefficient as InfluxDB ends up
indexing every query (and they can get pretty large). Storing these
as values instead means we can't always display the SQL as easily.
2. We already instrument ActiveRecord query methods, thus we already
have timing information about database queries.
3. SQL obfuscation is difficult to get right and I'd rather not expose
sensitive data by accident.
While it's useful to keep track of the different versions (Ruby, GitLab,
etc) doing so for every point wastes disk space and possibly also RAM
(which InfluxDB is all to eager to gobble up). If we want to see the
performance differences between different GitLab versions simply looking
at the performance since the last release date should suffice.
This removes the need for Sidekiq and any overhead/problems introduced
by TCP. There are a few things to take into account:
1. When writing data to InfluxDB you may still get an error if the
server becomes unavailable during the write. Because of this we're
catching all exceptions and just ignore them (for now).
2. Writing via UDP apparently requires the timestamp to be in
nanoseconds. Without this data either isn't written properly.
3. Due to the restrictions on UDP buffer sizes we're writing metrics one
by one, instead of writing all of them at once.