Commit Graph

434 Commits

Author SHA1 Message Date
Grzegorz Bizon 09a4a5aff8 Render only valid paths in artifacts metadata
In this version we will support only relative paths in artifacts
metadata. Support for absolute paths will be introduced later.
2016-01-14 12:48:16 +01:00
Grzegorz Bizon 61fb47a432 Simplify implementation of build artifacts browser (refactoring) 2016-01-14 12:48:15 +01:00
Grzegorz Bizon 387b27813d Change format of artifacts metadata from text to binary 0.0.1
This changes the format of metadata to handle paths, that may contain
whitespace characters, new line characters and non-UTF-8 characters.

Now those paths along with metadata in JSON format are stored as
length-prefixed strings (uint32 prefix).

Metadata file has a custom format:

1.   First string field is metadata version field (string)
2.   Second string field is metadata errors field (JSON strong)
3.   All subsequent fields is pair of path (string) and path metadata
     in JSON format.

Path's metadata contains all fields that where possible to extract from
ZIP archive like date of modification, CRC, compressed size,
uncompressed size and comment.
2016-01-14 12:48:15 +01:00
Grzegorz Bizon 1b1793c253 Show file size in artifacts browser using metadata 2016-01-14 12:48:15 +01:00
Grzegorz Bizon cd3b8bbd2f Add method that checks if path exists in `StringPath` 2016-01-14 12:48:15 +01:00
Grzegorz Bizon a5e1905d28 Render 404 when artifacts path is invalid 2016-01-14 12:48:15 +01:00
Grzegorz Bizon f948c00757 Do not depend on universe when checking parent in `StringPath` 2016-01-14 12:48:15 +01:00
Grzegorz Bizon a7f99b67a0 Extract artifacts metadata implementation to separate class 2016-01-14 12:48:15 +01:00
Grzegorz Bizon df41148662 Improve path sanitization in `StringPath` 2016-01-14 12:48:15 +01:00
Grzegorz Bizon 3de8a4620a Parse artifacts metadata stored in JSON format 2016-01-14 12:48:15 +01:00
Grzegorz Bizon 304c39b6dc Fix rubocop offenses in `StringPath` specs 2016-01-14 12:48:13 +01:00
Grzegorz Bizon b19e958d86 Add support for parent directories in `StringPath`
This support is not completed though, as parent directory that is first
in collection returned by `directories!` is not iterable yet.
2016-01-14 12:48:13 +01:00
Grzegorz Bizon 37b2c5dd55 Add support for root path for `StringPath` 2016-01-14 12:48:13 +01:00
Grzegorz Bizon d382335dcd Add implementation of remaining methods in `StringPath` 2016-01-14 12:48:13 +01:00
Grzegorz Bizon c184eeb848 Improve `StringPath` specs (DRY) 2016-01-14 12:48:13 +01:00
Grzegorz Bizon 518b206287 Add `parent` iteration implementation to `StringPath` 2016-01-14 12:48:13 +01:00
Grzegorz Bizon 73d2c7a553 Add new methods to StringPath 2016-01-14 12:48:12 +01:00
Grzegorz Bizon f5d5308658 Add implementation of StringPath class
`StringPath` class is something similar to Ruby's `Pathname` class,
but does not involve any IO operations. `StringPath` objects require
passing string representation of path, and array of paths that
represents universe to constructor to be intantiated.
2016-01-14 12:48:12 +01:00
Douwe Maan 4d64a32c88 Merge branch 'feature/ldap-sync-edgecases' into 'master'
LDAP Sync blocked user edgecases

Allow GitLab admins to block otherwise valid GitLab LDAP users
(https://gitlab.com/gitlab-org/gitlab-ce/issues/3462)

Based on the discussion on the original issue, we are going to differentiate "normal" block operations to the ldap automatic ones in order to make some decisions when its one or the other.

Expected behavior:

- [x] "ldap_blocked" users respond to both `blocked?` and `ldap_blocked?`
- [x] "ldap_blocked" users can't be unblocked by the Admin UI
- [x] "ldap_blocked" users can't be unblocked by the API
- [x] Block operations that are originated from LDAP synchronization will flag user as "ldap_blocked"
- [x] Only "ldap_blocked" users will be automatically unblocked by LDAP synchronization
- [x] When LDAP identity is removed, we should convert `ldap_blocked` into `blocked`
 
Mockup for the Admin UI with both "ldap_blocked" and normal "blocked" users:
![image](/uploads/4f56fc17b73cb2c9e2a154a22e7ad291/image.png)

There will be another MR for the EE version.

See merge request !2242
2016-01-14 11:00:08 +00:00
Gabriel Mazetto dd6fc01ff8 fixed LDAP activation on login to use new ldap_blocked state 2016-01-14 03:31:27 -02:00
Yorick Peterse 057eb824b5 Randomize metrics sample intervals
Sampling data at a fixed interval means we can potentially miss data
from events occurring between sampling intervals. For example, say we
sample data every 15 seconds but Unicorn workers get killed after 10
seconds. In this particular case it's possible to miss interesting data
as the sampler will never get to actually submitting data.

To work around this (at least for the most part) the sampling interval
is randomized as following:

1. Take the user specified sampling interval (15 seconds by default)
2. Divide it by 2 (referred to as "half" below)
3. Generate a range (using a step of 0.1) from -"half" to "half"
4. Every time the sampler goes to sleep we'll grab the user provided
   interval and add a randomly chosen "adjustment" to it while making
   sure we don't pick the same value twice in a row.

For a specified timeout of 15 this means the actual intervals can be
anywhere between 7.5 and 22.5, but never can the same interval be used
twice in a row.

The rationale behind this change is that on dev.gitlab.org I'm sometimes
seeing certain Gitlab::Git/Rugged objects being retained, but only for a
few minutes every 24 hours. Knowing the code of Gitlab and how much
memory it uses/leaks I suspect we're missing data due to workers getting
terminated before the sampler can write its data to InfluxDB.
2016-01-13 12:57:46 +01:00
Yorick Peterse 83ad5fa5cb Merge branch 'remove-application-frames-from-views' into 'master'
See merge request !2392
2016-01-12 15:44:57 +00:00
Yorick Peterse 355c341fe7 Stop tracking call stacks for instrumented views
Where a vew is called from doesn't matter as much. We already know what
action they belong to and this is more than enough information. By
removing the file/line number from the list of tags we should also be
able to reduce the number of series stored in InfluxDB.
2016-01-12 15:41:22 +01:00
Yorick Peterse 5679ee0120 Track memory allocated during a transaction
This gives a very rough estimate of how much memory is allocated during
a transaction. This only works reliably when using a single-threaded
application server and a Ruby implementation with a GIL as otherwise
memory allocated by other threads might skew the statistics. Sadly
there's no way around this as Ruby doesn't provide a reliable way of
gathering accurate object sizes upon allocation on a per-thread basis.
2016-01-12 14:59:30 +01:00
Yorick Peterse 35b501f30a Tag all transaction metrics with an "action" tag
Without this it's impossible to find out what methods/views/queries are
executed by a certain controller or Sidekiq worker. While this will
increase the total number of series it should stay within reasonable
limits due to the amount of "actions" being small enough.
2016-01-11 16:51:01 +01:00
Robert Speicher af68897acd Merge branch 'api-project-upload' into 'master'
Add API project upload endpoint

Fixes https://gitlab.com/gitlab-org/gitlab-ce/issues/4317

See merge request !2329
2016-01-08 20:29:43 +00:00
Gabriel Mazetto d6dc088aff LDAP synchronization block/unblock new states 2016-01-08 16:26:04 -02:00
Stan Hu 69209612e1 Suppress e-mails on failed builds if allow_failure is set
Every time I push to GitLab, I get > 2 emails saying a spec failed when
I don't care about benchmarks and other specs that have `allow_failure` set to `true`.
2016-01-07 10:45:39 -08:00
Douwe Maan 1e927d39b4 Update spec 2016-01-07 15:51:12 +01:00
Yorick Peterse 7b10cb6f0f Store request methods/URIs as values
Since filtering by these values is very rare (they're mostly just
displayed as-is) we don't need to waste any index space by saving them
as tags. By storing them as values we also greatly reduce the number of
series in InfluxDB.
2016-01-07 13:05:00 +01:00
Yorick Peterse 364b07cff0 Removed UUIDs from metrics transactions
While useful for finding out what methods/views belong to a transaction
this might result in too much data being stored in InfluxDB.
2016-01-07 12:44:15 +01:00
Yorick Peterse 7ed3a5a240 Revert "Store SQL/view timings in milliseconds"
This reverts commit 7549102bb7.

Apparently I was wrong about
ActiveSupport::Notifications::Event#duration returning the duration in
seconds, instead it returns it in milliseconds already.
2016-01-07 11:47:06 +01:00
Yorick Peterse 7549102bb7 Store SQL/view timings in milliseconds
Transaction timings are also already stored in milliseconds, this keeps
things consistent.
2016-01-06 16:37:14 +01:00
Douglas Barbosa Alexandre 837a9065f0 Ensure that we're only importing local pull requests 2016-01-05 15:24:55 -02:00
Douglas Barbosa Alexandre 98909dd12c Generate separate comments when importing GitHub Issues into GitLab 2016-01-05 15:24:55 -02:00
Douglas Barbosa Alexandre dc72a8b305 Refactoring GithubImport::Importer 2016-01-05 15:24:55 -02:00
Yorick Peterse 8de491a68f Fix Rubocop styling in AR subscriber specs 2016-01-04 14:21:39 +01:00
Yorick Peterse 2ee8f55599 Automatically prefix transaction series names
This ensures Rails and Sidekiq transactions are split into the series
"rails_transactions" and "sidekiq_transactions" respectively.
2016-01-04 13:17:02 +01:00
Yorick Peterse 2ea464bb27 Use separate series for Rails/Sidekiq sample stats
This removes the need for any tags to differentiate between Sidekiq and
Rails statistics while still being able to separate the two.
2016-01-04 12:45:31 +01:00
Yorick Peterse 825b46f8a3 Track total method call times per transaction
This makes it easier to see where time is spent without having to
aggregate all the individual points in the method_calls series.
2016-01-04 12:19:45 +01:00
Yorick Peterse 66a997a914 Track total query/view timings in transactions 2016-01-04 12:14:36 +01:00
Yorick Peterse 96075be6f4 Ability to increment custom transaction values
This will be used to store/increment the total query/view rendering
timings on a per transaction basis. This in turn can greatly reduce the
amount of metrics stored.
2016-01-04 11:37:46 +01:00
Yorick Peterse cafc784ee1 Removed tracking of hostnames for metrics
This isn't hugely useful and mostly wastes InfluxDB space. We can re-add
this whenever needed (but only once we really need it).
2015-12-31 17:55:10 +01:00
Yorick Peterse bd9f86bb8a Use separate series for Rails/Sidekiq transactions
This removes the need for tagging all metrics with a "process_type" tag.
2015-12-31 17:52:51 +01:00
Yorick Peterse 55ed6e1c96 Cache InfluxDB settings after the first use
This ensures we don't need to load anything from either PostgreSQL or
the Rails cache whenever creating new InfluxDB connections.
2015-12-31 17:47:07 +01:00
Yorick Peterse a6c60127e3 Removed tracking of raw SQL queries
This particular setup had 3 problems:

1. Storing SQL queries as tags is very inefficient as InfluxDB ends up
   indexing every query (and they can get pretty large). Storing these
   as values instead means we can't always display the SQL as easily.
2. We already instrument ActiveRecord query methods, thus we already
   have timing information about database queries.
3. SQL obfuscation is difficult to get right and I'd rather not expose
   sensitive data by accident.
2015-12-31 17:14:02 +01:00
Yorick Peterse c936e4e3c8 Removed various default metrics tags
While it's useful to keep track of the different versions (Ruby, GitLab,
etc) doing so for every point wastes disk space and possibly also RAM
(which InfluxDB is all to eager to gobble up). If we want to see the
performance differences between different GitLab versions simply looking
at the performance since the last release date should suffice.
2015-12-31 11:26:04 +01:00
Yorick Peterse 620e7bb3d6 Write to InfluxDB directly via UDP
This removes the need for Sidekiq and any overhead/problems introduced
by TCP. There are a few things to take into account:

1. When writing data to InfluxDB you may still get an error if the
   server becomes unavailable during the write. Because of this we're
   catching all exceptions and just ignore them (for now).
2. Writing via UDP apparently requires the timestamp to be in
   nanoseconds. Without this data either isn't written properly.
3. Due to the restrictions on UDP buffer sizes we're writing metrics one
   by one, instead of writing all of them at once.
2015-12-29 14:53:45 +01:00
Yorick Peterse 03478e6d5b Strip newlines from obfuscated SQL
Newlines aren't really needed and they may mess with InfluxDB's line
protocol.
2015-12-29 13:40:08 +01:00
Yorick Peterse ed214a11ca Handle missing settings table for metrics
This ensures we can still boot, even when the "application_settings"
table doesn't exist.
2015-12-28 22:38:34 +01:00