gitlab-ce

Commit Graph

Author	SHA1	Message	Date
Paweł Chojnacki	746f0ec367	Add sidekiq metrics endpoint and add http server to sidekiq	2017-08-07 17:13:02 +00:00
Paweł Chojnacki	2c3d52161a	Update Prometheus gem to version that explicitly calls `munmap`	2017-07-19 08:54:39 +00:00
Pawel Chojnacki	2d0741e562	Rename ConnectionRackMiddleware to RequestsRackMiddleware. + fix tests after metrics rename	2017-07-13 00:46:17 +02:00
Ben Kochie	e9363229b7	Update rack metric names * Follow Prometheus naming conventions[0]. * Simplify metrics by adding response lables to the histogram. * Use standard `http_request_duration_seconds_...` names for the histogram. [0]: https://prometheus.io/docs/practices/naming/#metric-names	2017-07-12 14:37:08 +02:00
Paweł Chojnacki	26ac691a68	Instrument Unicorn with Ruby exporter	2017-07-04 15:28:34 +00:00
Grzegorz Bizon	0430b76441	Enable Style/DotPosition Rubocop 👮	2017-06-21 13:48:12 +00:00
Pawel Chojnacki	5f2dc999bd	use proper `if defined?` check.	2017-06-20 12:22:56 +02:00
Pawel Chojnacki	ed5c7d11b1	Do not enable prometheus metrics when data folder is not present. + Set defaults correctly only for when not in production or staging + set ENV['prometheus_multiproc_dir'] in config/boot.rb instead of config.ru Test prometheus metrics unmemoized	2017-06-19 18:52:23 +02:00
Pawel Chojnacki	7b75004d60	Add missing trailing newlines	2017-06-02 19:45:58 +02:00
Pawel Chojnacki	68b946e3c8	Fix circular dependency condition with `current_application_settings` `current_application_settings` used by `influx_metrics_enabled` executed a markdown parsing code that was measured using `Gitlab::Metrics.measure` But since the Gitlab::Metrics::InfluxDb was not yet build so Gitlab::Metrics did not yet have `measure` method. Causing the NoMethodError. However If run was successful at least once then result was cached in a file and this code never executed again. Which caused this issue to only show up in CI preparation step.	2017-06-02 19:45:58 +02:00
Pawel Chojnacki	b668aaf426	Split the metrics implementation to separate modules for Influx and Prometheus	2017-06-02 19:45:58 +02:00
Pawel Chojnacki	ae8f7666e5	Add prometheus text formatter + rename controler method to #index from #metrics + remove assertion from nullMetric	2017-06-02 19:45:58 +02:00
Pawel Chojnacki	c134a72cdb	Move Prometheus presentation logic to PrometheusText + Use NullMetrics to mock metrics when unused + Use method_missing in NullMetrics mocking + Update prometheus gem to version that correctly uses transitive dependencies + Ensure correct folders are used in Multiprocess prometheus client tests. + rename Sessions controller's metric	2017-06-02 19:45:58 +02:00
Pawel Chojnacki	138a5577a9	remove prometheus sampler	2017-06-02 19:45:58 +02:00
Pawel Chojnacki	5bc099c2de	Prometheus metrics first pass metrics wip	2017-06-02 19:45:57 +02:00
Douwe Maan	1fe7501b49	Revert "Prefer leading style for Style/DotPosition" This reverts commit cb10b725c8929b8b4460f89c9d96c773af39ba6b.	2017-02-23 09:33:05 -06:00
Douwe Maan	206953a430	Prefer leading style for Style/DotPosition	2017-02-23 09:32:22 -06:00
Douwe Maan	5c7f9d69e3	Fix code for cops	2017-02-23 09:31:57 -06:00
Douwe Maan	8a4d68c53e	Enable Style/ConditionalAssignment	2017-02-23 09:31:57 -06:00
Douwe Maan	b7d8df503c	Enable Style/MutableConstant	2017-02-23 09:31:56 -06:00
Douwe Maan	f40716f48a	No more and/or	2017-02-21 16:31:14 -06:00
Rémy Coutable	892ff3a3ae	Check for env[Grape::Env::GRAPE_ROUTING_ARGS] instead of endpoint.route `endpoint.route` is calling `env[Grape::Env::GRAPE_ROUTING_ARGS][:route_info]` but `env[Grape::Env::GRAPE_ROUTING_ARGS]` is `nil` in the case of a 405 response Signed-off-by: Rémy Coutable <remy@rymai.me>	2017-01-12 23:15:25 -05:00
Rémy Coutable	c28b0a539d	Don't instrument 405 Grape calls Fixes #26051. Signed-off-by: Rémy Coutable <remy@rymai.me>	2017-01-09 10:02:52 +01:00
Rémy Coutable	90c6a1a319	Use Grape's new Route methods - Use Route#request_method instead of Route#route_method - Use Route#path instead of Route#route_path Signed-off-by: Rémy Coutable <remy@rymai.me>	2016-12-21 13:46:52 +01:00
Paco Guzman	a5079f68e6	Adds response mime type to transaction metric action when it's not HTML	2016-08-25 16:33:41 +02:00
Yorick Peterse	d345591fc8	Tracking of custom events GitLab Performance Monitoring is now able to track custom events not directly related to application performance. These events include the number of tags pushed, repositories created, builds registered, etc. The use of these events is to get a better overview of how a GitLab instance is used and how that may affect performance. For example, a large number of Git pushes may have a negative impact on the underlying storage engine. Events are stored in the "events" measurement and are not prefixed with "rails_" or "sidekiq_", this makes it easier to query events with the same name triggered from different parts of the application. All events being stored in the same measurement also makes it easier to downsample data. Currently the following events are tracked: * Creating repositories * Removing repositories * Changing the default branch of a repository * Pushing a new tag * Removing an existing tag * Pushing a commit (along with the branch being pushed to) * Pushing a new branch * Removing an existing branch * Importing a repository (along with the URL we're importing) * Forking a repository (along with the source/target path) * CI builds registered (and when no build could be found) * CI builds being updated * Rails and Sidekiq exceptions Fixes gitlab-org/gitlab-ce#13720	2016-08-17 10:04:04 +02:00
Yorick Peterse	905f8d763a	Reduce instrumentation overhead This reduces the overhead of the method instrumentation code primarily by reducing the number of method calls. There are also some other small optimisations such as not casting timing values to Floats (there's no particular need for this), using Symbols for method call metric names, and reducing the number of Hash lookups for instrumented methods. The exact impact depends on the code being executed. For example, for a method that's only called once the difference won't be very noticeable. However, for methods that are called many times the difference can be more significant. For example, the loading time of a large commit (nrclark/dummy_project@81ebdea5df) was reduced from around 19 seconds to around 15 seconds using these changes.	2016-07-28 16:56:17 +02:00
Paco Guzman	330de255b7	RailsCache metrics now includes fetch_hit/fetch_miss and read_hit/read_miss info.	2016-07-05 12:28:06 +02:00
Paco Guzman	e9a4d117f2	Instrument cache fetch hit and cache fetch misses	2016-07-05 12:28:06 +02:00
Yorick Peterse	d7b4f36a3c	Use clock_gettime for all performance timestamps Process.clock_gettime allows getting the real time in nanoseconds as well as allowing one to get a monotonic timestamp. This offers greater accuracy without the overhead of having to allocate a Time instance. In general using Time.now/Time.new is about 2x slower than using Process.clock_gettime(). For example: require 'benchmark/ips' Benchmark.ips do \|bench\| bench.report 'Time.now' do Time.now.to_f end bench.report 'clock_gettime' do Process.clock_gettime(Process::CLOCK_MONOTONIC, :millisecond) end bench.compare! end Running this benchmark gives: Calculating ------------------------------------- Time.now 108.052k i/100ms clock_gettime 125.984k i/100ms ------------------------------------------------- Time.now 2.343M (± 7.1%) i/s - 11.670M clock_gettime 4.979M (± 0.8%) i/s - 24.945M Comparison: clock_gettime: 4979393.8 i/s Time.now: 2342986.8 i/s - 2.13x slower Another benefit of using Process.clock_gettime() is that we can simplify the code a bit since it can give timestamps in nanoseconds out of the box.	2016-06-28 17:51:25 +02:00
Paco Guzman	9101915cb7	Add Sidekiq queue duration to transaction metrics.	2016-06-23 13:09:52 +02:00
Yorick Peterse	be3b878443	Track method call times/counts as a single metric Previously we'd create a separate Metric instance for every method call that would exceed the method call threshold. This is problematic because it doesn't provide us with information to accurately get the _total_ execution time of a particular method. For example, if the method "Foo#bar" was called 4 times with a runtime of ~10 milliseconds we'd end up with 4 different Metric instances. If we were to then get the average/95th percentile/etc of the timings this would be roughly 10 milliseconds. However, the _actual_ total time spent in this method would be around 40 milliseconds. To solve this problem we now create a single Metric instance per method. This Metric instance contains the _total_ real/CPU time and the call count for every instrumented method.	2016-06-17 13:09:55 -04:00
Paco Guzman	2e552c6bf0	Filter out sensitive parameters of metrics data	2016-06-17 18:14:25 +02:00
Yorick Peterse	ab91f1226f	Filter out classes without names in the sampler We can't do a lot with classes without names as we can't filter by them, have no idea where they come from, etc. As such it's best to just ignore these.	2016-06-14 18:09:06 +02:00
Yorick Peterse	0ca7b3ba37	Merge branch '18449-instrument-grape-endpoints' into 'master' Instrument Grape API endpoints See merge request !4587	2016-06-14 14:29:55 +00:00
Paco Guzman	dadc531353	Instrument private/protected methods By default instrumentation will instrument public, protected and private methods, because usually heavy work is done on private method or at least that’s what facts is showing	2016-06-14 15:17:51 +02:00
Paco Guzman	509082bafb	Instrument Grape Endpoint with Metrics::RackMiddleware Generating the following tags Grape#GET /projects/:id/archive from Grape::Route objects like { :path => /:version/projects/:id/archive(.:format) :version => “v3”, :method => “GET” } Use an instance variable to cache raw_path transformations. This variable is only going to growth to the number of endpoints of the API, not with exact different requests We can store this cache as an instance variable because middleware are initialised only once	2016-06-14 13:06:46 +02:00
Paco Guzman	120fbbd487	Measure CPU time for instrumented methods	2016-06-14 12:49:31 +02:00
Pablo Carranza	b9306c2e82	Add cache count metrics to rails cache	2016-05-15 19:47:41 +01:00
Yorick Peterse	945c5b3fe6	Removed tracking of total method execution times Because method call timings are inclusive (that is, they include the time of any sub method calls) this would lead to the total method execution time often being far greater than the total transaction time. Because this is incredibly confusing it's best to simply _not_ track the total method execution time, after all it's not that useful to begin with. Fixes gitlab-org/gitlab-ce#17239	2016-05-12 15:15:45 +02:00
Yorick Peterse	7e6f0ac0e0	Count the number of SQL queries per transaction Fixes gitlab-org/gitlab-ce#15335	2016-04-18 14:53:13 +02:00
Yorick Peterse	7b6785b3b1	Use Module#prepend for method instrumentation By using Module#prepend we can define a Module containing all proxy methods. This removes the need for setting up crazy method alias chains and in turn prevents us from having to deal with all that madness (e.g. methods calling each other recursively). Fixes gitlab-org/gitlab-ce#15281	2016-04-18 11:16:31 +02:00
Yorick Peterse	16926a676b	Store block timings as transaction values This makes it easier to query, simplifies the code, and makes it possible to figure out what transaction the data belongs to (simply because it's now stored _in_ the transaction). This new setup keeps track of both the real/wall time _and_ CPU time spent in a block, both measured using milliseconds (to keep all units the same).	2016-04-11 13:09:36 +02:00
Yorick Peterse	833808d737	Merge branch 'instrument-rails-cache' into 'master' Instrument Rails cache code See merge request !3619	2016-04-08 20:52:21 +00:00
Yorick Peterse	c56f702ec3	Instrument Rails cache code This allows us to track how much time of a transaction is spent in dealing with cached data.	2016-04-08 17:54:52 +02:00
Yorick Peterse	aa7cddc4fc	Use more accurate timestamps for InfluxDB. This changes the timestamp of metrics to be more accurate/unique by using Time#to_f combined with a small random jitter value. This combination hopefully reduces the amount of collisions, though there's no way to fully prevent any from occurring. Fixes gitlab-com/operations#175	2016-04-08 16:39:44 +02:00
Yorick Peterse	b74308c0a7	Correct arity for instrumented methods w/o args This ensures that an instrumented method that doesn't take arguments reports an arity of 0, instead of -1. If Ruby had a proper method for finding out the required arguments of a method (e.g. Method#required_arguments) this would not have been an issue. Sadly the only two methods we have are Method#parameters and Method#arity, and both are equally painful to use. Fixes gitlab-org/gitlab-ce#12450	2016-01-25 21:28:59 +01:00
Yorick Peterse	057eb824b5	Randomize metrics sample intervals Sampling data at a fixed interval means we can potentially miss data from events occurring between sampling intervals. For example, say we sample data every 15 seconds but Unicorn workers get killed after 10 seconds. In this particular case it's possible to miss interesting data as the sampler will never get to actually submitting data. To work around this (at least for the most part) the sampling interval is randomized as following: 1. Take the user specified sampling interval (15 seconds by default) 2. Divide it by 2 (referred to as "half" below) 3. Generate a range (using a step of 0.1) from -"half" to "half" 4. Every time the sampler goes to sleep we'll grab the user provided interval and add a randomly chosen "adjustment" to it while making sure we don't pick the same value twice in a row. For a specified timeout of 15 this means the actual intervals can be anywhere between 7.5 and 22.5, but never can the same interval be used twice in a row. The rationale behind this change is that on dev.gitlab.org I'm sometimes seeing certain Gitlab::Git/Rugged objects being retained, but only for a few minutes every 24 hours. Knowing the code of Gitlab and how much memory it uses/leaks I suspect we're missing data due to workers getting terminated before the sampler can write its data to InfluxDB.	2016-01-13 12:57:46 +01:00
Yorick Peterse	2367160015	Make the metrics sampler interval configurable	2016-01-13 12:29:48 +01:00
Yorick Peterse	83ad5fa5cb	Merge branch 'remove-application-frames-from-views' into 'master' See merge request !2392	2016-01-12 15:44:57 +00:00

1 2

83 Commits