Previously, prometheus_notifications_errors_total was incremented by
one whenever a batch of alerts was affected by an error during sending
to a specific alertmanager. However, the corresponding metric
prometheus_notifications_sent_total, counting all alerts that were
sent (including those where the sent ended in error), is incremented
by the batch size, i.e. the number of alerts.
Therefore, the ratio used in the mixin for the
PrometheusErrorSendingAlertsToSomeAlertmanagers alert is inconsistent.
This commit changes the increment of
prometheus_notifications_errors_total to the number of alerts that
were sent in the attempt that ended in an error. It also adjusts the
metrics help string accordingly and makes the wording in the alert in
the mixin more precise.
Signed-off-by: beorn7 <beorn@grafana.com>
* CHANGELOG - scraping change introduced in 2.52.0
Change introduced in #12933; old behavior partially recoved in #14685
Signed-off-by: Konrad <zuo.zp8@gmail.com>
Remote-write creates several shards to parallelise sending, each with
its own http connection. We do not want them all combined onto one
socket by http2.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Fix some edge cases when OOO is enabled
Signed-off-by: Vanshikav123 <vanshikav928@gmail.com>
Signed-off-by: Vanshika <102902652+Vanshikav123@users.noreply.github.com>
Signed-off-by: Jesus Vazquez <jesusvzpg@gmail.com>
Co-authored-by: Jesus Vazquez <jesusvzpg@gmail.com>
scrape: Remove implicit fallback to the Prometheus text format
Remove implicit fallback to the Prometheus text format in case of invalid/missing Content-Type and fail the scrape instead. Add ability to specify a `fallback_scrape_protocol` in the scrape config.
---------
Signed-off-by: alexgreenbank <alex.greenbank@grafana.com>
Signed-off-by: Alex Greenbank <alex.greenbank@grafana.com>
Co-authored-by: Björn Rabenstein <beorn@grafana.com>
promql: Fix stddev/stdvar when aggregating histograms, NaNs, and Infs
Native histograms are ignored when calculating stddev or stdvar.
However, for the first series of each group, a `groupedAggregation` is
always created. If the first series that was encountered is a histogram
then it acts as the equivalent of a 0 point.
This change creates the first `groupedAggregation` with the `seen` field set to `false` if the point is a
histogram, thus ignoring it like the rest of the aggregation function does. A new `groupedAggregation`
will then be created once an actual float value is encountered.
This commit also sets the `floatValue` field of the `groupedAggregation` to `NaN`, if the first
float value of a group is `NaN` or `±Inf`, so that the outcome is consistently `NaN` once those
values are in the mix.
(The added tests fail without this change).
Signed-off-by: Joshua Hesketh <josh@nitrotech.org>
Signed-off-by: beorn7 <beorn@grafana.com>
---------
Signed-off-by: Joshua Hesketh <josh@nitrotech.org>
Signed-off-by: beorn7 <beorn@grafana.com>
Co-authored-by: beorn7 <beorn@grafana.com>
The `info` function is an experiment to improve UX
around including labels from info metrics.
`info` has to be enabled via the feature flag `--enable-feature=promql-experimental-functions`.
This MVP of info simplifies the implementation by assuming:
* Only support for the target_info metric
* That target_info's identifying labels are job and instance
Also:
* Encode info samples' original timestamp as sample value
* Deduce info series select hints from top-most VectorSelector
---------
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Ying WANG <ying.wang@grafana.com>
Co-authored-by: Augustin Husson <augustin.husson@amadeus.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Co-authored-by: Björn Rabenstein <github@rabenste.in>
Co-authored-by: Bryan Boreham <bjboreham@gmail.com>
Go's sorting functions can re-order equal elements, so the strategy of
sorting by the fallback ordering first does not always work.
Pulling the fallback into the main comparison function is more reliable
and more efficient.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
* Make rate possible non-counter annotation consistent
Previously a PossibleNonCounterInfo annotation would be left in cases
where a range-vector selects 1 float data point, even if no more points
are selected in order to calculate a rate.
This change ensures an output float exists before emitting such an
annotation.
This fixes an inconsistency where a series with mixed data (ie, a float
and a native histogram) would emit an annotation without any points.
For example,
```
load 1m
series{label="a"} 1 {{schema:1 sum:10 count:5 buckets:[1 2 3]}}
eval instant at 1m rate(series[1m1s])
```
Would have a PossibleNonCounterInfo annotation.
Wheras
```
load 1m
series{label="a"} {{schema:1 sum:10 count:5 buckets:[1 2 3]}} {{schema:1 sum:15 count:10 buckets:[1 2 3]}}
eval instant at 1m rate(series[1m1s])
```
Would not.
---------
Signed-off-by: Joshua Hesketh <josh@nitrotech.org>
Conflicts:
cmd/prometheus/main.go
docs/command-line/prometheus.md
docs/feature_flags.md
web/ui/build_ui.sh
web/web.go
Resolved by dropping the UTF-8 feature flag and adding the
`auto-reload-config` feature flag.
For the new web ui pick all changes from `main`.
This commit removes support for the following API versions:
* `discovery.k8s.io/v1beta1` API version of EndpointSlice (no longer
served as of v1.25).
* `networking.k8s.io/v1beta1` API version of Ingress (no longer served
as of v1.22).
Closes#12884
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Add the ability to adjust the `GOGC` variable from the Prometheus
configuration file.
* Create a new top-level `runtime` section in the config.
* Adjust from the Go default of 100 to 50 to reduce wasted memory.
* Use the `GOGC` env value if no configuraiton is used.
Signed-off-by: SuperQ <superq@gmail.com>
* [PATCH] Allow having evaluation delay for rule groups
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* [PATCH] Fix lint
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* [PATCH] Move the option to ManagerOptions
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* [PATCH] Include evaluation_delay in the group config
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
* Fix comments
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Add a server configuration option.
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Appease the linter #1
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Add the new server flag documentation
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Improve documentation of the new flag and configuration
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Use named parameters for clarity on the `Rule` interface
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Add `initial` to the flag help
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Change the CHANGELOG area from `ruler` to `rules`
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Rename evaluation_delay to `rule_query_offset`/`query_offset` and make it a global configuration option.
Signed-off-by: gotjosh <josue.abreu@gmail.com>
E Your branch is up to date with 'origin/gotjosh/evaluation-delay'.
* more docs
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Improve wording on CHANGELOG
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Add `RuleQueryOffset` to the default config in tests in case it changes
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Update docs/configuration/recording_rules.md
Co-authored-by: Julius Volz <julius.volz@gmail.com>
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Rename `RuleQueryOffset` to `QueryOffset` when in the group context.
Signed-off-by: gotjosh <josue.abreu@gmail.com>
* Improve docstring and documentation on the `rule_query_offset`
Signed-off-by: gotjosh <josue.abreu@gmail.com>
---------
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: gotjosh <josue.abreu@gmail.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Julius Volz <julius.volz@gmail.com>
Add method `PostingsForLabelMatching` to `tsdb.IndexReader`, to obtain postings for labels with a certain name and values accepted by a provided callback, and use it from `tsdb.PostingsForMatchers`.
The intention is to optimize regexp matcher paths, especially not having to load all label values before matching on them.
Plus tests, and refactor some `tsdb/index.Reader` methods.
Benchmarking shows memory reduction up to ~100%, and speedup of up to ~50%.
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>