2013-11-24 19:13:08 +08:00
[[search-aggregations-metrics-avg-aggregation]]
2020-10-31 01:25:21 +08:00
=== Avg aggregation
++++
<titleabbrev>Avg</titleabbrev>
++++
2013-11-24 19:13:08 +08:00
2021-04-06 01:08:13 +08:00
A `single-value` metrics aggregation that computes the average of numeric values that are extracted from the aggregated documents. These values can be extracted either from specific numeric fields in the documents.
2013-11-24 19:13:08 +08:00
2017-02-08 03:17:54 +08:00
Assuming the data consists of documents representing exams grades (between 0
and 100) of students we can average their scores with:
2013-11-24 19:13:08 +08:00
2019-09-05 00:51:02 +08:00
[source,console]
2013-11-24 19:13:08 +08:00
--------------------------------------------------
2017-02-08 02:33:00 +08:00
POST /exams/_search?size=0
2013-11-24 19:13:08 +08:00
{
2020-07-21 03:08:04 +08:00
"aggs": {
"avg_grade": { "avg": { "field": "grade" } }
}
2013-11-24 19:13:08 +08:00
}
--------------------------------------------------
2017-02-08 02:33:00 +08:00
// TEST[setup:exams]
2013-11-24 19:13:08 +08:00
The above aggregation computes the average grade over all documents. The aggregation type is `avg` and the `field` setting defines the numeric field of the documents the average will be computed on. The above will return the following:
2019-09-07 02:05:36 +08:00
[source,console-result]
2013-11-24 19:13:08 +08:00
--------------------------------------------------
{
2020-07-21 03:08:04 +08:00
...
"aggregations": {
"avg_grade": {
"value": 75.0
2013-11-24 19:13:08 +08:00
}
2020-07-21 03:08:04 +08:00
}
2013-11-24 19:13:08 +08:00
}
--------------------------------------------------
2017-02-08 02:33:00 +08:00
// TESTRESPONSE[s/\.\.\./"took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits,/]
2013-11-24 19:13:08 +08:00
2014-01-18 00:20:05 +08:00
The name of the aggregation (`avg_grade` above) also serves as the key by which the aggregation result can be retrieved from the returned response.
2013-11-24 19:13:08 +08:00
==== Script
2021-04-06 01:08:13 +08:00
Let's say the exam was exceedingly difficult, and you need to apply a grade correction. Average a <<runtime,runtime field>> to get a corrected average:
2013-11-24 19:13:08 +08:00
2019-09-05 00:51:02 +08:00
[source,console]
2021-04-06 01:08:13 +08:00
----
2017-02-08 02:33:00 +08:00
POST /exams/_search?size=0
2013-11-24 19:13:08 +08:00
{
2021-04-06 01:08:13 +08:00
"runtime_mappings": {
"grade.corrected": {
"type": "double",
"script": {
"source": "emit(Math.min(100, doc['grade'].value * params.correction))",
"params": {
"correction": 1.2
2016-06-27 21:55:16 +08:00
}
2020-07-21 03:08:04 +08:00
}
2013-11-24 19:13:08 +08:00
}
2021-04-06 01:08:13 +08:00
},
2020-07-21 03:08:04 +08:00
"aggs": {
2021-04-06 01:08:13 +08:00
"avg_corrected_grade": {
2020-07-21 03:08:04 +08:00
"avg": {
2021-04-06 01:08:13 +08:00
"field": "grade.corrected"
2020-07-21 03:08:04 +08:00
}
2015-05-12 17:37:22 +08:00
}
2020-07-21 03:08:04 +08:00
}
2015-05-12 17:37:22 +08:00
}
2021-04-06 01:08:13 +08:00
----
// TEST[setup:exams]
// TEST[s/size=0/size=0&filter_path=aggregations/]
2013-11-24 19:13:08 +08:00
2021-04-06 01:08:13 +08:00
////
[source,console-result]
----
2013-11-24 19:13:08 +08:00
{
2021-04-06 01:08:13 +08:00
"aggregations": {
2020-07-21 03:08:04 +08:00
"avg_corrected_grade": {
2021-04-06 01:08:13 +08:00
"value": 80.0
2013-11-24 19:13:08 +08:00
}
2020-07-21 03:08:04 +08:00
}
2013-11-24 19:13:08 +08:00
}
2021-04-06 01:08:13 +08:00
----
////
2015-05-07 22:46:40 +08:00
==== Missing value
The `missing` parameter defines how documents that are missing a value should be treated.
By default they will be ignored but it is also possible to treat them as if they
had a value.
2019-09-05 00:51:02 +08:00
[source,console]
2015-05-07 22:46:40 +08:00
--------------------------------------------------
2017-02-08 02:33:00 +08:00
POST /exams/_search?size=0
2015-05-07 22:46:40 +08:00
{
2020-07-21 03:08:04 +08:00
"aggs": {
"grade_avg": {
"avg": {
"field": "grade",
"missing": 10 <1>
}
2015-05-07 22:46:40 +08:00
}
2020-07-21 03:08:04 +08:00
}
2015-05-07 22:46:40 +08:00
}
--------------------------------------------------
2017-02-08 02:33:00 +08:00
// TEST[setup:exams]
2015-05-07 22:46:40 +08:00
<1> Documents without a value in the `grade` field will fall into the same bucket as documents that have the value `10`.
2020-05-04 15:24:35 +08:00
[[search-aggregations-metrics-avg-aggregation-histogram-fields]]
==== Histogram fields
When average is computed on <<histogram,histogram fields>>, the result of the aggregation is the weighted average
of all elements in the `values` array taking into consideration the number in the same position in the `counts` array.
For example, for the following index that stores pre-aggregated histograms with latency metrics for different networks:
[source,console]
--------------------------------------------------
PUT metrics_index/_doc/1
{
"network.name" : "net-1",
"latency_histo" : {
"values" : [0.1, 0.2, 0.3, 0.4, 0.5], <1>
"counts" : [3, 7, 23, 12, 6] <2>
}
}
PUT metrics_index/_doc/2
{
"network.name" : "net-2",
"latency_histo" : {
"values" : [0.1, 0.2, 0.3, 0.4, 0.5], <1>
"counts" : [8, 17, 8, 7, 6] <2>
}
}
POST /metrics_index/_search?size=0
{
2020-07-21 03:08:04 +08:00
"aggs": {
"avg_latency":
{ "avg": { "field": "latency_histo" }
2020-05-04 15:24:35 +08:00
}
2020-07-21 03:08:04 +08:00
}
2020-05-04 15:24:35 +08:00
}
--------------------------------------------------
For each histogram field the `avg` aggregation adds each number in the `values` array <1> multiplied by its associated count
in the `counts` array <2>. Eventually, it will compute the average over those values for all histograms and return the following result:
[source,console-result]
--------------------------------------------------
{
2020-07-21 03:08:04 +08:00
...
"aggregations": {
"avg_latency": {
"value": 0.29690721649
2020-05-04 15:24:35 +08:00
}
2020-07-21 03:08:04 +08:00
}
2020-05-04 15:24:35 +08:00
}
--------------------------------------------------
// TESTRESPONSE[skip:test not setup]