2018-09-06 06:19:49 +08:00
[role="xpack"]
[[sql-functions-aggs]]
=== Aggregate Functions
Functions for computing a _single_ result from a set of input values.
{es-sql} supports aggregate functions only alongside <<sql-syntax-group-by,grouping>> (implicit or explicit).
2019-03-27 23:18:14 +08:00
[[sql-functions-aggs-general]]
2020-07-23 23:48:22 +08:00
[discrete]
2019-03-27 23:18:14 +08:00
=== General Purpose
2018-09-06 06:19:49 +08:00
[[sql-functions-aggs-avg]]
2019-03-27 23:18:14 +08:00
==== `AVG`
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
.Synopsis:
[source, sql]
--------------------------------------------------
2019-04-22 21:33:55 +08:00
AVG(numeric_field) <1>
2018-12-22 05:25:54 +08:00
--------------------------------------------------
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
*Input*:
2018-09-06 06:19:49 +08:00
2021-06-23 23:11:26 +08:00
<1> numeric field. If this field contains only `null` values, the function
returns `null`. Otherwise, the function ignores `null` values in this field.
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
*Output*: `double` numeric value
2020-08-17 21:44:24 +08:00
*Description*: Returns the {wikipedia}/Arithmetic_mean[Average] (arithmetic mean) of input values.
2018-12-22 05:25:54 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[aggAvg]
2018-12-22 05:25:54 +08:00
--------------------------------------------------
2018-09-06 06:19:49 +08:00
2020-04-17 17:22:06 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
include-tagged::{sql-specs}/docs/docs.csv-spec[aggAvgScalars]
--------------------------------------------------
2018-09-06 06:19:49 +08:00
[[sql-functions-aggs-count]]
2019-03-27 23:18:14 +08:00
==== `COUNT`
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
.Synopsis:
[source, sql]
--------------------------------------------------
2019-04-22 21:33:55 +08:00
COUNT(expression) <1>
2018-12-22 05:25:54 +08:00
--------------------------------------------------
*Input*:
2021-06-23 23:11:26 +08:00
<1> a field name, wildcard (`*`) or any numeric value. For `COUNT(*)` or
`COUNT(<literal>)`, all values are considered, including `null` or missing
ones. For `COUNT(<field_name>)`, `null` values are not considered.
2018-12-22 05:25:54 +08:00
*Output*: numeric value
2020-02-15 00:58:45 +08:00
*Description*: Returns the total number (count) of input values.
2018-09-06 06:19:49 +08:00
2019-01-10 15:51:51 +08:00
2018-12-22 05:25:54 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[aggCountStar]
2018-12-22 05:25:54 +08:00
--------------------------------------------------
2018-09-06 06:19:49 +08:00
2019-01-10 15:51:51 +08:00
[[sql-functions-aggs-count-all]]
2019-03-27 23:18:14 +08:00
==== `COUNT(ALL)`
2019-01-10 15:51:51 +08:00
.Synopsis:
[source, sql]
--------------------------------------------------
2019-04-22 21:33:55 +08:00
COUNT(ALL field_name) <1>
2019-01-10 15:51:51 +08:00
--------------------------------------------------
*Input*:
2021-06-23 23:11:26 +08:00
<1> a field name. If this field contains only `null` values, the function
returns `null`. Otherwise, the function ignores `null` values in this field.
2019-01-10 15:51:51 +08:00
*Output*: numeric value
2020-02-15 00:58:45 +08:00
*Description*: Returns the total number (count) of all _non-null_ input values. `COUNT(<field_name>)` and `COUNT(ALL <field_name>)` are equivalent.
2019-01-10 15:51:51 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[aggCountAll]
2019-01-10 15:51:51 +08:00
--------------------------------------------------
2020-04-17 17:22:06 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
include-tagged::{sql-specs}/docs/docs.csv-spec[aggCountAllScalars]
--------------------------------------------------
2019-01-10 15:51:51 +08:00
2018-09-06 06:19:49 +08:00
[[sql-functions-aggs-count-distinct]]
2019-03-27 23:18:14 +08:00
==== `COUNT(DISTINCT)`
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
.Synopsis:
[source, sql]
--------------------------------------------------
2019-04-22 21:33:55 +08:00
COUNT(DISTINCT field_name) <1>
2018-12-22 05:25:54 +08:00
--------------------------------------------------
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
*Input*:
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
<1> a field name
2021-06-23 23:11:26 +08:00
*Output*: numeric value. If this field contains only `null` values, the function
returns `null`. Otherwise, the function ignores `null` values in this field.
2018-12-22 05:25:54 +08:00
2020-02-15 00:58:45 +08:00
*Description*: Returns the total number of _distinct non-null_ values in input values.
2018-12-22 05:25:54 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[aggCountDistinct]
2018-12-22 05:25:54 +08:00
--------------------------------------------------
2018-09-06 06:19:49 +08:00
2020-04-17 17:22:06 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
include-tagged::{sql-specs}/docs/docs.csv-spec[aggCountDistinctScalars]
--------------------------------------------------
2019-01-31 22:33:05 +08:00
[[sql-functions-aggs-first]]
2019-03-27 23:18:14 +08:00
==== `FIRST/FIRST_VALUE`
2019-01-31 22:33:05 +08:00
.Synopsis:
[source, sql]
----------------------------------------------
2019-04-22 21:33:55 +08:00
FIRST(
field_name <1>
[, ordering_field_name]) <2>
2019-01-31 22:33:05 +08:00
----------------------------------------------
*Input*:
<1> target field for the aggregation
<2> optional field used for ordering
*Output*: same type as the input
2021-06-23 23:11:26 +08:00
*Description*: Returns the first non-`null` value (if such exists) of the `field_name` input column sorted by
2019-01-31 22:33:05 +08:00
the `ordering_field_name` column. If `ordering_field_name` is not provided, only the `field_name`
column is used for the sorting. E.g.:
[cols="<,<"]
|===
s| a | b
| 100 | 1
| 200 | 1
| 1 | 2
| 2 | 2
| 10 | null
| 20 | null
| null | null
|===
[source, sql]
----------------------
SELECT FIRST(a) FROM t
----------------------
will result in:
[cols="<"]
|===
s| FIRST(a)
| 1
|===
and
[source, sql]
-------------------------
SELECT FIRST(a, b) FROM t
-------------------------
will result in:
[cols="<"]
|===
s| FIRST(a, b)
| 100
|===
["source","sql",subs="attributes,macros"]
-----------------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[firstWithOneArg]
2019-01-31 22:33:05 +08:00
-----------------------------------------------------------
["source","sql",subs="attributes,macros"]
--------------------------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[firstWithOneArgAndGroupBy]
2019-01-31 22:33:05 +08:00
--------------------------------------------------------------------
["source","sql",subs="attributes,macros"]
-----------------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[firstWithTwoArgs]
2019-01-31 22:33:05 +08:00
-----------------------------------------------------------
["source","sql",subs="attributes,macros"]
---------------------------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[firstWithTwoArgsAndGroupBy]
2019-01-31 22:33:05 +08:00
---------------------------------------------------------------------
`FIRST_VALUE` is a name alias and can be used instead of `FIRST`, e.g.:
["source","sql",subs="attributes,macros"]
--------------------------------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[firstValueWithTwoArgsAndGroupBy]
2019-01-31 22:33:05 +08:00
--------------------------------------------------------------------------
2020-04-17 17:22:06 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------------------------------
include-tagged::{sql-specs}/docs/docs.csv-spec[firstValueWithTwoArgsAndGroupByScalars]
--------------------------------------------------------------------------
2019-01-31 22:33:05 +08:00
[NOTE]
`FIRST` cannot be used in a HAVING clause.
[NOTE]
`FIRST` cannot be used with columns of type <<text, `text`>> unless
the field is also <<before-enabling-fielddata,saved as a keyword>>.
[[sql-functions-aggs-last]]
2019-03-27 23:18:14 +08:00
==== `LAST/LAST_VALUE`
2019-01-31 22:33:05 +08:00
.Synopsis:
[source, sql]
--------------------------------------------------
2019-04-22 21:33:55 +08:00
LAST(
field_name <1>
[, ordering_field_name]) <2>
2019-01-31 22:33:05 +08:00
--------------------------------------------------
*Input*:
<1> target field for the aggregation
<2> optional field used for ordering
*Output*: same type as the input
2021-06-23 23:11:26 +08:00
*Description*: It's the inverse of <<sql-functions-aggs-first>>. Returns the last non-`null` value (if such exists) of the
2020-02-15 00:58:45 +08:00
`field_name` input column sorted descending by the `ordering_field_name` column. If `ordering_field_name` is not
2019-01-31 22:33:05 +08:00
provided, only the `field_name` column is used for the sorting. E.g.:
[cols="<,<"]
|===
s| a | b
| 10 | 1
| 20 | 1
| 1 | 2
| 2 | 2
| 100 | null
| 200 | null
| null | null
|===
[source, sql]
------------------------
SELECT LAST(a) FROM t
------------------------
will result in:
[cols="<"]
|===
s| LAST(a)
| 200
|===
and
[source, sql]
------------------------
SELECT LAST(a, b) FROM t
------------------------
will result in:
[cols="<"]
|===
s| LAST(a, b)
| 2
|===
["source","sql",subs="attributes,macros"]
-----------------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[lastWithOneArg]
2019-01-31 22:33:05 +08:00
-----------------------------------------------------------
["source","sql",subs="attributes,macros"]
-------------------------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[lastWithOneArgAndGroupBy]
2019-01-31 22:33:05 +08:00
-------------------------------------------------------------------
["source","sql",subs="attributes,macros"]
-----------------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[lastWithTwoArgs]
2019-01-31 22:33:05 +08:00
-----------------------------------------------------------
["source","sql",subs="attributes,macros"]
--------------------------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[lastWithTwoArgsAndGroupBy]
2019-01-31 22:33:05 +08:00
--------------------------------------------------------------------
`LAST_VALUE` is a name alias and can be used instead of `LAST`, e.g.:
["source","sql",subs="attributes,macros"]
-------------------------------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[lastValueWithTwoArgsAndGroupBy]
2019-01-31 22:33:05 +08:00
-------------------------------------------------------------------------
2020-04-17 17:22:06 +08:00
["source","sql",subs="attributes,macros"]
-------------------------------------------------------------------------
include-tagged::{sql-specs}/docs/docs.csv-spec[lastValueWithTwoArgsAndGroupByScalars]
-------------------------------------------------------------------------
2019-01-31 22:33:05 +08:00
[NOTE]
`LAST` cannot be used in `HAVING` clause.
[NOTE]
`LAST` cannot be used with columns of type <<text, `text`>> unless
the field is also <<before-enabling-fielddata,`saved as a keyword`>>.
2018-09-06 06:19:49 +08:00
[[sql-functions-aggs-max]]
2019-03-27 23:18:14 +08:00
==== `MAX`
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
.Synopsis:
[source, sql]
--------------------------------------------------
2019-04-22 21:33:55 +08:00
MAX(field_name) <1>
2018-12-22 05:25:54 +08:00
--------------------------------------------------
*Input*:
2021-06-23 23:11:26 +08:00
<1> a numeric field. If this field contains only `null` values, the function
returns `null`. Otherwise, the function ignores `null` values in this field.
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
*Output*: same type as the input
2018-09-06 06:19:49 +08:00
2020-02-15 00:58:45 +08:00
*Description*: Returns the maximum value across input values in the field `field_name`.
2018-12-22 05:25:54 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[aggMax]
2018-12-22 05:25:54 +08:00
--------------------------------------------------
2018-09-06 06:19:49 +08:00
2020-04-17 17:22:06 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
include-tagged::{sql-specs}/docs/docs.csv-spec[aggMaxScalars]
--------------------------------------------------
2019-01-31 22:33:05 +08:00
[NOTE]
`MAX` on a field of type <<text, `text`>> or <<keyword, `keyword`>> is translated into
<<sql-functions-aggs-last>> and therefore, it cannot be used in `HAVING` clause.
2018-09-06 06:19:49 +08:00
[[sql-functions-aggs-min]]
2019-03-27 23:18:14 +08:00
==== `MIN`
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
.Synopsis:
[source, sql]
--------------------------------------------------
2019-04-22 21:33:55 +08:00
MIN(field_name) <1>
2018-12-22 05:25:54 +08:00
--------------------------------------------------
*Input*:
2021-06-23 23:11:26 +08:00
<1> a numeric field. If this field contains only `null` values, the function
returns `null`. Otherwise, the function ignores `null` values in this field.
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
*Output*: same type as the input
2018-09-06 06:19:49 +08:00
2020-02-15 00:58:45 +08:00
*Description*: Returns the minimum value across input values in the field `field_name`.
2018-12-22 05:25:54 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[aggMin]
2018-12-22 05:25:54 +08:00
--------------------------------------------------
2018-09-06 06:19:49 +08:00
2019-01-31 22:33:05 +08:00
[NOTE]
`MIN` on a field of type <<text, `text`>> or <<keyword, `keyword`>> is translated into
<<sql-functions-aggs-first>> and therefore, it cannot be used in `HAVING` clause.
2018-09-06 06:19:49 +08:00
[[sql-functions-aggs-sum]]
2019-03-27 23:18:14 +08:00
==== `SUM`
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
.Synopsis:
[source, sql]
--------------------------------------------------
2019-04-22 21:33:55 +08:00
SUM(field_name) <1>
2018-12-22 05:25:54 +08:00
--------------------------------------------------
*Input*:
2018-09-06 06:19:49 +08:00
2021-06-23 23:11:26 +08:00
<1> a numeric field. If this field contains only `null` values, the function
returns `null`. Otherwise, the function ignores `null` values in this field.
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
*Output*: `bigint` for integer input, `double` for floating points
2020-02-15 00:58:45 +08:00
*Description*: Returns the sum of input values in the field `field_name`.
2018-12-22 05:25:54 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[aggSum]
2018-12-22 05:25:54 +08:00
--------------------------------------------------
2018-09-06 06:19:49 +08:00
2020-04-17 17:22:06 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
include-tagged::{sql-specs}/docs/docs.csv-spec[aggSumScalars]
--------------------------------------------------
2019-03-27 23:18:14 +08:00
[[sql-functions-aggs-statistics]]
2020-07-23 23:48:22 +08:00
[discrete]
2019-03-27 23:18:14 +08:00
=== Statistics
2018-09-06 06:19:49 +08:00
[[sql-functions-aggs-kurtosis]]
2019-03-27 23:18:14 +08:00
==== `KURTOSIS`
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
.Synopsis:
[source, sql]
--------------------------------------------------
2019-04-22 21:33:55 +08:00
KURTOSIS(field_name) <1>
2018-12-22 05:25:54 +08:00
--------------------------------------------------
*Input*:
2021-06-23 23:11:26 +08:00
<1> a numeric field. If this field contains only `null` values, the function
returns `null`. Otherwise, the function ignores `null` values in this field.
2018-12-22 05:25:54 +08:00
*Output*: `double` numeric value
2018-09-06 06:19:49 +08:00
2020-02-15 00:58:45 +08:00
*Description*:
2018-09-06 06:19:49 +08:00
2020-08-17 21:44:24 +08:00
{wikipedia}/Kurtosis[Quantify] the shape of the distribution of input values in the field `field_name`.
2018-12-22 05:25:54 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[aggKurtosis]
2018-12-22 05:25:54 +08:00
--------------------------------------------------
2018-09-06 06:19:49 +08:00
2020-04-17 17:22:06 +08:00
[NOTE]
====
`KURTOSIS` cannot be used on top of scalar functions or operators but only directly on a field. So, for example,
the following is not allowed and an error is returned:
[source, sql]
---------------------------------------
SELECT KURTOSIS(salary / 12.0), gender FROM emp GROUP BY gender
---------------------------------------
====
2019-03-15 17:45:10 +08:00
[[sql-functions-aggs-mad]]
2019-03-27 23:18:14 +08:00
==== `MAD`
2019-03-15 17:45:10 +08:00
.Synopsis:
[source, sql]
--------------------------------------------------
2019-04-22 21:33:55 +08:00
MAD(field_name) <1>
2019-03-15 17:45:10 +08:00
--------------------------------------------------
*Input*:
2021-06-23 23:11:26 +08:00
<1> a numeric field. If this field contains only `null` values, the function
returns `null`. Otherwise, the function ignores `null` values in this field.
2019-03-15 17:45:10 +08:00
*Output*: `double` numeric value
2020-02-15 00:58:45 +08:00
*Description*:
2019-03-15 17:45:10 +08:00
2020-08-17 21:44:24 +08:00
{wikipedia}/Median_absolute_deviation[Measure] the variability of the input values in the field `field_name`.
2019-03-15 17:45:10 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[aggMad]
2019-03-15 17:45:10 +08:00
--------------------------------------------------
2020-04-17 17:22:06 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
include-tagged::{sql-specs}/docs/docs.csv-spec[aggMadScalars]
--------------------------------------------------
2018-09-06 06:19:49 +08:00
[[sql-functions-aggs-percentile]]
2019-03-27 23:18:14 +08:00
==== `PERCENTILE`
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
.Synopsis:
[source, sql]
--------------------------------------------------
2019-04-22 21:33:55 +08:00
PERCENTILE(
2020-11-25 03:17:56 +08:00
field_name, <1>
percentile[, <2>
method[, <3>
method_parameter]]) <4>
2018-12-22 05:25:54 +08:00
--------------------------------------------------
*Input*:
2021-06-23 23:11:26 +08:00
<1> a numeric field. If this field contains only `null` values, the function
returns `null`. Otherwise, the function ignores `null` values in this field.
<2> a numeric expression (must be a constant and not based on a field). If
`null`, the function returns `null`.
2020-11-25 03:17:56 +08:00
<3> optional string literal for the <<search-aggregations-metrics-percentile-aggregation-approximation,percentile algorithm>>. Possible values: `tdigest` or `hdr`. Defaults to `tdigest`.
<4> optional numeric literal that configures the <<search-aggregations-metrics-percentile-aggregation-approximation,percentile algorithm>>. Configures `compression` for `tdigest` or `number_of_significant_value_digits` for `hdr`. The default is the same as that of the backing algorithm.
2018-12-22 05:25:54 +08:00
*Output*: `double` numeric value
2018-09-06 06:19:49 +08:00
2020-02-15 00:58:45 +08:00
*Description*:
2018-09-06 06:19:49 +08:00
2020-08-17 21:44:24 +08:00
Returns the nth {wikipedia}/Percentile[percentile] (represented by `numeric_exp` parameter)
2018-12-22 05:25:54 +08:00
of input values in the field `field_name`.
["source","sql",subs="attributes,macros"]
--------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[aggPercentile]
2018-12-22 05:25:54 +08:00
--------------------------------------------------
2018-09-06 06:19:49 +08:00
2020-04-17 17:22:06 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
include-tagged::{sql-specs}/docs/docs.csv-spec[aggPercentileScalars]
--------------------------------------------------
2020-11-25 03:17:56 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
include-tagged::{sql-specs}/docs/docs.csv-spec[aggPercentileWithPercentileConfig]
--------------------------------------------------
2018-09-06 06:19:49 +08:00
[[sql-functions-aggs-percentile-rank]]
2019-03-27 23:18:14 +08:00
==== `PERCENTILE_RANK`
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
.Synopsis:
[source, sql]
--------------------------------------------------
2019-04-22 21:33:55 +08:00
PERCENTILE_RANK(
2020-11-25 03:17:56 +08:00
field_name, <1>
value[, <2>
method[, <3>
method_parameter]]) <4>
2018-12-22 05:25:54 +08:00
--------------------------------------------------
*Input*:
2021-06-23 23:11:26 +08:00
<1> a numeric field. If this field contains only `null` values, the function
returns `null`. Otherwise, the function ignores `null` values in this field.
<2> a numeric expression (must be a constant and not based on a field). If
`null`, the function returns `null`.
2020-11-25 03:17:56 +08:00
<3> optional string literal for the <<search-aggregations-metrics-percentile-aggregation-approximation,percentile algorithm>>. Possible values: `tdigest` or `hdr`. Defaults to `tdigest`.
<4> optional numeric literal that configures the <<search-aggregations-metrics-percentile-aggregation-approximation,percentile algorithm>>. Configures `compression` for `tdigest` or `number_of_significant_value_digits` for `hdr`. The default is the same as that of the backing algorithm.
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
*Output*: `double` numeric value
2018-09-06 06:19:49 +08:00
2020-02-15 00:58:45 +08:00
*Description*:
2018-12-22 05:25:54 +08:00
2020-08-17 21:44:24 +08:00
Returns the nth {wikipedia}/Percentile_rank[percentile rank] (represented by `numeric_exp` parameter)
2018-12-22 05:25:54 +08:00
of input values in the field `field_name`.
["source","sql",subs="attributes,macros"]
--------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[aggPercentileRank]
2018-12-22 05:25:54 +08:00
--------------------------------------------------
2018-09-06 06:19:49 +08:00
2020-04-17 17:22:06 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
include-tagged::{sql-specs}/docs/docs.csv-spec[aggPercentileRankScalars]
--------------------------------------------------
2020-11-25 03:17:56 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
include-tagged::{sql-specs}/docs/docs.csv-spec[aggPercentileRankWithPercentileConfig]
--------------------------------------------------
2018-09-06 06:19:49 +08:00
[[sql-functions-aggs-skewness]]
2019-03-27 23:18:14 +08:00
==== `SKEWNESS`
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
.Synopsis:
[source, sql]
--------------------------------------------------
2019-04-22 21:33:55 +08:00
SKEWNESS(field_name) <1>
2018-12-22 05:25:54 +08:00
--------------------------------------------------
*Input*:
2021-06-23 23:11:26 +08:00
<1> a numeric field. If this field contains only `null` values, the function
returns `null`. Otherwise, the function ignores `null` values in this field.
2018-12-22 05:25:54 +08:00
*Output*: `double` numeric value
2020-02-15 00:58:45 +08:00
*Description*:
2018-09-06 06:19:49 +08:00
2020-08-17 21:44:24 +08:00
{wikipedia}/Skewness[Quantify] the asymmetric distribution of input values in the field `field_name`.
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[aggSkewness]
2018-12-22 05:25:54 +08:00
--------------------------------------------------
2018-09-06 06:19:49 +08:00
2020-04-17 17:22:06 +08:00
[NOTE]
====
`SKEWNESS` cannot be used on top of scalar functions but only directly on a field. So, for example, the following is
not allowed and an error is returned:
[source, sql]
---------------------------------------
SELECT SKEWNESS(ROUND(salary / 12.0, 2), gender FROM emp GROUP BY gender
---------------------------------------
====
2018-09-06 06:19:49 +08:00
[[sql-functions-aggs-stddev-pop]]
2019-03-27 23:18:14 +08:00
==== `STDDEV_POP`
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
.Synopsis:
[source, sql]
--------------------------------------------------
2019-04-22 21:33:55 +08:00
STDDEV_POP(field_name) <1>
2018-12-22 05:25:54 +08:00
--------------------------------------------------
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
*Input*:
2018-09-06 06:19:49 +08:00
2021-06-23 23:11:26 +08:00
<1> a numeric field. If this field contains only `null` values, the function
returns `null`. Otherwise, the function ignores `null` values in this field.
2018-12-22 05:25:54 +08:00
*Output*: `double` numeric value
2020-02-15 00:58:45 +08:00
*Description*:
2018-12-22 05:25:54 +08:00
2020-08-17 21:44:24 +08:00
Returns the {wikipedia}/Standard_deviations[population standard deviation] of input values in the field `field_name`.
2018-12-22 05:25:54 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[aggStddevPop]
2018-12-22 05:25:54 +08:00
--------------------------------------------------
2018-09-06 06:19:49 +08:00
2020-04-17 17:22:06 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
include-tagged::{sql-specs}/docs/docs.csv-spec[aggStddevPopScalars]
--------------------------------------------------
2020-07-09 14:22:01 +08:00
[[sql-functions-aggs-stddev-samp]]
==== `STDDEV_SAMP`
.Synopsis:
[source, sql]
--------------------------------------------------
STDDEV_SAMP(field_name) <1>
--------------------------------------------------
*Input*:
2021-06-23 23:11:26 +08:00
<1> a numeric field. If this field contains only `null` values, the function
returns `null`. Otherwise, the function ignores `null` values in this field.
2020-07-09 14:22:01 +08:00
*Output*: `double` numeric value
*Description*:
2020-08-17 21:44:24 +08:00
Returns the {wikipedia}/Standard_deviations[sample standard deviation] of input values in the field `field_name`.
2020-07-09 14:22:01 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
include-tagged::{sql-specs}/docs/docs.csv-spec[aggStddevSamp]
--------------------------------------------------
["source","sql",subs="attributes,macros"]
--------------------------------------------------
include-tagged::{sql-specs}/docs/docs.csv-spec[aggStddevSampScalars]
--------------------------------------------------
2018-09-06 06:19:49 +08:00
[[sql-functions-aggs-sum-squares]]
2019-03-27 23:18:14 +08:00
==== `SUM_OF_SQUARES`
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
.Synopsis:
[source, sql]
--------------------------------------------------
2019-04-22 21:33:55 +08:00
SUM_OF_SQUARES(field_name) <1>
2018-12-22 05:25:54 +08:00
--------------------------------------------------
*Input*:
2021-06-23 23:11:26 +08:00
<1> a numeric field. If this field contains only `null` values, the function
returns `null`. Otherwise, the function ignores `null` values in this field.
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
*Output*: `double` numeric value
2018-09-06 06:19:49 +08:00
2020-02-15 00:58:45 +08:00
*Description*:
2018-12-22 05:25:54 +08:00
2020-04-20 21:58:30 +08:00
Returns the sum of squares of input values in the field `field_name`.
2018-12-22 05:25:54 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[aggSumOfSquares]
2018-12-22 05:25:54 +08:00
--------------------------------------------------
2018-09-06 06:19:49 +08:00
2020-04-17 17:22:06 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
include-tagged::{sql-specs}/docs/docs.csv-spec[aggSumOfSquaresScalars]
--------------------------------------------------
2018-09-06 06:19:49 +08:00
[[sql-functions-aggs-var-pop]]
2019-03-27 23:18:14 +08:00
==== `VAR_POP`
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
.Synopsis:
[source, sql]
--------------------------------------------------
2019-04-22 21:33:55 +08:00
VAR_POP(field_name) <1>
2018-12-22 05:25:54 +08:00
--------------------------------------------------
*Input*:
2021-06-23 23:11:26 +08:00
<1> a numeric field. If this field contains only `null` values, the function
returns `null`. Otherwise, the function ignores `null` values in this field.
2018-12-22 05:25:54 +08:00
*Output*: `double` numeric value
2020-02-15 00:58:45 +08:00
*Description*:
2018-09-06 06:19:49 +08:00
2020-08-17 21:44:24 +08:00
Returns the {wikipedia}/Variance[population variance] of input values in the field `field_name`.
2018-09-06 06:19:49 +08:00
2018-12-22 05:25:54 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
2019-03-25 21:22:59 +08:00
include-tagged::{sql-specs}/docs/docs.csv-spec[aggVarPop]
2018-12-22 05:25:54 +08:00
--------------------------------------------------
2020-04-17 17:22:06 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
include-tagged::{sql-specs}/docs/docs.csv-spec[aggVarPopScalars]
--------------------------------------------------
2020-07-09 14:22:01 +08:00
[[sql-functions-aggs-var-samp]]
==== `VAR_SAMP`
.Synopsis:
[source, sql]
--------------------------------------------------
VAR_SAMP(field_name) <1>
--------------------------------------------------
*Input*:
2021-06-23 23:11:26 +08:00
<1> a numeric field. If this field contains only `null` values, the function
returns `null`. Otherwise, the function ignores `null` values in this field.
2020-07-09 14:22:01 +08:00
*Output*: `double` numeric value
*Description*:
2020-08-17 21:44:24 +08:00
Returns the {wikipedia}/Variance[sample variance] of input values in the field `field_name`.
2020-07-09 14:22:01 +08:00
["source","sql",subs="attributes,macros"]
--------------------------------------------------
include-tagged::{sql-specs}/docs/docs.csv-spec[aggVarSamp]
--------------------------------------------------
["source","sql",subs="attributes,macros"]
--------------------------------------------------
include-tagged::{sql-specs}/docs/docs.csv-spec[aggVarSampScalars]
--------------------------------------------------