2023-10-17 23:36:14 +08:00
|
|
|
[[esql-agg-functions]]
|
2023-10-23 22:45:42 +08:00
|
|
|
==== {esql} aggregate functions
|
2023-10-17 23:36:14 +08:00
|
|
|
|
|
|
|
++++
|
|
|
|
<titleabbrev>Aggregate functions</titleabbrev>
|
|
|
|
++++
|
|
|
|
|
|
|
|
The <<esql-stats-by>> function supports these aggregate functions:
|
|
|
|
|
2023-10-23 22:45:42 +08:00
|
|
|
// tag::agg_list[]
|
2023-10-17 23:36:14 +08:00
|
|
|
* <<esql-agg-avg>>
|
|
|
|
* <<esql-agg-count>>
|
|
|
|
* <<esql-agg-count-distinct>>
|
|
|
|
* <<esql-agg-max>>
|
|
|
|
* <<esql-agg-median>>
|
|
|
|
* <<esql-agg-median-absolute-deviation>>
|
|
|
|
* <<esql-agg-min>>
|
|
|
|
* <<esql-agg-percentile>>
|
2024-04-02 16:31:00 +08:00
|
|
|
* experimental:[] <<esql-agg-st-centroid>>
|
2023-10-17 23:36:14 +08:00
|
|
|
* <<esql-agg-sum>>
|
ESQL: Values aggregation function (#106065)
This creates the `VALUES` aggregation function which buffers all field
values it receives and emits them as a multivalued field. It can use a
significant amount of memory and will circuit break if it uses too much
memory, but it's really useful for putting together self-join-like
behavior. It sort of functions as a stop-gap measure until we have more
self-join style things.
In the future we'll have spill-to-disk for aggregations and, likely,
some kind of self-join command for aggregations at least so this will be
able to grow beyond memory. But for now, memory it is.
Example:
```
FROM employees
| EVAL first_letter = SUBSTRING(first_name, 0, 1)
| STATS first_name=VALUES(first_name) BY first_letter
| SORT first_letter
;
first_name:keyword | first_letter:keyword
[Anneke, Alejandro, Anoosh, Amabile, Arumugam] | A
[Bezalel, Berni, Bojan, Basil, Brendon, Berhard, Breannda] | B
[Chirstian, Cristinel, Claudi, Charlene] | C
[Duangkaew, Divier, Domenick, Danel] | D
```
I made this work for everything but `geo_point` and `cartesian_point`
because I'm not 100% sure how to integrate with those. We can grab those
in a follow up.
Closes #103600
2024-03-22 00:52:04 +08:00
|
|
|
* <<esql-agg-values>>
|
2023-10-23 22:45:42 +08:00
|
|
|
// end::agg_list[]
|
2023-10-17 23:36:14 +08:00
|
|
|
|
|
|
|
include::avg.asciidoc[]
|
|
|
|
include::count.asciidoc[]
|
|
|
|
include::count-distinct.asciidoc[]
|
|
|
|
include::max.asciidoc[]
|
|
|
|
include::median.asciidoc[]
|
|
|
|
include::median-absolute-deviation.asciidoc[]
|
|
|
|
include::min.asciidoc[]
|
|
|
|
include::percentile.asciidoc[]
|
2024-04-10 23:56:45 +08:00
|
|
|
include::st_centroid_agg.asciidoc[]
|
2023-10-17 23:36:14 +08:00
|
|
|
include::sum.asciidoc[]
|
ESQL: Values aggregation function (#106065)
This creates the `VALUES` aggregation function which buffers all field
values it receives and emits them as a multivalued field. It can use a
significant amount of memory and will circuit break if it uses too much
memory, but it's really useful for putting together self-join-like
behavior. It sort of functions as a stop-gap measure until we have more
self-join style things.
In the future we'll have spill-to-disk for aggregations and, likely,
some kind of self-join command for aggregations at least so this will be
able to grow beyond memory. But for now, memory it is.
Example:
```
FROM employees
| EVAL first_letter = SUBSTRING(first_name, 0, 1)
| STATS first_name=VALUES(first_name) BY first_letter
| SORT first_letter
;
first_name:keyword | first_letter:keyword
[Anneke, Alejandro, Anoosh, Amabile, Arumugam] | A
[Bezalel, Berni, Bojan, Basil, Brendon, Berhard, Breannda] | B
[Chirstian, Cristinel, Claudi, Charlene] | C
[Duangkaew, Divier, Domenick, Danel] | D
```
I made this work for everything but `geo_point` and `cartesian_point`
because I'm not 100% sure how to integrate with those. We can grab those
in a follow up.
Closes #103600
2024-03-22 00:52:04 +08:00
|
|
|
include::values.asciidoc[]
|