2023-10-17 23:36:14 +08:00
|
|
|
[[esql-mv-functions]]
|
2023-10-23 22:45:42 +08:00
|
|
|
==== {esql} multivalue functions
|
2023-10-17 23:36:14 +08:00
|
|
|
|
|
|
|
++++
|
|
|
|
<titleabbrev>Multivalue functions</titleabbrev>
|
|
|
|
++++
|
|
|
|
|
|
|
|
{esql} supports these multivalue functions:
|
|
|
|
|
2023-10-23 22:45:42 +08:00
|
|
|
// tag::mv_list[]
|
2024-06-13 00:44:49 +08:00
|
|
|
* <<esql-mv_append>>
|
2023-10-17 23:36:14 +08:00
|
|
|
* <<esql-mv_avg>>
|
|
|
|
* <<esql-mv_concat>>
|
|
|
|
* <<esql-mv_count>>
|
|
|
|
* <<esql-mv_dedupe>>
|
2024-01-09 21:46:34 +08:00
|
|
|
* <<esql-mv_first>>
|
|
|
|
* <<esql-mv_last>>
|
2023-10-17 23:36:14 +08:00
|
|
|
* <<esql-mv_max>>
|
|
|
|
* <<esql-mv_median>>
|
ESQL: mv_median_absolute_deviation function (#112055)
- Added mv_median_absolute_deviation function
- Added possibility of having a fixed param in Multivalue "ascending" functions
- Add surrogate to MedianAbsoluteDeviation
### Calculations used to avoid overflows
First, a quick recap of how the MAD is calculated:
1. Sort values, and get the median
2. Calculate the difference between each value with the median (`abs(median - value)`)
3. Sort the differences, and get their median
Calculating a MAD may overflow when calculating the differences (Step 2), given the type is a signed number, as the difference is a positive value, with potentially the same value as `POSITIVE_MAX - NEGATIVE_MIN`.
To solve this, some types are up-casted as follow:
- Int: Stored as longs, simple approach
- Long: Stored as longs, but switched to unsigned long representation when calculating the differences
- Unsigned long: No effect; the resulting range is the same
- Doubles: Nothing. If the values overflow to +/-infinity, they're left that way, as we'll just use those outliers to sort
Closes https://github.com/elastic/elasticsearch/issues/111590
2024-09-09 16:04:25 +08:00
|
|
|
* <<esql-mv_median_absolute_deviation>>
|
2023-10-17 23:36:14 +08:00
|
|
|
* <<esql-mv_min>>
|
2024-11-23 03:41:18 +08:00
|
|
|
* <<esql-mv_percentile>>
|
2024-07-31 18:08:28 +08:00
|
|
|
* <<esql-mv_pseries_weighted_sum>>
|
2024-03-14 00:04:12 +08:00
|
|
|
* <<esql-mv_sort>>
|
2024-03-12 11:02:57 +08:00
|
|
|
* <<esql-mv_slice>>
|
2023-10-17 23:36:14 +08:00
|
|
|
* <<esql-mv_sum>>
|
2024-03-12 11:02:57 +08:00
|
|
|
* <<esql-mv_zip>>
|
2023-10-23 22:45:42 +08:00
|
|
|
// end::mv_list[]
|
2023-10-17 23:36:14 +08:00
|
|
|
|
2024-06-13 00:44:49 +08:00
|
|
|
include::layout/mv_append.asciidoc[]
|
2024-05-01 22:47:22 +08:00
|
|
|
include::layout/mv_avg.asciidoc[]
|
|
|
|
include::layout/mv_concat.asciidoc[]
|
|
|
|
include::layout/mv_count.asciidoc[]
|
|
|
|
include::layout/mv_dedupe.asciidoc[]
|
|
|
|
include::layout/mv_first.asciidoc[]
|
|
|
|
include::layout/mv_last.asciidoc[]
|
|
|
|
include::layout/mv_max.asciidoc[]
|
|
|
|
include::layout/mv_median.asciidoc[]
|
ESQL: mv_median_absolute_deviation function (#112055)
- Added mv_median_absolute_deviation function
- Added possibility of having a fixed param in Multivalue "ascending" functions
- Add surrogate to MedianAbsoluteDeviation
### Calculations used to avoid overflows
First, a quick recap of how the MAD is calculated:
1. Sort values, and get the median
2. Calculate the difference between each value with the median (`abs(median - value)`)
3. Sort the differences, and get their median
Calculating a MAD may overflow when calculating the differences (Step 2), given the type is a signed number, as the difference is a positive value, with potentially the same value as `POSITIVE_MAX - NEGATIVE_MIN`.
To solve this, some types are up-casted as follow:
- Int: Stored as longs, simple approach
- Long: Stored as longs, but switched to unsigned long representation when calculating the differences
- Unsigned long: No effect; the resulting range is the same
- Doubles: Nothing. If the values overflow to +/-infinity, they're left that way, as we'll just use those outliers to sort
Closes https://github.com/elastic/elasticsearch/issues/111590
2024-09-09 16:04:25 +08:00
|
|
|
include::layout/mv_median_absolute_deviation.asciidoc[]
|
2024-05-01 22:47:22 +08:00
|
|
|
include::layout/mv_min.asciidoc[]
|
2024-11-23 03:41:18 +08:00
|
|
|
include::layout/mv_percentile.asciidoc[]
|
2024-07-31 18:08:28 +08:00
|
|
|
include::layout/mv_pseries_weighted_sum.asciidoc[]
|
2024-05-01 22:47:22 +08:00
|
|
|
include::layout/mv_slice.asciidoc[]
|
|
|
|
include::layout/mv_sort.asciidoc[]
|
|
|
|
include::layout/mv_sum.asciidoc[]
|
|
|
|
include::layout/mv_zip.asciidoc[]
|