Commit Graph

182 Commits

Author SHA1 Message Date
Iván Cea Fontenla 595d907f61
ESQL: SpatialCentroid aggregation tests and docs (#111236) 2024-07-26 10:41:18 +02:00
Fang Xing 66dd2687d5
[ES|QL] Generate docs for unregistered esql functions from annotations (#108749)
* render docs for operators
2024-07-22 14:58:17 -04:00
Iván Cea Fontenla 195b916e2b
ESQL: TOP aggregation IP support (#111105)
Added IP support to TOP() aggregation.

Adapted a bit the stringtemplates organization for esql/compute to
(also?) work with specific datatypes. Right now it may be a bit messy,
but we need the specific support for cases like this.
2024-07-22 22:35:48 +10:00
Iván Cea Fontenla 101775b93d
Added Sum aggregation tests and docs (#110984)
- Added SUM() agg tests (Which autogenerates docs)
- Converted non-finite doubles to nulls in aggregator

The complete set of tests depends on
https://github.com/elastic/elasticsearch/issues/110437, as commented in
code. After completion, the test can be uncommented and everything
should work fine
2024-07-22 21:43:58 +10:00
Iván Cea Fontenla 96e1b15b9d
ESQL: Support IP fields in MAX and MIN aggregations (#110921)
- Support IP in MAX() and MIN()
  - Used a custom IpArrayState for it, as it's quite different from the `X-ArrayState.java.st` generated ones
- Add IP test cases for aggregation tests
2024-07-19 23:23:13 +10:00
Iván Cea Fontenla 0e68117935
Added Percentile aggregation tests and Kibana docs (#111050)
- Added Percentile aggregation tests and autogen docs
- Added a new "appendix" section to FunctionInfo. Existing Percentile docs had a final, long section with info, and we need this to leep it. We have an "detailedDescription" attribute already, but it's right after the description, and it would make it harder to read the important bits of the function (types, examples...). So I'm not reusing it.
2024-07-19 14:28:11 +02:00
Carlos Delgado 453b82706d
Add the EXP ES|QL function (#110879) 2024-07-16 16:36:01 +02:00
Iván Cea Fontenla 43a3af66e8
ESQL: Add boolean support to TOP aggregation (#110718)
- Added a custom implementation of BooleanBucketedSort to keep the top booleans
- Added boolean aggregator to TOP
- Added tests (Boolean aggregator tests, Top tests for boolean, and added boolean fields to CSV cases)
2024-07-16 03:14:29 +10:00
Nik Everett 9f001169c6
ESQL: Document the pattern to count TRUE (#110820)
This adds an example to the docs an example of counting the TRUE results
of an expression. You do `COUNT(a > 0 OR NULL)`. That turns the `FALSE`
into `NULL`. Which you need to do because `COUNT(false)` is `1` -
because it's a value. But `COUNT(null)` is `0` - because it's the
absence of values.

We could like to make something more intuitive for this one day. But for
now, this is what works.
2024-07-12 14:08:22 -04:00
Nik Everett 55532c8d6f
ESQL: All descriptions are a full sentence (#110791)
This asserts that all functions have descriptions that are complete
sentences.
2024-07-11 16:44:15 -04:00
Iván Cea Fontenla 2901711c46
ESQL: Add boolean support to Max and Min aggs (#110527)
- Added support for Booleans on Max and Min
- Added some helper methods to BitArray (`set(index, value)` and `fill(from, to, value)`). This way, the container is more similar to other BigArrays, and it's easier to work with

Part of https://github.com/elastic/elasticsearch/issues/110346, as Max
and Min are dependencies of Top.
2024-07-10 23:10:32 +10:00
Iván Cea Fontenla 5d3512fb33
ESQL: Fix Max doubles bug with negatives and add tests for Max and Min (#110586)
`MAX()` currently doesn't work with doubles smaller than
`Double.MIN_VALUE` (Note that `Double.MIN_VALUE` returns the smallest
non-zero positive, not the smallest double).

This PR adds tests for Max and Min, and fixes the bug (Detected by the
tests).

Also, as the tests now generate the docs, replaced the old docs with the
generated ones, and updated the Max&Min examples.
2024-07-09 21:05:00 +10:00
Iván Cea Fontenla 38cd0b333e
ESQL: AVG aggregation tests and ignore complex surrogates (#110579)
Some work around aggregation tests, with AVG as an example:
- Added tests and autogenerated docs for AVG
- As AVG uses "complex" surrogates (A combination of functions), we can't trivially execute them without a complete plan. As I'm not sure it's worth it for most aggregations, I'm skipping those cases for now, as to avoid blocking other aggs tests.

The bad side effect of skipping those tests is that most tests in AvgTests are actually ignored (74 of 100)
2024-07-09 12:01:46 +02:00
Fang Xing 8abc8857f2
[ES|QL] weighted_avg (#109993)
* weighted_avg
2024-07-02 18:29:02 -04:00
Nik Everett 6fbc52d170
ESQL docs: Push down needs index and doc_values (#110353)
This adds a `NOTE` to each comparison saying that pushing the comparison
to the search index requires that the field have an `index` and
`doc_values`. This is unique compared to the rest of Elasticsearch which
only requires an `index` and it's caused by our insistence that
comparisons only return true for single-valued fields. We can in future
accelerate comparisons without `doc_values`, but we just haven't written
that code yet.
2024-07-02 14:22:50 -04:00
Iván Cea Fontenla c89ee3b648
ESQL: Renamed TopList to Top (#110347)
Rename TopList aggregation to Top, after internal discussions
2024-07-02 03:52:24 +10:00
Iván Cea Fontenla fc0313f429
ESQL: Add aggregations testing base and docs (#110042)
- Added a new `AbstractAggregationTestCase` base class for tests, that shares most of the code of function tests, adapted for aggregations. Including both testing and docs generation.
  - Reused the `AbstractFunctionTestCase` class to also let us test evaluators if the aggregation is foldable
- Added a `TopListTests` example
  - This includes the docs for Top_list _(Also added a missing include of Ip_prefix docs)_
- Adapted Kibana docs to use `type: "agg"` (@drewdaemon)

The current tests are very basic: Consume a page, generate an output,
all in Single aggregation mode (No intermediates, no grouping). More
complex testing will be added in future PRs

Initial PR of https://github.com/elastic/elasticsearch/issues/109917
2024-06-27 21:21:55 +10:00
Craig Taverner 536d614694
ES|QL ST_DISTANCE Function (#108764)
* WIP Started refactoring in preparation for ST_DISTANCE

* Initial evaluators for ST_DISTANCE

* Update docs/changelog/108764.yaml

* Fix invalid changelog generated by CI

* Register function and get unit tests working

* Fixed failing meta function description tests, and refined descriptions

* Added initial CsvTests and calculate Geo differently to Cartesian

* Added more csv-spec tests and changed to arcDistance for accuracy

* Added generated docs files

* Link to generated docs

* Fix examples tag for linking from generated docs

* Skip wrapper function

And note that we might want to include instead some of the related intelligence from Circle2D::HaversineDistance class

* Added ST_DWITHIN and more tests for ST_DISTANCE and ST_DWITHIN

* Code style

* Added more tests, this time for sorting on distance

* Fixes after rebase on main

* The ST_DWITHIN cannot use BinarySpatialFunction because it is ternary

So we moved the common code to a separate SpatialTypeResolver, and made a simpler TernarySpatialFunction based on a simple TernaryScalarFunction. This had additional consequences, simplifying the points-only cases.

The main reason for this change was to support StDWithinTests which need to test a lot of things that involve varying all three input types, generating expected error strings, etc. The original hack of just adding to BinarySpatialFunction worked for the actual integration tests, but clearly did not satisfy all the use cases tested by the unit tests.

We also restricted ST_DWITHIN to take only a double as the third argument, because otherwise the number of evaluators would explode, since we need a separate evaluator for each Block type, and Integer and Double use different block types.

* Fixed function count after rebasing on main

* Update docs/changelog/108764.yaml

* Added generated docs for ST_DWITHIN

* Connect docs for ST_DWITHIN

* Add back issue link

* Remove support for ST_DWITHIN

* Update docs/changelog/108764.yaml

* Bring back link to issue in changelog

* Update x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/spatial/StDistance.java

Co-authored-by: Ignacio Vera <iverase@gmail.com>

* Revert reformatting of function descriptions

We should put this into a separate PR

* Github merged commit with incorrectly formatted whitespace

---------

Co-authored-by: Ignacio Vera <iverase@gmail.com>
2024-06-21 11:59:44 +02:00
Nik Everett b35f0ed48d
ESQL: Make a table of all inline casts (#109713)
This adds a test that generates
`docs/reference/esql/functions/kibana/inline_cast.json` which is a json
object who's keys are the names of valid inline casts and who's values
are the resulting data types.

I also moved one of the maps we use to make the inline casts to
`DataType`, which is a place where we want it.
2024-06-18 06:23:11 -04:00
Nik Everett 2aade9dd66
ESQL: Warn about division (#109716)
When you divide two integers or two longs we round towards 0. Like
Postgres or Java or Rust or C. Other systems, like MySQL or SPL or
Javascript or Python always produce a floating point number. We should
warn folks about this. It's genuinely unexpected for some folks. OTOH,
converting into a floating point number would be unexpected for other
folks. Oh well, let's document what we've got.
2024-06-14 08:36:27 -04:00
Luigi Dell'Aquila 47edae4fbd
ES|QL: reduce memory footprint for MvAppendTests with shapes (#109517)
Fixing MvAppendTests CB exceptions by generating smaller geometries: the
test generates a lot of documents and the CB is too small for multiple
big shapes.

Fixes https://github.com/elastic/elasticsearch/issues/109409
2024-06-13 02:44:49 +10:00
Luigi Dell'Aquila 3d0c65d0c5
ES|QL: add tests for COALESCE() function on VERSION type (#109468) 2024-06-07 18:01:42 +02:00
Nik Everett 7916e6a231
ESQL: Implement LOOKUP, an "inline" enrich (#107987)
This adds support for `LOOKUP`, a command that implements a sort of
inline `ENRICH`, using data that is passed in the request:

```
$ curl -uelastic:password -HContent-Type:application/json -XPOST \
    'localhost:9200/_query?error_trace&pretty&format=txt' \
-d'{
    "query": "ROW a=1::LONG | LOOKUP t ON a",
    "tables": {
        "t": {
            "a:long":     [    1,     4,     2],
            "v1:integer": [   10,    11,    12],
            "v2:keyword": ["cat", "dog", "wow"]
        }
    },
    "version": "2024.04.01"
}'
      v1       |      v2       |       a       
---------------+---------------+---------------
10             |cat            |1
```

This required these PRs: * #107624 * #107634 * #107701 * #107762 *
#107923 * #107894 * #107982 * #108012 * #108020 * #108169 * #108191 *
#108334 * #108482 * #108696 * #109040 * #109045

Closes #107306
2024-06-07 11:38:51 +10:00
Parker Timmins bb3ff8e924
ESQL: add REPEAT string function (#109220)
Add support for the string manipulation function REPEAT(string, number). This function concatenates the string argument with itself the specified number of times. If number is 0 an empty string is returned. If number is less than 0, null is returned and a warning is logged. If number is less than 0 and is a constant, the query will fail without executing.
2024-06-04 16:32:43 -05:00
Luigi Dell'Aquila 5f6e8f687b
ES|QL: add MV_APPEND function (#107001)
Adding `MV_APPEND(value1, value2)` function, that appends two values
creating a single multi-value. If one or both the inputs are
multi-values, the result is the concatenation of all the values, eg.

```
MV_APPEND([a, b], [c, d]) -> [a, b, c, d]
```

~I think for this specific case it makes sense to consider `null` values
as empty arrays, so that~ ~MV_APPEND(value, null) -> value~ ~It is
pretty uncommon for ESQL (all the other functions, apart from
`COALESCE`, short-circuit to `null` when one of the values is null), so
let's discuss this behavior.~

[EDIT] considering the feedback from Andrei, I changed this logic and
made it consistent with the other functions: now if one of the
parameters is null, the function returns null
2024-06-05 03:42:29 +10:00
Luigi Dell'Aquila 21952c7e36
ES|QL: add geo tests for mv_dedupe (#109342)
Adding more unit tests for MV_DEDUPE function, covering geo_point,
geo_shape, cartesian_point and cartesian_shape. This also adds docs for
Kibana.

Fixes https://github.com/elastic/elasticsearch/issues/108982
2024-06-05 03:33:14 +10:00
Iván Cea Fontenla f16f71e2a2
ESQL: Add ip_prefix function (#109070)
Added ESQL function to get the prefix of an IP. It works now with both
IPv4 and IPv6. For users planning to use it with mixed IPs, we may need
to add a function like "is_ipv4()" first.

**About the skipped test:** There's currently a "bug" in the
evaluators//functions that return null. Evaluators can't handle them.
We'll work on support for that in another PR. It affects other
functions, like `substring()`. In this function, however, it only
affects in "wrong" cases (Like an invalid prefix), so it has no impact.

Fixes https://github.com/elastic/elasticsearch/issues/99064
2024-05-29 10:23:45 -04:00
Luigi Dell'Aquila a5b1848c14
ES|QL: more tests for coalesce() function (#109032)
Adding more unit tests for `coalesce()` function, in particular adding
tests for `ip`, `date` and spatial data types.

This also generates the right signatures for Kibana.

Related to https://github.com/elastic/elasticsearch/issues/108982
2024-05-27 04:36:06 -04:00
Alexander Spies 16a5d248b7
ESQL: Clone ql for esql (#108773)
Part of https://github.com/elastic/elasticsearch/issues/106679

* Copy the `ql` project into a different project _just for esql_, call it `esql-core`.
* Make `esql` depend only on the latter.
* Fix `EsqlNodeSubclassTests`; I'm confused why this didn't bite us earlier.
* Update the warning regexes in some csv tests as the exceptions have other package names now.

**Note to reviewers:** Exclude the first commit when viewing the diff,
as that contains only the actual copying of `ql`. The remaining commits
are the actually meaningful ones. _The `build.gradle` files probably
require the most attention._
2024-05-22 04:35:17 -04:00
Iván Cea Fontenla 62b372b4dc
ESQL: CBRT function (#108574)
- Added the cube root function to ESQL (`CBRT(x)`). Nearly identical to SQRT, but without the negative numbers exception
- Added docs generation support for Windows end lines (CRLF), as within the examples, it was writing the "\r" without the "\n" (Which was being converted to "\\n"), and some other inconsistencies
- Some updates to `package-info.java` documentation over how to create functions
- Fixes https://github.com/elastic/elasticsearch/issues/108675

Functions issue: https://github.com/elastic/elasticsearch/issues/98545
2024-05-15 16:50:15 +02:00
Fang Xing 172c05918c
[DOCS] ES|QL implicit casting (#108618)
* implicit casting doc
2024-05-15 09:07:09 -04:00
Fang Xing 11de886346
[ES|QL] Add/Modify annotations for spatial and conditional functions for better doc generation (#107722)
* annotation for spatial functions and conditional functions
2024-05-10 14:49:25 -04:00
Luigi Dell'Aquila fed808850d
ES|QL: Add unit tests for now() function (#108498) 2024-05-10 14:28:19 +02:00
Bogdan Pintea de725aef80
Add docs clarifications on DATE_DIFF args (#108301)
This adds some clarifications on the time unit strings the function
takes as arguments, noting the differences between these and the time
span literals, as well as the abbreviations' source.
2024-05-07 12:59:01 +02:00
Bogdan Pintea b26d7d3e14
Introduce an IP functions group (#108304)
This takes the CIDR_MATCH out of the operators group and adds it to a
new `IP functions` group.
The change also re-aranges the groups, grouping together the
type-specific functions and ordering them alphabetically.
2024-05-06 13:43:30 +02:00
Fang Xing 4daac77e3b
[ES|QL] Add/Modify annotations for operators for better doc generation (#108220)
* annotation for operators
2024-05-03 22:59:51 -04:00
Bogdan Pintea 5f4ef87c47
Fix docs generation of signatures for variadic functions (#107865)
This fixes the generation of the signatures for variadic functions,
except for those that take a list as last argument; i.e.  functions with
optional arguments (like ROUND) or functions with overloading-like
signatures (like BUCKET).
2024-05-03 15:37:22 +02:00
Fang Xing 7ae08306a0
mv functions (#107839)
Add annotations for MV functions for better doc generation.
2024-05-01 10:47:22 -04:00
Bogdan Pintea 4b5c5e2ded
Update BUCKET docs in source (#108005)
This applies a review proposed changes to the source, so that they're
synchronized to the generated output.
2024-04-29 14:27:20 +02:00
Nhat Nguyen 22aad7b201
Support metrics counter types in ESQL (#107877)
This commit adds support for numeric metrics counter fields in ES|QL. 
These counter types, including counter_long, counter_integer, and
counter_double, are different from their parent types. Users will have
limited interaction with these counter types, restricted to:

- Retrieving values without any processing
- Casting to their root type (e.g., to_long(a_long_counter))
- Using them in the metrics rate aggregation

These restrictions are intentional to prevent misuse. If users want to 
use them as numeric values, explicit casting to their root types is
required.
2024-04-26 12:15:48 -07:00
Bogdan Pintea a21242054b
ESQL: Document BUCKET as a grouping function (#107864)
This adds the documentation for BUCKET as a grouping function and the
addition of the "direct" invocation mode providing a span (in addition
to the auto mode).
2024-04-25 12:38:12 -04:00
Bogdan Pintea 7af45cc52e
ESQL: Document the cast operator (::) (#107871)
This documents the cast operator, `::`.
2024-04-25 10:10:59 -04:00
Bogdan Pintea 31f2fb85df
Docs: move STARTS/ENDS_WITH under string functions in the docs (#107867)
This moves the STARTS_WITH and ENDS_with under the strings functions
section (as they're not operators).
2024-04-25 09:41:11 -04:00
Bogdan Pintea 9482673fbe
Docs: move base64 functions under string functions (#107866)
This moves the TO_BASE64 and FROM_BASE64 from the type conversion
functions under string functions (they take a string as input and output
another string).
2024-04-25 13:57:45 +02:00
Fang Xing ad15d50863
[ES|QL] more doc generation via annotations (#107541)
Annotations for math functions, datetime functions, string functions, type conversion functions.
2024-04-22 14:43:36 -04:00
Mark Tozzi f620961812
[ESQL] Add in the autogenerated docs for a bunch of functions (#107633) 2024-04-18 14:09:30 -04:00
Bogdan Pintea a2c2e8fe47
ESQL: extend BUCKET with spans. Turn it into a grouping function (#107272)
This extends `BUCKET` function to accept a two-parameters-only
invocation: the first parameter remains as is, while the second is a
span. It can be a numeric (floating point) span, if the first argument
is numeric, or a date period or time duration, if the first argument is
a date.

Also, the function can now be invoked with the alias BIN.

Additionally, the function has been turned into a grouping-only function
and thus can only be used within a `STATS` command.
2024-04-16 12:57:18 +02:00
Fang Xing 353abef214
[ES|QL] Base64 decoding and encoding functions (#107390)
* add base64 functions
2024-04-15 18:39:26 -04:00
Nik Everett aac17616a3
ESQL: Improve tests and docs for some functions (#107331)
This improves the tests and docs for a few functions, specifically `E`,
`FLOOR`, `PI`, `POW`, and `ROUND`. The examples and tested signatures
will get copied into the docs and kibana signatures.


Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
2024-04-11 12:41:56 -04:00
Fang Xing 0075c1fb1e
[ES|QL] String literal implicit casting (#106932)
* string literal casting for scalar functions and arithmetic operations.
2024-04-10 21:20:12 -04:00