* Support ST_INTERSECTS between geometry column and other geometry or string
* Pushdown to lucene for ST_INTERSECTS on GEO_POINT
* Get geo_shape working in ST_INTERSECTS bypassing SingleValueQuery
* Initial work to support cartesian shape queries in ESQL
* Fixed CSV tests for combined ST_INTERSECTS and ST_CENTROID
* Fixed bug in point-in-shape query for CARTESIAN_POINT
* Added unit tests for SpatialIntersects and fixed a few bugs found
* Added comments to public ShapeQueryBuilder class
* Move calls to random() later to avoid security exception
* Refined type checking support in ST_INTERSECTS
Improved the combinations supported as preparation for removing the uly try/catch way of detecting the difference between WKT and WKB in some code.
* Fixed bugs in incorrect use of doc-values in parameter type matching
Also made a few reminfments, including removing one try/catch approach to differentiating between WKT and WKB.
* Removed second place where we used try/catch to differentiate WKT from WKB
This was a workaround for a mistake in the planning, where we incorrectly mapped incoming types to the wrong FieldEvaluators. We fixed that mistake in an earlier commit.
* Fixed flaky tests were GEO was treated as CARTSIAN
We assumed if the incoming types were constants, they had no CRS, even when they did, which was wrong. For shapes crossing the dateline this lead to different (incorrect) behaviour.
* Fixed a flaky test by removing some point==point optimizations
* Moved spatial intersects to 'spatial' package
When we developed the ST_CENTROID work, this was requested, so let's do it here too.
* Use normal switch on enums
* Cleanup some static utility methods
Now all code paths that can convert a constant string to a geometry use the same code.
* Fixed bugs with non-quantized coordinates, and cleaned up code a little
* Fixed failing test after change to evaluator class names
* Refactored SpatialRelatesFunction into three files, and made evaluatorRules static
This was a general cleanup, making the code more organized, but did also achieve static evaluator rules so we don't re-created these on every query parsing.
* Fixed compile error after rebase
* Removed ConstantAndConstant support, using fold() correctly instead
* better error on circles
* Make sure compound predicates are supported in use-doc-values pushdown
* Testing ENRICH with ST_INTERSECTS
This required adding new data for an ENRICH index, and this data could be tested with a few other related tests, which were also added.
* Added missing mixed-cluster rules for testing only with 8.14
* Fixed some mixed-cluster issues where we failed to mark test for only 8.14
Also added an interesting polygon-polygon intersection case from real data.
* Fix flaky test where cartesian polygons were generated from geo
* Remove support for string literals in ST_INTERSECTS
* Fix failing tests after removing string support
* Removed unused code from previous string literal support (WKT parsing)
* Support case where both fields are points and doc-values
If we have an ST_INTERSECTS and an ST_CENTROID, the centroid asks to load the points as doc-values, and the ST_INTERSECTS needs to therefor support two doc-values points.
* Disallow more than one field from doc-values for ST_INTERSECTS
* Remove unused evaluator classes
* Add tests for multiple doc-values if not in same intersects
* Fix errors after rebase on main
* Fixed bug in missing support for spatial function expressions in EVAL
When a spatial aggregate expects doc-values, this was not being communicated to spatial functions in EVAL, only in WHERE.
* Reduce flaky tests when reading directly from enrich source indices
The test framework does not expect enrich source indices to be used directly in queries, leading to duplicated results on multi-node clusters, so we edit the queries to be less sensitive to this case.
* Fixed failing test
* Code style
* Fixed test file name and added function name annotation
* Added documentation for st_intersects
* Fixed failing show functions test
* Code review changes, notably simplifying the type resolution
* Fixed broken docs link
* Add two new OGC functions ST_X and ST_Y
Recently Nik did work that involved extracting the X and Y coordinates from geo_point data using `to_string(field)` followed by a DISSECT command to re-parse the string to get the X and Y coordinates.
This is much more efficiently achieved using existing known OGC functions `ST_X` and `ST_Y`.
* Update docs/changelog/105768.yaml
* Fixed invalid changelog yaml
* Fixed mixed cluster tests
* Fixed tests and added docs
* Removed false impression that these functions were different for geo/cartesian
With the use of WKB as the core type in the compute engine, many spatial functions are actually the same between these two types, so we should not give the impression they are different.
* Code review comments and reduced object creation.
* Revert temporary StringUtils hack, and fix bug in x/y extraction from WKB
* Revert object creation reduction
* Fixed mistakes in documentation
* Fix automatic generation of spatial function types files
The automatic mapping of spatial function names from class names was not working for spatial types, so the automatic generation of these files did not happen, and in fact existing files were deleted.
In addition, the generation of aggregation functions types does not yet exist at all, so the st_centroid.asciidoc file was always deleted. Until such support exists, this files contents will be moved back into the function definition file.
The railroad diagrams for syntax are now also created, however, not all functions in the documentation actually use these, and certainly none of the `TO_*` type-casting functions do, so we'll not include links to them from the docs, and leave that to the docs team to decide. Personally, while these diagrams are pretty, they contain no additional informational content, and in fact give a cluttered impression to the documentation visual appeal.
* Refined to use an annotation which is more generic
This creates the `MV_FIRST` and `MV_LAST` functions that return the
first and last values from a multivalue field. They are noops from a
single valued field. They are quite similar to `MV_MIN` and `MV_MAX`
except they work on positional data rather than relative size. That
sounds like a large distinction, but in practice our multivalued fields
are often sorted. And when they operate on sorted arrays `MV_MIN` does
*the same* thing as `MV_FIRST`.
But there are some cases where it really does matter - say you are
`SPLIT`ing something - so `MV_FIRST(SPLIT("foo;bar;baz", ";"))` gets you
`foo` like you'd expect. No sorting needed.
Relates to #103879
Improve the docs for is_nan, is_finite, is_infinite functions.
This also adjusts the CamelCase to snake_case conversion, to not
consider the last capital letter (like in `IsNaN`).
This adds a tiny blurb for each operator to the docs with a railroad
diagram of the operator's syntax and a table of the input and output
types. This also fixes the tests to correctly generate the tables for
operators.
This adds the missing unit tests for the conversion functions.
It also extends the type support by adding the `TEXT` type to those functions that support `KEYWORD` already (which also simplifies the testing, actually). Some functions did have it, some didn't; they now all do.
The change also fixes two defects resulting from better testing coverage: `ToInteger` and `ToUnsignedLong` had some missing necessary exceptions declarations in the decorators for the evaluators.
It also updates `ToInteger`'s `fromDouble()` conversion to use a newly added utility, so that the failed conversions contain the right message (`out of [integer] range`, instead of the confusing `out of [long] range`).
Related: #102488, #102552.
This adds more tests for some of the `MV_` functions and updates their
docs now that the railroad diagram and table generated by the tests
covers all of the types.
This prevents `CONCAT` from using an unbounded amount of memory by
hooking it's temporary value into the circuit breaker. To do so, it
makes *all* `ExpressionEvaluator`s `Releasable`. Most of the changes in
this PR just plumb that through to every evaluator. The rest of the
changes correctly release evaluators after their use.
I considered another tactic but didn't like it as much, even though the
number of changes would be smaller - I could have created a fresh,
`Releasable` temporary value for every `Page`. It would be pretty
contained keep the releasable there. But I wanted to share the temporary
state across runs to avoid a bunch of allocations.
Here's a script that used to crash before this PR but is fine after:
```
curl -uelastic:password -XDELETE localhost:9200/test
curl -HContent-Type:application/json -uelastic:password -XPUT localhost:9200/test -d'{
"mappings": {
"properties": {
"short": {
"type": "keyword"
}
}
}
}'
curl -HContent-Type:application/json -uelastic:password -XPUT localhost:9200/test/_doc/1?refresh -d'{"short": "short"}'
echo -n '{"query": "FROM test ' > /tmp/evil
for i in {0..9}; do
echo -n '| EVAL short = CONCAT(short' >> /tmp/evil
for j in {1..9}; do
echo -n ', short' >> /tmp/evil
done
echo -n ')' >> /tmp/evil
done
echo '| EVAL len = LENGTH(short) | KEEP len"}'>> /tmp/evil
curl -HContent-Type:application/json -uelastic:password -XPOST localhost:9200/_query?pretty --data-binary @/tmp/evil
```
This adds tests, supported types, and a signature image for `to_string`
and `to_version`. It also fixes the resolution of functions who's names
contain an `_`
Finally, it updates the docs for `ceil` to render the image more nicely.
Add the 'right' function, which extracts a substring beginning from its
right end (opposite function of 'left').
---------
Co-authored-by: Alexander Spies <alexander.spies@elastic.co>
CI will skip building them. Lot's of CI machines don't have font support
so they can't generate these. But all local machine have a GUI so they
can.
Also, super-lazy initialize the font so CI don't bump into it by
accident.
Closes#99018
Add the unary scalar function CEIL.
Analogously to FLOOR, it rounds up its argument.
- Implement CEIL, add it to the function registry and make sure it is serializable.
- Add csv tests, unit tests and docs.
- Add additional csv tests with different data types and some edge cases for both CEIL and FLOOR
- Add unit tests and update docs for FLOOR.
Locks the railroad diagrams to always use the same font, this one named
`roboto mono`. This makes sure that when we render the railroad diagrams
we always size them the same way. Because everyone has a copy of roboto
mono. Because gradle resolves that dependency.
This generates a "railroad diagram" svg image that can be embedded into
the docs for any function to explain it's syntax. It's basic, but it's
something we can iterate on.
It also generates a table of supported types from the list of types that
we test. It can be included in the docs for reference as well.