The current docs mention that Elasticsearch indexes prefixes between 2 and 5 characters in a separate field. 2 and 5 are default values, and the size of the prefixes indexed depend on the configuration settings.
* Soft-deprecation of point/geo_point formats
Since GeoJSON and WKT are now common formats for all three types:
geo_shape, geo_point and point
We decided to soft-deprecate the other point formats by ordering:
* GeoJSON (object with keys `type` and `coordinates`)
* WKT `POINT(x y)`
* Object with keys `lat` and `lon` (or `x` and `y` for point)
* Array [lon,lat]
* String `"lat,lon"` (or `"x,y"` in point)
* String with geohash (only in `geo_point`)
The geohash is last because it is only in one field type.
The string version is second last because it is the most controversial
being the only version to reverse the coordinate order from all other
formats (for geo_point only, since the coordinates are not reversed
in point).
In addition we replaced many examples in both documentation and tests
to prioritize WKT over the plain string format.
Many remaining examples of array format or object with keys still exist
and could be replaced by, for example, GeoJSON, if we feel the need.
* Incorrect quote position
Documents the `EMPTY` and `NONE` `flag` values for the `regexp` query.
Also documents the `""` (empty string) value, which is an alias for `ALL`.
Closes#81978.
Changes:
* Notes that the query string query's `default_field` and `fields` parameters support wildcards.
* Adds an xref to the `index.query.default_field` docs to the `default_field` parameter.
The current `multi_match` docs contain an erroneous reference to the `combined_fields` query. This updates the reference to reference the correct query.
Relates to https://github.com/elastic/elasticsearch/pull/76893
Removes `testenv` annotations and related code. These annotations originally let you skip x-pack snippet tests in the docs. However, that's no longer possible.
Relates to #79309, #31619
Changes:
* Documents the `wildcard` parameter for the `wildcard` query. This parameter is an alias for the `value` parameter.
* Reorders the parameters alphabetically.
Closes#79711
As the script has only access to the nested document, this should be
documented.
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Adds additional information about how Elasticsearch uses polygon orientation. Elasticsearch only uses a polygon's orientation to determine if it crosses the international dateline. If so, Elasticsearch splits the polygon at the dateline.
Closes#74891
Changes:
* Notes that you can't use cross-cluster search to run a terms lookup on a remote index.
* Removes a redundant sentence noting `_source` is enabled by default.
Closes#61364.
Changes:
* Use "geopoint" when not referring to the literal field type
* Use "geoshape" when not referring to the literal field type or query type
* Use "GeoJSON" consistently
The current `ids` option doesn't allow pinning a specific document in a
single index when searching over multiple indices. This introduces a
`documents` option, which is an array of `_id` and `_index`
fields to allow index-specific pins.
Closes https://github.com/elastic/elasticsearch/issues/67855.
In the upcoming Lucene 9 release, `indices.query.bool.max_clause_count` is
going to apply to the entire query tree rather than per `bool` query. In order
to avoid breaks, the limit has been bumped from 1024 to 4096.
The semantics will effectively change when we upgrade to Lucene 9, this PR
is only about agreeing on a migration strategy and documenting this change.
To avoid further breaks, I am leaning towards keeping the current setting name
even though it contains `bool`. I believe that it still makes sense given that
`bool` queries are typically the main contributors to high numbers of clauses.
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
`field_masking_span` is the only span query that does not begin with
`span_`. This commit deprecates the existing name and adds a new
name `span_field_masking` to better fit with the other queries.
* Removes docs and references for the following `geo_shape` mapping parameters:
* `tree`
* `tree_levels`
* `strategy`
* `distance_error_pct`
* Updates a related breaking change.
Relates to #70850
This PR introduces a new query called `combined_fields` for searching multiple
text fields. It takes a term-centric view, first analyzing the query string
into individual terms, then searching for each term any of the fields as though
they were one combined field. It is based on Lucene's `CombinedFieldQuery`,
which takes a principled approach to scoring based on the BM25F formula.
This query provides an alternative to the `cross_fields` `multi_match` mode. It
has simpler behavior and a more robust approach to scoring.
Addresses #41106.
This adds a "note" on the docs for the script query pointing folks to
runtime fields because they are more flexible. It also translates the
request example into runtime fields.
Relates to #69291
Co-authored-by: Adam Locke <adam.locke@elastic.co>
When performing a multi_match in cross_fields mode, we group fields based on
their analyzer and create a blended query per group. Our docs claimed that the
group scores were combined through a boolean query, but they are actually
combined through a dismax that incorporates the tiebreaker parameter.
This commit updates the docs and adds a test verifying the behavior.
The doc is misleading : The following intervals search returns documents containing `my favorite food` **immediately** followed by `hot water` or `cold porridge`
max_gaps apply only to the match query and is not used for checking proximity with the other match, the example given actually`This search would match a my_text value of my favorite food is cold`
Co-authored-by: Julien Guay <guay_j@yahoo.fr>
Currently, if you write a date range query with numeric 'to' or 'from' bounds,
they can be interpreted as years if no format is provided. We use
"strict_date_optional_time||epoch_millis" in this case that can interpret inputs
like 1000 as the year 1000 for example.
This PR change this to always interpret and parse numbers with the "epoch_millis"
parser if no other formatter was provided.
Closes#63680
The original description of per-field boosting is incorrect. Boosting a
field does not imply that it is more important relative to other fields.
It simply means that the score is multiplied by the supplied boost
value. Due to the differences in each field's term and document
statistics, it's not possible to imply relative importance of fields
based on the per-field boost value alone.
Added case insensitive support for regex queries.
Forks a copy of Lucene’s RegexpQuery and RegExp from Lucene master.
This can be removed when 8.7 Lucene is released.
Closes#59235
Changes:
* Moves "Notes" sections for the joining queries and percolate query
pages to the parent page
* Adds related redirects for the moved "Notes" pages
* Assigns explicit anchor IDs to other "Notes" headings. This was required for
the redirects to work.
Plugin discovery documentation contained information about installing
Elasticsearch 2.0 and installing an oracle JDK, both of which is no
longer valid.
While noticing that the instructions used cleartext HTTP to install
packages, this commit replaces HTTPs links instead of HTTP where possible.
In addition a few community links have been removed, as they do not seem
to exist anymore.
Moves the search sort docs from the deprecated 'Request Body Search'
page to a new subpage of 'Run a search'.
No substantive changes were made to the content.
Moves the highlighting docs from the deprecated 'Request Body Search'
chapter to the new subpage of the 'Run a search chapter' section.
No substantive changes were made to the content.
This commit highlights the ability for geo_point fields to be
used in geo_shape queries. It also adds an explicit geo_point
example in the geo_shape query documentation
Closes#56927.
The `lowercase_expand_terms`, `locale` and `all_fields` parameters for
`simple_query_string` have been deprecated and a no-op for at least until 6.0
so we can remove them in 8.0
Warn about potential performance impact when a large number of fields
is used with query string query and no default field.
Re-adds content from #35570.
That content was erroneously removed in #45296.
Co-authored-by: Peter Dyson <peter.dyson@geekpete.com>
This commit adds a new point field that is able to index arbitrary pair of values (x/y)
in the cartesian space. It only supports filtering using shape queries at the moment.
Looking into #50237 I realized that two of the examples given in the
documentation around date math rounding for range queries on date fields using
`gt` and `lt` is slightly off by a nanosecond. This PR changes this to the
bounds that are currently parsed using these parameters.
The terms-lookup section of our terms query docs currently state that the
index, id and path fields are optional. They should be marked instead
as required.
Before boost in script_score query was wrongly applied only to the subquery.
This commit makes sure that the boost is applied to the whole score
that comes out of script.
Closes#48465
Add a new cluster setting `search.allow_expensive_queries` which by
default is `true`. If set to `false`, certain queries that have
usually slow performance cannot be executed and an error message
is returned.
- Queries that need to do linear scans to identify matches:
- Script queries
- Queries that have a high up-front cost:
- Fuzzy queries
- Regexp queries
- Prefix queries (without index_prefixes enabled
- Wildcard queries
- Range queries on text and keyword fields
- Joining queries
- HasParent queries
- HasChild queries
- ParentId queries
- Nested queries
- Queries on deprecated 6.x geo shapes (using PrefixTree implementation)
- Queries that may have a high per-document cost:
- Script score queries
- Percolate queries
Closes: #29050
It is fairly common to filter the geo point candidates in
geohash_grid and geotile_grid aggregations according to some
viewable bounding box. This change introduces the option of
specifying this filter directly in the tiling aggregation.
This is even more relevant to `geo_shape` where the bounds will restrict
the shape to be within the bounds
this optional `bounds` parameter is parsed in an equivalent fashion to
the bounds specified in the geo_bounding_box query.