Commit Graph

168 Commits

Author SHA1 Message Date
Panagiotis Bailis 7563a724f0
Updating retriever documentation to better explain how filters are applied (#112201) 2024-08-26 16:15:31 +03:00
Larisa Motova 9ac0718d90
Add docs for shard level stats in node stats (#111082)
Fixes #111081
2024-08-13 14:59:03 +03:00
Mayya Sharipova 405e39660b
Support k parameter for knn query (#110233)
Introduce an optional k param for knn query

If k is not set, knn query has the previous behaviour:
- `num_candidates` docs  is collected from each shard. This `num_candidates` docs
are used for combining with results with other queries and aggregations on each shard.
- docs from all shards are merged to produce the top global `size` results

If k is set, the behaviour instead is following:
- `k` docs is collected from each shard. This `k` docs are used for
combining results with other queries and aggregations on each shard.
- similarly, docs from all shards are merged to produce the top global `size`
results.

Having `k` param makes it more intuitive for users to address their needs.
They also don't need to care and can skip `num_candidates` param for this query
as it is of more internal details to tune how knn search operates.

Closes #108473
2024-06-28 09:59:28 -04:00
Jim Ferenczi a6470fb86d
Fix cluster level dense vector stats (#107962)
The cluster level dense vector stats returns the total number of dense vector indices globally including the replicas.
This commit fixes the total to only include the value count of the primary indices.
This change aligns with the docs stats which also reports the number of primary documents when used in cluster stats.
The indices stats API still reports granular results for replicas and primaries so the information is not lost.
2024-06-18 17:45:02 +01:00
Kathleen DeRusso 8529bf71f6
Add SparseVectorStats (#108793)
* Add SparseVectorStats

* Update to use mappings in engine

* Update to be unique to primary shards

* Fix doc

* Fix null error in test

* Cleanup

* fix yaml

* remove comment

* add version to yaml

* Revert whitespace changes to stats doc

* fix yml test

* Checkstyle

* Fix NPE in test

* Update docs/changelog/108793.yaml

* Add link to sparse_vector field type in docs

* PR feedback

* Flesh out test a bit more

* PR feedback - alphabetize placement in docs

* Fix doc change
2024-06-17 11:42:14 -04:00
Nick Tindall 3ecdd77e97
[DOCS] Align docs to implementation for timeout parameters (#108593)
* [DOCS] Fix documentation for timeout-related parameters

Closes #108224
2024-05-16 13:05:39 +10:00
Nick Tindall 68a8664c21
[DOCS] Fix stored_fields parameter description (#98385) (#108445)
(referenced from get and multi_get API docs)

Closes #98385
2024-05-09 03:17:10 -04:00
David Turner fc287bde8b
Interpret `?timeout=-1` as infinite ack timeout (#107675)
APIs which perform cluster state updates typically accept the
`?master_timeout=` and `?timeout=` parameters to respectively set the
pending task queue timeout and the acking timeout for the cluster state
update. Both of these parameters accept the value `-1`, but
`?master_timeout=-1` means to wait indefinitely whereas `?timeout=-1`
means the same thing as `?timeout=0`, namely that acking times out
immediately on commit.

There are some situations where it makes sense to wait for as long as
possible for nodes to ack a cluster state update. In practice this wait
is bounded by other mechanisms (e.g. the lag detector will remove the
node from the cluster after a couple of minutes of failing to apply
cluster state updates) but these are not really the concern of clients.

Therefore with this commit we change the meaning of `?timeout=-1` to
mean that the acking timeout is infinite.
2024-04-30 09:54:15 -04:00
Liam Thompson 33a71e3289
[DOCS] Refactor book-scoped variables in `docs/reference/index.asciidoc` (#107413)
* Remove `es-test-dir` book-scoped variable

* Remove `plugins-examples-dir` book-scoped variable

* Remove `:dependencies-dir:` and `:xes-repo-dir:` book-scoped variables

- In `index.asciidoc`, two variables (`:dependencies-dir:` and `:xes-repo-dir:`) were removed.
- In `sql/index.asciidoc`, the `:sql-tests:` path was updated to fuller path
- In `esql/index.asciidoc`, the `:esql-tests:` path was updated idem

* Replace `es-repo-dir` with `es-ref-dir`

* Move `:include-xpack: true` to few files that use it, remove from index.asciidoc
2024-04-17 14:37:07 +02:00
David Turner ccbb5badce
Fix support for infinite `?master_timeout` (#107050)
Specifying `?master_timeout=-1` on an API which performs a cluster state
update means that the cluster state update task will never time out
while waiting in the pending tasks queue. However this parameter is also
re-used in a few places where a timeout of `-1` means something else,
typically to timeout immediately. This commit fixes those places so that
`?master_timeout=-1` consistently means to wait forever.
2024-04-10 18:32:38 +01:00
Parker Timmins e59dd0b60e
Add total size in bytes to doc stats (#106840) 2024-03-29 09:40:37 -05:00
Tommaso Teofili 7bff3b3bec
Add modelId and modelText to KnnVectorQueryBuilder (#106068)
* Add modelId and modelText to KnnVectorQueryBuilder

Use QueryVectorBuilder within KnnVectorQueryBuilder to make it
possible to perform knn queries also when a query vector is not
immediately available. Supplying a text_embedding query_vector_builder
with model_text and model_id instead of the query_vector will result
in the generation of a query_vector by calling inference on the
specified model_id with the supplied model_text (during query
rewrite). This is consistent with the way query vectors are built
from model_id / model_text in KnnSearchBuilder (DFS phase).
2024-03-18 16:13:38 +01:00
Panagiotis Bailis d471ccb5bb
Adding support for hex-encoded byte vectors on knn-search (#105393) 2024-03-13 09:24:51 +02:00
Jack Conradson 68b0acac8f
Add retrievers using the parser-only approach (#105470)
This enhancement adds a new abstraction to the _search API called "retriever." A 
retriever is something that returns top hits. This adds three initial retrievers called
"standard", "knn", and "rrf". The retrievers use a parser-only approach where they
are parsed and then translated into a SearchSourceBuilder to execute the actual
search.
---------

Co-authored-by: Mayya Sharipova <mayya.sharipova@elastic.co>
2024-03-12 10:11:55 -07:00
Panagiotis Bailis 7ce8d76559
Making k and num_candidates optional for knn search (#101209) 2024-02-01 15:43:09 +02:00
Benjamin Trent 7fde357f3a
Improve docs around knn similarity search (#103158)
Adding equations to the docs around how to best calculate similarity & score. The similarity parameter for search was added in 8.8.

The max-inner-product mentions will be removed for all versions before 8.11 when backporting.

closes: https://github.com/elastic/elasticsearch/issues/102924
2023-12-11 14:56:16 -05:00
Abdon Pijpelink 70128f5b74
[DOCS] Mark 'ignore_throttled' deprecated in all docs (#101838) 2023-11-07 13:03:49 +01:00
Keith Massey 92ec9d6605
Add executed pipelines to bulk api response (#100031)
This change allows users to pass a new list_executed_pipelines parameter
to the bulk API, which results in an executed_pipelines list being returned.
2023-10-17 09:39:09 -05:00
Marantidis Kiriakos ea42c2e076
boxplot support for transform 52189 (#96515) 2023-07-24 10:11:26 +02:00
debadair 777598d602
[DOCS] Remove redirect pages (#88738)
* [DOCS] Remove manual redirects

* [DOCS] Removed refs to modules-discovery-hosts-providers

* [DOCS] Fixed broken internal refs

* Fixing bad cross links in ES book, and adding redirects.asciidoc[] back into docs/reference/index.asciidoc.

* Update docs/reference/search/point-in-time-api.asciidoc

Co-authored-by: James Rodewig <james.rodewig@elastic.co>

* Update docs/reference/setup/restart-cluster.asciidoc

Co-authored-by: James Rodewig <james.rodewig@elastic.co>

* Update docs/reference/sql/endpoints/translate.asciidoc

Co-authored-by: James Rodewig <james.rodewig@elastic.co>

* Update docs/reference/snapshot-restore/restore-snapshot.asciidoc

Co-authored-by: James Rodewig <james.rodewig@elastic.co>

* Update repository-azure.asciidoc

* Update node-tool.asciidoc

* Update repository-azure.asciidoc

---------

Co-authored-by: amyjtechwriter <61687663+amyjtechwriter@users.noreply.github.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Amy Jonsson <amy.jonsson@elastic.co>
Co-authored-by: James Rodewig <james.rodewig@elastic.co>
2023-05-24 12:32:46 +01:00
Jean-Pierre Matsumoto 87b8f1cf73
Bad ref to 'node_id' parameter in Task Mgt doc (#90380)
* Bad ref to 'node_id' parameter in Task Mgt doc

Here in documentation:
https://www.elastic.co/guide/en/elasticsearch/reference/8.4/tasks.html#tasks-api-query-params

Name of parameter to filter node is `nodes`.

* Change 'node_id' into 'nodes' and add 'nodes' as a common param

---------

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
2023-05-03 11:55:29 +02:00
Przemysław Witek 2b70165ffd
[Transform] Allow specifying destination index aliases in the Transform's dest config (#94943) 2023-04-17 15:08:43 +02:00
Benjamin Trent f23b906891
Add new `similarity` field to `knn` clause in `_search` (#94828)
This adds a new parameter to `knn` that allows filtering nearest neighbor results that are outside a given similarity.

`num_candidates` and `k` are still required as this controls the nearest-neighbor vector search accuracy and exploration. For each shard the query will search `num_candidates` and only keep those that are within the provided `similarity` boundary, and then finally reduce to only the global top `k` as normal.

For example, when using the `l2_norm` indexed similarity value, this could be considered a `radius` post-filter on `knn`.

relates to: https://github.com/elastic/elasticsearch/issues/84929 && https://github.com/elastic/elasticsearch/pull/93574
2023-03-28 15:29:01 -04:00
Keith Massey ebb860d1af
Adding to the documentation and tests for the _none pipeline (#93057) 2023-01-23 14:09:57 -06:00
István Zoltán Szabó e16bee0e72
[DOCS] Adds bullet points to the statuses of the health object in transform stats API docs (#91790) 2022-11-22 15:18:08 +01:00
Craig Taverner 81d5859f61
Added documentation for cartesian-bounds aggregation (#91623)
* Added documentation for cartesian-bounds aggregation

* Fixed rounding errors in docs tests
2022-11-18 11:00:41 +01:00
István Zoltán Szabó e715f3c737
[DOCS] Adds KNN object sub-properties individually to common params (#91503) 2022-11-10 17:23:55 +01:00
István Zoltán Szabó ed452fb53d
[DOCS] Adds knn object to common parameters (#91464) 2022-11-10 11:21:01 +01:00
Przemysław Witek 95f484c4fd
[Transform] Expand the docs section regarding mappings deduction in transform's dest index (#91077) 2022-10-24 13:43:22 +02:00
Craig Taverner 4c5d24610f
Centroid aggregation for cartesian points and shapes (#89216)
Added Cartesian support for centroid aggregation

* First draft of cartesian-centroid docs
  However, this is largely a duplicate of geo-centroid docs since they are essentially identical behaviour. We should consider merging them.
* Work on isAggregatable caused a minor logic conflict. When that work was done, Point and Shape were not aggregatable, but now they are.
2022-09-28 17:14:30 +02:00
István Zoltán Szabó 45646b78e2
[DOCS] Adds missing_bucket setting to transform APIs (#90111) 2022-09-19 15:22:48 +02:00
István Zoltán Szabó 9a71d1fa78
[DOCS] Clarifies retention policy for transforms (#89685) 2022-08-29 13:17:15 +02:00
István Zoltán Szabó accf737145
[DOCS] Adds unattended setting to transforms API docs. (#89335) 2022-08-29 11:46:52 +02:00
István Zoltán Szabó 226b8a260e
[DOCS] Modifies the description of frequency. (#89128) 2022-08-08 15:44:00 +02:00
Andrey ebde65d2a2
Remove suggest flag from index stats docs (#85479) 2022-07-14 12:50:53 -04:00
Przemysław Witek 8656a29675
[Transform] Implement per-transform num_failure_retries setting. (#87361) 2022-06-09 15:22:06 +02:00
Nik Everett 86effae55c
Docs: Data streams only support `create` (#87263)
This removes "data streams" from the docs for the `index`, `delete`,
and `update` actions because data streams only support the `update`
action.

Closes #87231
2022-06-08 13:41:42 -04:00
Przemysław Witek 70e37ae7c6
[Transform] Support `range` aggregation in transform (#86501) 2022-05-16 15:21:00 +02:00
James Rodewig f9a64b2e86
[DOCS] Fix `ignore_unavailable` parameter definition (#84071)
The current `ignore_unavailable` definition is a bit misleading. The parameter primarily determines if a request that targets a missing or closed index returns an error.
2022-02-17 08:24:06 -05:00
Lisa Cawley 0a16177f40
[DOCS] Fix formatting in cat transforms API (#82899) 2022-01-20 17:45:43 -08:00
Przemysław Witek 7be74a8046
Introduce `deduce_mappings` transform setting (#82256) 2022-01-18 09:01:23 +01:00
Ievgen Degtiarenko 11b52619c5
do not scroll if max docs is less than scroll size (update/delete by query) (#81654)
This change allows to not open scroll while reindex/delete_by_query/update_by_query
if configured max_docs if less then or equal to the number of documents returned by the scroll batch.
2021-12-21 15:26:51 +01:00
Lisa Cawley 4ed6e8ad3c
[DOCS] Adds missing timeout parameter to transform APIs (#81129) 2021-12-02 13:28:28 -08:00
James Rodewig cf818edcde
[DOCS] Fix syntax error in bulk `dynamic_templates` docs (#81264) 2021-12-02 14:05:21 -05:00
Lisa Cawley 8ab03d021c
[DOCS] Edits reset transforms API (#81027) 2021-11-25 08:40:50 -08:00
James Rodewig 7940e0777c
[DOCS] Re-add several query params to search API docs (#79716)
PR #55884 removed documentation for several query parameters from the search API
docs. During tests, I failed to notice that these are valid parameters but require other parameters to use.

Changes:

* Notes the following search API parameters require the `q` query string parameter:

  * `analyzer`
  * `analyze_wildcard`
  * `default_operator`
  * `df`
  * `lenient`

* Notes the following search API parameters require the `suggest_field` and `suggest_text` query parameters:

  * `suggest_mode`
  * `suggest_size`

* Re-adds the above parameters to the search API docs.

These changes also affect API documentation that reuses the search API parameters:

* Delete by query API
* Update by query API
* Count API
* Explain API
* Validate API

Closes #79674
2021-10-25 11:58:54 -04:00
Przemysław Witek 1595d3a20f
[Transform] Add _meta field to TransformConfig (#79003) 2021-10-15 08:12:03 +02:00
István Zoltán Szabó 81851684d1
[DOCS] Fixes description of index_total property for GET Transforms stats API docs. (#77354) 2021-09-08 08:53:15 +02:00
James Rodewig cfae69717a
[DOCS] Update anchor and add redirect for aliases (#77349)
PRs #73062 and #73043 repurposed the `alias` anchor for a new guide for index
and data stream aliases. Previously, this anchor was used for our field alias
documentation.

Repurposing the anchor has caused continuity errors for users selecting
different versions of the ES docs. It could also cause confusion for users with
a `/current/` link to the `alias` page.

This updates the anchor for the alias guide and adds a redirect page to
disambiguate the `alias` anchor.

It also fixes a bread crumb issue for redirects following the 'Modifying your
Data' redirect page.

Closes #77034.
2021-09-07 09:42:42 -04:00
Przemysław Witek ec07e4213e
[Transform] Rename interim_results to align_checkpoints (#76609) 2021-08-18 13:58:50 +02:00