elasticsearch

Commit Graph

Author	SHA1	Message	Date
Kathleen DeRusso	7a1d532ffb	Pass over Sparse Vector docs for correctness (#110282 ) * Remove legacy mentions of text expansion queries * Add missing query_vector param to sparse_vector query docs * Fix formatting errors in sparse vector query dsl doc * Remove unnecessary test setup block	2024-07-02 13:37:25 -04:00
George Wallace	dea593db3f	Update behavioral-analytics-start.asciidoc (#110271 )	2024-06-28 09:01:48 -06:00
Kathleen DeRusso	19fc0d9cad	Deprecate text_expansion and weighted_tokens queries (#109880 )	2024-06-27 13:24:57 -04:00
István Zoltán Szabó	31f0253b43	[DOCS] Adds link to ES-Cohere notebook and clarifies requirements. (#110195 )	2024-06-26 17:22:40 +02:00
Pius	79623c7609	Update search-application-api.asciidoc (#110113 ) Add a subsection about cross cluster search support (or the lack of).	2024-06-26 12:20:28 +02:00
Benjamin Trent	1c1733d823	Add some docs explaining filter performance and behavior for HNSW (#110108 )	2024-06-25 08:42:24 -04:00
Kathleen DeRusso	41a61b069b	Mark Query Rules as GA (#110004 ) * Mark query rules APIs as stable * Remove preview label from docs * Update docs/changelog/110004.yaml	2024-06-21 15:26:51 -04:00
Benjamin Trent	3aed0afb2b	Add new int4 quantization to dense_vector (#109317 ) This adds a new quantization mechanism for HNSW and flat indices. Here we add `int4` quantization via the `int4_hnsw` and `int4_flat` index types. This quantization methodology further reduces the memory required for fast HNSW, meaning that the memory required is 8x smaller than with regular float32 values. 8x reduction means that 1M 1024 dimension vectors goes from requiring 3.8GB to 477MB. Recall continues to stay steady, there is some reduction that is recoverable via slightly oversampling and reranking. For example over 500k CohereV3 vectors, only 5 extra vectors are required to be gathered to achieve over 0.98 recall in a brute-force scenario. ![recall](https://github.com/elastic/elasticsearch/assets/4357155/b47a79d0-020d-4baa-8199-41a932df00f7)	2024-06-18 00:15:43 +10:00
Benjamin Trent	a5fbfe81b2	Merge remote-tracking branch 'upstream/main' into lucene_snapshot_9_11	2024-06-07 07:24:43 -04:00
Panagiotis Bailis	1c3b3d8f11	Adding support for explain in rrf (#108682 )	2024-06-07 11:09:06 +03:00
Benjamin Trent	d3561f9cf3	Merge remote-tracking branch 'upstream/main' into lucene_snapshot_9_11	2024-06-06 18:22:08 -04:00
István Zoltán Szabó	d89dae2a32	[DOCS] Modifies semantic search-related docs to refer to the `semantic_text` workflow (#109418 ) Co-authored-by: Carlos Delgado <6339205+carlosdelest@users.noreply.github.com>	2024-06-06 16:45:46 +02:00
Benjamin Trent	ac53d6020b	Merge remote-tracking branch 'upstream/main' into lucene_snapshot_9_11	2024-06-05 12:38:23 -04:00
Mark J. Hoy	80a22ec046	[Inference API] Add Docs for Mistral Embedding Support for the Inference API (#109319 ) * Initial docs for put-inference for Mistral * adds mistral embeddings to tutorial; add changelog * update mistral text and dimensions * fix mistral spelling error * fix azure AI studio; fix Mistral label * fix auto-formatted items * change pipeline button back to azure openai * put proper Azure AI Studio include in * fix missing azure-openai; fix huggingface hidden * fix mistral tab for reindex * re-add Mistral service settings to put inference	2024-06-05 11:23:29 -04:00
Benjamin Trent	9cd123d6cc	Merge remote-tracking branch 'upstream/main' into lucene_snapshot_9_11	2024-06-02 16:46:19 -04:00
István Zoltán Szabó	95ce898436	[DOCS] Adds docs to semantic text (#108311 ) Co-authored-by: Carlos Delgado <6339205+carlosdelest@users.noreply.github.com> Co-authored-by: Mike Pellegrini <mike.pellegrini@elastic.co> Co-authored-by: Kathleen DeRusso <kathleen.derusso@elastic.co>	2024-05-31 16:56:07 +02:00
elasticsearchmachine	7b5925f4b6	Merge remote-tracking branch 'origin/main' into lucene_snapshot	2024-05-30 10:01:52 +00:00
István Zoltán Szabó	1413c67d99	[DOCS] Amends inference reference docs and tutorials (#109159 ) * [DOCS] Fixes inference tutorial widgets. * [DOCS] Adds link to notebooks, rearranges sections in PUT inference API docs.	2024-05-29 17:43:10 +02:00
Liam Thompson	b6241711ef	[DOCS] Update CCS matrix for 8.14 (#109142 )	2024-05-29 14:00:19 +02:00
ChrisHegarty	45a51d558c	Merge branch 'main' into lucene_snapshot	2024-05-23 14:03:51 +01:00
Kathleen DeRusso	74d7010a8f	Rename rule query and add support for multiple rulesets (#108831 )	2024-05-22 15:20:34 -04:00
Panagiotis Bailis	06957f6e31	Adding score from RankDoc to SearchHit (#108870 )	2024-05-22 15:43:49 +03:00
Kathleen DeRusso	3911061869	Update Search Applications docs with more introductory information about search templates (#108697 ) * Update Search Applications docs with more introductory information about search templates * Add docs tests * Skip test * Fix test * Unskip test * Add comment RE: reasoning behind test setup	2024-05-22 08:04:14 -04:00
elasticsearchmachine	0ce5dadc6b	Merge remote-tracking branch 'origin/main' into lucene_snapshot	2024-05-15 10:02:23 +00:00
Liam Thompson	37668fa418	[DOCS] Add crosslink to update retriever.asciidoc (#108608 ) Link to https://www.elastic.co/guide/en/elasticsearch/reference/master/retrievers-overview.html from API reference	2024-05-15 10:10:08 +02:00
ChrisHegarty	e8faece732	Merge branch 'main' into lucene_snapshot	2024-05-13 11:35:51 +01:00
Mayya Sharipova	2337eb05a0	Unified Highlighter to support matched_fields (#107640 ) Add support to the Unified highlighter to combine matches on multiple fields to highlight a single field: "matched_fields". Based on Lucene PR: https://github.com/apache/lucene/pull/13268 Lucene PR is based on the concept of masked fields where masked fields are different from the original highlighted field. This PR in Elasticsearch uses the already existing highlighter parameter "matched_fields".	2024-05-09 10:35:29 -04:00
István Zoltán Szabó	06a0758769	[DOCS] Fixes typo in Cohere ES tutorial (#108456 ) * [DOCS] Fixes typo in Cohere ES tutorial. * [DOCS] Fixes list.	2024-05-09 15:09:11 +02:00
István Zoltán Szabó	fa2f81353e	[DOCS] Adds complete Cohere tutorial (#108415 )	2024-05-09 09:59:56 +02:00
Liam Thompson	b2ebaeee7b	[DOCS] Add retrievers overview (#107959 )	2024-05-07 18:20:49 +02:00
Liam Thompson	1be1110740	[DOCS] Clarify `retriever` is not API (#108295 )	2024-05-06 15:52:25 +02:00
Michael Peterson	a451511e3a	Change skip_unavailable default value to true (#105792 ) In order to improve the experience of cross-cluster search, we are changing the default value of the remote cluster `skip_unavailable` setting from `false` to `true`. This setting causes any cross-cluster _search (or _async_search) to entirely fail when any remote cluster with `skip_unavailable=false` is either unavailable (connection to it fails) or if the search on it fails on all shards. Setting `skip_unavailable=true` allows partial results from other clusters to be returned. In that case, the search response cluster metadata will show a `skipped` status, so the user can see that no data came in from that cluster. Kibana also now leverages this metadata in the cross-cluster search responses to allow users to see how many clusters returned data and drill down into which clusters did not (including failure messages). Currently, the user/admin has to specifically set the value to `true` in the configs, like so: ``` cluster: remote: remote1: seeds: 10.10.10.10:9300 skip_unavailable: true ``` even though that is probably what search admins want in the vast majority of cases. Setting `skip_unavailable=false` should be a conscious (and probably rare) choice by an Elasticsearch admin that a particular cluster's results are so essential to a search (or visualization in dashboard or Discover panel) that no results at all should be shown if it cannot return any results.	2024-04-29 15:53:47 -04:00
eyalkoren	ee262954ee	Adding aggregations support for the `_ignored` field (#101373 ) Enables aggregations on the _ignored metadata field replacing the stored field with doc values.	2024-04-29 16:41:34 +02:00
Jim Ferenczi	4380cd1bd5	Allow rescorer with field collapsing (#107779 ) This change adds the support for rescoring collapsed documents. The rescoring is applied on the top document per group on each shard. Closes #27243	2024-04-29 08:48:12 +01:00
Panagiotis Bailis	fdefe09041	Fix for from parameter when using sub_searches and rank (#106253 )	2024-04-25 20:11:44 +03:00
Luca Cavanna	223e7f829b	Avoid attempting to load the same empty field twice in fetch phase (#107551 ) During the fetch phase, there's a number of stored fields that are requested explicitly or loaded by default. That information is included in `StoredFieldsSpec` that each fetch sub phase exposes. We attempt to provide stored fields that are already loaded to the fields lookup that scripts as well as value fetchers use to load field values (via `SearchLookup`). This is done in `PreloadedFieldLookupProvider.` The current logic makes available values for fields that have been found, so that scripts or value fetchers that request them don't load them again ad-hoc. What happens though for stored fields that don't have a value for a specific doc, is that they are treated like any other field that was not requested, and loaded again, although they will not be found, which causes overhead. This change makes available to `PreloadedFieldLookupProvider` the list of required stored fields, so that it can better distinguish between fields that we already attempted to load (although we may not have found a value for them) and those that need to be loaded ad-hoc (for instance because a script is requesting them for the first time). This is an existing issue, that has become evident as we moved fetching of metadata fields to `FetchFieldsPhase`, that relies on value fetchers, and hence on `SearchLookup`. We end up attempting to load default metadata fields (`_ignored` and `_routing`) twice when they are not present in a document, which makes us call `LeafReader#storedFields` additional times for the same document providing a `SingleFieldVisitor` that will never find a value. Another existing issue that this PR fixes is for the `FetchFieldsPhase` to extend the `StoredFieldsSpec` that it exposes to include the metadata fields that the phase is now responsible for loading. That results in `_ignored` being included in the output of the debug stored fields section when profiling is enabled. The fact that it was previously missing is an existing bug (it was missing in `StoredFieldLoader#fieldsToLoad`). Yet another existing issues that this PR fixes is that `_id` has been until now always loaded on demand when requested via fetch fields or script. That is because it is not part of the preloaded stored fields that the fetch phase passes over to the `PreloadedFieldLookupProvider`. That causes overhead as the field has already been loaded, and should not be loaded once again when explicitly requested.	2024-04-17 19:37:04 +02:00
Liam Thompson	33a71e3289	[DOCS] Refactor book-scoped variables in `docs/reference/index.asciidoc` (#107413 ) * Remove `es-test-dir` book-scoped variable * Remove `plugins-examples-dir` book-scoped variable * Remove `:dependencies-dir:` and `:xes-repo-dir:` book-scoped variables - In `index.asciidoc`, two variables (`:dependencies-dir:` and `:xes-repo-dir:`) were removed. - In `sql/index.asciidoc`, the `:sql-tests:` path was updated to fuller path - In `esql/index.asciidoc`, the `:esql-tests:` path was updated idem * Replace `es-repo-dir` with `es-ref-dir` * Move `:include-xpack: true` to few files that use it, remove from index.asciidoc	2024-04-17 14:37:07 +02:00
Salvatore Campagna	4dfcb0897e	Fetch meta fields in FetchFieldsPhase using ValueFetcher (#106325 ) Here we extract the logic to populate metadata fields such as _ignored, _routing, _size and the deprecated _type into FetchFieldsPhase so that we can use the ValueFetcher interface to retrieve field values. This allows us to fetch values no matter if the Mapper uses stored or doc values.	2024-04-15 11:02:18 +02:00
István Zoltán Szabó	afb492272a	[DOCS] Adds HuggingFace example to inference API tutorial (#107298 )	2024-04-10 17:57:18 +02:00
Bogdan Pintea	f9ae6db319	ESQL: Add docs for the OPTIONS directive (#107013 ) This adds the docs for the newly added `OPTIONS` directive to `FROM`.	2024-04-03 16:23:36 +02:00
Liam Thompson	573c03262f	[Docs] Fix CCS matrix for 8.13 (#107028 )	2024-04-03 10:54:49 +02:00
Albert Zaharovits	df0fd30e7a	[Doc] Privileges required to retrieve the status of async searches Document that users can retrieve the status of the async searches they submitted without any extra privileges.	2024-04-02 09:35:02 +03:00
Benjamin Trent	89bf4b33e8	Make int8_hnsw our default index for new dense-vector fields (#106836 ) For float32, there is no compelling reason to use all the memory required by default for HNSW. Using `int8_hnsw` provides a much saner default when it comes to cost vs relevancy. So, on all new indices that use `dense_vector` and want to index them for fast search, we will default to `int8_hnsw`. Users can still customize their parameters, or prefer `hnsw` over float32 if they so desire.	2024-04-01 08:23:32 -04:00
Albert Zaharovits	b4938e1645	Query API Key Information API support for the `typed_keys` request parameter (#106873 ) The typed_keys request parameter is the canonical parameter, that's also used in the regular index _search enpoint, in order to return the types of aggregations in the response. This is required by typed language clients of the _security/_query/api_key endpoint that are using aggregations. Closes #106817	2024-03-29 09:24:52 +02:00
Jack Conradson	5ef0b57f77	Remove rank and sub_searches elements from documentation (#106827 ) This change removes the technical preview elements rank and sub_searches from the search API documentation now that retrievers are available.	2024-03-27 10:51:13 -07:00
István Zoltán Szabó	a3d96b9333	[DOCS] Changes model_id path param to inference_id (#106719 )	2024-03-26 08:20:34 +01:00
Liam Thompson	e92420dc86	[DOCS] Update cross cluster search compatability matrix (#106677 )	2024-03-22 15:28:30 +01:00
István Zoltán Szabó	32dbc28e82	[DOCS] Adds disclaimer to semantic search tutorials (#106590 )	2024-03-21 11:32:57 +01:00
Ioana Tagirta	d01adfff60	Add links to text_expansion in ELSER tutorial (#106490 ) * Add links to text_expansion in ELSER tutorial * Apply suggestions from code review Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> --------- Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>	2024-03-20 10:03:04 +01:00
Aurélien FOUCRET	e944619e01	Fix typo in the LTR guide. (#106276 )	2024-03-13 09:05:47 +01:00

1 2 3 4 5 ...

1326 Commits