elasticsearch

Commit Graph

Author	SHA1	Message	Date
Benjamin Trent	73f537170b	Update nested knn search documentation about inner-hits (#104154 ) Adding a link tag for inner hits behavior and kNN search. Additionally adding a note that if you are using multiple knn clauses, that the inner hit name should be provided.	2024-01-10 07:46:42 -05:00
Kathleen DeRusso	bdde29720a	Update synonyms doc with warning about index creation (#103476 ) * Update synonyms doc with warning about index creation * PR feedback * Moved warning in docs	2023-12-18 13:18:51 -05:00
István Zoltán Szabó	c55495d502	[DOCS] Adds inference API end-to-end example (#103042 ) Co-authored-by: David Kyle <david.kyle@elastic.co>	2023-12-12 12:02:47 +01:00
Benjamin Trent	7fde357f3a	Improve docs around knn similarity search (#103158 ) Adding equations to the docs around how to best calculate similarity & score. The similarity parameter for search was added in 8.8. The max-inner-product mentions will be removed for all versions before 8.11 when backporting. closes: https://github.com/elastic/elasticsearch/issues/102924	2023-12-11 14:56:16 -05:00
Abdon Pijpelink	6b60a53732	Update rrf.asciidoc (#103078 ) (#103109 ) typo (cherry picked from commit `851cab63eb`) Co-authored-by: Ugo Sangiorgi <ugo.sangiorgi@elastic.co>	2023-12-11 13:02:49 +01:00
Benjamin Trent	47b57537ae	Add docs for the include_named_queries_score param (#103155 ) The only docs for this _search param were mentioned in the bool query docs. While it makes contextual sense to have it there, we should also add it as a _search parameter in the search API docs. It was introduced in 8.8.	2023-12-08 14:39:18 -05:00
Kathleen DeRusso	4dd9e2a772	[Query Rules] Add some usability clarifications to docs (#102990 ) * [Query Rules] Add some usability clarifications to docs * Fix typo	2023-12-06 17:16:56 -05:00
Benjamin Trent	f00364aefd	Add byte quantization for float vectors in HNSW (#102093 ) Adds new `quantization_options` to `dense_vector`. This allows for vectors to be automatically quantized to `byte` when indexed. Example: ``` PUT vectors { "mappings": { "properties": { "my_vector": { "type": "dense_vector", "index": true, "index_options": { "type": "int8_hnsw" } } } } } ``` When querying, the query vector is automatically quantized and used when querying the HNSW graph. This reduces the memory required to only `25%` of what was previously required for `float` vectors at a slight loss of accuracy. This is currently only available when `index: true` and when using `hnsw`	2023-11-29 12:29:55 -05:00
Luca Cavanna	7c9e8356e6	Merge branch 'main' into lucene_snapshot	2023-11-24 09:57:22 +01:00
Saikat Sarkar	d4f01fc7b3	Gather vector_operation count for knn search (#102032 )	2023-11-21 12:16:21 -07:00
Luca Cavanna	9cd96df179	Add support for index_filter to open pit (#102388 ) The open point in time API accepts a list of indices and opens a point in time view against those indices. Like we do already for field caps, this commit allows users to provide an index_filter parameter as part of the request body, that will be used to execute the can match phase and exclude the indices that can't possibly match such filter. Closes #99740	2023-11-21 15:35:49 +01:00
Kathleen DeRusso	4567d397fa	Clarify text expansion query docs to not suggest enabling track_total_hits for performance (#102102 )	2023-11-20 08:56:26 -05:00
István Zoltán Szabó	c303ab885a	[DOCS] Simplifies dense vector mapping in semantic search example (#102080 )	2023-11-14 10:52:56 +01:00
Abdon Pijpelink	70128f5b74	[DOCS] Mark 'ignore_throttled' deprecated in all docs (#101838 )	2023-11-07 13:03:49 +01:00
Abdon Pijpelink	49c5b03d57	[DOCS] Update CCS compatibility matrix for 8.11 (#101786 )	2023-11-06 08:41:15 +01:00
Mayya Sharipova	61c7483fc9	Make knn search a query (#98916 ) This introduced a new knn query: - knn query is executed during the Query phase similar to all other queries. - No k parameter, k defaults to size - num_candidates is a size of queue for candidates to consider while search a graph on each shard - For aggregations: "size" results are collected with total = size * shards. Aggregations will see size * shards results. - All filters from DSL are applied as post-filters, except: 1) alias filter is applied as pre-filter or 2) a filter provided as a parameter inside knn query.	2023-11-01 14:21:40 -04:00
James Rodewig	4c69746c24	[DOCS] Update tech preview copy (#101606 ) Updates the copy for tech preview and experimental features in the Elasticsearch docs. Relates to https://github.com/elastic/docs/pull/2807	2023-10-31 10:31:07 -04:00
Alan Woodward	f7a9783d45	Check that scripts produce correct json in render template action (#101518 ) If a mustache script that outputs badly-formed json is referred to in a render template request, then the error returned will be a 500 server error, rather than a 400 json parsing error. This is because rendering templates skips json parsing, and so the error ends up being caught in the REST layer instead. This commit changes the template rendering logic to always parse the output of the script, catching json errors higher in the stack and allowing us to return the correct status code. This also means that errors are correctly detected and returned as part of multi search template requests. Fixes #101477	2023-10-30 13:25:39 +00:00
István Zoltán Szabó	9b404099b4	[DOCS] Adds links to token section in ESLER conceptual. (#101033 )	2023-10-18 11:30:38 +02:00
Liam Thompson	eab813f8cb	[DOCS] Migrate Behavioral Analytics docs to ES ref (#100704 ) * [DOCS] Migrate Behavioral Analytics docs to ES ref * Fix typo * Fix attributes * Rename top level heading, fix requirements * Address review suggestions	2023-10-13 09:05:23 +02:00
István Zoltán Szabó	446ac9f378	[DOCS] Updates ELSER tutorial with inference processor changes (#100420 ) Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>	2023-10-11 17:33:20 +02:00
Abdon Pijpelink	62b85b1d0f	[DOCS] Refresh "Search your data" (#99482 ) * Restructure existing docs * Add draft content * Changes for MVP * Reword * Move Search Applications docs to ES reference - Renamed files and changed ids per https://github.com/elastic/elasticsearch/pull/100032 - Updated URL syntax for absolute URLs using attribute - Deleted redirects in redirects.asciidoc * Fix json source formatting * Use `source, js`, not `javascript` * Idem * Fix console-reponse * Skip tests for js blocks * This will definitely fix things * Use attributes * Remove commented out redirects * Fix header level in search-with-synonyms.asciidoc * Update docs/reference/search/search-your-data/knn-search.asciidoc Co-authored-by: Chris Cressman <chris@chriscressman.com> * Fix trailing comma bug Flagged in #enterprise-search Slack * Move semantic search under vector search --------- Co-authored-by: Liam Thompson <leemthompo@gmail.com> Co-authored-by: Chris Cressman <chris@chriscressman.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>	2023-10-10 10:47:35 +02:00
Carlos Delgado	f2dfbfe8c4	[DOCS] Add sparse-vector field type to docs, changed references (#100348 )	2023-10-06 14:25:27 +02:00
Luca Cavanna	689a1e490a	Merge branch 'main' into lucene_snapshot_9_8	2023-10-02 13:56:12 +02:00
István Zoltán Szabó	9d01def3dc	[DOCS] Changes semantic search tutorials to use ELSER v2 and sparse_vector field type (#100021 ) * [DOCS] Changes semantic search tutorials to use ELSER v2 and sparse_vector field type. * [DOCS] More edits.	2023-09-29 09:24:36 +02:00
Benjamin Trent	92cea2797e	Add nested support for dense_vector fields and knn search (#99763 ) * Nested dense_vector support * Adjust nested support based on new lucene version * fixing after rebase * fixing some code * fixing tests adding transport version * spotless * [Automated] Update Lucene snapshot to 9.9.0-snapshot-b3e67403aaf * Adds new max_inner_product vector similarity function (#99527) Adds new max_inner_product vector similarity function. This differs from dot_product in the following ways: Doesn't require vectors to be normalized Scales the similarity between vectors differently to prevent negative scores * requiring top level filter to be parent filter * adding docs & fixing tests * adding and fixing docs * adding changlog * removing unnecessary file changes * removing unused imports * fixing test * maybe fix doc tests * continue tests in docs * fixing more tests * fixing tests --------- Co-authored-by: Jim Ferenczi <jim.ferenczi@elastic.co> Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>	2023-09-28 11:38:04 -04:00
Matteo Piergiovanni	d9c15c526e	Add counters to _clusters response for all states (#99566 ) To help the user know what the possible cluster states are and to provide an accurate accounting, we added counters summarising `running`, `partial` and `failed` clusters to the `_clusters` section. Changes: - Now in the response is present the number of `running` clusters. - We split up `partial` and `successful` (before was summed up in the `successful` counter). - We now have a counter for `failed` clusters. - Now `total` is always equal to `running` + `skipped` + `failed` + `partial` + `successful`.	2023-09-28 09:28:45 +02:00
Ignacio Vera	4bc1afddda	Move Aggregator#buildTopLevel() to search worker thread. (#98715 ) This commit introduces an AggregatorCollector that contains a finish method which performs aggregation post-collection and builds the internal aggregation for this collector. This method is called on the worker thread at the end of the collection phase.	2023-09-19 09:46:51 +02:00
David Pilato	7064bc9e5c	Generated field is `ml.tokens` (#99049 ) The generated field name is `ml.tokens` and not `ml-tokens`.	2023-09-13 15:21:27 +02:00
István Zoltán Szabó	f5dc68abc6	[DOCS] Fine-tunes the reindexing step of the ELSER tutorial. (#99155 )	2023-09-04 11:04:58 +02:00
Michael Peterson	649821e992	Support cluster/details for CCS minimize_roundtrips=false (#98457 ) This commit tracks progress for each shard search by cluster alias using a new SearchProgressListener (CCSSingleCoordinatorSearchProgressListener). Both sync and async CCS searches use this new progress listener when minimize_roundtrips=false. Two of the SearchProgressListener method had to be extended to allow tracking per-cluster took values (TransportSearchAction.SearchTimeProvider) and whether searches timed out (by passing in QuerySearchResult to the onQueryResult listener method). This commit brings parity between minimize_roundtrips=true and false to have the same _cluster/details sections in CCS search responses. Note that there are still a few differences between minimize_roundtrips=true and false. 1. The per-cluster took value for minimize_roundtrips=true is accurate, but the for 'false' it is only measured at the granualarity of each partial reduce, so the per cluster took time is overestimated in basically all cases. 2. For minimize_roundtrips=true, a skip_unavailable=false cluster that disconnects during the search or has all searches on all shards fail, will cause the entire search to fail. This is (still) not true for minimize_roundtrips=false. The search is only failed if the skip_unavailable=false cluster cannot be connected to at the start of the search. (This will likely be changed in a follow up ticket that implements fail-fast logic for in-progress searches that should fail due to a skip_unavailable=true cluster failing.) 3. The shard accounting for minimize_roundtrips=false is always accurate (total shard counts are known at the start of the search). For minimize_roundtrips=true, the shard accounting is only accurate per cluster unless all clusters have successful (or partially successful) searches. For clusters that have failures we do not have shard count info.	2023-08-31 12:56:20 -04:00
Liam Thompson	dfbec46c3d	[Docs] Add link to labs from semantic search overview (#98985 )	2023-08-30 10:54:24 +02:00
Liam Thompson	a3c96caa51	[DOCS] Add link to Elasticsearch labs ELSER Python notebook (#98983 ) * Add link to Elasticsearch labs ELSER Python notebook * Fix typos * Use {es} variable Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co> --------- Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>	2023-08-29 15:26:00 +02:00
Abdon Pijpelink	1955bd8ad4	[DOCS] New docs for remote clusters using API key authentication (#98330 ) * New docs structure for remote clusters * Fix broken cross-book link errors * More broken cross-book link errors * Remove redirects for new pages * Link to generic remote cluster docs instead * Drop 'API' from the abbreviated title * Add 'Establish trust with a remote cluster' section * Restructure 'Establish trust' section into Prprequisite/local/remote instructions * Add 'Configure roles and users' section * Add 'Connect to a remote cluster' section * Move version compatibility to prerequisites * Fix test errors * Incorporate review feedback * Mention version 8.10 or later in the intro for API keys * Add license prerequisite	2023-08-24 12:30:03 +02:00
Kathleen DeRusso	8c12a7b7cd	Query rules docs clarification (#98605 ) * Query rules docs clarification * Update docs/reference/search/search-your-data/search-using-query-rules.asciidoc * Update docs/reference/search/search-your-data/search-using-query-rules.asciidoc	2023-08-17 11:11:49 -04:00
Craig Taverner	dfe9bdc45f	Simple grammar fix for MVT docs (#98591 )	2023-08-17 16:10:26 +02:00
Nick Chow	5de0a9013f	Documentation update that fixes a query rules code example (#98540 ) * Change example field in rule query guide * Change fuzzy to contains to get tests to work --------- Co-authored-by: Kathleen DeRusso <kathleen.derusso@elastic.co>	2023-08-16 15:14:32 -07:00
Carlos Delgado	c596f121b4	Synonyms Overview Documentation (#98202 )	2023-08-10 18:07:12 +02:00
Abdon Pijpelink	21ef4f3629	[DOCS] Update CCS compatibility matrix for 8.10 (#98341 )	2023-08-10 15:57:47 +02:00
Kathleen DeRusso	0437416c33	Tech debt: Add tests to documentation for query rules, search applications (#98266 ) * Add tests for query rules * More tests * Fix search app tests * Fix tests * Add teardown to tests * Add tests for list search apps call * Update test in get search application * Tweak stack trace * Make response match in test --------- Co-authored-by: carlosdelest <carlos.delgado@elastic.co>	2023-08-09 08:01:52 -04:00
Michael Peterson	169f7d1774	Add specific cluster error info, shard info and additional metadata for CCS when minimizing roundtrips (#97731 ) For CCS searches with ccs_minimize_roundtrips=true, when an error is returned, it is unclear which cluster caused the problem. This commit adds additional accounting and error information to the search response for each cluster involved in a cross-cluster search. The _clusters section of the SearchResponse has a new details section added with an entry for each cluster (remote and local). It includes status info, shard accounting counters and error information that are added incrementally as the search happens. The search on each cluster can be in one of 5 states: RUNNING SUCCESSFUL - all shards were successfully searched (successful or skipped) PARTIAL - some shard searches failed, but at least one succeeded and partial data has been returned SKIPPED - no shards were successfully searched (all failed or cluster unavailable) when skip_unavailable=true FAILED - no shards were successfully searched (all failed or cluster unavailable) when skip_unavailable=false A new SearchResponse.Cluster object has been added. Each TransportSearchAction.CCSActionListener (one for each cluster) has a reference to a separate Cluster instance and updates once it gets back information from its cluster. The SearchResponse.Clusters object only uses the new Cluster object for CCS minimize_roundtrips=true. For local-only searches and CCS minimize_roundtrips=false, it uses the current Clusters object as before. Follow on work will change CCS minimize_roundtrips=false to also use the new Cluster model and update state in the _cluster/details section. The Cluster objects are immutable, so a CAS operation is required to swap in new state to the map of Cluster objects held by the `SearchResponse.Clusters` class. This concurrency model is a little bit of overkill for the minimize_roundtrips=true use case, but it will be necessary for supporting minimize_roundtrips=false, since updates there will be done per shard, not per cluster.	2023-08-07 12:32:06 -04:00
Kathleen DeRusso	23e35d5687	[Query Rules] Add documentation for rule_query (#97667 ) * Add docs for rule query * Add test * Fix formatting in rule query dsl * Remove query string as required from rule query docs * PR feedback * Update with API changes * Expand and clarify 'search using query rules' doc * Clean up wording * Update put syntax * Fix examples after refactor * Update docs/reference/query-dsl/rule-query.asciidoc Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co> * PR feedback + update privilege * PR feedback * More PR feedback * Small correction --------- Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>	2023-08-02 15:56:06 -04:00
Abdon Pijpelink	48b3a85741	[DOCS] Update RRF tech preview statement (#97851 ) * [DOCS] Update RRF tech preview statement * Add 'rank' and 'sub_searches'	2023-07-24 13:55:06 +02:00
Abdon Pijpelink	40409bf8ca	[DOCS] Semantic search page (#97715 ) Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co> Co-authored-by: David Roberts <dave.roberts@elastic.co>	2023-07-20 10:45:13 +02:00
István Zoltán Szabó	57fd6b84fb	[DOCS] Expands ELSER tutorial with optimization info (#97392 ) Co-authored-by: David Kyle <david.kyle@elastic.co>	2023-07-19 10:38:11 +02:00
Michael Peterson	eaa86796a7	Add completion_time time field to async_search get and status response (#97700 ) The completion_time is set as the start_time (already present) plus the 'took' time that is set in the SearchResponse object and only if the isRunning status == false since took is set even for in-progress searches. We use the 'took' field because it is based on relative time, not absolute wall clock time which can go backwards due to NTP issues. See the comments in TransportSearchAction about the SearchTimeProvider for details. Closes #88640	2023-07-17 09:13:15 -04:00
Mayya Sharipova	f8c626f792	Track max_score in collapse when requested (#97703 ) Before we used to track max_score in collapse when requested (track_scores=true) or when there is no sort in collapse (see PR#27122). But this feature was lost through refactoring and changes. This PR restores this feature. Closes #97653	2023-07-17 06:48:00 -04:00
Abdon Pijpelink	0f810b19e9	[DOCS] Clarify that dense vectors can be created with ES (#97636 ) * [DOCS] Clarify that dense vectors can be created with ES * Fix rendering issue * Break up long sentence	2023-07-13 14:04:32 +02:00
István Zoltán Szabó	9cd609f22c	[DOCS] Adds deployment_id as an option to query_vector_builder (#97576 )	2023-07-12 09:35:36 +02:00
Jack Conradson	f2b0434ee2	Mark rank and sub_searches as tech preview (#97573 ) rank and sub_searches are in tech preview. This adds the tech preview text that is required in the docs for these features.	2023-07-11 09:28:46 -07:00

1 2 3 4 5 ...

1262 Commits