elasticsearch

Commit Graph

Author	SHA1	Message	Date
Iraklis Psaroudakis	aa083ce419	[CI] Mute reference/cluster/nodes-stats (#91399 ) relates #91081	2022-11-08 14:57:37 +02:00
Iraklis Psaroudakis	dcdf58721d	[CI] Mute reference/cluster/nodes-stats/line_2735 (#91380 ) relates #91081	2022-11-08 05:04:49 -05:00
Liam Thompson	cd6be58860	[DOCS] Add reference for ingest pipelines in Enterprise Search (#91357 )	2022-11-08 09:22:01 +01:00
Hendrik Muhs	1b556d75fa	mute another node stats test (#91346 ) muting another test part as it causes a lot of CI failures relates #91081	2022-11-07 06:07:09 -05:00
Lisa Cawley	99877382a0	[DOCS] Remove coming tag from release notes (#91330 )	2022-11-04 18:36:37 -07:00
Lisa Cawley	9e83084020	[DOCS] Clarify description of geo_results (#91237 )	2022-11-04 08:15:46 -07:00
David Kilfoyle	3295662697	[DOCS] Add time range info to TSDS docs (#91291 ) * [DOCS] Add time range info to TSDS docs * Fixup	2022-11-04 09:18:35 -04:00
Dimitris Athanasiou	4e67df8b05	[ML] Low priority trained model deployments (#91234 ) This adds a new parameter to the start trained model deployment API, namely `priority`. The available settings are `normal` and `low`. For normal priority deployments the allocations get distributed so that node processors are never oversubscribed. Low priority deployments allow users to test model functionality even if there are no node processors available. They are limited to 1 allocation with a single thread. In addition, the process is executed in low priority which limits the amount of CPU that can be used when the CPU is under pressure. The intention of this is to limit the impact of low priority deployments on normal priority deployments. When we rebalance model assignments we now: 1. compute a plan just for normal priority deployments 2. fix the resources used by normal deployments 3. compute a plan just for low priority deployments 4. merge the two plans Closes #91024	2022-11-04 14:22:30 +02:00
Hendrik Muhs	14b2d2d37e	[ML] frequent items filter (#91137 ) add a filter to the frequent items agg that filters documents from the analysis while still calculating support on the full set A filter is specified top-level in frequent_items: "frequent_items": { "filter": { "term": { "host.name.keyword": "i-12345" } }, ... The above filters documents that don't match, however still counts the docs when calculating support. That's in contrast to specifying a query at the top, in which case you find the same item sets, but don't know the importance given the full document set.	2022-11-03 13:58:40 +01:00
charliek17	4192c5b327	Update move-to-step.asciidoc (#91114 )	2022-11-03 08:55:24 +00:00
Valeriy Khakhutskyy	7c4186ddbc	[ML] Update API documentation for anomaly score explanation (#91177 ) This PR updates the API documentation to match the UI. Co-authored-by: lcawl <lcawley@elastic.co>	2022-11-01 21:43:33 +01:00
Lisa Cawley	2d30bbab21	[DOCS] Semantic search endpoint (#91210 )	2022-11-01 09:01:55 -07:00
Abdon Pijpelink	8abd39ab98	Fix typo in stop-tokenfilter.asciidoc (#91128 ) (#91207 ) Since ignore_case is set to true in our custom stop words filter, the matching will be case-insensitive. (cherry picked from commit `a03fba9d77`) Co-authored-by: Siniša Subašić <68671543+sinisuba@users.noreply.github.com>	2022-11-01 15:32:16 +01:00
David Kilfoyle	56397f5d4c	[Docs] Remove feature flag from downsampling page (#91228 )	2022-11-01 09:51:22 -04:00
Anthony McGlone	0249d1650f	[DOCS] Update the feature state example in the snapshot and restore docs (#90328 )	2022-11-01 10:17:29 +09:00
Lisa Cawley	f0c12cdeea	[DOCS] Fix typo in knn-search.asciidoc (#91206 )	2022-10-31 10:07:53 -07:00
Mary Gouseti	d55059afab	Mute reference/cluster/nodes-stats/line_2751 (#91174 )	2022-10-28 11:55:53 +02:00
Julie Tibshirani	1b249639f1	Remove experimental marking from kNN search (#91065 ) This commit removes the experimental tag from kNN search docs and makes some docs improvements: * Add a prominent warning about memory usage in the kNN search guide * Link to the performance tuning guide from the main guide * Clarify the memory requirements section in the tuning guide	2022-10-27 18:00:56 +02:00
Yang Wang	882fbe62b5	[Doc] Improve doc for certutil parameter applicability (#91124 ) The http command does not take most of the parameters. This PR ensures it is consistently documented for all parameters.	2022-10-27 09:38:56 +11:00
Frederic Dartayre	fe0036fdbf	Update threadpool.asciidoc (#90098 ) * Update threadpool.asciidoc Starting from 8.0 the value of the `node.processors` setting is bounded by the number of available processors https://github.com/elastic/elasticsearch/pull/44894 * Update docs/reference/modules/threadpool.asciidoc Co-authored-by: Adam Locke <adam.locke@elastic.co>	2022-10-26 14:04:39 -04:00
Craig Taverner	c19f642d94	Refine geo-point and geo-shape docs (#90913 ) * Refine geo-point and geo-shape docs While reviewing the docs for another issue, some deprecated references to prefix-trees were discovered, leading to interest in bringing the docs a little more up-to-date. * Update docs/reference/mapping/types/geo-point.asciidoc Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co> * Update docs/reference/mapping/types/geo-shape.asciidoc Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co> Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>	2022-10-26 12:21:34 +02:00
Hendrik Muhs	82a71f6ef6	[Transform] add a health section to transform stats (#90760 ) adds a health section to the transform stats endpoint and implements reporting assignment, indexing/search and persistence problems, together with a overall health state.	2022-10-25 09:01:21 +02:00
Flavio	83694c37a3	Update docker image (#90730 )	2022-10-24 15:52:36 -04:00
Stéphane Campinas	8c44ed1442	Fix itemized list (#90855 )	2022-10-24 15:14:17 -04:00
Przemysław Witek	95f484c4fd	[Transform] Expand the docs section regarding mappings deduction in transform's dest index (#91077 )	2022-10-24 13:43:22 +02:00
Christos Soulios	1f265eb725	[DOCS] Add release notes for 8.5.0(#91063 ) Forward port PR (#91029) with release notes for version 8.5.0 - Add release notes for v8.5.0 after BC6 has been cut	2022-10-21 13:17:33 +03:00
Jack Conradson	f28ae4b288	Add support for indexing byte-sized knn vectors (#90774 ) This change adds an element_type as an optional mapping parameter for dense vector fields as described in #89784. This also adds a byte element_type for dense vector fields that supports storing dense vectors using only 8-bits per dimension. This is only supported when the mapping parameter index is set to true. The code follows a similar pattern to our NumberFieldMapper where we have an enum for ElementType, and it has methods that DenseVectorFieldType and DenseVectorMapper can delegate to to support each available type (just float and byte for now).	2022-10-20 14:45:58 -07:00
Iraklis Psaroudakis	0f4374f4fb	Explain disk headroom settings more in docs (#90763 ) Relates to #81406	2022-10-20 18:45:23 +03:00
Roberto Seldner	8e35a6a846	Update documentation with supported IANA numbers (#90531 ) Based on this: https://github.com/elastic/elasticsearch/blob/main/modules/ingest-common/src/main/java/org/elasticsearch/ingest/common/CommunityIdProcessor.java#L440-L451	2022-10-19 08:23:11 -05:00
Leaf-Lin	14ef513f2c	[DOCS] Add CCR limitation (#87348 ) * Add CCR limitation closes https://github.com/elastic/elasticsearch/issues/86121 * Add restored index auto follow pattern restriction https://github.com/elastic/elasticsearch/issues/87055 * Moving content to existing CCR page + several changes * Remove sections to consolidate limitation information * Delete separate file * Remove restored indices from list of things that aren't replicated Co-authored-by: Adam Locke <adam.locke@elastic.co> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2022-10-17 16:05:29 -04:00
Lisa Cawley	2dd7732553	[DOCS] Add ML CPP PRs to release notes (#90961 )	2022-10-17 09:58:40 -07:00
Mary Gouseti	cfd23d512f	Disk indicator troubleshooting guides (#90504 )	2022-10-14 15:24:21 +02:00
Paramdeep Singh	34ff7a9d98	Consolidated Circuit Breaker documentation to include EQL and ML infer (#90809 ) Fixes #85851 Co-authored-by: Iraklis Psaroudakis <kingherc@gmail.com>	2022-10-14 14:33:52 +03:00
Przemyslaw Gomulka	aa922754af	Add known issues entry about date rounding bug (#90721 ) add entry to all affected versions relates #90187	2022-10-14 11:51:02 +02:00
Francisco Fernández Castaño	1a3032beb6	Keep track of average shard write load (#90768 ) This commit adds a new field, write_load, into the shard stats. This new stat exposes the average number of write threads used while indexing documents. Closes #90102	2022-10-13 16:34:45 +02:00
David Kyle	9e6a784aa5	[ML] Semantic search endpoint (#90450 ) Adds a {index}_semantic_search endpoint which first converts the query text into a dense vector using a NLP text embedding model then performs a knn search against an index containing dense vectors created with the same embedding model.	2022-10-13 13:17:30 +01:00
David Roberts	be006e2eee	[ML] Improve categorize_text docs (#90765 ) Adds more detail about the meaning of the results fields of the `categorize_text` aggregation, and advice about how to use these fields when searching for messages that match the categories. Followup to #90723	2022-10-13 10:46:53 +01:00
Julie Tibshirani	f4038b3f15	Add guide for tuning kNN search (#89782 ) This 'how to' guide explains performance considerations specific to kNN search. It takes inspiration from the 'tune for search speed' guide.	2022-10-12 14:53:53 -07:00
Nik Everett	82aeb478db	Synthetic `_source`: support `wildcard` field (#90196 ) This adds synthetic `_source` support for the `wildcard` field type.	2022-10-12 15:55:13 -04:00
David Kilfoyle	cad87c4d5a	[DOCS] Add Downsampling docs (#88571 ) This adds documentation for downsampling of time series indices.	2022-10-12 12:10:16 -04:00
Valeriy Khakhutskyy	95758e88a2	[ML] Explain anomaly score factors (#90675 ) This PR surfaces new information about the impact of the factors on the initial anomaly score in the anomaly record: - single bucket impact is determined by the deviation between actual and typical in the current bucket - multi-bucket impact is determined by the deviation between actual and typical in the past 12 buckets - anomaly characteristics are statistical properties of the current anomaly compared to the historical observations - high variance penalty is the reduction of anomaly score in the buckets with large confidence intervals. - incomplete bucket penalty is the reduction of anomaly score in the buckets with fewer samples than historically expected. Additionally, we compute lower- and upper-confidence bounds and the typical value for the anomaly records. This improves the explainability of the cases where the model plot is not activated with only a slight overhead in performance (1-2%).	2022-10-12 16:57:06 +02:00
Luca Cavanna	18942d5b11	Enhance nested depth tracking when parsing queries (#90425 ) When parsing queries on the coordinating node, there is currently no way to share state between the different parsing methods (`fromXContent`). The only query that supports a parse context is bool query, which uses the context to track nested depth of queries, added with #66204. Such nested depth tracking mechanism is not 100% accurate as it tracks bool queries only, while there's many more query types that can hold other queries hence potentially cause stack overflow when deeply nested. This change removes the parsing context that's specific to bool query, introduced with #66204, in favour of generalizing the nested depth tracking to all query types. The generic tracking is introduced by wrapping the parser and overriding the method that parses named objects through the xcontent registry. Another way would have been to require a context argument when parsing queries, which would mean adding a context argument to all the QueryBuilder#fromXContent static methods. That would be a breaking change for plugins that provide custom queries, hence I went for trying out a different approach. One aspect that this change requires and introduces is the distinction between parsing a top level query (which will wrap the parser, or it would create the context if we had one), as opposed to parsing an inner query, which goes ahead with the given parser and context. We already have this distinction as we have two different static methods in `AbstractQueryBuilder` but in practice only bool query makes the distinction being the only context-aware query. In addition to generalizing tracking nested depth when parsing queries, we should be able to adopt this same strategy to track queries usage as part #90176 . Given that the depth check is now more restrictive, as it counts all compound queries and not only bool, we have decided to raise the default limit to `30` to ensure that users are not going to hit the limit due to this change.	2022-10-12 15:15:06 +02:00
Albert Zaharovits	73cdc7b80a	DOC CCR Disaster recovery does not handle Security configuration (#85522 ) We do not support and don't plan to support disaster recovery arrangements where Security configuration is replicated between the production and the disaster recovery cluster because the cluster-local Security APIs assume exclusive write on the .security system index.	2022-10-12 13:53:53 +03:00
Ed Savage	f355787165	[ML] Allow overriding timestamp field to null in file structure finder (#90764 ) Use a magic value of "null" for the timestamp format override to indicate to the analysis that a timestamp is not expected in the input text. This should improve performance when analysing delimited, ndjson or xml formatted text files that don't contain timestamps. For semi-structured text files without timestamps the magic value indicates to treat the text as single line log messages. see #55219	2022-10-12 09:08:25 +01:00
Dimitris Athanasiou	16bfc550ea	[ML] Add api to update trained model deployment number_of_allocations (#90728 ) This commit adds a new API that users can use calling: ``` POST _ml/trained_models/{model_id}/deployment/_update { "number_of_allocations": 4 } ``` This allows a user to update the number of allocations for a deployment that is `started`. If the allocations are increased we rebalance and let the assignment planner find how to allocate the additional allocations. If the allocations are decreased we cannot use the assignment planner. Instead, we implement the reduction in a new class `AllocationReducer` that tries to reduce the allocations so that: 1. availability zone balance is maintained 2. assignments that can be completely stopped are preferred to release memory	2022-10-12 10:04:23 +03:00
David Roberts	bfccd20155	[ML] Add a regex to the output of the categorize_text aggregation (#90723 ) The new `regex` field in `categorize_text` output is created in the same way as the `regex` field that appears in the category definitions created by anomaly detection jobs that do categorization. It consists of the terms that occur in the same order for every message that matches the category, separated with a `.+?` wildcard. It therefore matches the category messages and enforces the order of the terms that occurred in the same order for all messages used to create the category. It is not recommended to use the regex as the primary mechanism for searching for the original documents that were categorized. Search using a regular expression is very slow. Instead the terms of the category should be used to search for matching documents, as a terms search can use the inverted index and hence be much faster. However, there may be situations where it is useful to use the `regex` field to test whether a small set of messages that have not been indexed match the category.	2022-10-10 11:41:16 +01:00
Andrei Dan	b55f5fd77b	Rename the fields reported under details by the disk indicator (#90717 ) Currently, we report the count of affected nodes and indices as part of the disk indicator using a leaky abstraction. Namely we use the status we assign to nodes internally to nodes based on their disk usage (red, yellow, green, unknown). However, these statuses don't have an explicit meaning outside the implementation details e.g. a red node would probably convey it's a node experiencing disk issues but not what kind This proposes being explicit in what we return to our health API users e.g. ``` "details": { "indices_with_readonly_block": 2, "nodes_with_enough_disk_space": 0, "nodes_with_unknown_disk_status": 0, "nodes_over_high_watermark": 0, "nodes_over_flood_watermark": 2 } ```	2022-10-10 11:30:03 +01:00
Lisa Cawley	db2882cbb5	[DOCS] Add links to clear trained model deployment cache API (#90727 )	2022-10-06 10:10:55 -07:00
Brandon Morelli	ced1447db0	docs: update fleet/agent pipeline docs (#90659 ) * docs: update fleet/agent pipeline docs * Apply suggestions from code review Co-authored-by: Adam Locke <adam.locke@elastic.co> Co-authored-by: Adam Locke <adam.locke@elastic.co>	2022-10-05 13:06:58 -07:00
Jack Conradson	8b0d0716d1	Add profiling and documentation for dfs phase (#90536 ) Adds profiling statistics for the dfs phase, and adds documentation for both the dfs phase profiling and kNN profiling. Closes #89713	2022-10-05 09:54:36 -07:00

1 2 3 4 5 ...

10038 Commits