elasticsearch

Commit Graph

Author	SHA1	Message	Date
Yang Wang	b337f9b6f3	[Docs] Misc doc update for RCS 2.0 (#98472 ) This PR adds docs for the following items: * Remote indices privileges * Remote cluster network settings * Remote cluster security settings * New privileges * New response field for RemoteInfo API List of preview pages: * [Remote indices in defining roles](https://elasticsearch_98472.docs-preview.app.elstc.co/guide/en/elasticsearch/reference/master/defining-roles.html#roles-remote-indices-priv) * [Remote indices in PutRole API](https://elasticsearch_98472.docs-preview.app.elstc.co/guide/en/elasticsearch/reference/master/security-api-put-role.html#security-api-put-role-request-body) * [Remote cluster server SSL settings](https://elasticsearch_98472.docs-preview.app.elstc.co/guide/en/elasticsearch/reference/master/security-settings.html#_remote_cluster_server_api_key_based_model_tlsssl_settings) * [Remote cluster client SSL settings](https://elasticsearch_98472.docs-preview.app.elstc.co/guide/en/elasticsearch/reference/master/security-settings.html#_remote_cluster_client_api_key_based_model_tlsssl_settings) * [Remote cluster network settings](https://elasticsearch_98472.docs-preview.app.elstc.co/guide/en/elasticsearch/reference/master/modules-network.html#remote-cluster-network-settings) and [here](https://elasticsearch_98472.docs-preview.app.elstc.co/guide/en/elasticsearch/reference/master/modules-network.html#common-network-settings) * [Remote cluster credentials setting](https://elasticsearch_98472.docs-preview.app.elstc.co/guide/en/elasticsearch/reference/master/remote-clusters-settings.html) * [New privileges](https://elasticsearch_98472.docs-preview.app.elstc.co/guide/en/elasticsearch/reference/master/security-privileges.html) * [New response field for RemoteInfo API](https://elasticsearch_98472.docs-preview.app.elstc.co/guide/en/elasticsearch/reference/master/cluster-remote-info.html#cluster-remote-info-api-response-body)	2023-08-15 20:11:21 -04:00
Luca Cavanna	4023454483	Introduce executor for concurrent search (#98204 ) This commit enables concurrent search execution in the DFS phase, which is going to improve resource usage as well as performance of knn queries which benefit from both concurrent rewrite and collection. We will enable concurrent execution for the query phase in a subsequent commit. While this commit does not introduce parallelism for the query phase, it introduces offloading sequential computation to the newly introduced executor. This is true both for situations where a single slice needs to be searched, as well as scenarios where a specific request does not support concurrency (currently only DFS phase does regardless of the request). Sequential collection is not offloaded only if the request includes aggregations that don't support offloading: composite, nested and cardinality as their post collection method must be executed in the same thread as the collection or we'll trip a lucene assertion that verifies that doc_values are pulled and consumed from the same thread. ## Technical details This commit introduces a secondary executor, used exclusively to execute the concurrent bits of search. The search threads are still the ones that coordinate the search (where the caller search will originate from), but the actual work will be offloaded to the newly introduced executor. We are offloading not only parallel execution but also sequential execution, to make the workload more predictable, as it would be surprising to have bits of search executed in either of the two thread pools. Also, that would introduce the possibility to suddenly run a higher amount of heavy operations overall (some in the caller thread and some in the separate threads), which could overload the system as well as make sizing of thread pools more difficult. Note that fetch, together with other actions, is still executed in the search thread pool. This commit does not make the search thread pool merely a coordinating only thread pool, It does so only for what concerns the IndexSearcher#search operation itself, which is though a big portion of the different phases of search API execution. Given that the searcher blocks waiting for all tasks to be completed, we take a simple approach of introducing a thread pool executor that has the same size as the existing search thread pool but relies on an unbounded queue. This simplifies handling of thread pool queue and rejections. In fact, we'd like to guarantee that the secondary thread pool won't reject, and delegate queuing entirely to the search thread pool which is the entry point for every search operation anyway. The principle behind this is that if you got a slot in the search thread pool, you should be able to complete your search, and rather quickly. As part of this commit we are also introducing the ability to cancel tasks that have not started yet, so that if any task throws an exception, other tasks are prevented from starting needless computation. Relates to #80693 Relates to #90700	2023-08-10 12:40:36 +02:00
David Turner	0f6a217ed8	Fix admonition about initial_master_nodes (#98242 ) Admonition paragraphs cannot be combined with a `+` continuation mark. This commit fixes the formatting by using an admonition block instead.	2023-08-08 11:50:36 +01:00
David Turner	847ec45baa	Remove bound on SEARCH_COORDINATION default size (#98264 ) Today by default the `SEARCH_COORDINATION` pool is sized at half the allocated processors, or five if there are more than ten CPUs. Yet, if we scale up a node to have more than ten CPUs, we probably want to scale up the number of search coordination threads to match. This commit removes the limit of five threads.	2023-08-08 07:09:25 +01:00
Pooya Salehi	966eb022d9	[DOCS] Mention mmap and FD limits when increasing default max shard per node (#97975 )	2023-07-26 16:45:27 +02:00
David Turner	09e53f9ad9	Enhance docs around network troubleshooting (#97305 ) Discovery, like cluster membership, can also be affected by network-like issues (e.g. GC/VM pauses, dropped packets and blocked threads) so this commit duplicates the troubleshooting info across both places.	2023-07-10 10:57:44 +01:00
James Rodewig	ff84ad1469	[DOCS] Note license requirements for CCS (#97252 ) Notes that CCS requires both clusters to use the same license level for full capabilities.	2023-06-29 16:55:10 -04:00
David Turner	2a49ad929c	Slightly better hot threads for transport workers (#96315 ) A completely idle `transport_worker` thread is reported as `0.0%` idle, which is confusing. Moreover the docs on the network threading model do not reflect the changes made in #90482. This commit fixes both of those things.	2023-05-25 12:08:08 +01:00
debadair	777598d602	[DOCS] Remove redirect pages (#88738 ) * [DOCS] Remove manual redirects * [DOCS] Removed refs to modules-discovery-hosts-providers * [DOCS] Fixed broken internal refs * Fixing bad cross links in ES book, and adding redirects.asciidoc[] back into docs/reference/index.asciidoc. * Update docs/reference/search/point-in-time-api.asciidoc Co-authored-by: James Rodewig <james.rodewig@elastic.co> * Update docs/reference/setup/restart-cluster.asciidoc Co-authored-by: James Rodewig <james.rodewig@elastic.co> * Update docs/reference/sql/endpoints/translate.asciidoc Co-authored-by: James Rodewig <james.rodewig@elastic.co> * Update docs/reference/snapshot-restore/restore-snapshot.asciidoc Co-authored-by: James Rodewig <james.rodewig@elastic.co> * Update repository-azure.asciidoc * Update node-tool.asciidoc * Update repository-azure.asciidoc --------- Co-authored-by: amyjtechwriter <61687663+amyjtechwriter@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Amy Jonsson <amy.jonsson@elastic.co> Co-authored-by: James Rodewig <james.rodewig@elastic.co>	2023-05-24 12:32:46 +01:00
David Turner	7a517cb4a0	Add note on jstack frequency for troubleshooting (#95764 ) Suggest calling `jstack` every 15s to ensure that at least one capture shows a stuck thread. Also adds a link to this guide to the list on the troubleshooting overview page.	2023-05-03 10:04:13 +01:00
David Turner	822dc713d8	Add note on name resolution during startup (#95266 ) Notes that the transport publish address is resolved once during startup, plus advice to ensure that this name resolution doesn't vary by location.	2023-04-17 14:42:15 +01:00
David Turner	f0989404ab	Bootstrapping docs clarifications (#94977 ) Explains why you should remove `cluster.initial_master_nodes`, and rewords some of the other sections a little for (subjectively) improved readability.	2023-04-03 14:43:12 +01:00
David Kilfoyle	7cd484ac95	Revert "Cross-reference disclaimer" (#94829 ) * Revert "Cross-reference disclaimer (#94801)" This reverts commit `902649be31`. * Highlight sentences about removing `cluster.initial_master_nodes` setting	2023-03-28 11:17:53 -04:00
Stef Nestor	902649be31	Cross-reference disclaimer (#94801 ) 👋🏼 howdy, team! Can we cross pollinate the [important banner](https://www.elastic.co/guide/en/elasticsearch/reference/master/important-settings.html#initial_master_nodes) from the `cluster.initial_master_nodes` setting page to the related [bootstrap doc](https://www.elastic.co/guide/en/elasticsearch/reference/master/modules-discovery-bootstrap-cluster.html#bootstrap-cluster-name) to avoid user's misunderstanding the latter's "This is only required the first time a cluster starts up" as saying they don't need to comment-out these settings?	2023-03-28 00:16:40 -04:00
David Turner	421c2d4731	Add request/response body logging to HTTP tracer (#93133 ) Adds another logger, `org.elasticsearch.http.HttpBodyTracer`, which logs the body of every HTTP request and response as well as the usual summaries.	2023-03-15 11:13:36 -04:00
David Kilfoyle	32f7d046b7	Update http.max_content_length description (#94430 )	2023-03-09 10:58:50 -05:00
Stef Nestor	a2837a2e3f	Confirm http.max_content_length for compressed (#94408 ) 👋 Per [StackOverflow](https://stackoverflow.com/questions/55724839/elasticsearch-http-max-content-length-when-compressed), can we append that `http.max_content_length` applies to the compressed HTTP size.	2023-03-09 09:21:36 -05:00
Abdon Pijpelink	2808512397	[DOCS] Improve watermark troubleshooting documentation (#94222 )	2023-03-01 14:34:14 +01:00
Zacson	031854b6f4	[DOCS] Correct the calculation rules for limit the total number of cluster frozen shards (#93764 )	2023-02-23 10:30:03 +01:00
Pooya Salehi	93a897c89d	Update snapshot threadpool size doc (#93655 ) Co-authored-by: David Turner <david.turner@elastic.co>	2023-02-09 17:45:45 +01:00
Sylvain Wallez	484d3f4ada	Fixes CORS headers needed by Elastic clients (#85791 ) * Fixes CORS headers needed by Elastic clients Updates the default value for the `http.cors.allow-headers` setting to include headers used by Elastic client libraries. Also adds the `access-control-expose-headers` header to responses to CORS requests so that clients can successfully perform their product check.	2023-02-09 16:44:37 +01:00
Daniel Mitterdorfer	5ec28cc875	Document correct get thread pool size (#93541 ) In #92309 we have aligned the size of the `search` and the `get` thread pool but the docs still contain the prior `get` thread pool size. With this commit we also align the docs. Relates #92309	2023-02-08 07:19:55 +01:00
David Turner	4c68382065	Capture thread dump on ShardLockObtainFailedException (#93458 ) We sometimes see a `ShardLockObtainFailedException` when a shard failed to shut down as fast as we expected, often because a node left and rejoined the cluster. Sometimes this is because it was held open by ongoing scrolls or PITs, but other times it may be because the shutdown process itself is too slow. With this commit we add the ability to capture and log a thread dump at the time of the failure to give us more information about where the shutdown process might be running slowly. Relates #93226	2023-02-02 11:17:40 -05:00
Stef Nestor	eb1de9493e	[+DOC] node_concurrent_recoveries default (#90330 ) Notes that `node_concurrent_recoveries` default is 2 (same as both sub-settings which already note that).	2023-01-18 13:53:48 +01:00
David Turner	dfab580976	Limit length of lag detector hot threads log lines (#92851 ) If debug logging is enabled then the lag detector will capture and report the hot threads of a lagging node. In some cases the resulting log message can be very large, exceeding 10kiB, which means it is truncated in most logging setups. The relevant thread(s) may be waiting on I/O, which is not considered "hot" and therefore may not appear in the first 10kiB. This commit adjusts this logging mechanism to split the message into chunks of size at most 2kiB (after compression and base64-encoding) to ensure that the entire hot threads output can be faithfully reconstructed from these logs. Closes #88126	2023-01-13 13:11:26 +00:00
David Turner	6203560983	Fix docs for fault detection troubleshooting (#92749 ) In #92742 we changed the logging around cluster membership changes but the docs don't quite match the final version. This commit addresses that.	2023-01-09 10:17:06 +00:00
David Turner	5182748318	Improve node-{join,left} logging for troubleshooting (#92742 ) Today to troubleshoot an unstable cluster we ask the users to parse the rather complex `node-join` and `node-left` messages emitted by the `MasterService`. These messages may refer to many nodes, may be truncated, and are generally pretty hard to work with. With this commit we start to emit a simplified log message about each node added and removed. It also renames the respective executor classes: - `JoinTaskExecutor` -> `NodeJoinExecutor` - `NodeRemovalClusterStateTaskExecutor` -> `NodeLeftExecutor` This brings their names in line with each other, and the messages that they emit, whilst preserving the older `node-join` and `node-left` terminology as reported by the `MasterService`. Finally, it updates the troubleshooting logs to reflect these new and simplified logs. Relates #92741	2023-01-09 04:34:41 -05:00
Luiz Guilherme Pais dos Santos	9eec322424	Fix format for cluster.discovery_configuration_check.interval (#90452 )	2022-12-22 16:11:33 +01:00
amyjtechwriter	e130617b1b	putting Miscellaneous cluster settings on it's own page (#92150 )	2022-12-06 14:29:20 +00:00
David Turner	c9ae9123fe	Add docs for desired balance allocator (#92109 ) These docs cover the new allocator and the settings controlling the heuristics for combining disk usage and write load into the overall weight.	2022-12-06 11:10:18 +00:00
Nick Canzoneri	2b268d359d	[docs] Update search-settings documentation to reflect the fact that the indices.query.bool.max_clause_count setting has been deprecated (#91811 ) * Update search-settings documentation to reflect the fact that the indices.query.bool.max_clause_count setting has been deprecated * Fix indentation * Replace Elasticsearch with {es} * Add deprecation entry to release notes Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>	2022-11-29 18:10:30 +01:00
Tim Brooks	c1b39322af	Update network threading documentation (#91027 ) Currently the documentation on network threading suggests that we still use a model where we have individual workers dedicated to server sockets. That is no longer true and server sockets are assigned to normal workers. This commit updates the documentation.	2022-11-08 09:19:39 -05:00
Frederic Dartayre	fe0036fdbf	Update threadpool.asciidoc (#90098 ) * Update threadpool.asciidoc Starting from 8.0 the value of the `node.processors` setting is bounded by the number of available processors https://github.com/elastic/elasticsearch/pull/44894 * Update docs/reference/modules/threadpool.asciidoc Co-authored-by: Adam Locke <adam.locke@elastic.co>	2022-10-26 14:04:39 -04:00
Iraklis Psaroudakis	0f4374f4fb	Explain disk headroom settings more in docs (#90763 ) Relates to #81406	2022-10-20 18:45:23 +03:00
Paramdeep Singh	34ff7a9d98	Consolidated Circuit Breaker documentation to include EQL and ML infer (#90809 ) Fixes #85851 Co-authored-by: Iraklis Psaroudakis <kingherc@gmail.com>	2022-10-14 14:33:52 +03:00
Luca Cavanna	18942d5b11	Enhance nested depth tracking when parsing queries (#90425 ) When parsing queries on the coordinating node, there is currently no way to share state between the different parsing methods (`fromXContent`). The only query that supports a parse context is bool query, which uses the context to track nested depth of queries, added with #66204. Such nested depth tracking mechanism is not 100% accurate as it tracks bool queries only, while there's many more query types that can hold other queries hence potentially cause stack overflow when deeply nested. This change removes the parsing context that's specific to bool query, introduced with #66204, in favour of generalizing the nested depth tracking to all query types. The generic tracking is introduced by wrapping the parser and overriding the method that parses named objects through the xcontent registry. Another way would have been to require a context argument when parsing queries, which would mean adding a context argument to all the QueryBuilder#fromXContent static methods. That would be a breaking change for plugins that provide custom queries, hence I went for trying out a different approach. One aspect that this change requires and introduces is the distinction between parsing a top level query (which will wrap the parser, or it would create the context if we had one), as opposed to parsing an inner query, which goes ahead with the given parser and context. We already have this distinction as we have two different static methods in `AbstractQueryBuilder` but in practice only bool query makes the distinction being the only context-aware query. In addition to generalizing tracking nested depth when parsing queries, we should be able to adopt this same strategy to track queries usage as part #90176 . Given that the depth check is now more restrictive, as it counts all compound queries and not only bool, we have decided to raise the default limit to `30` to ensure that users are not going to hit the limit due to this change.	2022-10-12 15:15:06 +02:00
David Turner	c95fb2f3e8	More opinionated docs about http.max_content_length (#90500 ) Adds to the docs a note that the `100mb` default for `http.max_content_length` is the recommended maximum, along with suggestions for what to do when hitting this limit.	2022-09-29 16:07:38 +01:00
Iraklis Psaroudakis	34471b1cd2	Introduce max headroom for disk watermark stages (#88639 ) Introduce max headroom settings for the low, high, and flood disk watermark stages, similar to the existing max headroom setting for the flood stage of the frozen tier. Introduce new max headrooms in HealthMetadata and in ReactiveStorageDeciderService. Add multiple tests in DiskThresholdDeciderUnitTests, DiskThresholdDeciderTests and DiskThresholdMonitorTests. Moreover, addition & subtraction for ByteSizeValue, and min.	2022-09-19 14:59:18 +03:00
Leaf-Lin	65b05f858e	Add default value for destructive_requires_name (#85591 ) As per https://github.com/elastic/elasticsearch/pull/66908, the setting now defaults to `True`, but it's not shown in the doc. Can we please have the doc updated?	2022-08-25 08:44:53 -04:00
Francisco Fernández Castaño	837a8d7a6e	Add support for floating point node.processors setting (#89281 ) This commit adds support for floating point node.processors setting. This is useful when the nodes run in an environment where the CPU time assigned to the ES node process is limited (i.e. using cgroups). With this change, the system would be able to size the thread pools accordingly, in this case it would round up the provided setting to the closest integer.	2022-08-17 15:00:39 +02:00
David Turner	616fd07278	Drop transport client from ping_schedule docs (#89264 ) The docs for `transport.ping_schedule` note that the transport client defaults to a 5s ping schedule, but this is no longer relevant. This commit drops this from the docs, and also moves the docs for this setting further down the page to reflect its relative unimportance.	2022-08-11 09:25:14 +01:00
David Turner	c9d4892929	Weaken language about "low-latency" networks (#89198 ) Today we say that voting-only nodes require a "low-latency" network. This term has a specific meaning in some operating environments which is different from our intended meaning. To avoid this confusion this commit removes the absolute term "low-latency" in favour of describing the requirements relative to the user's own performance goals.	2022-08-09 13:15:37 +01:00
David Turner	d5ea39b2e8	Clean up network setting docs (#88929 ) Clean up network setting docs - Add types for all params - Remove mention of JDKs before 11 - Clarify some wording Co-authored-by: Stef Nestor <steffanie.nestor@gmail.com>	2022-08-01 19:59:50 +01:00
David Turner	41a607af2e	Fix typo (missing word) (#88034 )	2022-07-28 00:53:35 +09:30
Pooya Salehi	806d2976aa	Remove Blocks when disk threshold monitoring is disabled (#87841 ) This change ensures that existing read_only_allow_delete blocks that are placed on indices when the flood_stage watermark threshold is exceeded, are removed when the disk threshold monitoring is disabled. This is done by changing how InternalClusterInfoService behaves when disabled. With this change, it will keep calling the registered listeners periodically, but with an empty ClusterInfo. Closes #86383	2022-07-26 14:26:43 +02:00
Nikolaj Volgushev	b04c0f3c3a	Increase `http.max_header_size` default to 16kb (#88725 ) Our current default for the http.max_header_size setting is 8kb. This is lower than the current default for Kibana (16kb in 8.x), and the ESS proxy (1mb based on the Go http library default). To align with the current convention of other Elastic components, this PR increases the ES header size setting default to 16kb. Closes #88501	2022-07-25 12:57:28 +02:00
Iraklis Psaroudakis	f284cc16f4	Convert disk watermarks to RelativeByteSizeValues (#88719 ) * Convert disk watermarks to RelativeByteSizeValues Similar to the existing watermark setting for the frozen tier. Pre-requisite for PR 88639 that plans to introduce max headroom settings for the disk watermarks, similar to the frozen tier max headroom setting. * Add changelog * Revert 20gb to 20GB * Make formatNoTrailingZerosPercent non static * ByteSizeValue.MINUS_ONE * Remove getMinimumTotalSizeForBelowWatermark * Remove comment * Fix minor stuff * Make parsing of RelativeByteSizeValue faster Mimicks older definitelyNotPercentage function * Remove Locale from Strings.format * More MINUS_ONE	2022-07-22 18:39:07 +03:00
Leaf-Lin	945cb27782	[DOCS] Adding discovery troubleshooting link in the master get help page (#87344 ) * Adding discovery troubleshooting link * Add tags to pull in discovery troubleshooting content * Move discovery troubleshooting to separate page and add redirects Co-authored-by: Adam Locke <adam.locke@elastic.co>	2022-07-06 15:51:43 -04:00
Iraklis Psaroudakis	50d2cf31b8	Periodic warning for 1-node cluster w/ seed hosts (#88013 ) For fully-formed single-node clusters, emit a periodic warning if seed_hosts has been set to a non-empty list. Closes #85222	2022-06-30 16:35:15 +03:00
David Turner	80f7af58f8	More detail in discovery troubleshooting docs (#86930 ) In #85074 we added docs on discovery troubleshooting that really only talked about troubleshooting master elections. There's also the case where the master is elected fine but some other node can't join it. This commit adds troubleshooting docs about that too. Co-authored-by: Adam Locke <adam.locke@elastic.co>	2022-06-06 08:33:45 +01:00
Pooya Salehi	beadcaf631	Increase force_merge threadpool size (#87082 ) Changes the default size used for the force_merge threadpool to 1/8 of the allocated processors, with a minimum value of 1. Closes #84943	2022-05-25 15:45:28 +02:00
Joe Gallo	79990fa49b	Remove "Push back excessive requests for stats (#83832 )" (#87054 )	2022-05-23 12:58:02 -04:00
David Turner	79f181d208	Reduce resource needs of join validation (#85380 ) Fixes a few scalability issues around join validation: - compresses the cluster state sent over the wire - shares the serialized cluster state across multiple nodes - forks the decompression/deserialization work off the transport thread Relates #77466 Closes #83204	2022-04-26 12:15:54 +01:00
David Turner	33a553f61f	Fix up whitespace error introduced in #85948	2022-04-19 07:58:10 +01:00
David Turner	ce004d49e7	More docs re. removing cluster.initial_master_nodes (#85948 ) Ensures that on every page of the docs that mentions `cluster.initial_master_nodes` also mentions that this setting must be removed after bootstrapping completes.	2022-04-19 07:54:43 +01:00
David Turner	6a273886e9	Add technical docs on diagnosing instability etc (#85074 ) Copies some internal troubleshooting docs to the reference manual for wider use. Co-authored-by: James Rodewig <james.rodewig@gmail.com>	2022-03-31 09:01:10 +01:00
James Rodewig	73e56e3cf8	[DOCS] Reuse data tier content in node role docs (#84346 ) Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2022-03-28 14:32:01 -07:00
David Turner	fd76f9c5d1	Fix auto-bootstrap docs (#85215 ) Today it's no longer true that by default nodes will auto-discover other nodes on the same host and bootstrap them all into a cluster. This commit fixes the docs on auto-bootstrapping to recognise this.	2022-03-22 16:35:48 +00:00
David Turner	ff742fcb27	More balanced docs about NFS etc (#85060 ) Today we don't really say anything about the requirements for the data path in terms of correctness, and we specifically say to avoid NFS for performance reasons. This isn't wholly accurate: some NFS implementations work just fine. This commit documents a more balanced position on local vs remote storage.	2022-03-18 13:01:59 +00:00
Mary Gouseti	ed0bb2a8af	Push back excessive requests for stats (#83832 ) Resolves #51992	2022-02-28 08:46:18 +01:00
Tobias Stadler	e3deacf547	[DOCS] Fix typos (#83895 )	2022-02-15 12:42:17 -05:00
James Rodewig	2f03112b5b	[DOCS] Synced with 8.0 stack upgrade changes (#83489 ) (#83596 ) This moves the bulk of the upgrade information into the consolidated upgrade guide, but leaves the primary upgrade topic in place as a cross reference. Relates to: https://github.com/elastic/stack-docs/pull/1970 Co-authored-by: gchaps <33642766+gchaps@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com> (cherry picked from commit `f6473d71f9`) Co-authored-by: debadair <debadair@elastic.co>	2022-02-07 11:01:42 -05:00
Tanguy Leroux	7f827bbab8	Document and test operator-only node bandwidth recovery settings (#83372 ) This commit updates the Operator-only functionality doc to mention the operator only settings introduced in #82819. It also adds an integration test for those operator only settings that would have caught #83359.	2022-02-02 11:50:19 +01:00
David Turner	a062bdf42f	Add docs for node bandwith settings (#83361 ) Relates #82819	2022-02-01 12:19:00 +00:00
erictung1999	58ffc42f5f	[DOCS] Fix typo (#82100 ) Fix typo under `indices.recovery.max_concurrent_snapshot_file_downloads_per_node`	2022-01-13 09:42:22 -05:00
James Rodewig	dfb9f6f18d	[DOCS] Document 8.0 BWC support for CCS (#80809 ) As of 8.0, the compatibility window for cross-cluster search (CCS) to an earlier release will be one minor release. This updates the CCS docs and adds a related 8.0 breaking change. Closes https://github.com/elastic/elasticsearch/issues/80782	2022-01-11 10:33:12 -05:00
James Rodewig	950eb775fe	[DOCS] Correct yaml syntax in example configuration (#82297 ) (#82392 ) (cherry picked from commit `432fd79c46`) Co-authored-by: mymindstorm <mymindstorm@evermiss.net>	2022-01-10 17:19:27 -05:00
James Rodewig	7142b47e69	[DOCS] Add prerequisites for CCS (#81782 ) * Adds a prerequisites section covering remote cluster config, node roles, and security. * Moves existing content about remote cluster config to the prereqs. * Updates the remote cluster docs to include information about eligible gateway nodes and tagging for gateway nodes. Closes https://github.com/elastic/elasticsearch/issues/72001	2022-01-10 09:17:44 -05:00
Stef Nestor	e2d66cd257	[DOCS] Thread pool settings are static (#81887 ) Starting in 5.1 Thread Pools can no longer be dynamically updated, [doc](https://www.elastic.co/guide/en/elasticsearch/reference/5.0/breaking_50_settings_changes.html#_threadpool_settings).	2021-12-20 11:20:06 -05:00
Leaf-Lin	82592c4268	[DOCS] Update remote cluster version compatibility table for 8.x (#81239 ) Updates the remote clusters version compatibility table to include 7.17 and 8.x versions. Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>	2021-12-16 11:16:24 -05:00
David Turner	5b9ce9e820	Remove dead code from same-shard decider (#81520 ) Today the same-shard allocation decider falls back to checking the hostname if the node has no host address. In practice nodes will always have an address so the fallback is dead code. This commit removes that dead code. Relates #80702 which will add the ability to distinguish nodes by hostname regardless of whether they have an address or not, and #80767 which optimizes this area of code - this refactoring should make the optimization simpler.	2021-12-09 08:42:25 +00:00
David Turner	7dd32fb027	Reduce verbosity-increase timeout to 3m (#81118 ) Today we increase the verbosity of discovery failures after 5 minutes without a master. Unfortunately 5 minutes is a common orchestration timeout, so if discovery is broken then we see nodes being shut down just before they start to emit useful logs. This commit reduces the default timeout to 3 minutes to address that.	2021-11-30 09:52:39 +00:00
David Turner	8cf4c7b6fb	Remove last few mentions of Zen discovery (#80410 ) We have a few leftover mentions of `zen` discovery, mostly for historical/BwC reasons, which this commit removes. Prior to this commit the default value for `discovery.type` was `zen` but this was not written down anywhere or officially supported: the two options were to set it to `single-node` or to omit it entirely. This commit changes the default to `multi-node` and documents this. Co-authored-by: Adam Locke <adam.locke@elastic.co>	2021-11-09 09:52:06 +01:00
Stuart Tettemer	30e15ba838	Script: Time series compile and cache evict metrics (#79078 ) Collects compilation and cache eviction metrics for each script context. Metrics are available in _nodes/stats in 5m/15m/1d buckets. Refs: #62899	2021-11-03 13:13:42 -05:00
David Turner	6cc0a41af0	Expand warning about modifying data path contents (#79649 ) Today we have a short note in one place in the docs saying not to touch the contents of the data path. This commit expands the warning to describe more precisely what is forbidden, and to give some more detail of the consequences, and also duplicates the warning to the other location that documents the `path.data` setting.	2021-10-21 16:28:43 -04:00
Stuart Tettemer	808b70d2f9	Script: Restore the scripting general cache (#79453 ) Deprecate the script context cache in favor of the general cache. Users should use the following settings: `script.max_compilations_rate` to set the max compilation rate for user scripts such as filter scripts. Certain script contexts that submit scripts outside of the control of the user are exempted from this rate limit. Examples include runtime fields, ingest and watcher. `script.cache.max_size` to set the max size of the cache. `script.cache.expire` to set the expiration time for entries in the cache. Whats deprecated? `script.max_compilations_rate: use-context`. This special setting value was used to turn on the script context-specific caches. `script.context.$CONTEXT.cache_max_size`, use `script.cache.max_size` instead. `script.context.$CONTEXT.cache_expire`, use `script.cache.expire` instead. `script.context.$CONTEXT.max_compilations_rate`, use `script.max_compilations_rate` instead. The default cache size was increased from `100` to `3000`, which was approximately the max cache size when using context-specific caches. The default compilation rate limit was increased from `75/5m` to `150/5m` to account for increasing uses of scripts. System script contexts can now opt-out of compilation rate limiting using a flag rather than a sentinel rate limit value. 7.16: Script: Deprecate script context cache #79508 Refs: #62899 7.16: Script: Opt-out system contexts from script compilation rate limit #79459 Refs: #62899	2021-10-21 07:57:27 -05:00
Francisco Fernández Castaño	2b4fe8fc7b	Limit concurrent snapshot file restores in recovery per node (#79316 ) Today we limit the max number of concurrent snapshot file restores per recovery. This works well when the default node_concurrent_recoveries is used (which is 2). When this limit is increased, it is possible to exhaust the underlying repository connection pool, affecting other workloads. This commit adds a new setting `indices.recovery.max_concurrent_snapshot_file_downloads_per_node` that allows to limit the max number of snapshot file downloads per node during recoveries. When a recovery starts in the target node it tries to acquire a permit that allows it to download snapshot files when it is granted. This is communicated to the source node in the StartRecoveryRequest. This is a rather conservative approach since it is possible that a recovery that gets a permit to use snapshot files doesn't recover any snapshot file while there's a concurrent recovery that doesn't get a permit could take advantage of recovering from a snapshot. Closes #79044	2021-10-18 18:17:27 +02:00
Yannick Welsch	13487b1ed6	Node level can match action (#78765 ) Changes can-match from a shard-level to a node-level action, which helps avoid an explosion of shard-level can-match subrequests in clusters with many shards, that can cause stability issues. Also introduces a new search_coordination thread pool to handle the sending and handling of node-level can-match requests.	2021-10-18 10:13:44 +02:00
Nikola Grcevski	055c770083	Deprecation of transient cluster settings (#78794 ) This PR changes uses of transient cluster settings to persistent cluster settings. The PR also deprecates the transient settings usage. Relates to #49540	2021-10-15 13:00:52 -04:00
Henning Andersen	57e503ca78	[DOCS] disk.threshold_enabled not cloud (#79225 ) Mark `cluster.routing.allocation.disk.threshold_enabled` not for cloud and add it to list of operator only settings. Relates #78822	2021-10-15 16:19:04 +02:00
Adam Locke	529986e9b1	A typo error (#78987 ) (#79203 ) * A typo error a space between 'E' and 'cluster...' * Update example, fix headings, change notes Co-authored-by: Adam Locke <adam.locke@elastic.co> Co-authored-by: Marwane Chahoud <marwane.chahoud@gmail.com>	2021-10-15 08:52:03 -04:00
Adam Locke	c3b67ee0ae	[DOCS] Fix default value for closed indices (#78924 ) * [DOCS] Fix default value for closed indices #57953 introduced changes that added ESS icons to many Elasticsearch settings. As part of those changes, the default value for `cluster.indices.close.enable` was indicated as `false`, when it should be `true`. This PR updates the default value to `true`. Closes #78877 * Update description * Update note to remove outdated claims	2021-10-13 08:14:01 -04:00
Samuel Nelson	c4f5d41fe7	[DOCS] Update ESS support for `stack.templates.enabled` (#78732 ) The documentation indicates that `stack.templates.enabled` can be used in Elasticsearch Service, but it is not part of the settings allowlist in ESS. This PR makes the documentation match the state of the allowlist.	2021-10-06 09:37:30 -04:00
David Turner	07a2acac93	Improve docs for pre-release version compatibility (#78428 ) * Improve docs for pre-release version compatibility Follow-up to #78317 clarifying a couple of points: - a pre-release build can restore snapshots from released builds - compatibility applies if at least one of the local or remote cluster is a released build * Remote cluster build date nit	2021-09-29 04:49:07 -04:00
David Turner	4782cf4d91	Add docs for pre-release version compatibility (#78317 ) The reference manual includes docs on version compatibility in various places, but it's not clear that these docs only apply to released versions and that the rules for pre-release versions are stricter than folks expect. This commit adds some words to the docs for unreleased versions which explains this subtlety.	2021-09-27 16:56:35 +01:00
Adam Locke	6940673e8a	[DOCS] Update remote cluster docs (#77043 ) * [DOCS] Update remote cluster docs * Add files, rename files, write new stuff * Plethora of changes * Add test and update snippets * Redirects, moved files, and test updates * Moved file to x-pack for tests * Remove older CCS page and add redirects * Cleanup, link updates, and some rewrites * Update image * Incorporating user feedback and rewriting much of the remote clusters page * More changes from review feedback * Numerous updates, including request examples for CCS and Kibana * More changes from review feedback * Minor clarifications on security for remote clusters * Incorporate review feedback Co-authored-by: Yang Wang <ywangd@gmail.com> * Some review feedback and some editorial changes Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Yang Wang <ywangd@gmail.com>	2021-09-22 16:02:33 -04:00
James Rodewig	2b2f0e1d7f	[DOCS] Remove the `listener` thread pool (#78194 ) Changes: * Removes docs for the `listener` thread pool * Adds an 8.0 breaking change for the thread pool removal Relates to #53314 and #53049	2021-09-22 13:41:05 -04:00
AndyHunt66	a5030ef407	[DOCS] Fix typo for `script.painless.regex.enabled` setting value (#77853 ) The value is `limited`, not `limit`.	2021-09-16 13:59:58 -04:00
Johan Nilsson Hansen	553e8dcb07	Create a sha-256 hash of the shard request cache key (#74877 ) We currently use the plaintext body of a shard request as the key to the request cache. This has the disadvantage that very large requests can quickly fill up the cache due to the size of their keys. With this commit, we instead use a sha-256 hash of the shard request as the cache key, which will use a constant (and much smaller) number of bytes.	2021-09-13 08:55:59 +01:00
David Turner	1045abe71f	Limit count of HTTP channels with tracked stats (#77303 ) Today we expire the client stats for HTTP channels 5 minutes after they close. It's possible to open a very large number of HTTP channels in 5 minutes, possibly inadvertently, and the stats for those channels can be overwhelming. This commit introduces a limit on the number of channels tracked by each node which applies in addition to the age limit, and makes these limits configurable via static settings. It drops the pruning of old stats when starting to track a new channel and instead uses a queue to expire the oldest stats when each channel closes if necessary to respect the count limit; it only performs age-based expiry when retrieving the stats, since the count limit now bounds the memory needed. Finally, it tightents up some missing synchronization and makes sure that we expose only immutable objects to the stats subsystem.	2021-09-08 07:25:57 +01:00
Howard	4432b39112	[DOCS] Fix formatting for `snapshot_meta` thread pool (#76973 )	2021-08-26 10:36:26 -04:00
Martijn van Groningen	8a1deff75a	Improve fault-detection.asciidoc (#76821 ) Add section to fault-detection.asciidoc about nodes being removed from cluster due to slow cluster state applying.	2021-08-23 14:31:06 +02:00
Tim Brooks	673e8e17f4	Enable LZ4 transport compression by default (#76326 ) This commit enables LZ4 transport compression by default at the indexing_data level. Relates to #73497.	2021-08-17 12:19:42 -06:00
Tim Brooks	e6fd459a6e	Respond with same compression scheme received (#76372 ) This is related to #73497. Currently, we only use the configured transport.compression_scheme setting when compressing a request or a response. Additionally, the cluster.remote.*.compression_scheme setting is ignored. This commit fixes this behavior by respecting the per-cluster setting. Additionally, it resolves confusion around inbound and outbound connections by always responding with the same scheme that was received. This allows remote connections to have different schemes than local connections.	2021-08-13 13:29:22 -06:00
Francisco Fernández Castaño	2ebe5cd075	Add peer recoveries using snapshot files when possible (#76237 ) This commit adds peer recoveries from snapshots. It allows establishing a replica by downloading file data from a snapshot rather than transferring the data from the primary. Enabling this feature is done on the repository definition. Repositories having the setting `use_for_peer_recovery=true` will be consulted to find a good snapshot when recovering a shard. Relates #73496	2021-08-13 10:42:16 +02:00
Tim Brooks	425b7b280b	Add docs for production ready compression settings (#76441 ) In 7.15, we intend for the indexing_data compression level and the compression scheme lz4 to no longer be experimental. This commit updates the documentation to reflect this. Additionally, it adds missing docs for the cluster.remote.*.transport.compression_scheme setting. Relates to #73497.	2021-08-12 16:48:56 -06:00
David Turner	e6a39e6ddc	Add note on special network values docs (#75779 ) The special values `_global_`, `_site_`, `0.0.0.0` and so on may resolve to multiple addresses, of which one is chosen to be the publish address. This commit generalises the warning about reachability as applied to DNS-resolved hostnames to also apply to these special values. Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>	2021-08-09 17:05:54 +01:00
Francisco Fernández Castaño	3c8b9a6f2e	Add peer recovery planners that take into account available snapshots (#75840 ) This commit adds a new set of classes that would compute a peer recovery plan, based on source files + target files + available snapshots. When possible it would try to maximize the number of files used from a snapshot. It uses repositories with `use_for_peer_recovery` setting set to true. It adds a new recovery setting `indices.recovery.use_snapshots` Relates #73496	2021-08-09 14:03:12 +02:00
James Rodewig	5252995b48	[DOCS] Document regex circuit breaker (#76048 ) Documents the `script.painless.regex.enabled` and `script.painless.regex.limit-factor` cluster settings. Relates to #63029. Closes #75199.	2021-08-04 16:37:29 -04:00
Adrien Grand	feb6620d14	`indices.query.bool.max_clause_count` now limits all query clauses (#75297 ) In the upcoming Lucene 9 release, `indices.query.bool.max_clause_count` is going to apply to the entire query tree rather than per `bool` query. In order to avoid breaks, the limit has been bumped from 1024 to 4096. The semantics will effectively change when we upgrade to Lucene 9, this PR is only about agreeing on a migration strategy and documenting this change. To avoid further breaks, I am leaning towards keeping the current setting name even though it contains `bool`. I believe that it still makes sense given that `bool` queries are typically the main contributors to high numbers of clauses. Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>	2021-07-21 12:16:30 +02:00

1 2 3 4 5 ...

1008 Commits