elasticsearch

Commit Graph

Author	SHA1	Message	Date
Rene Groeschke	35ec6f348c	Introduce simple public yaml-rest-test plugin (#76554 ) This introduces a basic public yaml rest test plugin that is supposed to be used by external elasticsearch plugin authors. This is driven by #76215 - Rename yaml-rest-test to intern-yaml-rest-test - Use public yaml plugin in example plugins Co-authored-by: Mark Vieira <portugee@gmail.com>	2021-08-31 08:45:52 +02:00
Nikita Glashenko	1db17ada95	Fix wrong error upper bound when performing incremental reductions (#43874 ) When performing incremental reductions, 0 value of docCountError may mean that the error was not previously calculated, or that the error was indeed previously calculated and its value was 0. We end up rejecting true values set to 0 this way. This may lead to wrong upper bound of error in result. To fix it, this PR makes docCountError nullable. null values mean that error was not calculated yet. Fixes #40005 Co-authored-by: Igor Motov <igor@motovs.org> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2021-07-22 08:18:24 -10:00
Mayya Sharipova	aa76ebbfbe	Set max allowed size for stored async response (#74455 ) Add a dynamic transient cluster setting search.max_async_search_response_size that controls the maximum allowed size for a stored async search response. The default max size is 10Mb. An attempt to store an async search response larger than this size will result in error. Relates to #67594	2021-06-30 10:21:28 -04:00
Rene Groeschke	b79dd52c1b	Cleanup QA projects build scripts (#74428 ) Aiming for configuring less during the build, this removes non required configuration from qa build scripts that do not contain any sources. We also remove a few non required afterEvaluate hooks	2021-06-23 11:35:47 +02:00
Alexander Reelsen	1a8b890af5	Make PIT validation error actionable (#74224 ) Closes #74223	2021-06-19 15:30:53 -04:00
Rory Hunter	a5d2251064	Order imports when reformatting (#74059 ) Change the formatter config to sort / order imports, and reformat the codebase. We already had a config file for Eclipse users, so Spotless now uses that. The "Eclipse Code Formatter" plugin ought to be able to use this file as well for import ordering, but in my experiments the results were poor. Instead, use IntelliJ's `.editorconfig` support to configure import ordering. I've also added a config file for the formatter plugin. Other changes: * I've quietly enabled the `toggleOnOff` option for Spotless. It was already possible to disable formatting for sections using the markers for docs snippets, so enabling this option just accepts this reality and makes it possible via `formatter:off` and `formatter:on` without the restrictions around line length. It should still only be used as a very last resort and with good reason. * I've removed mention of the `paddedCell` option from the contributing guide, since I haven't had to use that option for a very long time. I moved the docs to the spotless config.	2021-06-16 09:22:22 +01:00
Nhat Nguyen	b3d36d5e03	Integrate circuit breaker in AsyncTaskIndexService (#73862 ) This change integrates the circuit breaker in AsyncTaskIndexService to make sure that we won't hit OOM when serializing a large response of an async search. Related to #67594 Supersedes #73638 Co-authored-by: Mayya Sharipova <mayya.sharipova@elastic.co>	2021-06-09 11:25:59 -04:00
Ryan Ernst	68817d7ca2	Rename o.e.common in libs/core to o.e.core (#73909 ) When libs/core was created, several classes were moved from server's o.e.common package, but they were not moved to a new package. Split packages need to go away long term, so that Elasticsearch can even think about modularization. This commit moves all the classes under o.e.common in core to o.e.core. relates #73784	2021-06-08 09:53:28 -07:00
Nhat Nguyen	44fc661835	Add point in time to HLRC (#72167 ) Closes #70593	2021-05-12 17:59:25 -04:00
Rene Groeschke	e609e07cfe	Remove internal build logic from public build tool plugins (#72470 ) Extract usage of internal API from TestClustersPlugin and PluginBuildPlugin and related plugins and build logic This includes a refactoring of ElasticsearchDistribution to handle types better in a way we can differentiate between supported Elasticsearch Distribution types supported in TestCkustersPlugin and types only supported in internal plugins. It also introduces a set of internal versions of public plugins. As part of this we also generate the plugin descriptors now. As a follow up on this we can actually move these public used classes into an extra project (declared as included build) We keep LoggedExec and VersionProperties effectively public And workaround for RestTestBase	2021-05-06 14:02:35 +02:00
Rene Groeschke	5bcd02cb4d	Restructure build tools java packages (#72030 ) Related to #71593 we move all build logic that is for elasticsearch build only into the org.elasticsearch.gradle.internal* packages This makes it clearer if build logic is considered to be used by external projects Ultimately we want to only expose TestCluster and PluginBuildPlugin logic to third party plugin authors. This is a very first step towards that direction.	2021-04-26 14:53:55 +02:00
Joe Gallo	2a0ec50d47	Apply REST API compatibility testing for the :x-pack plugins (#71302 )	2021-04-06 11:35:33 -04:00
Jason Tedor	32314493a2	Pass override settings when creating test cluster (#71203 ) Today when creating an internal test cluster, we allow the test to supply the node settings that are applied. The extension point to provide these settings has a single integer parameter, indicating the index (zero-based) of the node being constructed. This allows the test to make some decisions about the settings to return, but it is too simplistic. For example, imagine a test that wants to provide a setting, but some values for that setting are not valid on non-data nodes. Since the only information the test has about the node being constructed is its index, it does not have sufficient information to determine if the node being constructed is a non-data node or not, since this is done by the test framework externally by overriding the final settings with specific settings that dicate the roles of the node. This commit changes the test framework so that the test has information about what settings are going to be overriden by the test framework after the test provide its test-specific settings. This allows the test to make informed decisions about what values it can return to the test framework.	2021-04-02 10:20:36 -04:00
Mark Vieira	6339691fe3	Consolidate REST API specifications and publish under Apache 2.0 license (#70036 )	2021-03-26 16:20:14 -07:00
Nhat Nguyen	5bb440cdca	Move point in time to server (#70704 ) This change moves the implementation of point in time to the server package.	2021-03-24 14:29:20 -04:00
Joe Gallo	4cc4c2cc47	[REST Compatible API] Route refactoring (#69573 ) Related to #51816 Makes `Route`s `RestApiVersion` -aware (and `RestHandler`s `RestApiVersion` -agnostic). Refactors how `Route`s are constructed in the case of deprecation or replacement of routes.	2021-03-05 19:11:37 -05:00
jimczi	69309c1b73	Remove leftover of #67877 Closes #69313	2021-02-22 09:29:47 +01:00
Rene Groeschke	bdf229a148	Introduce Internal Test Artifact Plugin (#68766 ) This reduces the ceremony declaring test artifacts for a project. It also solves an issue with usage of deprecated testRuntime that testArtifacts extendsFrom which seems not required at all and would have broke with Gradle 7.0 anyhow Test artifact resolution is now variant aware which allows us a more adequate compile and runtime classpath for the consuming projects. We also Introduce a convention method in the elasticsearch build to declare test artifact dependencies in an easy way close to how its done by the gradle build in test fixture plugin. Furthermore we cleaned up some inconsistent test dependencies declarations when relying on a project and on its test artifacts	2021-02-16 14:36:17 +01:00
Jim Ferenczi	f67185f746	Add a cluster privilege to cancel tasks and delete async searches (#68679 ) This change adds a new cluster privilege cancel_task that allows to: Cancel running tasks (_tasks/_cancel). Cancel and delete async searches. Today the 'manage' cluster privilege is required to cancel tasks and to delete async searches when security features are enabled. This new focused privilege allows to handle tasks and searches only. The change also adds the privilege to the internal 'kibana_system' and '_async_search' roles. They both need to be able to cancel tasks and delete async searches. Relates #67965	2021-02-16 10:56:17 +01:00
Mayya Sharipova	6521d2af27	Introduce eql search status API (#68065 ) Introduce eql search status API, that reports the status of eql stored or async search. GET _eql/search/status/<id> The API is restricted to the monitoring_user role. For a running eql search, a response has the following format: { "id" : <id>, "is_running" : true, "is_partial" : true, "start_time_in_millis" : 1611690235000, "expiration_time_in_millis" : 1611690295000 } For a completed eql search, a response has the following format: { "id" : <id>, "is_running" : false, "is_partial" : false, "expiration_time_in_millis" : 1611690295000, "completion_status" : 200 } Closes #66955	2021-02-11 09:30:13 -05:00
Jim Ferenczi	e7b4b01cd1	Revert "Extend async search keep alive (#67877 )" (#68855 ) This reverts commit `244fc958bf`.	2021-02-10 20:36:20 +01:00
Hendrik Muhs	54ed2e37d9	[Transform] implement retention policy to delete data from a transform (#67832 ) add a retention policy to transform to delete data that is considered outdated as part of a transform checkpoint. fixes #67916	2021-02-08 15:06:15 +01:00
Jim Ferenczi	ec48172084	Allow deletion of async searches with the manage privilege (#67965 ) This change allows users that do not initiated an async search to delete it if they have the cluster manage and manage-security privilege. It is equivalent to the cancellation of tasks through the task manager (same privilege required) and will allow users with the right permissions to cancel/delete async searches if they know the async execution id.	2021-02-08 12:05:27 +01:00
Rene Groeschke	5dfa6f46ac	Remove deprecated usage of default configuration (#68575 ) This has been deprecated in gradle before but we havnt been warned. Gradle 7.0 will likely introduce a change in behaviour here that we should fix the usage of this configuration upfront. See https://github.com/gradle/gradle/issues/16027 for further information about the change in Gradle 7.0	2021-02-07 12:08:02 +01:00
Mark Vieira	a92a647b9f	Update sources with new SSPL+Elastic-2.0 license headers As per the new licensing change for Elasticsearch and Kibana this commit moves existing Apache 2.0 licensed source code to the new dual license SSPL+Elastic license 2.0. In addition, existing x-pack code now uses the new version 2.0 of the Elastic license. Full changes include: - Updating LICENSE and NOTICE files throughout the code base, as well as those packaged in our published artifacts - Update IDE integration to now use the new license header on newly created source files - Remove references to the "OSS" distribution from our documentation - Update build time verification checks to no longer allow Apache 2.0 license header in Elasticsearch source code - Replace all existing Apache 2.0 license headers for non-xpack code with updated header (vendored code with Apache 2.0 headers obviously remains the same). - Replace all Elastic license 1.0 headers with new 2.0 header in xpack.	2021-02-02 16:10:53 -08:00
Nik Everett	ffad8ed8a4	Drop rawtypes warning from async search (#68248 ) It looks like we don't need it and it helps us write code with "funny" assumptions without realizing it.	2021-02-01 08:12:20 -05:00
Jim Ferenczi	d1faab4cd3	Remove flaky test for async search (#67982 ) This change removes a test that tries to simulate failures when indexing the response of an async search request. The test is flaky and doesn't simulate the errors correctly so it was disabled. This change removes it entirely since it doesn't add any value. Closes #63948	2021-01-27 10:10:14 +01:00
Jim Ferenczi	cd2007ed5c	Async search keep alive validation (#67981 ) This commit changes the minimum value for the keep_alive option of async searches to 1s. Closes #67974	2021-01-27 09:04:06 +01:00
Nhat Nguyen	244fc958bf	Extend async search keep alive (#67877 ) There can be a race between two GET async search requests, and the one with a lower keep_alive parameter wins the race. This scenario is not desirable as we should retain the search result for all requests. This commit ensures the keep_alive is extended and never goes backward.	2021-01-25 17:46:11 -05:00
Rory Hunter	1a05a5ac24	Introduce deprecation categories (#67443 ) Closes #64824. Introduce the concept of categories to deprecation logging. Every location where we log a deprecation message must now include a deprecation category.	2021-01-18 16:16:54 +00:00
Julie Tibshirani	5852fbedf5	Rename QueryShardContext -> SearchExecutionContext. (#67490 ) We decided to rename `QueryShardContext` to clarify that it supports all parts of search request execution. Before there was confusion over whether it should only be used for building queries, or maybe only used in the query phase. This PR also updates the javadocs. Closes #64740.	2021-01-14 09:11:59 -08:00
Ioannis Kakavas	bd873698bc	Ensure CI is run in FIPS 140 approved only mode (#64024 ) We were depending on the BouncyCastle FIPS own mechanics to set itself in approved only mode since we run with the Security Manager enabled. The check during startup seems to happen before we set our restrictive SecurityManager though in org.elasticsearch.bootstrap.Elasticsearch , and this means that BCFIPS would not be in approved only mode, unless explicitly configured so. This commit sets the appropriate JVM property to explicitly set BCFIPS in approved only mode in CI and adds tests to ensure that we will be running with BCFIPS in approved only mode when we expect to. It also sets xpack.security.fips_mode.enabled to true for all test clusters used in fips mode and sets the distribution to the default one. It adds a password to the elasticsearch keystore for all test clusters that run in fips mode. Moreover, it changes a few unit tests where we would use bcrypt even in FIPS 140 mode. These would still pass since we are bundling our own bcrypt implementation, but are now changed to use FIPS 140 approved algorithms instead for better coverage. It also addresses a number of tests that would fail in approved only mode Mainly: Tests that use PBKDF2 with a password less than 112 bits (14char). We elected to change the passwords used everywhere to be at least 14 characters long instead of mandating the use of pbkdf2_stretch because both pbkdf2 and pbkdf2_stretch are supported and allowed in fips mode and it makes sense to test with both. We could possibly figure out the password algorithm used for each test and adjust password length accordingly only for pbkdf2 but there is little value in that. It's good practice to use strong passwords so if our docs and tests use longer passwords, then it's for the best. The approach is brittle as there is no guarantee that the next test that will be added won't use a short password, so we add some testing documentation too. This leaves us with a possible coverage gap since we do support passwords as short as 6 characters but we only test with > 14 chars but the validation itself was not tested even before. Tests can be added in a followup, outside of fips related context. Tests that use a PKCS12 keystore and were not already muted. Tests that depend on running test clusters with a basic license or using the OSS distribution as FIPS 140 support is not available in neither of these. Finally, it adds some information around FIPS 140 testing in our testing documentation reference so that developers can hopefully keep in mind fips 140 related intricacies when writing/changing docs.	2020-12-23 21:00:49 +02:00
Albert Zaharovits	5921ead045	Store and use only internal security headers (#66365 ) For async searches (EQL included) the client's request headers were erroneously stored in the .tasks index. This might expose the requesting client's HTTP Authorization header. This PR fixes that by employing the usual approach to store only the security-internal headers, which carry the authentication result, instead of the original Authorization header, which is commonly utilized to redo authentication for scheduled tasks.	2020-12-17 20:51:08 +02:00
Rene Groeschke	defaa93902	Avoid tasks materialized during configuration phase (#65922 ) * Avoid tasks materialized during configuration phase * Fix RestTestFromSnippet testRoot setup	2020-12-12 16:14:17 +01:00
Armin Braun	80468992da	Fix AbstractClient#execute Listener Leak (#65415 ) This was observed in #65405 due to trying to locally execute a task whose parent was already cancelled but is a general issue. We should not throw from APIs that consume a listener as this may like in this case leak the listener in that case. Rather than fixing the specific case of #65405 this fixes the abstract client overall to avoid other such leaks. Closes #65405	2020-12-01 00:48:34 +01:00
Nik Everett	c227554080	Remove SearchContext from constructing aggregations (#64953 ) This replaces the `SearchContext` passed to the ctor of `Aggregation`s with `AggregationContext`. It ends up adding a fairly large number of methods to `AggregationContext` but in exchange it shows a path to removing a few methods from `SearchContext`. That seems nice! It also gives us an accurate inventory of "all of the stuff" that aggregations use to build and run.	2020-11-30 13:19:44 -05:00
Jim Ferenczi	6d22901eec	Do not skip not available shard exception in search response (#64337 ) Today search responses do not report failures for shard that were not available for the search. So if one shard is not assigned on a search over 5 shards, the search response will report: ``` "_shards": { "total": 5, "successful": 4, "skipped": 0, "failed": 0 } ``` If all shards are unassigned, we report a generic search phase exception with no cause. It's easy to spot that `successful` is less than `total` in the response but not reporting the failure is misleading for users. This change removes the special handling of not available shards exception in search responses and treat them as any other failure that could occur on a shard. These exceptions will count in the `failed` section and will be reported in details in the `shard_failures` section. If all shards are unavailable, the search API will now return 404 NOT_FOUND as an indication that the search failed because it couldn't find any of the resources. Closes #47700	2020-11-24 08:48:58 +01:00
Armin Braun	8b39992bf8	Cleanup TransportRequestOptions Usage (#65248 ) A class with 2 fields does not need a builder, especially when in many cases the builder result is just equivalent to the `EMPTY` singleton to begin with. Removed the builder and simplified related code accordingly.	2020-11-19 13:12:53 +01:00
Ryan Ernst	329771d3ed	Rename test module (#65020 ) This commit renames the test module used by async-search so that it does not trip the check on test modules published with snpashot builds. A more robust approach will be added in the future so this is not a module at all, but for now this fixes CI from breaking on release tests. closes #64910	2020-11-12 14:06:21 -08:00
Rene Groeschke	810e7ff6b0	Move tasks in build scripts to task avoidance api (#64046 ) - Some trivial cleanup on build scripts - Change task referencing in build scripts to use task avoidance api where replacement is trivial.	2020-11-12 12:04:15 +01:00
Daniel Mitterdorfer	19b55640dd	Mute AsyncSearchActionIT.testRetryVersionConflict (#64917 ) Relates #63948	2020-11-11 13:56:03 +01:00
Ryan Ernst	f5598475fa	Treat all xpack as modules (#64832 ) When rest tests install plugins, they currently treat xpack as plugins instead of modules. This was a side effect of splitting the rest test plugin out from PluginBuildPlugin. However, that means xpack modules are being run through the elasticsearch plugin installer, which is not guaranteed to work. This commit reworks rest tests to treat anything under xpack as a module. A side effect of this is that QA plugins under xpack get treated as modules. This is ok because they are just for testing, we don't need to validate them in the same way we do actual plugins. Additionally, this commit relaxes the expection that modules are only added for the integ test distribution. There is already a check to not overwrite an existing module, so this wasn't a useful optimization anyways.	2020-11-09 15:58:54 -08:00
Mayya Sharipova	074f7d2e8a	Async search status (#62947 ) Introduce async search status API GET /_async_search/status/<id> The API is restricted to the monitoring_user role. For a running async search, the response is: ```js { "id" : <id>, "is_running" : true, "is_partial" : true, "start_time_in_millis" : 1583945890986, "expiration_time_in_millis" : 1584377890986, "_shards" : { "total" : 562, "successful" : 188, "skipped" : 0, "failed" : 0 } } ``` For a completed async search, an additional `completion_status` fields is added. ```js { "id" : <id>, "is_running" : false, "is_partial" : false, "start_time_in_millis" : 1583945890986, "expiration_time_in_millis" : 1584377890986, "_shards" : { "total" : 562, "successful" : 562, "skipped" : 0, "failed" : 0 }, "completion_status" : 200 } ``` Closes #57537	2020-11-03 11:35:28 -05:00
Nhat Nguyen	4388449626	Ignore cancellation error when search is cancelled (#64240 ) Since #63520, we will cancel a search that hits shard failures and does not accept partial results. However, that change can return the wrong HTTP code for bad requests (from 4xx to 5xx) due to the cancellation. Relates #63520 Closes #64012 Closes #63702	2020-10-27 19:43:02 -04:00
Jim Ferenczi	2b4bde45b6	Async search should retry updates on version conflict (#63652 ) * Async search should retry updates on version conflict The _async_search APIs can throw version conflict exception when the internal response is updated concurrently. That can happen if the final response is written while the user extends the expiration time. That scenario should be rare but it happened in Kibana for several users so this change ensures that updates are retried at least 5 times. That should resolve the transient errors for Kibana. This change also preserves the version conflict exception in case the retry didn't work instead of returning a confusing 404. This commit also ensures that we don't delete the response if the search was cancelled internally and not deleted explicitly by the user. Closes #63213	2020-10-16 08:37:23 +02:00
Nhat Nguyen	b40e82f494	Mute testSearchPhaseFailureNoCause Tracked at #63702	2020-10-14 15:33:28 -04:00
Nik Everett	4aaffc6a3d	Consider query when optimizing date rounding (#63403 ) Before this change we inspected the index when optimizing `date_histogram` aggregations, precalculating the divisions for the buckets for the entire range of dates on the index so long as there aren't a ton of these buckets. This works very well when you query all of the dates in the index which is quite common - after all, folks frequently want to query a week of data and have daily indices. But it doesn't work as well when the index is much larger than the query. This is quite common when dumping data into ES just to investigate it but less common in the traditional time series use case. But even there it still happens, it is just less impactful. Consider the default query produced by Kibana's Discover app: a range of 15 minutes and a interval of 30 seconds. This optimization saves something like 3 to 12 nanoseconds per document, so that 15 minutes would have to have hundreds of millions of documents for it to be impactful. Anyway, this commit takes the query into account when precalculating the buckets. Mostly this is good when you have "dirty data". Immagine loading 80 billion docs in an index to investigate them. Most of them have dates around 2015 and 2016 but some have dates in 1970 and others have dates in 2030. These outlier dates are "dirty" "garbage". Well, without this change a `date_histogram` across many of these docs is significantly slowed down because we don't precalculate the range due to the outliers. That's just rude! So this change takes the query into account. The bulk of the code change here is plumbing the query into place. It turns out that its a ton of plumbing, so instead of just adding a `Query` member in hundreds of args replace `QueryShardContext` with a new `AggregationContext` which does two things: 1. Has the top level `Query`. 2. Exposes just the parts of `QueryShardContext` that we actually need to run aggregation. This lets us simplify a few tests now and will let us simplify many, many tests later.	2020-10-12 13:11:44 -04:00
Gordon Brown	91f4b58bf7	Deprecate REST access to System Indices (#60945 ) This PR adds deprecation warnings when accessing System Indices via the REST layer. At this time, these warnings are only enabled for Snapshot builds by default, to allow projects external to Elasticsearch additional time to adjust their access patterns. Deprecation warnings will be triggered by all REST requests which access registered System Indices, except for purpose-specific APIs which access System Indices as an implementation detail a few specific APIs which will continue to allow access to system indices by default: - `GET _cluster/health` - `GET {index}/_recovery` - `GET _cluster/allocation/explain` - `GET _cluster/state` - `POST _cluster/reroute` - `GET {index}/_stats` - `GET {index}/_segments` - `GET {index}/_shard_stores` - `GET _cat/[indices,aliases,health,recovery,shards,segments]` Deprecation warnings for accessing system indices take the form: ``` this request accesses system indices: [.some_system_index], but in a future major version, direct access to system indices will be prevented by default ```	2020-10-06 11:13:48 -06:00
Nhat Nguyen	1fe287d647	Fix testRestartAfterCompletion (#63211 ) We need to complete the search before closing the iterator, which internally closes the point in time; otherwise, the search will fail with a missing context error. Closes #62451	2020-10-02 17:31:12 -04:00
Jim Ferenczi	fbed2a1709	Request-level circuit breaker support on coordinating nodes (#62223 ) This commit allows coordinating node to account the memory used to perform partial and final reduce of aggregations in the request circuit breaker. The search coordinator adds the memory that it used to save and reduce the results of shard aggregations in the request circuit breaker. Before any partial or final reduce, the memory needed to reduce the aggregations is estimated and a CircuitBreakingException} is thrown if exceeds the maximum memory allowed in this breaker. This size is estimated as roughly 1.5 times the size of the serialized aggregations that need to be reduced. This estimation can be completely off for some aggregations but it is corrected with the real size after the reduce completes. If the reduce is successful, we update the circuit breaker to remove the size of the source aggregations and replace the estimation with the serialized size of the newly reduced result. As a follow up we could trigger partial reduces based on the memory accounted in the circuit breaker instead of relying on a static number of shard responses. A simpler follow up that could be done in the mean time is to [reduce the default batch reduce size](https://github.com/elastic/elasticsearch/issues/51857) of blocking search request to a more sane number. Closes #37182	2020-09-24 14:02:49 +02:00

1 2 3

138 Commits