elasticsearch

Commit Graph

Author	SHA1	Message	Date
Przemyslaw Gomulka	71e05838a6	[Rest Api Compatibility] Enable tests after types and cat api fixed (#75179 ) Some tests are fixed after typed api is available with compatible api. Also cat api returning text fixed some tests relates #51816	2021-07-14 08:37:38 +02:00
Armin Braun	8947c1e980	Save Memory on Large Repository Metadata Blob Writes (#74313 ) This PR adds a new API for doing streaming serialization writes to a repository to enable repository metadata of arbitrary size and at bounded memory during writing. The existing write-APIs require knowledge of the eventual blob size beforehand. This forced us to materialize the serialized blob in memory before writing, costing a lot of memory in case of e.g. very large `RepositoryData` (and limiting us to `2G` max blob size). With this PR the requirement to fully materialize the serialized metadata goes away and the memory overhead becomes completely bounded by the outbound buffer size of the repository implementation. As we move to larger repositories this makes master node stability a lot more predictable since writing out `RepositoryData` does not take as much memory any longer (same applies to shard level metadata), enables aggregating multiple metadata blobs into a single larger blobs without massive overhead and removes the 2G size limit on `RepositoryData`.	2021-06-29 11:29:55 +02:00
Armin Braun	cbf48e0633	Flatten Get Snapshots Response (#74451 ) This PR returns the get snapshots API to the 7.x format (and transport client behavior) and enhances it for requests that ask for multiple repositories. The changes for requests that target multiple repositories are: * Add `repository` field to `SnapshotInfo` and REST response * Add `failures` map alongside `snapshots` list instead of returning just an exception response as done for single repo requests * Pagination now works across repositories instead of being per repository for multi-repository requests closes #69108 closes #43462	2021-06-24 16:58:33 +02:00
Ryan Ernst	68817d7ca2	Rename o.e.common in libs/core to o.e.core (#73909 ) When libs/core was created, several classes were moved from server's o.e.common package, but they were not moved to a new package. Split packages need to go away long term, so that Elasticsearch can even think about modularization. This commit moves all the classes under o.e.common in core to o.e.core. relates #73784	2021-06-08 09:53:28 -07:00
Armin Braun	52e7b926a9	Make Large Bulk Snapshot Deletes more Memory Efficient (#72788 ) Use an iterator instead of a list when passing around what to delete. In the case of very large deletes the iterator is a much smaller than the actual list of files to delete (since we save all the prefixes which adds up if the individual shard folders contain lots of deletes). Also this commit as a side-effect adjusts a few spots in logging where the log messages could be catastrophic in size when trace logging is activated.	2021-05-10 13:40:57 +02:00
Armin Braun	bef9dab643	Cleanup BlobPath Class (#72860 ) There should be a singleton for the empty version of this. All the copying to `String[]` or use as an iterator make no sense either when we can just use the list outright.	2021-05-10 00:10:39 +02:00
Francisco Fernández Castaño	e6894960f4	Include URLHttpClientIOException on URLBlobContainerRetriesTests testReadBlobWithReadTimeouts (#71318 ) In some scenarios where the read timeout is too tight it's possible that the http request times out before the response headers have been received, in that case an URLHttpClientIOException is thrown. This commit adds that exception type to the expected set of read timeout exceptions. Closes #70931	2021-04-06 14:58:57 +02:00
Jake Landis	279fde375e	Apply REST API compatibility testing for the :modules (#71137 )	2021-04-02 11:20:54 -05:00
Mark Vieira	6339691fe3	Consolidate REST API specifications and publish under Apache 2.0 license (#70036 )	2021-03-26 16:20:14 -07:00
Francisco Fernández Castaño	3f8a9256ea	Add searchable snapshots integration tests for URL repositories (#70709 ) Relates #69521	2021-03-26 15:23:44 +01:00
Francisco Fernández Castaño	b1c4cb4451	Take into account range start to compute the current stream end on url repositories. (#70509 ) Closes #70310	2021-03-18 15:44:03 +01:00
David Kyle	05637bc713	Mute URLBlobContainerRetriesTests::testReadRangeBlobWithRetries (#70374 )	2021-03-15 11:41:55 +00:00
Francisco Fernández Castaño	ae5308c638	Add support for range reads and retries to URL repositories (#69521 )	2021-03-08 13:14:12 +01:00
David Turner	d3e0a571eb	URL repos and searchable snapshots don't mix (#69197 ) Provides docs and a better error message regarding using URL repositories with searchable snapshots. Relates #68918	2021-02-18 17:50:50 +00:00
Mark Vieira	a92a647b9f	Update sources with new SSPL+Elastic-2.0 license headers As per the new licensing change for Elasticsearch and Kibana this commit moves existing Apache 2.0 licensed source code to the new dual license SSPL+Elastic license 2.0. In addition, existing x-pack code now uses the new version 2.0 of the Elastic license. Full changes include: - Updating LICENSE and NOTICE files throughout the code base, as well as those packaged in our published artifacts - Update IDE integration to now use the new license header on newly created source files - Remove references to the "OSS" distribution from our documentation - Update build time verification checks to no longer allow Apache 2.0 license header in Elasticsearch source code - Replace all existing Apache 2.0 license headers for non-xpack code with updated header (vendored code with Apache 2.0 headers obviously remains the same). - Replace all Elastic license 1.0 headers with new 2.0 header in xpack.	2021-02-02 16:10:53 -08:00
Rene Groeschke	ca96612245	Remove debugging printlns from build scripts	2021-01-18 15:19:19 +01:00
Rene Groeschke	f83d545b81	Port UrlFixture to test fixture plugin (#67169 ) - Port UrlFixture to test fixture plugin - Avoid exposing PID and PORt for http fixture when not required - Make AbstractHttpFixture work inside and outside docker - Check directories when running UrlFixture	2021-01-18 14:59:18 +01:00
Armin Braun	3819fcb582	Add Ability to Write a BytesReference to BlobContainer (#66501 ) Except when writing actual segment files to the blob store we always write `BytesReference` instead of a stream. Only having the stream API available forces needless copies on us. I fixed the straight-forward needless copying for HDFS and FS repos in this PR, we could do similar fixes for GCS and Azure as well and thus significantly reduce the peak memory use of these writes on master nodes in particular.	2020-12-17 17:42:29 +01:00
Rene Groeschke	defaa93902	Avoid tasks materialized during configuration phase (#65922 ) * Avoid tasks materialized during configuration phase * Fix RestTestFromSnippet testRoot setup	2020-12-12 16:14:17 +01:00
Rene Groeschke	0911d04467	Make AntFixture handling task provider api compliant (#65832 ) This tweaks the AntFixture handling to make it compliant with the task avoidance api. Tasks of type StandaloneRestTestTask are now generally finalised by using the typed ant stop task which allows us to remove of errorprone dependsOn overrides in StandaloneRestTestTask. As a result we also ported more task definitions in the build to task avoidance api. Next work item regarding AntFixture handling is porting AntFixture to a plain Gradle task and remove Groovy AntBuilder will allow us to port more build logic from Groovy to Java but is out of the scope of This PR.	2020-12-08 13:07:36 +01:00
Rene Groeschke	810e7ff6b0	Move tasks in build scripts to task avoidance api (#64046 ) - Some trivial cleanup on build scripts - Change task referencing in build scripts to use task avoidance api where replacement is trivial.	2020-11-12 12:04:15 +01:00
Armin Braun	c419cd3251	Use Pooled Byte Arrays in BlobStoreRepository Serialization (#63461 ) Many of the metadata blobs we handle in the changed spots can grow up in size up to `O(1M)`. Not using recycled bytes when working with them causes significant spikes in memory use for larger repositories.	2020-10-12 10:27:52 +02:00
Jake Landis	7dd57c9415	Introduce javaRestTest source set/task and convert modules (#59939 ) Introduce a javaRestTest source set and task to compliment the yamlRestTest. javaRestTest differs such that the code is sourced from Java and may have different dependencies and setup requirements for the test clusters. This also allows the tests to run in parallel in different cluster instances to prevent any cross test contamination between the two types of tests. Included in this PR is all :modules no longer use the integTest task. The tests are now driven by test, yamlRestTest, javaRestTest, and internalClusterTest. Since only :modules (and :rest-api-spec) have been converted to yamlRestTest we can now disable the integTest task if either yamlRestTest or javaRestTest have been applied. Once all projects are converted, we can delete the integTest task. related: #56841 related: #59444	2020-07-21 17:17:17 -05:00
Jake Landis	ddd882b835	Convert modules to use yamlRestTest (#59089 ) This commit moves the modules REST tests to the newly introduced yamlRestTest source set. A few tests have also been re-named to include the correct IT suffix. Without changing the names, the testing conventions task would fail since now that the YAML tests are no longer present pacify the convention. These tests have moved to the internalClusterTest source set. related: #56841	2020-07-13 11:32:42 -05:00
Armin Braun	5da804b865	Add Check for Metadata Existence in BlobStoreRepository (#59141 ) In order to ensure that we do not write a broken piece of `RepositoryData` because the phyiscal repository generation was moved ahead more than one step by erroneous concurrent writing to a repository we must check whether or not the current assumed repository generation exists in the repository physically. Without this check we run the risk of writing on top of stale cached repository data. Relates #56911	2020-07-08 13:16:58 +02:00
Jake Landis	333a5d8cdf	Create plugin for yamlTest task (#56841 ) This commit creates a new Gradle plugin to provide a separate task name and source set for running YAML based REST tests. The only project converted to use the new plugin in this PR is distribution/archives/integ-test-zip. For which the testing has been moved to :rest-api-spec since it makes the most sense and it avoids a small but awkward change to the distribution plugin. The remaining cases in modules, plugins, and x-pack will be handled in followups. This plugin is distinctly different from the plugin introduced in #55896 since the YAML REST tests are intended to be black box tests over HTTP. As such they should not (by default) have access to the classpath for that which they are testing. The YAML based REST tests will be moved to separate source sets (yamlRestTest). The which source is the target for the test resources is dependent on if this new plugin is applied. If it is not applied, it will default to the test source set. Further, this introduces a breaking change for plugin developers that use the YAML testing framework. They will now need to either use the new source set and matching task, or configure the rest resources to use the old "test" source set that matches the old integTest task. (The former should be preferred). As part of this change (which is also breaking for plugin developers) the rest resources plugin has been removed from the build plugin and now requires either explicit application or application via the new YAML REST test plugin. Plugin developers should be able to fix the breaking changes to the YAML tests by adding apply plugin: 'elasticsearch.yaml-rest-test' and moving the YAML tests under a yamlRestTest folder (instead of test)	2020-07-06 12:13:01 -05:00
Yannick Welsch	118521d022	Account for recovery throttling when restoring snapshot (#58658 ) Restoring from a snapshot (which is a particular form of recovery) does not currently take recovery throttling into account (i.e. the `indices.recovery.max_bytes_per_sec` setting). While restores are subject to their own throttling (repository setting `max_restore_bytes_per_sec`), this repository setting does not allow for values to be configured differently on a per-node basis. As restores are very similar in nature to peer recoveries (streaming bytes to the node), it makes sense to configure throttling in a single place. The `max_restore_bytes_per_sec` setting is also changed to default to unlimited now, whereas previously it was set to `40mb`, which is the current default of `indices.recovery.max_bytes_per_sec`). This means that no behavioral change will be observed by clusters where the recovery and restore settings were not adapted. Relates https://github.com/elastic/elasticsearch/issues/57023 Co-authored-by: James Rodewig <james.rodewig@elastic.co>	2020-06-30 13:08:21 +02:00
Tanguy Leroux	34e253558d	Remove more //NORELEASE (#57517 ) We agreed on removing the following //NORELEASE tags.	2020-06-05 15:19:38 +02:00
Jason Tedor	95a7eed9aa	Rename MetaData to Metadata in all of the places (#54519 ) This is a simple naming change PR, to fix the fact that "metadata" is a single English word, and for too long we have not followed general naming conventions for it. We are also not consistent about it, for example, METADATA instead of META_DATA if we were trying to be consistent with MetaData (although METADATA is correct when considered in the context of "metadata"). This was a simple find and replace across the code base, only taking a few minutes to fix this naming issue forever.	2020-03-31 15:52:01 -04:00
Jake Landis	afc2383b72	Optimize which Rest resources are used by the Rest tests. (#53299 ) This should help with Gradle's incremental compile such that projects only depend upon the resources they use. related #52114	2020-03-18 09:09:29 -05:00
Marios Trivyzas	e42c4d1b0b	[Tests] Update skip version for YAML tests (#52324 ) Update skip versions upper boundary to match the release or intended release version of the feature/fix. Relates to #52310	2020-02-13 20:03:48 +01:00
Armin Braun	1e4d775bfc	Remove Unused Single Delete in BlobStoreRepository (#50024 ) * Remove Unused Single Delete in BlobStoreRepository There are no more production uses of the non-bulk delete or the delete that throws on missing so this commit removes both these methods. Only the bulk delete logic remains. Where the bulk delete was derived from single deletes, the single delete code was inlined into the bulk delete method. Where single delete was used in tests it was replaced by bulk deleting.	2019-12-12 10:12:03 +01:00
Armin Braun	459d8edcc0	Make BlobStoreRepository Aware of ClusterState (#49639 ) This is a preliminary to #49060. It does not introduce any substantial behavior change to how the blob store repository operates. What it does is to add all the infrastructure changes around passing the cluster service to the blob store, associated test changes and a best effort approach to tracking the latest repository generation on all nodes from cluster state updates. This brings a slight improvement to the consistency by which non-master nodes (or master directly after a failover) will be able to determine the latest repository generation. It does not however do any tricky checks for the situation after a repository operation (create, delete or cleanup) that could theoretically be used to get even greater accuracy to keep this change simple. This change does not in any way alter the behavior of the blobstore repository other than adding a better "guess" for the value of the latest repo generation and is mainly intended to isolate the actual logical change to how the repository operates in #49060	2019-11-29 10:14:53 +01:00
Rory Hunter	3a3e5f6176	Apply 2-space indent to all gradle scripts (#48849 ) Closes #48724. Update `.editorconfig` to make the Java settings the default for all files, and then apply a 2-space indent to all `*.gradle` files. Then reformat all the files.	2019-11-13 10:14:04 +00:00
Mark Vieira	af6af346f7	Introduce type-safe and consistent pattern for handling build globals (#48778 ) This commit introduces a consistent, and type-safe manner for handling global build parameters through out our build logic. Primarily this replaces the existing usages of extra properties with static accessors. It also introduces and explicit API for initialization and mutation of any such parameters, as well as better error handling for uninitialized or eager access of parameter values. Closes #42042	2019-11-01 09:54:22 -07:00
Alan Woodward	8b9bed9ea6	Remove type parameter from ESIntegTestCase sugar methods (#48451 ) ESIntegTestCase has a number of sugar index methods that prepare and execute index requests. Now that index requests no longer use types, we can remove the type parameter from each of these. To prevent issues when re-compiling against the test framework, the method `index(String index, String id, Object... source)` is renamed to `indexDoc` so that e.g. `index(index, type, id, field1, field2, field3)` does not get re-interpreted as an index request with an id of `type`.	2019-10-25 10:29:32 +01:00
Mark Vieira	184cc4d8ed	Repository plugin test cacheability fixes (#46572 )	2019-09-11 08:24:32 -07:00
Jason Tedor	268881ecc9	Remove node settings from blob store repositories (#45991 ) This commit starts from the simple premise that the use of node settings in blob store repositories is a mistake. Here we see that the node settings are used to get default settings for store and restore throttle rates. Yet, since there are not any node settings registered to this effect, there can never be a default setting to fall back to there, and so we always end up falling back to the default rate. Since this was the only use of node settings in blob store repository, we move them. From this, several places fall out where we were chaining settings through only to get them to the blob store repository, so we clean these up as well. That leaves us with the changeset in this commit.	2019-08-26 16:10:25 -04:00
Armin Braun	df01766c15	Repository Cleanup Endpoint (#43900 ) * Snapshot cleanup functionality via transport/REST endpoint. * Added all the infrastructure for this with the HLRC and node client * Made use of it in tests and resolved relevant TODO * Added new `Custom` CS element that tracks the cleanup logic. Kept it similar to the delete and in progress classes and gave it some (for now) redundant way of handling multiple cleanups but only allow one * Use the exact same mechanism used by deletes to have the combination of CS entry and increment in repository state ID provide some concurrency safety (the initial approach of just an entry in the CS was not enough, we must increment the repository state ID to be safe against concurrent modifications, otherwise we run the risk of "cleaning up" blobs that just got created without noticing) * Isolated the logic to the transport action class as much as I could. It's not ideal, but we don't need to keep any state and do the same for other repository operations (like getting the detailed snapshot shard status)	2019-08-21 12:02:44 +02:00
Armin Braun	4b8fd4e76f	Remove blobExists Method from BlobContainer (#44472 ) * We only use this method in one place in production code and can replace that with a read -> remove it to simplify the interface * Keep it as an implementation detail in the Azure repository	2019-07-17 09:15:22 +02:00
Andrey Ershov	680d6edc0b	Get snapshots support for multiple repositories (#42090 ) This commit adds multiple repositories support to get snapshots request. If some repository throws an exception this method does not fail fast instead, it returns results for all repositories. This PR is opened in favour of #41799, because we decided to change the response format in a non-BwC manner. It makes sense to read a discussion of the aforementioned PR. This is the continuation of work done here #15151.	2019-06-19 16:04:13 +03:00
Armin Braun	9a4ffa3d7e	Recursive Delete on BlobContainer (#43281 ) This is a prerequisite of #42189: * Add directory delete method to blob container specific to each implementation: * Some notes on the implementations: * AWS + GCS: We can simply exploit the fact that both AWS and GCS return blobs lexicographically ordered which allows us to simply delete in the same order that we receive the blobs from the listing request. For AWS this simply required listing without the delimiter setting (so we get a deep listing) and for GCS the same behavior is achieved by not using the directory mode on the listing invocation. The nice thing about this is, that even for very large numbers of blobs the memory requirements are now capped nicely since we go page by page when deleting. * For Azure I extended the parallelization to the listing calls as well and made it work recursively. I verified that this works with thread count `1` since we only block once in the initial thread and then fan out to a "graph" of child listeners that never block. * HDFS and FS are trivial since we have directory delete methods available for them * Enhances third party tests to ensure the new functionality works (I manually ran them for all cloud providers)	2019-06-18 13:28:22 +02:00
Armin Braun	2f637d42f1	Add Ability to List Child Containers to BlobContainer (#42653 ) * Add Ability to List Child Containers to BlobContainer * This is a prerequisite of #42189	2019-06-05 17:03:14 +02:00
Armin Braun	074da02f44	Dry up BlobStoreRepository#basePath Implementations (#42578 ) * This method is just a getter in every implementation => moved the field and concrete getter to the base class to simplify implementations	2019-05-27 19:29:51 +02:00
Armin Braun	70eb812f83	Remove Delete Method from BlobStore (#41619 ) * Remove Delete Method from BlobStore * The delete method on the blob store was used almost nowhere and just duplicates the delete method on the blob containers * The fact that it provided for some recursive delete logic (that did not behave the same way on all implementations) was not used and not properly tested either	2019-05-09 17:12:34 +02:00
Armin Braun	8a07522ed5	Async Snapshot Repository Deletes (#40144 ) Motivated by slow snapshot deletes reported in e.g. #39656 and the fact that these likely are a contributing factor to repositories accumulating stale files over time when deletes fail to finish in time and are interrupted before they can complete. * Makes snapshot deletion async and parallelizes some steps of the delete process that can be safely run concurrently via the snapshot thread poll * I did not take the biggest potential speedup step here and parallelize the shard file deletion because that's probably better handled by moving to bulk deletes where possible (and can still be parallelized via the snapshot pool where it isn't). Also, I wanted to keep the size of the PR manageable. * See https://github.com/elastic/elasticsearch/pull/39656#issuecomment-470492106 * Also, as a side effect this gives the `SnapshotResiliencyTests` a little more coverage for master failover scenarios (since parallel access to a blob store repository during deletes is now possible since a delete isn't a single task anymore). * By adding a `ThreadPool` reference to the repository this also lays the groundwork to parallelizing shard snapshot uploads to improve the situation reported in #39657	2019-04-05 06:56:46 +02:00
Alpar Torok	4434491c1e	convert modules to use testclusters (#40804 ) * convert modules to use testclusters * Eliminate PluginPropertiesTask and move logic in plugin where it belongs	2019-04-04 11:41:38 +03:00
Henning Andersen	ac7ec99bea	Unify blob store compress setting (#39346 ) Blob store compression was all implemented generally, except reading the setting for it. Moved the setting to BlobStoreRepository to unify this. Also removed deprecated env setting 'repositories.fs.compress'. This is a follow up on #39073	2019-02-28 12:34:13 +01:00
Henning Andersen	f5fc163228	Blob store compression fix (#39073 ) Blob store compression was not enabled for some of the files in snapshots due to constructor accessing sub-class fields. Fixed to instead accept compress field as constructor param. Also fixed chunk size validation to work. Deprecated repositories.fs.compress setting as well to be able to unify in a future commit.	2019-02-20 08:27:07 +01:00
Colin Goodheart-Smithe	21e392e95e	Removes typed calls from YAML REST tests (#37611 ) This PR attempts to remove all typed calls from our YAML REST tests. The PR adds include_type_name: false to create index requests that use a mapping and also to put mapping requests. It also removes _type from index requests where they haven't already been removed. The PR ignores tests named *_with_types.yml since this are specifically testing typed API behaviour. The change also includes changing the test harness to add the type _doc to index, update, get and bulk requests that do not specify the document type when the test is running against a mixed 7.x/6.x cluster.	2019-01-30 16:32:58 +00:00

1 2

79 Commits