Commit Graph

2419 Commits

Author SHA1 Message Date
Armin Braun 6059f8e8da
Ensure SAS Tokens in Test Use Minimal Permissions (#46112)
* Ensure SAS Tokens in Test Use Minimal Permissions

There were some issues with the Azure implementation requiring
permissions to list all containers ue to a container exists
check. This was caught in CI this time, but going forward we
should ensure that CI is executed using a token that does not
allow listing containers.

Relates #43288
2019-09-11 19:51:48 +02:00
Mark Vieira 184cc4d8ed
Repository plugin test cacheability fixes (#46572) 2019-09-11 08:24:32 -07:00
Luca Cavanna c512214cdc
Update http-core and http-client dependencies (#46549)
* Update http-core and http-client dependencies

Relates to #45808
Closes #45577

* update shas dependencies

* update test
2019-09-11 13:42:32 +02:00
Tanguy Leroux 62a516de88
Mutualize code in cloud-based repository integration tests (#46483)
This commit factors out some common code between the cloud-based 
repository integration tests that were recently improved.

Relates #46376
2019-09-09 15:51:24 +02:00
Tanguy Leroux 3cef830ff6
Inject random server errors in AzureBlobStoreRepositoryTests (#46371)
This commit modifies the HTTP server used in 
AzureBlobStoreRepositoryTests so that it randomly returns 
server errors for any type of request executed by the Azure client.
2019-09-09 09:59:01 +02:00
Tanguy Leroux 3d4c0c6a87
Inject random server errors in GoogleCloudStorageBlobStoreRepositoryTests (#46376)
This commit modifies the HTTP server used in 
GoogleCloudStorageBlobStoreRepositoryTests so that it randomly 
returns server errors. The test does not inject server errors for the 
following types of request: batch request, resumable upload request.
2019-09-09 09:57:43 +02:00
David Turner 23155532d8
Add support for OneZoneInfrequentAccess storage (#46436)
The `repository-s3` plugin has supported a storage class of `onezone_ia` since
the SDK upgrade in #30723, but we do not test or document this fact. This
commit adds this storage class to the docs and adds a test to ensure that the
documented storage classes are all accepted by S3 too.

Fixes #30474
2019-09-09 07:54:02 +01:00
Tanguy Leroux 0321073ae6
Fix usage of randomIntBetween() in testWriteBlobWithRetries (#46380)
This commit fixes the usage of randomIntBetween() in the test 
testWriteBlobWithRetries, when the test generates a random array  
of a single byte.
2019-09-06 09:10:06 +02:00
Tanguy Leroux 4726e1e6b3
Replace mocked client in GCSBlobStoreRepositoryTests by HTTP server (#46255)
This commit removes the usage of MockGoogleCloudStoragePlugin in 
GoogleCloudStorageBlobStoreRepositoryTests and replaces it by a 
HttpServer that emulates the Storage service. This allows the repository 
tests to use the real Google's client under the hood in tests and will allow 
us to test the behavior of the snapshot/restore feature for GCS repositories 
by simulating random server-side internal errors.

The HTTP server used to emulate the Storage service is intentionally simple 
and minimal to keep things understandable and maintainable. Testing full 
client options on the server side (like authentication, chunked encoding 
etc) remains the responsibility of the GoogleCloudStorageFixture.
2019-09-05 09:25:23 +02:00
Tanguy Leroux dfc74c096b
Add repository integration tests for Azure (#46263)
Similarly to what had been done for S3 (#46081) and GCS (#46255) 
this commit adds repository integration tests for Azure, based on an 
internal HTTP server instead of mocks.
2019-09-05 09:22:42 +02:00
Tanguy Leroux 69abc64413
Disable request throttling in S3BlobStoreRepositoryTests (#46226)
When some high values are randomly picked up - for example the number 
of indices to snapshot or the number of snapshots to create - the tests in S3BlobStoreRepositoryTests can generate a high number of requests to 
the internal S3 server.

In order to test the retry logic of the S3 client, the internal server is 
designed to randomly generate random server errors. When many
 requests are made, it is possible that the S3 client reaches its maximum 
number of successive retries capacity. Then the S3 client will stop 
retrying requests until enough retry attempts succeed, but it means 
that any request could fail before reaching the max retries count and 
make the test fail too.

Closes #46217
Closes #46218
Closes #46219
2019-09-02 16:41:18 +02:00
Henning Andersen dd487a0ab9
Mute 2 tests in S3BlobStoreRepositoryTests (#46221)
Muted testSnapshotAndRestore and testMultipleSnapshotAndRollback

Relates #46218 and #46219
2019-09-02 10:37:32 +02:00
Tanguy Leroux bf838f6608
Inject random errors in S3BlobStoreRepositoryTests (#46125)
This commit modifies the HTTP server used in S3BlobStoreRepositoryTests 
so that it randomly returns server errors for any type of request executed by
 the SDK client. It is now possible to verify that the repository tests are s
uccessfully completed even if one or more errors were returned by the S3 
service in response of a blob upload, a blob deletion or a object listing request 
etc.

Because injecting errors forces the SDK client to retry requests, the test limits
 the maximum errors to send in response for each request at 3 retries.
2019-08-30 10:00:32 +02:00
Jason Tedor 2bcfe1a2d8
Remove insecure settings (#46147)
This commit removes the oxymoron of insecure secure settings from the
code base. In particular, we remove the ability to set the access_key
and secret_key for S3 repositories inside the repository definition (in
the cluster state). Instead, these settings now must be in the
keystore. Thus, it also removes some leniency where these settings could
be placed in the elasticsearch.yml, would not be rejected there, but
would not be consumed for any purpose.
2019-08-29 21:25:20 -04:00
Tanguy Leroux 7dbf2df949
Replace MockAmazonS3 usage in S3BlobStoreRepositoryTests by a HTTP server (#46081)
This commit removes the usage of MockAmazonS3 in S3BlobStoreRepositoryTests 
and replaces it by a HttpServer that emulates the S3 service. This allows the 
repository tests to use the real Amazon's S3 client under the hood in tests and will 
allow to test the behavior of the snapshot/restore feature for S3 repositories by 
simulating random server-side internal errors.

The HTTP server used to emulate the S3 service is intentionally simple and minimal 
to keep things understandable and maintainable. Testing full client options on the 
server side (like authentication, chunked encoding etc) remains the responsibility 
of the AmazonS3Fixture.
2019-08-29 13:15:48 +02:00
Armin Braun f7fedc3090
Upgrade to Azure SDK 8.4.0 (#46094)
* Upgrading to 8.4.0 here which brings bulk deletes to be used in a follow up PR
2019-08-29 10:22:16 +02:00
Tanguy Leroux 1dcc4ed3eb
Few clean ups in ESBlobStoreRepositoryIntegTestCase (#46068) 2019-08-28 16:13:31 +02:00
Jason Tedor 268881ecc9
Remove node settings from blob store repositories (#45991)
This commit starts from the simple premise that the use of node settings
in blob store repositories is a mistake. Here we see that the node
settings are used to get default settings for store and restore throttle
rates. Yet, since there are not any node settings registered to this
effect, there can never be a default setting to fall back to there, and
so we always end up falling back to the default rate. Since this was the
only use of node settings in blob store repository, we move them. From
this, several places fall out where we were chaining settings through
only to get them to the blob store repository, so we clean these up as
well. That leaves us with the changeset in this commit.
2019-08-26 16:10:25 -04:00
Tanguy Leroux a526d9c54a
Refactor RepositoryCredentialsTests (#45919)
This commit refactors the S3 credentials tests in 
RepositoryCredentialsTests so that it now uses a single 
node (ESSingleNodeTestCase) to test how secure/insecure 
credentials are overriding each other. Using a single node 
makes it much easier to understand what each test is actually 
testing and IMO better reflect how things are initialized.

It also allows to fold into this class the test 
testInsecureRepositoryCredentials which was wrongly located 
in S3BlobStoreRepositoryTests. By moving this test away, the 
S3BlobStoreRepositoryTests class does not need the 
allow_insecure_settings option anymore and thus can be 
executed as part of the usual gradle test task.
2019-08-26 14:53:31 +02:00
Tanguy Leroux 9dc6f0d7d9
Allow partial request body reads in AWS S3 retries tests (#45847)
This commit changes the tests added in #45383 so that the fixture that 
emulates the S3 service now sometimes consumes all the request body 
before sending an error, sometimes consumes only a part of the request 
body and sometimes consumes nothing. The idea here is to beef up a bit 
the tests that writes blob because the client's retry logic relies on 
marking and resetting the blob's input stream.

This pull request also changes the testWriteBlobWithRetries() so that it 
(rarely) tests with a large blob (up to 1mb), which is more than the client's 
default read limit on input streams (131Kb).

Finally, it optimizes the ZeroInputStream so that it is a bit more effective 
(now works using an internal buffer and System.arraycopy() primitives).
2019-08-23 13:38:52 +02:00
Jason Tedor 7cb26efbb8
Remove binary file accidentally committed
🤦‍♀️
2019-08-22 17:34:38 -04:00
Jason Tedor d05101b9e5
Add node.processors setting in favor of processors (#45855)
This commit namespaces the existing processors setting under the "node"
namespace. In doing so, we deprecate the existing processors setting in
favor of node.processors.
2019-08-22 17:19:40 -04:00
Tanguy Leroux 8010dd070b
Add tests to check that requests are retried when writing/reading blobs on S3 (#45383)
This commit adds tests to verify the behavior of the S3BlobContainer and 
its underlying AWS SDK client when the remote S3 service is responding 
errors or not responding at all. The expected behavior is that requests are 
retried multiple times before the client gives up and the S3BlobContainer 
bubbles up an exception.

The test verifies the behavior of BlobContainer.writeBlob() and 
BlobContainer.readBlob(). In the case of S3 writing a blob can be executed 
as a single upload or using multipart requests; the test checks both scenario 
by writing a small then a large blob.
2019-08-22 11:27:54 +02:00
Armin Braun df01766c15
Repository Cleanup Endpoint (#43900)
* Snapshot cleanup functionality via transport/REST endpoint.
* Added all the infrastructure for this with the HLRC and node client
* Made use of it in tests and resolved relevant TODO
* Added new `Custom` CS element that tracks the cleanup logic.
Kept it similar to the delete and in progress classes and gave it
some (for now) redundant way of handling multiple cleanups but only allow one
* Use the exact same mechanism used by deletes to have the combination
of CS entry and increment in repository state ID provide some
concurrency safety (the initial approach of just an entry in the CS
was not enough, we must increment the repository state ID to be safe
against concurrent modifications, otherwise we run the risk of "cleaning up"
blobs that just got created without noticing)
* Isolated the logic to the transport action class as much as I could.
It's not ideal, but we don't need to keep any state and do the same
for other repository operations
(like getting the detailed snapshot shard status)
2019-08-21 12:02:44 +02:00
Jim Ferenczi d66a307599
Add support for inlined user dictionary in the Kuromoji plugin (#45489)
This change adds a new option called user_dictionary_rules to
Kuromoji's tokenizer. It can be used to set additional tokenization rules
to the Japanese tokenizer directly in the settings (instead of using a file).
This commit also adds a check that no rules are duplicated since this is not allowed
in the UserDictionary.

Closes #25343
2019-08-20 15:06:01 +02:00
Igor Motov 9f5611f820
Ingest Attachment: Upgrade tika to v1.22 (#45575)
Upgrades:
Apache Tika: 1.19.1 -> 1.22.
pdfbox : 2.0.12 -> 2.0.16
poi : 4.0.0 -> 4.0.1
2019-08-19 18:16:32 -04:00
Karel Minarik 9166311622 Update the schema for the REST API specification (#42346)
* Update the REST API specification

This patch updates the REST API spefication in JSON files to better encode deprecated entities,
to improve specification of URL paths, and to open up the schema for future extensions.

Notably, it changes the `paths` from a list of strings to a list of objects, where each
particular object encodes all the information for this particular path: the `parts` and the `methods`.

Among the benefits of this approach is eg. encoding the difference between using the `PUT` and `POST`
methods in the Index API, to either use a specific document ID, or let Elasticsearch generate one.

Also `documentation` becomes an object that supports an `url` and also a `description` which is a
new field.

* Adapt YAML runner to new REST API specification format

The logic for choosing the path to use when running tests has been
simplified, as a consequence of the path parts being listed under each
path in the spec. The special case for create and index has been removed.

Also the parsing code has been hardened so that errors are thrown earlier
when the structure of the spec differs from what expected, and their
error messages should be more helpful.
2019-08-15 17:15:30 +02:00
Yogesh Gaikwad b44c0281e6
Refactor cluster privileges and cluster permission (#45265)
The current implementations make it difficult for
adding new privileges (example: a cluster privilege which is
more than cluster action-based and not exposed to the security
administrator). On the high level, we would like our cluster privilege
either:
- a named cluster privilege
  This corresponds to `cluster` field from the role descriptor
- or a configurable cluster privilege
  This corresponds to the `global` field from the role-descriptor and
allows a security administrator to configure them.

Some of the responsibilities like the merging of action based cluster privileges
are now pushed at cluster permission level. How to implement the predicate
(using Automaton) is being now enforced by cluster permission.

`ClusterPermission` helps in enforcing the cluster level access either by
performing checks against cluster action and optionally against a request.
It is a collection of one or more permission checks where if any of the checks
allow access then the permission allows access to a cluster action.

Implementations of cluster privilege must be able to provide information
regarding the predicates to the cluster permission so that can be enforced.
This is enforced by making implementations of cluster privilege aware of
cluster permission builder and provide a way to specify how the permission is
to be built for a given privilege.

This commit renames `ConditionalClusterPrivilege` to `ConfigurableClusterPrivilege`.
`ConfigurableClusterPrivilege` is a renderable cluster privilege exposed
as a `global` field in role descriptor.

Other than this there is a requirement where we would want to know if a cluster
permission is implied by another cluster-permission (`has-privileges`).
This is helpful in addressing queries related to privileges for a user.
This is not just simply checking of cluster permissions since we do not
have access to runtime information (like request object).
This refactoring does not try to address those scenarios.

Relates #44048
2019-08-12 13:09:34 +10:00
Armin Braun 18f5690ce7
Remove Settings from BaseRestRequest Constructor (#45418)
* Resolving the todo, cleaning up the unused `settings` parameter
* Cleaning up some other minor dead code in affected classes
2019-08-11 21:47:21 +02:00
Armin Braun fb508dba4a
Upgrade to Netty 4.1.38 (#45132)
* A number of fixes to buffer handling in the .37 and .38 -> we should stay up to date
2019-08-09 01:58:53 +02:00
Tim Brooks e0f9d61bec
Disable netty direct buffer pooling by default (#44837)
Elasticsearch does not grant Netty reflection access to get Unsafe. The
only mechanism that currently exists to free direct buffers in a timely
manner is to use Unsafe. This leads to the occasional scenario, under
heavy network load, that direct byte buffers can slowly build up without
being freed.

This commit disables Netty direct buffer pooling and moves to a strategy
of using a single thread-local direct buffer for interfacing with sockets.
This will reduce the memory usage from networking. Elasticsearch
currently derives very little value from direct buffer usage (TLS,
compression, Lucene, Elasticsearch handling, etc all use heap bytes). So
this seems like the correct trade-off until that changes.
2019-08-08 16:54:01 -04:00
Armin Braun 5c5f782b89
Add Assertion to Ensure Retries in S3BlobContainer (#45224)
* We need a `markSupported` input stream to retry uploads
* Relates #45153
2019-08-06 11:47:13 +02:00
Yannick Welsch 245cb348d3
Add per-socket keepalive options (#44055)
Uses JDK 11's per-socket configuration of TCP keepalive (supported on Linux and Mac), see
https://bugs.openjdk.java.net/browse/JDK-8194298, and exposes these as transport settings.
By default, these options are disabled for now (i.e. fall-back to OS behavior), but we would like
to explore whether we can enable them by default, in particular to force keepalive configurations
that are better tuned for running ES.
2019-08-05 16:09:11 +02:00
Armin Braun 570e406e91
Stop Passing Around REST Request in Multiple Spots (#44949)
* Stop Passing Around REST Request in Multiple Spots

* Motivated by #44564
  * We are currently passing the REST request object around to a large number of places. This works fine since we simply copy the full request content before we handle the rest itself which is needlessly hard on GC and heap.
  * This PR removes a number of spots where the request is passed around needlessly. There are many more spots to optimize in follow-ups to this, but this one would already enable bypassing the request copying for some error paths in a follow up.
2019-08-01 20:19:25 +02:00
Mark Vieira 635b55a861
Add missing dependency on plugin bundle task (#45103) 2019-08-01 10:47:06 -07:00
Andrey Ershov 63361622e3
Snapshot tool: GCS orphaned files cleanup (#45076)
Extending snapshot-tool with GCS cleanup command.
Refactoring main/test to support both S3 and GCS.
Snapshot-tool was first introduced here #44551.
2019-08-01 15:38:26 +02:00
Andrey Ershov d3230c9239
Move thirdPartyTest from :repository-gcs to :qa:google-cloud-storage (#45056)
Currently, thirdPartyTest task for GCS lives in repository-gcs project.
However, it depends a lot on :qa:google-cloud-storage project and
should live there instead.
2019-08-01 12:07:24 +02:00
Tim Brooks f39e8e5dcf
Move nio channel initialization to event loop (#43780)
Currently in the transport-nio work we connect and bind channels on the
a thread before the channel is registered with a selector. Additionally,
it is at this point that we set all the socket options. This commit
moves these operations onto the event-loop after the channel has been
registered with a selector. It attempts to set the socket options for a
non-server channel at registration time. If that fails, it will attempt
to set the options after the channel is connected. This should fix
#41071.
2019-07-30 12:41:51 -04:00
Armin Braun 236ceed3e9
S3 3rd Party Test Goal (#44799)
* Create S3 Third Party Test Task that Covers the S3 CLI Tool
* Adjust snapshot cli test tool tests to work with real S3
  * Build adjustment
  * Clean up repo path before testing
* Dedup the logic for asserting path contents by using the correct utility method here that somehow became unused
2019-07-30 14:27:07 +02:00
Armin Braun 7ee8d150e9
Release Pooled Buffers Earlier for HTTP Requests (#44952)
* We should release the buffers right after copying and not only do so after we did all the request handling on the copy
* Relates #44564
2019-07-30 07:41:31 +02:00
Andrey Ershov 80cb5df977
Support fixture in repository-gcs:thirdPartyTest and fix GCS fixture (#44885)
It turns out that today :plugins:repository-gcs:thirdPartyTest can only
run against real GCS.
Moreover, thirdPartyTest is not a part of check, so these tests are not
running on intake build.
This commit addresses the issue and makes repository-gcs:thirdPartyTest
work with both fixture and real GCS.
To do that, except adjusting build and test itself, I had to make
changes to the fixture, because previously it was ignoring
BlobListOption.currentDirectory() in the list call.
2019-07-29 14:12:39 +02:00
Ignacio Vera b8ef6127f2
Upgrade to Lucene 8.2.0 release (#44859) 2019-07-26 05:57:02 +02:00
Ioannis Kakavas 3b7b025690
Allow parsing the value of java.version sysprop (#44017)
We often start testing with early access versions of new Java
versions and this have caused minor issues in our tests
(i.e. #43141) because the version string that the JVM reports
cannot be parsed as it ends with the string -ea.

This commit changes how we parse and compare Java versions to
allow correct parsing and comparison of the output of java.version
system property that might include an additional alphanumeric
part after the version numbers
 (see [JEP 223[(https://openjdk.java.net/jeps/223)). In short it 
handles a version number part, like before, but additionally a 
PRE part that matches ([a-zA-Z0-9]+).

It also changes a number of tests that would attempt to parse
java.specification.version in order to get the full version
of Java. java.specification.version only contains the major
version and is thus inappropriate when trying to compare against
a version that might contain a minor, patch or an early access
part. We know parse java.version that can be consistently
parsed.

Resolves #43141
2019-07-22 20:13:32 +03:00
Jason Tedor 56d47d09a0
Reomve debugging loging statements from Azure tests
This commit removes some unneeded debugging logging statements from the
Azure storage tests.

Relates #44672
2019-07-22 16:54:16 +09:00
Jason Tedor 35d4a9d9a4
Use debug logging instead for Azure tests (#44672)
These Azure tests have hard println statements which means we always see
these messages during configuration. Yet, there are unnecessary most of
the time. This commit changes them to use debug logging.
2019-07-22 00:44:46 -07:00
Jinhu Wu 6d70276af1 Add disable_chunked_encoding Setting to S3 Repo (#44052)
* Add disable_chunked_encoding setting to S3 repo plugin to support S3 implementations that don't support chunked encoding
2019-07-18 14:39:16 +02:00
maarab7 aff21c431e Fix parameter value for calling data.advanceExact (#44205)
While the code works perfectly well for a single segment, it returns the wrong values for multiple segments. E.g. If we have 500 docs in one segment and if we want to get the doc id = 280 then data.advanceExact(topDocs.scoreDocs[i].doc) works fine. If we have two segments, say, with first segment having docs 1-200 and the second segment having docs 201-500, then 280 is fetched from the second segment but is actually 480. Subtracting the docBase (280-200) takes us to the correct document which is 80 in the second segment and actually 280.
2019-07-18 10:51:27 +02:00
Yogesh Gaikwad 83912ed95a
skip repository-hdfs integTest in case of fips jvm (#44319)
The repository-hdfs runners need to be disabled it in fips mode.

Testing done for all the tasks, dynamic created and static (integTest, integTestHa, integSecureTest, integSecureHaTest)
2019-07-18 15:07:33 +10:00
Jason Tedor 7de0919b96
Introduce test issue logging (#44477)
Today we have an annotation for controlling logging levels in
tests. This annotation serves two purposes, one is to control the
logging level used in tests, when such control is needed to impact and
assert the behavior of loggers in tests. The other use is when a test is
failing and additional logging is needed. This commit separates these
two concerns into separate annotations.

The primary motivation for this is that we have a history of leaving
behind the annotation for the purpose of investigating test failures
long after the test failure is resolved. The accumulation of these stale
logging annotations has led to excessive disk consumption. Having
recently cleaned this up, we would like to avoid falling into this state
again. To do this, we are adding a link to the test failure under
investigation to the annotation when used for the purpose of
investigating test failures. We will add tooling to inspect these
annotations, in the same way that we have tooling on awaits fix
annotations. This will enable us to report on the use of these
annotations, and report when stale uses of the annotation exist.
2019-07-18 05:26:01 +09:00
Armin Braun 5cda8709b6
Remove Minio Host Hack in S3 Repository Build (#44491)
* Resolving the todo to clean this hackyness up
2019-07-17 16:36:23 +02:00