Commit Graph

2661 Commits

Author SHA1 Message Date
Armin Braun 6b7f4c6bec
Formalize and Streamline Buffer Sizes used by Repositories (#59771)
Due to complicated access checks (reads and writes execute in their own access context) on some repositories (GCS, Azure, HDFS), using a hard coded buffer size of 4k for restores was needlessly inefficient.
By the same token, the use of stream copying with the default 8k buffer size  for blob writes was inefficient as well.

We also had dedicated, undocumented buffer size settings for HDFS and FS repositories. For these two we would use a 100k buffer by default. We did not have such a setting for e.g. GCS though, which would only use an 8k read buffer which is needlessly small for reading from a raw `URLConnection`.

This commit adds an undocumented setting that sets the default buffer size to `128k` for all repositories. It removes wasteful allocation of such a large buffer for small writes and reads in case of HDFS and FS repositories (i.e. still using the smaller buffer to write metadata) but uses a large buffer for doing restores and uploading segment blobs.

This should speed up Azure and GCS restores and snapshots in a non-trivial way as well as save some memory when reading small blobs on FS and HFDS repositories.
2020-07-22 16:46:37 +02:00
Nik Everett 98698f569d
Drop some params from IndexFieldData.Builder (#59934)
We never used the `IndexSettings` parameter and we only used the
`MappedFieldType` parameter to get the name of the field which we
already know everywhere where we build the `IFD.Builder`. This allows us
to drop a fair bit of ceremony from a couple of tests.
2020-07-21 08:29:58 -04:00
Ignacio Vera 15bc79da19
upgrade to lucene-8.6.0 release (#59596) 2020-07-15 11:58:28 +02:00
Armin Braun 60e0b4641c
Remove Artificially Low Chunk Size Limits from GCS + Azure Blob Stores (#59279)
Removing these limits as they cause unnecessarily many object in the blob stores.
We do not have to worry about BwC of this change since we do not support any 3rd party
implementations of Azure or GCS.
Also, since there is no valid reason to set a different than the default maximum chunk size at this
point, removing the documentation (which was incorrect in the case of Azure to begin with) for the setting
from the docs.

Closes #56018
2020-07-14 21:34:53 +02:00
Armin Braun d333dacb4a
Enable Fully Concurrent Snapshot Operations (#56911)
Enables fully concurrent snapshot operations:
* Snapshot create- and delete operations can be started in any order
* Delete operations wait for snapshot finalization to finish, are batched as much as possible to improve efficiency and once enqueued in the cluster state prevent new snapshots from starting on data nodes until executed
   * We could be even more concurrent here in a follow-up by interleaving deletes and snapshots on a per-shard level. I decided not to do this for now since it seemed not worth the added complexity yet. Due to batching+deduplicating of deletes the pain of having a delete stuck behind a long -running snapshot seemed manageable (dropped client connections + resulting retries don't cause issues due to deduplication of delete jobs, batching of deletes allows enqueuing more and more deletes even if a snapshot blocks for a long time that will all be executed in essentially constant time (due to bulk snapshot deletion, deleting multiple snapshots is mostly about as fast as deleting a single one))
* Snapshot creation is completely concurrent across shards, but per shard snapshots are linearized for each repository as are snapshot finalizations

See updated JavaDoc and added test cases for more details and illustration on the functionality.

Some notes:

The queuing of snapshot finalizations and deletes and the related locking/synchronization is a little awkward in this version but can be much simplified with some refactoring.  The problem is that snapshot finalizations resolve their listeners on the `SNAPSHOT` pool while deletes resolve the listener on the master update thread. With some refactoring both of these could be moved to the master update thread, effectively removing the need for any synchronization around the `SnapshotService` state. I didn't do this refactoring here because it's a fairly large change and not necessary for the functionality but plan to do so in a follow-up.

This change allows for completely removing any trickery around synchronizing deletes and snapshots from SLM and 100% does away with SLM errors from collisions between deletes and snapshots.

Snapshotting a single index in parallel to a long running full backup will execute without having to wait for the long running backup as required by the ILM/SLM use case of moving indices to "snapshot tier". Finalizations are linearized but ordered according to which snapshot saw all of its shards complete first
2020-07-10 15:19:08 +02:00
Alan Woodward 62f51eb9ae
MappedFieldType no longer requires equals/hashCode/clone (#59212)
With the removal of mapping types and the immutability of FieldTypeLookup in #58162, we no longer
have any cause to compare MappedFieldType instances. This means that we can remove all equals
and hashCode implementations, and in addition we no longer need the clone implementations which
were required for equals/hashcode testing. This greatly simplifies implementing new MappedFieldTypes,
which will be particularly useful for the runtime fields project.
2020-07-09 21:01:29 +01:00
Armin Braun 5da804b865
Add Check for Metadata Existence in BlobStoreRepository (#59141)
In order to ensure that we do not write a broken piece of `RepositoryData`
because the phyiscal repository generation was moved ahead more than one step
by erroneous concurrent writing to a repository we must check whether or not
the current assumed repository generation exists in the repository physically.
Without this check we run the risk of writing on top of stale cached repository data.

Relates #56911
2020-07-08 13:16:58 +02:00
Rene Groeschke ef6eb3af3c
Fix dependency related deprecations (#58892) 2020-07-07 11:29:26 +02:00
Ignacio Vera 155c9d15ea
upgrade to lucene-8.6.0-snapshot-6a715e2ecc3 (#59091) 2020-07-07 10:50:53 +02:00
Jake Landis 333a5d8cdf
Create plugin for yamlTest task (#56841)
This commit creates a new Gradle plugin to provide a separate task name
and source set for running YAML based REST tests. The only project
converted to use the new plugin in this PR is distribution/archives/integ-test-zip.
For which the testing has been moved to :rest-api-spec since it makes the most
sense and it avoids a small but awkward change to the distribution plugin.

The remaining cases in modules, plugins, and x-pack will be handled in followups.

This plugin is distinctly different from the plugin introduced in #55896 since
the YAML REST tests are intended to be black box tests over HTTP. As such they
should not (by default) have access to the classpath for that which they are testing.

The YAML based REST tests will be moved to separate source sets (yamlRestTest).
The which source is the target for the test resources is dependent on if this
new plugin is applied. If it is not applied, it will default to the test source
set.

Further, this introduces a breaking change for plugin developers that
use the YAML testing framework. They will now need to either use the new source set
and matching task, or configure the rest resources to use the old "test" source set that
matches the old integTest task. (The former should be preferred).

As part of this change (which is also breaking for plugin developers) the
rest resources plugin has been removed from the build plugin and now requires
either explicit application or application via the new YAML REST test plugin.

Plugin developers should be able to fix the breaking changes to the YAML tests
by adding apply plugin: 'elasticsearch.yaml-rest-test' and moving the YAML tests
under a yamlRestTest folder (instead of test)
2020-07-06 12:13:01 -05:00
Tim Brooks 3190c3cf93
Use `getPortRange` in http server tests (#58794)
Currently we are leaving the settings to default port range in the nio
and netty4 http server test. This has recently led to tests failing due
to what appears to be a port conflict with other processes. This commit
modifies these tests to use the test case helper method to generate port
ranges.

Fixes #58433 and #58296.
2020-07-02 13:08:04 -06:00
Armin Braun 99be035f2c
Simplify Repository.finalizeSnapshot Signature (#58834)
Many of the parameters we pass into this method were only used to
build the `SnapshotInfo` instance to write.
This change simplifies the signature. Also, it seems less error prone to build
`SnapshotInfo` in `SnapshotsService` isntead of relying on the fact that each repository
implementation will build the correct `SnapshotInfo`.
2020-07-02 15:38:53 +02:00
Alan Woodward 3944066e99
Move MappedFieldType#getSearchAnalyzer and #getSearchQuoteAnalyzer to TextSearchInfo (#58639)
Analyzers are specific to text searching, and so should be in TextSearchInfo rather than on
the generic MappedFieldType.
2020-07-01 13:16:02 +01:00
Yannick Welsch 118521d022
Account for recovery throttling when restoring snapshot (#58658)
Restoring from a snapshot (which is a particular form of recovery) does not currently take recovery throttling into account
(i.e. the `indices.recovery.max_bytes_per_sec` setting). While restores are subject to their own throttling (repository
setting `max_restore_bytes_per_sec`), this repository setting does not allow for values to be configured differently on a
per-node basis. As restores are very similar in nature to peer recoveries (streaming bytes to the node), it makes sense to
configure throttling in a single place.

The `max_restore_bytes_per_sec` setting is also changed to default to unlimited now, whereas previously it was set to
`40mb`, which is the current default of `indices.recovery.max_bytes_per_sec`). This means that no behavioral change
will be observed by clusters where the recovery and restore settings were not adapted.

Relates https://github.com/elastic/elasticsearch/issues/57023

Co-authored-by: James Rodewig <james.rodewig@elastic.co>
2020-06-30 13:08:21 +02:00
Rene Groeschke 9526c7a4b3
Replace compile configuration usage with api (#58451)
- Use java-library instead of plugin to allow api configuration usage
- Remove explicit references to runtime configurations in dependency declarations
- Make test runtime classpath input for testing convention
  - required as java library will by default not have build jar file
  - jar file is now explicit input of the task and gradle will ensure its properly build
2020-06-30 09:37:09 +02:00
Przemyslaw Gomulka 9bef31ccd3
Do not create two loggers for DeprecationLogger (#58435)
DeprecationLogger's constructor should not create two loggers. It was
taking parent logger instance, changing its name with a .deprecation
prefix and creating a new logger.
Most of the time parent logger was not needed. It was causing Log4j to
unnecessarily cache the unused parent logger instance.
2020-06-29 13:38:21 +02:00
Tim Brooks 089dcaf0b5
Add error logging when http test fails (#58462)
Netty4HttpServerTransportTests has started to fail intermittently. It
seems like unexpected successful responses are being received when the
test is simulating errors. This commit adds logging to the test to
provide additional information when there is an unexpected success. It
also adds the logging to the nio http test.
2020-06-24 09:40:42 -06:00
Alan Woodward 57316e26af
Add text search information to MappedFieldType (#58230)
Now that MappedFieldType no longer extends lucene's FieldType, we need to have a
way of getting the index information about a field necessary for building text queries,
building term vectors, highlighting, etc. This commit introduces a new TextSearchInfo
abstraction that holds this information, and a getTextSearchInfo() method to
MappedFieldType to make it available. Field types that do not support text search can
just return null here.

This allows us to remove the MapperService.getLuceneFieldType() shim method.
2020-06-23 13:37:49 +01:00
Alan Woodward 708f6bf879
Add serialization test for FieldMappers when include_defaults=true (#58235)
Fixes a bug in TextFieldMapper serialization when index is false, and adds a
base-class test to ensure that all field mappers are tested against all variations
with defaults both included and excluded.

Fixes #58188
2020-06-18 14:34:06 +01:00
Alan Woodward 09ff747fe7
Remove Settings parameter from FieldMapper base class (#58237)
This is currently used to set the indexVersionCreated parameter on FieldMapper.
However, this parameter is only actually used by two implementations, and clutters
the API considerably. We should just remove it, and use it directly in the
implementations that require it.
2020-06-18 12:39:48 +01:00
Rene Groeschke 5f9d1f1d7c
Unify dependency licenses task configuration (#58116)
- Remove duplicate dependency configuration
- Use task avoidance api accross the build
- Remove redundant licensesCheck config
2020-06-17 18:27:16 +02:00
Tal Levy 69a6a18d8d
Add optional description parameter to ingest processors. (#57906)
This commit adds an optional field, `description`, to all ingest processors
so that users can explain the purpose of the specific processor instance.

Closes #56000.
2020-06-15 14:08:29 -07:00
Alan Woodward 3b696828ad
MappedFieldType should not extend FieldType (#57666)
MappedFieldType is a combination of two concerns:

* an extension of lucene's FieldType, defining how a field should be indexed
* a set of query factory methods, defining how a field should be searched

We want to break these two concerns apart. This commit is a first step to doing this, breaking
the inheritance relationship between MappedFieldType and FieldType. MappedFieldType 
instead has a series of boolean flags defining whether or not the field is searchable or 
aggregatable, and FieldMapper has a separate FieldType passed to its constructor defining 
how indexing should be done.

Relates to #56814
2020-06-15 17:47:15 +01:00
Rene Groeschke 680ea07f7f
Remove deprecated usage of testCompile configuration (#57921)
* Remove usage of deprecated testCompile configuration
* Replace testCompile usage by testImplementation
* Make testImplementation non transitive by default (as we did for testCompile)
* Update CONTRIBUTING about using testImplementation for test dependencies
* Fail on testCompile configuration usage
2020-06-12 13:34:53 +02:00
Alan Woodward e19a82d762
Update to lucene snapshot e7c625430ed (#57981)
Includes LUCENE-9148 and LUCENE-9398, which splits the BKD metadata, index and data into separate files and keeps the index off-heap.
2020-06-11 14:36:31 +01:00
Armin Braun 37ab35156b
Deduplicate Index Metadata in BlobStore (#50278)
This PR introduces two new fields in to `RepositoryData` (index-N) to track the blob name of `IndexMetaData` blobs and their content via setting generations and uuids. This is used to deduplicate the `IndexMetaData` blobs (`meta-{uuid}.dat` in the indices folders under `/indices` so that new metadata for an index is only written to the repository during a snapshot if that same metadata can't be found in another snapshot.
This saves one write per index in the common case of unchanged metadata thus saving cost and making snapshot finalization drastically faster if many indices are being snapshotted at the same time.

The implementation is mostly analogous to that for shard generations in #46250 and piggy backs on the BwC mechanism introduced in that PR (which means this PR needs adjustments if it doesn't go into `7.6`). 

Relates to #45736 as it improves the efficiency of snapshotting unchanged indices
Relates to #49800 as it has the potential of loading the index metadata for multiple snapshots of the same index concurrently much more efficient speeding up future concurrent snapshot delete
2020-06-05 19:16:41 +02:00
Jun Ohtani 9d5409a9c2
Expose discard_compound_token option to kuromoji_tokenizer (#57421)
This commit exposes the new Lucene option `discard_compound_token` to the Elasticsearch Kuromoji plugin.
2020-06-05 15:33:31 +02:00
Tanguy Leroux 34e253558d
Remove more //NORELEASE (#57517)
We agreed on removing the following //NORELEASE tags.
2020-06-05 15:19:38 +02:00
Mark Tozzi 0a23487e73
IndexFieldData should hold the ValuesSourceType (#57373) 2020-06-02 09:54:53 -04:00
Tanguy Leroux d7b31a8a35
Use 3rd party task to run integration tests on external service (#56587) 2020-06-02 09:40:37 +02:00
Mark Vieira 627ef279fd
Include vendored code notices in distribution notice files (#57017) 2020-06-01 15:23:41 -07:00
Przemyslaw Gomulka 4d6dc51c72
Header warning logging refactoring (#55941)
Splitting DeprecationLogger into two. HeaderWarningLogger - responsible for adding a response warning headers and ThrottlingLogger - responsible for limiting the duplicated log entries for the same key (previously deprecateAndMaybeLog).
Introducing A ThrottlingAndHeaderWarningLogger which is a base for other common logging usages where both response warning header and logging throttling was needed.

relates #55699
relates #52369
2020-06-01 15:44:01 +02:00
Armin Braun 6f79750793
Fix GCS Mock Behavior for Missing Bucket (#57283)
* Fix GCS Mock Behavior for Missing Bucket

We were throwing a 500 instead of a 404 for a missing bucket.
This would make yaml tests needlessly wait for multiple seconds, retrying
the 500 response with backoff, in the test checking behavior for missing buckets.
2020-05-28 21:01:28 +02:00
Francisco Fernández Castaño 007ab1b846
Track PUT/PUT_BLOCK operations on AzureBlobStore. (#56936) 2020-05-25 12:53:17 +02:00
Armin Braun 2151fbf7d7
Release HTTP Request Body Earlier (#57094)
We don't need to hold on to the request body past the beginning of sending
the response. There is no need to keep a reference to it until after the response
has been sent fully and we can eagerly release it here.
Note, this can be optimized further to release the contents even earlier but for now
this is an easy increment to saving some memory on the IO pool.
2020-05-25 12:02:47 +02:00
Armin Braun 444e1e155d
Remove Needless Context Switches on Loading RepositoryData (#56935)
We don't need to switch to the generic or snapshot pool for loading
cached repository data (i.e. most of the time in normal operation).

This makes `executeConsistentStateUpdate` less heavy if it has to retry
and lowers the chance of having to retry in the first place.
Also, this change allowed simplifying a few other spots in the codebase
where we would fork off to another pool just to load repository data.
2020-05-25 11:20:17 +02:00
Armin Braun b82a16eb38
Fix GCS Repository YAML Test Build (#57073)
A few relatively obvious issues here:

* We cannot run the different IT runs (large blob setting one and normal integ run) concurrently
* We need to set the dependency tasks up correctly for the large blob run so that it works in isolation
* We can't use the `localAddress` for the location header of the resumable upload
(this breaks in YAML tests because GCS is using a loopback port forward for the initial request and the
local address will be chosen as the actual Docker container host)

Closes #57026
2020-05-25 10:11:58 +02:00
Armin Braun 2a8b578746
Serialize Outbound Messages on IO Threads (#56961)
Almost every outbound message is serialized to buffers of 16k pagesize.
We were serializing these messages off the IO loop (and retaining the concrete message
instance as well) and would then enqueue it on the IO loop to be dealt with as soon as the
channel is ready.
1. This would cause buffers to be held onto for longer than necessary, causing less reuse on average.
2. If a channel was slow for some reason, not only would concrete message instances queue up for it, but also 16k of buffers would be reserved for each message until it would be written+flushed physically.

With this change, the serialization happens on the event loop which effectively limits the number of buffers that `N` IO-threads will ever use so long as messages are small and channels writable.
Also, this change dereferences the reference to the concrete outbound message as soon as it has been serialized to save some more on GC.

This reduces the GC time for a default PMC run by about 50% in experiments (3 nodes, 2G heap each, loopback ... obvious caveat is that GC isn't that heavy in the first place with recent changes but still a measurable gain).
I also expect it to be helpful for master node stability by causing less of a spike if master is e.g. hit by a large number of requests that are processed batched (e.g. shard snapshot status updates) and responded to in a short time frame all at once.

Obviously, the downside to this change is that it introduces more latency on the IO loop for the serialization. But since we read all of these messages on the IO loop as well I don't see it as much of a qualitative change really and the more predictable buffer use seems much more valuable relatively.
2020-05-22 20:06:30 +02:00
markharwood df93987a75
Update Lucene snapshot to 8.6.0-snapshot-9d6c738ffce (#56988)
Update of Lucene snapshot and dealing with API changes
2020-05-21 09:18:35 +01:00
Alan Woodward f82d74b501
Move merge compatibility logic from MappedFieldType to FieldMapper (#56915)
Merging logic is currently split between FieldMapper, with its merge() method, and
MappedFieldType, which checks for merging compatibility. The compatibility checks
are called from a third class, MappingMergeValidator. This makes it difficult to reason
about what is or is not compatible in updates, and even what is in fact updateable - we
have a number of tests that check compatibility on changes in mapping configuration
that are not in fact possible.

This commit refactors the compatibility logic so that it all sits on FieldMapper, and
makes it called at merge time. It adds a new FieldMapperTestCase base class that
FieldMapper tests can extend, and moves the compatibility testing machinery from
FieldTypeTestCase to here.

Relates to #56814
2020-05-20 09:32:08 +01:00
Rene Groeschke 731b282c9f
Improvement usage of gradle task avoidance api (#56627)
Use gradle task avoidance api wherever it is possible as a drop in replacement in the es build
2020-05-19 20:01:49 +02:00
Francisco Fernández Castaño 46b754831f
Track GET/LIST Azure Storage API calls (#56773)
Adds tracking for the API calls performed by the Azure Storage
underlying SDK. A new interface (RequestMetricCollector) has been
introduced into the Azure plugin that allows collecting metrics per
request easily, it just need to be injected in during the client
creation and it would be hooked into the OperationContext.
2020-05-19 11:42:43 +02:00
Tim Brooks 7501e061cf
Create HttpRequest earlier in pipeline (#56393)
Elasticsearch requires that a HttpRequest abstraction be implemented
by http modules before server processing. This abstraction controls when
underlying resources are released. This commit moves this abstraction to
be created immediately after content aggregation. This change will
enable follow-up work including moving Cors logic into the server
package and tracking bytes as they are aggregated from the network
level.
2020-05-18 09:06:24 -06:00
Francisco Fernández Castaño 490e9c8d2a
Track upload requests on S3 repositories (#56826)
Add tracking for regular and multipart uploads.
Regular uploads are categorized as PUT.
Multi part uploads are categorized as POST.
The number of documents created for the test #testRequestStats
have been increased so all upload methods are exercised.
2020-05-18 13:46:39 +02:00
Francisco Fernández Castaño 79a69cb676
Track multipart/resumable uploads GCS API calls (#56821)
Add tracking for multipart and resumable uploads for GoogleCloudStorage.
For resumable uploads only the last request is taken into account for
billing, so that's the only request that's tracked.
2020-05-18 11:41:55 +02:00
Armin Braun d3996358b4
Shorter Path in Netty ByteBuf Unwrap (#56740)
In most cases we are seeing a `PooledHeapByteBuf` here now. No need to
redundantly create an new `ByteBuffer` and single element array for it
here when we can just directly unwrap its internal `byte[]`.
2020-05-16 10:27:34 +02:00
Alan Woodward 0cc2345f98
Simplify generics on Mapper.Builder (#56747)
Mapper.Builder currently has some complex generics on it to allow fluent builder
construction. However, the second parameter, a return type from the build() method,
is unnecessary, as we can use covariant return types. This commit removes this second
generic parameter.
2020-05-15 12:06:39 +01:00
Francisco Fernández Castaño aaddeb8d46
Move azure client logic from AzureStorageService to AzureBlobStore (#56782) 2020-05-15 09:57:15 +02:00
Ryan Ernst c0ee68b0a0
Move publishing configuration to a separate plugin (#56727)
This is another part of the breakup of the massive BuildPlugin. This PR
moves the code for configuring publications to a separate plugin. Most
of the time these publications are jar files, but this also supports the
zip publication we have for integ tests.
2020-05-14 18:56:59 -07:00
Mark Vieira f9847f3b71
Enforce strict license distribution requirements (#56642)
This commit tightens certain dependency license checks in our build.
Firstly, the build will not fail if it cannot accurately identify the
type of license in one of our LICENSE.txt files. Secondly, dependencies
for licenses identified as requiring source redistribution will fail if
a corresponding SOURCES.txt file does not exist. This file should
include a hyperlink to a source artifact for the given dependency to be
used for redistribution during the release process.
2020-05-14 13:25:32 -07:00