This commit removes `integTest` task from all es-plugins.
Most relevant projects have been converted to use yamlRestTest, javaRestTest,
or internalClusterTest in prior PRs.
A few projects needed to be adjusted to allow complete removal of this task
* x-pack/plugin - converted to use yamlRestTest and javaRestTest
* plugins/repository-hdfs - kept the integTest task, but use `rest-test` plugin to define the task
* qa/die-with-dignity - convert to javaRestTest
* x-pack/qa/security-example-spi-extension - convert to javaRestTest
* multiple projects - remove the integTest.enabled = false (yay!)
related: #61802
related: #60630
related: #59444
related: #59089
related: #56841
related: #59939
related: #55896
Kibana often highlights *everything* like this:
```
POST /_search
{
"query": ...,
"size": 500,
"highlight": {
"fields": {
"*": { ... }
}
}
}
```
This can get slow when there are hundreds of mapped fields. I tested
this locally and unscientifically and it took a request from 20ms to
150ms when there are 100 fields. I've seen clusters with 2000 fields
where simple search go from 500ms to 1500ms just by turning on this sort
of highlighting. Even when the query is just a `range` that and the
fields are all numbers and stuff so it won't highlight anything.
This speeds up the `unified` highlighter in this case in a few ways:
1. Build the highlighting infrastructure once field rather than once pre
document per field. This cuts out a *ton* of work analyzing the query
over and over and over again.
2. Bail out of the highlighter before loading values if we can't produce
any results.
Combined these take that local 150ms case down to 65ms. This is unlikely
to be really useful when there are only a few fetched docs and only a
few fields, but we often end up having many fields with many fetched
docs.
Currently we duplicate our specialized cors logic in all transport
plugins. This is unnecessary as it could be implemented in a single
place. This commit moves the logic to server. Additionally it fixes a
but where we are incorrectly closing http channels on early Cors
responses.
This pull request adds a new set of APIs that allows tracking the number of requests performed
by the different registered repositories.
In order to avoid losing data, the repository statistics are archived after the repository is closed for
a configurable retention period `repositories.stats.archive.retention_period`. The API exposes the
statistics for the active repositories as well as the modified/closed repositories.
There are currently half a dozen ways to add plugins and modules for
test clusters to use. All of them require the calling project to peek
into the plugin or module they want to use to grab its bundlePlugin
task, and then both depend on that task, as well as extract the archive
path the task will produce. This creates cross project dependencies that
are difficult to detect, and if the dependent plugin/module has not yet
been configured, the build will fail because the task does not yet
exist.
This commit makes the plugin and module methods for testclusters
symmetetric, and simply adding a file provider directly, or a project
path that will produce the plugin/module zip. Internally this new
variant uses normal configuration/dependencies across projects to get
the zip artifact. It also has the added benefit of no longer needing the
caller to add to the test task a dependsOn for bundlePlugin task.
FetchSubPhase has two 'execute' methods, one which takes all hits to be examined,
and one which takes a single HitContext. It's not obvious which one should be implemented
by a given sub-phase, or if implementing both is a possibility; nor is it obvious that we first
run the hitExecute methods of all subphases, and then subsequently call all the
hitsExecute methods.
This commit reworks FetchSubPhase to replace these two variants with a processor class,
`FetchSubPhaseProcessor`, that is returned from a single `getProcessor` method. This
processor class has two methods, `setNextReader()` and `process`. FetchPhase collects
processors from all its subphases (if a subphase does not need to execute on the current
search context, it can return `null` from `getProcessor`). It then sorts its hits by docid, and
groups them by lucene leaf reader. For each reader group, it calls `setNextReader()` on
all non-null processors, and then passes each doc id to `process()`.
Implementations of fetch sub phases can divide their concerns into per-request, per-reader
and per-document sections, and no longer need to worry about sorting docs or dealing with
reader slices.
FetchSubPhase now provides a FetchSubPhaseExecutor that exposes two methods, setNextReader(LeafReaderContext) and execute(HitContext). The parent FetchPhase collects all these executors together (if a phase should not be executed, then it returns null here); then it sorts hits, and groups them by reader; for each reader it calls setNextReader, and then execute for each hit in turn. Individual sub phases no longer need to concern themselves with sorting docs or keeping track of readers; global structures can be built in getExecutor(SearchContext), per-reader structures in setNextReader and per-doc in execute.
It looks like it is possible for a request to throw an exception early
before any API interaciton has happened. This can lead to the request count
map containing a `null` for the request count key.
The assertion is not correct and we should not NPE here
(as that might also hide the original exception since we are running this code in
a `finally` block from within the S3 SDK).
Closes#61670
Runtime fields need to have a SearchLookup available, when building their fielddata implementations, so that they can look up other fields, runtime or not.
To achieve that, we add a Supplier<SearchLookup> argument to the existing MappedFieldType#fielddataBuilder method.
As we introduce the ability to look up other fields while building fielddata for mapped fields, we implicitly add the ability for a field to require other fields. This requires some protection mechanism that detects dependency cycles to prevent stack overflow errors.
With this commit we also introduce detection for cycles, as well as a limit on the depth of the references for a runtime field. Note that we also plan on introducing cycles detection at compile time, so the runtime cycles detection is a last resort to prevent stack overflow errors but we hope that we can reject runtime fields from being registered in the mappings when they create a cycle in their definition.
Note that this commit does not introduce any production implementation of runtime fields, but is rather a pre-requisite to merge the runtime fields feature branch.
This is a breaking change for MapperPlugins that plug in a mapper, as the signature of MappedFieldType#fielddataBuilder changes from taking a single argument (the index name), to also accept a Supplier<SearchLookup>.
Co-authored-by: Nik Everett <nik9000@gmail.com>
Relates to #59332
Added case insensitive support for regex queries.
Forks a copy of Lucene’s RegexpQuery and RegExp from Lucene master.
This can be removed when 8.7 Lucene is released.
Closes#59235
The recursive data.path FilePermission check is an extremely hot
codepath in Elasticsearch. Unfortunately the FilePermission check in
Java is extremely allocation heavy. As it iterates through different
file permissions, it allocates byte arrays for each Path component that
must be compared. This PR improves the situation by adding the recursive
data.path FilePermission it its own PermissionsCollection object which
is checked first.
Before when a value was copied to a field through a parent field or `copy_to`,
we parsed it using the `FieldMapper` from the source field. Instead we should
parse it using the target `FieldMapper`. This ensures that we apply the
appropriate mapping type and options to the copied value.
To implement the fix cleanly, this PR refactors the value parsing strategy. Now
instead of looking up values directly, field mappers produce a helper object
`ValueFetcher`. The value fetchers are responsible for almost all aspects of
fetching, including looking up the right paths in the _source.
The PR is fairly big but each commit can be reviewed individually.
Fixes#61033.
Same as https://github.com/elastic/elasticsearch/pull/43288 for GCS.
We don't need to do the bucket exists check before using the repo, that just needlessly
increases the necessary permissions for using the GCS repository.
We have various ways of copying between two streams and handling thread-local
buffers throughout the codebase. This commit unifies a number of them and
removes buffer allocations in many spots.
* Merge test runner task into RestIntegTest
* Reorganizing Standalone runner and RestIntegTest task
* Rework general test task configuration and extension
- Replace immediate task creations by using task avoidance api
- One step closer to #56610
- Still many tasks are created during configuration phase. Tackled in separate steps
The `SourceLookup` class provides access to the _source for a particular
document, specified through `SourceLookup#setSegmentAndDocument`. Previously
the search context contained a single `SourceLookup` that was shared between
different fetch subphases. It was hard to reason about its state: is
`SourceLookup` set to the expected document? Is the _source already loaded and
available?
Instead of using a global source lookup, the fetch hit context now provides
access to a lookup that is set to load from the hit document.
This refactor closes#31000, since the same `SourceLookup` is no longer shared
between the 'fetch _source phase' and script execution.
In #60297 we added some tests related to logging from the transport
layer, but these tests failed occasionally since the cluster
was kept alive between test invocations but the logging framework
expected it only to be used for a single test. With this commit we
reduce the scope of the internal test cluster to `TEST` to solve this
problem.
Closes#60321.
For OSS plugins that being with repository-*, integTest
task is now a no-op and all of the tests are now executed via a test,
yamlRestTest, javaRestTest, or internalClusterTest.
related: #56841
related: #59444
For OSS plugins that begin with discovery-*, the integTest
task is now a no-op and all of the tests are now executed via a test,
yamlRestTest, javaRestTest, or internalClusterTest.
related: #56841
related: #59444
For all OSS plugins (except repository-* and discovery-*) integTest
task is now a no-op and all of the tests are now executed via a test,
yamlRestTest, javaRestTest, or internalClusterTest.
This commit does NOT convert the discovery-* and repository-* since they
are bit more complex then the rest of tests and this PR is large enough.
Those plugins will be addressed in a future PR(s).
This commit also fixes a minor issue that did not copy the rest api
for projects that only had YAML TEST tests.
related: #56841
Transport connections between nodes remain in place until one or other
node shuts down or the connection is disrupted by a flaky network.
Today it is very difficult to demonstrate that transient failures and
cluster instability are caused by the network even though this is often
the case. In particular, transport connections open and close without
logging anything, even at `DEBUG` level, making it very hard to quantify
the scale of the problem or to correlate the networking problems with
external events.
This commit adds the missing `DEBUG`-level logging when transport
connections open and close, and also tracks the total number of
transport connections a node has opened as a measure of the stability of
the underlying network.
This feature adds a new `fields` parameter to the search request, which
consults both the document `_source` and the mappings to fetch fields in a
consistent way. The PR merges the `field-retrieval` feature branch.
Addresses #49028 and #55363.
keepalives tell any intermediate devices that the connection remains alive, which helps with overzealous firewalls that are
killing idle connections. keepalives are enabled by default in Elasticsearch, but use system defaults for their
configuration, which often times do not have reasonable defaults (e.g. 7200s for TCP_KEEP_IDLE) in the context of
distributed systems such as Elasticsearch.
This PR sets the socket-level keep_alive options for network.tcp.{keep_idle,keep_interval} to 5 minutes on configurations
that support it (>= Java 11 & (MacOS || Linux)) and where the system defaults are set to something higher than 5
minutes. This helps keep the connections alive while not interfering with system defaults or user-specified settings
unless they are deemed to be set too high by providing better out-of-the-box defaults.
Due to complicated access checks (reads and writes execute in their own access context) on some repositories (GCS, Azure, HDFS), using a hard coded buffer size of 4k for restores was needlessly inefficient.
By the same token, the use of stream copying with the default 8k buffer size for blob writes was inefficient as well.
We also had dedicated, undocumented buffer size settings for HDFS and FS repositories. For these two we would use a 100k buffer by default. We did not have such a setting for e.g. GCS though, which would only use an 8k read buffer which is needlessly small for reading from a raw `URLConnection`.
This commit adds an undocumented setting that sets the default buffer size to `128k` for all repositories. It removes wasteful allocation of such a large buffer for small writes and reads in case of HDFS and FS repositories (i.e. still using the smaller buffer to write metadata) but uses a large buffer for doing restores and uploading segment blobs.
This should speed up Azure and GCS restores and snapshots in a non-trivial way as well as save some memory when reading small blobs on FS and HFDS repositories.
We never used the `IndexSettings` parameter and we only used the
`MappedFieldType` parameter to get the name of the field which we
already know everywhere where we build the `IFD.Builder`. This allows us
to drop a fair bit of ceremony from a couple of tests.
Removing these limits as they cause unnecessarily many object in the blob stores.
We do not have to worry about BwC of this change since we do not support any 3rd party
implementations of Azure or GCS.
Also, since there is no valid reason to set a different than the default maximum chunk size at this
point, removing the documentation (which was incorrect in the case of Azure to begin with) for the setting
from the docs.
Closes#56018
Enables fully concurrent snapshot operations:
* Snapshot create- and delete operations can be started in any order
* Delete operations wait for snapshot finalization to finish, are batched as much as possible to improve efficiency and once enqueued in the cluster state prevent new snapshots from starting on data nodes until executed
* We could be even more concurrent here in a follow-up by interleaving deletes and snapshots on a per-shard level. I decided not to do this for now since it seemed not worth the added complexity yet. Due to batching+deduplicating of deletes the pain of having a delete stuck behind a long -running snapshot seemed manageable (dropped client connections + resulting retries don't cause issues due to deduplication of delete jobs, batching of deletes allows enqueuing more and more deletes even if a snapshot blocks for a long time that will all be executed in essentially constant time (due to bulk snapshot deletion, deleting multiple snapshots is mostly about as fast as deleting a single one))
* Snapshot creation is completely concurrent across shards, but per shard snapshots are linearized for each repository as are snapshot finalizations
See updated JavaDoc and added test cases for more details and illustration on the functionality.
Some notes:
The queuing of snapshot finalizations and deletes and the related locking/synchronization is a little awkward in this version but can be much simplified with some refactoring. The problem is that snapshot finalizations resolve their listeners on the `SNAPSHOT` pool while deletes resolve the listener on the master update thread. With some refactoring both of these could be moved to the master update thread, effectively removing the need for any synchronization around the `SnapshotService` state. I didn't do this refactoring here because it's a fairly large change and not necessary for the functionality but plan to do so in a follow-up.
This change allows for completely removing any trickery around synchronizing deletes and snapshots from SLM and 100% does away with SLM errors from collisions between deletes and snapshots.
Snapshotting a single index in parallel to a long running full backup will execute without having to wait for the long running backup as required by the ILM/SLM use case of moving indices to "snapshot tier". Finalizations are linearized but ordered according to which snapshot saw all of its shards complete first
With the removal of mapping types and the immutability of FieldTypeLookup in #58162, we no longer
have any cause to compare MappedFieldType instances. This means that we can remove all equals
and hashCode implementations, and in addition we no longer need the clone implementations which
were required for equals/hashcode testing. This greatly simplifies implementing new MappedFieldTypes,
which will be particularly useful for the runtime fields project.
In order to ensure that we do not write a broken piece of `RepositoryData`
because the phyiscal repository generation was moved ahead more than one step
by erroneous concurrent writing to a repository we must check whether or not
the current assumed repository generation exists in the repository physically.
Without this check we run the risk of writing on top of stale cached repository data.
Relates #56911
This commit creates a new Gradle plugin to provide a separate task name
and source set for running YAML based REST tests. The only project
converted to use the new plugin in this PR is distribution/archives/integ-test-zip.
For which the testing has been moved to :rest-api-spec since it makes the most
sense and it avoids a small but awkward change to the distribution plugin.
The remaining cases in modules, plugins, and x-pack will be handled in followups.
This plugin is distinctly different from the plugin introduced in #55896 since
the YAML REST tests are intended to be black box tests over HTTP. As such they
should not (by default) have access to the classpath for that which they are testing.
The YAML based REST tests will be moved to separate source sets (yamlRestTest).
The which source is the target for the test resources is dependent on if this
new plugin is applied. If it is not applied, it will default to the test source
set.
Further, this introduces a breaking change for plugin developers that
use the YAML testing framework. They will now need to either use the new source set
and matching task, or configure the rest resources to use the old "test" source set that
matches the old integTest task. (The former should be preferred).
As part of this change (which is also breaking for plugin developers) the
rest resources plugin has been removed from the build plugin and now requires
either explicit application or application via the new YAML REST test plugin.
Plugin developers should be able to fix the breaking changes to the YAML tests
by adding apply plugin: 'elasticsearch.yaml-rest-test' and moving the YAML tests
under a yamlRestTest folder (instead of test)
Currently we are leaving the settings to default port range in the nio
and netty4 http server test. This has recently led to tests failing due
to what appears to be a port conflict with other processes. This commit
modifies these tests to use the test case helper method to generate port
ranges.
Fixes#58433 and #58296.
Many of the parameters we pass into this method were only used to
build the `SnapshotInfo` instance to write.
This change simplifies the signature. Also, it seems less error prone to build
`SnapshotInfo` in `SnapshotsService` isntead of relying on the fact that each repository
implementation will build the correct `SnapshotInfo`.
Restoring from a snapshot (which is a particular form of recovery) does not currently take recovery throttling into account
(i.e. the `indices.recovery.max_bytes_per_sec` setting). While restores are subject to their own throttling (repository
setting `max_restore_bytes_per_sec`), this repository setting does not allow for values to be configured differently on a
per-node basis. As restores are very similar in nature to peer recoveries (streaming bytes to the node), it makes sense to
configure throttling in a single place.
The `max_restore_bytes_per_sec` setting is also changed to default to unlimited now, whereas previously it was set to
`40mb`, which is the current default of `indices.recovery.max_bytes_per_sec`). This means that no behavioral change
will be observed by clusters where the recovery and restore settings were not adapted.
Relates https://github.com/elastic/elasticsearch/issues/57023
Co-authored-by: James Rodewig <james.rodewig@elastic.co>
- Use java-library instead of plugin to allow api configuration usage
- Remove explicit references to runtime configurations in dependency declarations
- Make test runtime classpath input for testing convention
- required as java library will by default not have build jar file
- jar file is now explicit input of the task and gradle will ensure its properly build
DeprecationLogger's constructor should not create two loggers. It was
taking parent logger instance, changing its name with a .deprecation
prefix and creating a new logger.
Most of the time parent logger was not needed. It was causing Log4j to
unnecessarily cache the unused parent logger instance.
Netty4HttpServerTransportTests has started to fail intermittently. It
seems like unexpected successful responses are being received when the
test is simulating errors. This commit adds logging to the test to
provide additional information when there is an unexpected success. It
also adds the logging to the nio http test.
Now that MappedFieldType no longer extends lucene's FieldType, we need to have a
way of getting the index information about a field necessary for building text queries,
building term vectors, highlighting, etc. This commit introduces a new TextSearchInfo
abstraction that holds this information, and a getTextSearchInfo() method to
MappedFieldType to make it available. Field types that do not support text search can
just return null here.
This allows us to remove the MapperService.getLuceneFieldType() shim method.
Fixes a bug in TextFieldMapper serialization when index is false, and adds a
base-class test to ensure that all field mappers are tested against all variations
with defaults both included and excluded.
Fixes#58188
This is currently used to set the indexVersionCreated parameter on FieldMapper.
However, this parameter is only actually used by two implementations, and clutters
the API considerably. We should just remove it, and use it directly in the
implementations that require it.
This commit adds an optional field, `description`, to all ingest processors
so that users can explain the purpose of the specific processor instance.
Closes#56000.
MappedFieldType is a combination of two concerns:
* an extension of lucene's FieldType, defining how a field should be indexed
* a set of query factory methods, defining how a field should be searched
We want to break these two concerns apart. This commit is a first step to doing this, breaking
the inheritance relationship between MappedFieldType and FieldType. MappedFieldType
instead has a series of boolean flags defining whether or not the field is searchable or
aggregatable, and FieldMapper has a separate FieldType passed to its constructor defining
how indexing should be done.
Relates to #56814
* Remove usage of deprecated testCompile configuration
* Replace testCompile usage by testImplementation
* Make testImplementation non transitive by default (as we did for testCompile)
* Update CONTRIBUTING about using testImplementation for test dependencies
* Fail on testCompile configuration usage
This PR introduces two new fields in to `RepositoryData` (index-N) to track the blob name of `IndexMetaData` blobs and their content via setting generations and uuids. This is used to deduplicate the `IndexMetaData` blobs (`meta-{uuid}.dat` in the indices folders under `/indices` so that new metadata for an index is only written to the repository during a snapshot if that same metadata can't be found in another snapshot.
This saves one write per index in the common case of unchanged metadata thus saving cost and making snapshot finalization drastically faster if many indices are being snapshotted at the same time.
The implementation is mostly analogous to that for shard generations in #46250 and piggy backs on the BwC mechanism introduced in that PR (which means this PR needs adjustments if it doesn't go into `7.6`).
Relates to #45736 as it improves the efficiency of snapshotting unchanged indices
Relates to #49800 as it has the potential of loading the index metadata for multiple snapshots of the same index concurrently much more efficient speeding up future concurrent snapshot delete
Splitting DeprecationLogger into two. HeaderWarningLogger - responsible for adding a response warning headers and ThrottlingLogger - responsible for limiting the duplicated log entries for the same key (previously deprecateAndMaybeLog).
Introducing A ThrottlingAndHeaderWarningLogger which is a base for other common logging usages where both response warning header and logging throttling was needed.
relates #55699
relates #52369
* Fix GCS Mock Behavior for Missing Bucket
We were throwing a 500 instead of a 404 for a missing bucket.
This would make yaml tests needlessly wait for multiple seconds, retrying
the 500 response with backoff, in the test checking behavior for missing buckets.
We don't need to hold on to the request body past the beginning of sending
the response. There is no need to keep a reference to it until after the response
has been sent fully and we can eagerly release it here.
Note, this can be optimized further to release the contents even earlier but for now
this is an easy increment to saving some memory on the IO pool.
We don't need to switch to the generic or snapshot pool for loading
cached repository data (i.e. most of the time in normal operation).
This makes `executeConsistentStateUpdate` less heavy if it has to retry
and lowers the chance of having to retry in the first place.
Also, this change allowed simplifying a few other spots in the codebase
where we would fork off to another pool just to load repository data.
A few relatively obvious issues here:
* We cannot run the different IT runs (large blob setting one and normal integ run) concurrently
* We need to set the dependency tasks up correctly for the large blob run so that it works in isolation
* We can't use the `localAddress` for the location header of the resumable upload
(this breaks in YAML tests because GCS is using a loopback port forward for the initial request and the
local address will be chosen as the actual Docker container host)
Closes#57026
Almost every outbound message is serialized to buffers of 16k pagesize.
We were serializing these messages off the IO loop (and retaining the concrete message
instance as well) and would then enqueue it on the IO loop to be dealt with as soon as the
channel is ready.
1. This would cause buffers to be held onto for longer than necessary, causing less reuse on average.
2. If a channel was slow for some reason, not only would concrete message instances queue up for it, but also 16k of buffers would be reserved for each message until it would be written+flushed physically.
With this change, the serialization happens on the event loop which effectively limits the number of buffers that `N` IO-threads will ever use so long as messages are small and channels writable.
Also, this change dereferences the reference to the concrete outbound message as soon as it has been serialized to save some more on GC.
This reduces the GC time for a default PMC run by about 50% in experiments (3 nodes, 2G heap each, loopback ... obvious caveat is that GC isn't that heavy in the first place with recent changes but still a measurable gain).
I also expect it to be helpful for master node stability by causing less of a spike if master is e.g. hit by a large number of requests that are processed batched (e.g. shard snapshot status updates) and responded to in a short time frame all at once.
Obviously, the downside to this change is that it introduces more latency on the IO loop for the serialization. But since we read all of these messages on the IO loop as well I don't see it as much of a qualitative change really and the more predictable buffer use seems much more valuable relatively.
Merging logic is currently split between FieldMapper, with its merge() method, and
MappedFieldType, which checks for merging compatibility. The compatibility checks
are called from a third class, MappingMergeValidator. This makes it difficult to reason
about what is or is not compatible in updates, and even what is in fact updateable - we
have a number of tests that check compatibility on changes in mapping configuration
that are not in fact possible.
This commit refactors the compatibility logic so that it all sits on FieldMapper, and
makes it called at merge time. It adds a new FieldMapperTestCase base class that
FieldMapper tests can extend, and moves the compatibility testing machinery from
FieldTypeTestCase to here.
Relates to #56814
Adds tracking for the API calls performed by the Azure Storage
underlying SDK. A new interface (RequestMetricCollector) has been
introduced into the Azure plugin that allows collecting metrics per
request easily, it just need to be injected in during the client
creation and it would be hooked into the OperationContext.
Elasticsearch requires that a HttpRequest abstraction be implemented
by http modules before server processing. This abstraction controls when
underlying resources are released. This commit moves this abstraction to
be created immediately after content aggregation. This change will
enable follow-up work including moving Cors logic into the server
package and tracking bytes as they are aggregated from the network
level.
Add tracking for regular and multipart uploads.
Regular uploads are categorized as PUT.
Multi part uploads are categorized as POST.
The number of documents created for the test #testRequestStats
have been increased so all upload methods are exercised.
Add tracking for multipart and resumable uploads for GoogleCloudStorage.
For resumable uploads only the last request is taken into account for
billing, so that's the only request that's tracked.
In most cases we are seeing a `PooledHeapByteBuf` here now. No need to
redundantly create an new `ByteBuffer` and single element array for it
here when we can just directly unwrap its internal `byte[]`.
Mapper.Builder currently has some complex generics on it to allow fluent builder
construction. However, the second parameter, a return type from the build() method,
is unnecessary, as we can use covariant return types. This commit removes this second
generic parameter.
This is another part of the breakup of the massive BuildPlugin. This PR
moves the code for configuring publications to a separate plugin. Most
of the time these publications are jar files, but this also supports the
zip publication we have for integ tests.
This commit tightens certain dependency license checks in our build.
Firstly, the build will not fail if it cannot accurately identify the
type of license in one of our LICENSE.txt files. Secondly, dependencies
for licenses identified as requiring source redistribution will fail if
a corresponding SOURCES.txt file does not exist. This file should
include a hyperlink to a source artifact for the given dependency to be
used for redistribution during the release process.
We never do any file IO or other blocking work on the transport threads
so no tangible benefit can be derived from using more threads than CPUs
for IO.
There are however significant downsides to using more threads than necessary
with Netty in particular. Since we use the default setting for
`io.netty.allocator.useCacheForAllThreads` which is `true` we end up
using up to `16MB` of thread local buffer cache for each transport thread.
Meaning we potentially waste CPUs * 16MB of heap for unnecessary IO threads in addition to obvious inefficiencies of artificially adding extra context switches.
Adds tracking for the API calls performed by the GoogleCloudStorage
underlying SDK. It hooks an HttpResponseInterceptor to the SDK
transport layer and does http request filtering based on the URI
paths that we are interested to track. Unfortunately we cannot hook
a wrapper into the ServiceRPC interface since we're using different
levels of abstraction to implement retries during reads
(GoogleCloudStorageRetryingInputStream).
Fixes the fact that repository metadata with the same settings still results in
multiple settings instances being cached as well as leaking settings on closing
a repository.
Closes#56702
This merges the code for the `significant_terms` agg into the package
for the code for the `terms` agg. They are *super* entangled already,
this mostly just admits that to ourselves.
Precondition for the terms work in #56487
Two spots that allow for some optimization:
* We are often creating a composite reference of just a single item in
the transport layer => special cased via static constructor to make sure we never do that
* Also removed the pointless case of an empty composite bytes ref
* `ByteBufferReference` is practically always created from a heap buffer these days so there
is no point of dealing with all the bounds checks and extra references to sliced buffers from that
and we can just use the underlying array directly
A recent AWS SDK upgrade has introduced a new source of spurious `WARN` logs
when the security manager prevents access to the user's home directory and
therefore to their shared client configuration. This is actually the behaviour
we want, and it's harmless and handled by the SDK as if the profile config
doesn't exist, so this log message is unnecessary noise. This commit suppresses
this noisy logging by default.
Relates #20313Closes#56333
Another Jackson release is available. There are some CVEs addressed,
none of which impact us, but since we can now bump Jackson easily, let
us move along with the train to avoid the false positives from security
scanners.
Moving from `5s` to `10s` here because of #56095.
This adds `10s` to the overall runtime of the test which should be
a reasonable tradeoff for stability.
Closes#56095
`FieldMapper#parseCreateField` accepts the parse context, plus a list of fields
as an output parameter. These fields are immediately added to the document
through `ParseContext#doc()`.
This commit simplifies the signature by removing the list of fields, and having
the mappers add the fields directly to `ParseContext#doc()`. I think this is
nicer for implementors, because previously fields could be added either through
the list, or the context (through `add`, `addWithKey`, etc.)
Adds deleting multiple snapshots in one go without significantly changing the mechanics of snapshot deletes otherwise.
This change does not yet allow mixing snapshot delete and abort. Abort is still only allowed for a single snapshot delete by exact name.
This change folds the removal of the in-progress snapshot entry
into setting the safe repository generation. Outside of removing
an unnecessary cluster state update, this also has the advantage
of removing a somewhat inconsistent cluster state where the safe
repository generation points at `RepositoryData` that contains a
finished snapshot while it is still in-progress in the cluster
state, making it easier to reason about the state machine of
upcoming concurrent snapshot operations.
To read from GCS repositories we're currently using Google SDK's official BlobReadChannel,
which issues a new request every 2MB (default chunk size for BlobReadChannel) using range
requests, and fully downloads the chunk before exposing it to the returned InputStream. This
means that the SDK issues an awfully high number of requests to download large blobs.
Increasing the chunk size is not an option, as that will mean that an awfully high amount of
heap memory will be consumed by the download process.
The Google SDK does not provide the right abstractions for a streaming download. This PR
uses the lower-level primitives of the SDK to implement a streaming download, similar to what
S3's SDK does.
Also closes#55505
* Fix Path Style Access Setting Priority
Fixing obvious bug in handling path style access if it's the only setting overridden by the
repository settings.
Closes#55407
Adds ranged read support for GCS repositories in order to enable searchable snapshot support
for GCS.
As part of this PR, I've extracted some of the test infrastructure to make sure that
GoogleCloudStorageBlobContainerRetriesTests and S3BlobContainerRetriesTests are covering
similar test (as I saw those diverging in what they cover)
* Fix security manager bug writing large blobs to GCS
This commit addresses a security manager permissions issue writing large
blobs (on the resumable upload path) to GCS. The underlying issue here
is that we need to wrap the close and write calls on the channel. It is
not enough to do this:
SocketAccess.doPrivilegedVoidIOException(
() -> Streams.copy(
inputStream,
Channels.newOutputStream(client().writer(blobInfo, writeOptions))));
This reason that this is not enough is because Streams#copy will be in
the stacktrace and it is not granted the security manager permissions
needed to close or write this channel. We only grant those permissions
to classes loaded in the plugin classloader, and Streams#copy is from
the parent classloader. This is why we must wrap the close and write
calls as privileged, to truncate the Streams#copy call out of the
stacktrace.
The reason that this issue is not caught in testing is because the size
of data that we use in testing is too small to trigger the large blob
resumable upload path. Therefore, we address this by adding a system
property to control the threshold, which we can then set in tests to
exercise this code path. Prior to rewriting the writeBlobResumable
method to wrap the close and write calls as privileged, with this
additional test, we are able to reproduce the security manager
permissions issue. After adding the wrapping, this test now passes.
* Fix forbidden APIs issue
* Remove leftover debugging
We believe there's no longer a need to be able to disable basic-license
features completely using the "xpack.*.enabled" settings. If users don't
want to use those features, they simply don't need to use them. Having
such features always available lets us build more complex features that
assume basic-license features are present.
This commit deprecates settings of the form "xpack.*.enabled" for
basic-license features, excluding "security", which is a special case.
It also removes deprecated settings from integration tests and unit
tests where they're not directly relevant; e.g. monitoring and ILM are
no longer disabled in many integration tests.
Closes#53137. Replace calls to deprecate(String,Object...) with deprecateAndMaybeLog(...), with an appropriate key, so that all messages
can potentially be deduplicated.
I've noticed that a lot of our tests are using deprecated static methods
from the Hamcrest matchers. While this is not a big deal in any
objective sense, it seems like a small good thing to reduce compilation
warnings and be ready for a new release of the matcher library if we
need to upgrade. I've also switched a few other methods in tests that
have drop-in replacements.
Currently forbidden apis accounts for 800+ tasks in the build. These
tasks are aggressively created by the plugin. In forbidden apis 3.0, we
will get task avoidance
(https://github.com/policeman-tools/forbidden-apis/pull/162), but we
need to ourselves use the same task avoidance mechanisms to not trigger
these task creations. This commit does that for our foribdden apis
usages, in preparation for upgrading to 3.0 when it is released.
Upgrade to lucene 8.5.1 release that contains a bug fix for a bug that might introduce index corruption when deleting data from an index that was previously shrunk.
Currently there is a clear mechanism to stub sending a request through
the transport. However, this is limited to testing exceptions on the
sender side. This commit reworks our transport related testing
infrastructure to allow stubbing request handling on the receiving side.
Provides basic repository-level stats that will allow us to get some insight into how many
requests are actually being made by the underlying SDK. Currently only tracks GET and LIST
calls for S3 repositories. Most of the code is unfortunately boiler plate to add a new endpoint
that will help us better understand some of the low-level dynamics of searchable snapshots.
The use of available processors, the terminology, and the settings
around it have evolved over time. This commit cleans up some places in
the codes and in the docs to adjust to the current terminology.
The ranges in HTTP headers are using inclusive values for start and end of the range.
The math we used was off in so far that start equals end for the range resulted in length `0`
instead of the correct value of `1`.
Closes#54981Closes#54995
This change converts the module and plugin parameters
for testClusters to be lazy. Meaning that the values
are not resolved until they are actually used. This
removes the requirement to use project.afterEvaluate to
be able to resolve the bundle artifact.
Note - this does not completely remove the need for afterEvaluate
since it is still needed for the custom resource extension.
We can be a little more efficient when aborting a snapshot. Since we know the new repository
data after finalizing the aborted snapshot when can pass it down to the snapshot completion listeners.
This way, we don't have to fork off to the snapshot threadpool to get the repository data when the listener completes and can directly submit the delete task with high priority straight from the cluster state thread.
This commit moves the action name validation and circuit breaking into
the InboundAggregator. This work is valuable because it lays the
groundwork for incrementally circuit breaking as data is received.
This PR includes the follow behavioral change:
Handshakes contribute to circuit breaking, but cannot be broken. They
currently do not contribute nor are they broken.
This commit updates the link to the JDK 14 compiler bug that we have
found. At the time that we committed the workaround, we had a submission
ID, but not yet the public bug URL. This commit adds the public bug URL.
This commit merges the searchable-snapshots feature branch into master.
See #54803 for the complete list of squashed commits.
Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: Yannick Welsch <yannick@welsch.lu>
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: Andrei Dan <andrei.dan@elastic.co>
This commit workarounds a bug in the JDK 14 compiler. It is choking on a
method reference, so we substitute a lambda expression instead. The JDK
bug ID is 9064309.
Guava was removed from Elasticsearch many years ago, but remnants of it
remain due to transitive dependencies. When a dependency pulls guava
into the compile classpath, devs can inadvertently begin using methods
from guava without realizing it. This commit moves guava to a runtime
dependency in the modules that it is needed.
Note that one special case is the html sanitizer in watcher. The third
party dep uses guava in the PolicyFactory class signature. However, only
calling a method on the PolicyFactory actually causes the class to be
loaded, a reference alone does not trigger compilation to look at the
class implementation. There we utilize a MethodHandle for invoking the
relevant method at runtime, where guava will continue to exist.
This is a simple naming change PR, to fix the fact that "metadata" is a
single English word, and for too long we have not followed general
naming conventions for it. We are also not consistent about it, for
example, METADATA instead of META_DATA if we were trying to be
consistent with MetaData (although METADATA is correct when considered
in the context of "metadata"). This was a simple find and replace across
the code base, only taking a few minutes to fix this naming issue
forever.
In order to prepare the `AliasOrIndex` abstraction for the introduction of data streams,
the abstraction needs to be made more flexible, because currently it really can be only
an alias or an index.
* Renamed `AliasOrIndex` to `IndexAbstraction`.
* Introduced a `IndexAbstraction.Type` enum to indicate what a `IndexAbstraction` instance is.
* Replaced the `isAlias()` method that returns a boolean with the `getType()` method that returns the new Type enum.
* Moved `getWriteIndex()` up from the `IndexAbstraction.Alias` to the `IndexAbstraction` interface.
* Moved `getAliasName()` up from the `IndexAbstraction.Alias` to the `IndexAbstraction` interface and renamed it to `getName()`.
* Removed unnecessary casting to `IndexAbstraction.Alias` by just checking the `getType()` method.
Relates to #53100
Currently all of our transport protocol decoding and aggregation occurs
in the individual transport modules. This means that each implementation
(test, netty, nio) must implement this logic. Additionally, it means
that the entire message has been read from the network before the server
package receives it.
This commit creates a pipeline in server which can be passed arbitrary
bytes to handle. Internally, the pipeline will decode, decompress, and
aggregate the messages. Additionally, this allows us to run many
megabytes of bytes through the pipeline in tests to ensure that the
logic works.
This work will enable future work:
Circuit breaking or backoff logic based on message type and byte
in the content aggregator.
Sharing bytes with the application layer using the ref counted
releasable network bytes.
Improved network monitoring based specifically on channels.
Finally, this fixes the bug where we do not circuit break on the correct
message size when compression is enabled.
We are not using the Apache HTTP client backed http transport
with the GCS repo. Same as with the app engine type transport
we can save ourselves the dependency on the http client here
and ignore the missing classes.
Elasticsearch has a number of different BytesReference implementations.
These implementations can all implement the interface in different ways
with subtly different behavior and performance characteristics. On the
other-hand, the JVM only represents bytes as an array or a direct byte
buffer. This commit deletes the specialized Netty implementations and
moves to using a generic ByteBuffer reference type. This will allow us
to focus on standardizing performance and behave around a smaller number
of implementations that can be used by all components in Elasticsearch.
Failure #52906 does not happen in master and is limited to the `7.x` branch
so it wasn't unmuted when the `7.x` fix for this landed => unmuting it here.
This change adds the `nori_number` token filter.
It also adds a `discard_punctuation` option in nori_tokenizer that should be used in conjunction with the new filter.
Upgrading AWS SDK to v1.11.749.
Required building clients inside privileged contexts because some class loading that requires privileges now happens there and working around a new SDK bug in the S3 client builder.
Closes#53191
Upgrading to 8.6.2 in #53865 broke running against HTTPs endpoints (and hence real azure)
because the https url connection needs the newly added permission to work.
The lower end of the timeout range of 100ms is prone to time out
on CI before the mock REST server gets to sending a response that
is not supposed to be a timeout.
Using 1-3s here should make this safe at the cost of randomly making
this test take a few seconds.
Closes#53506
Re-applies the change from #53523 along with test fixes.
closes#53626closes#53624closes#53622closes#53625
Co-authored-by: Nik Everett <nik9000@gmail.com>
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: Jake Landis <jake.landis@elastic.co>
This commit upgrades the jackson-databind depdendency to
2.8.11.6. Additionally, we revert a previous change that put
ingest-geoip on the version of jackson-databind from the version
properties file. This is because upgrading ingest-geoip to a later
version of jackson-databind also requires an upgrade to the geoip2
dependency which is currently blocked. Therefore, if we can get to a
point where we otherwise upgrade our Jackson dependencies, we do not
want ingest-geoip to automatically come along with it.
Lucene 8.5.0 release candidates are imminent. This commit upgrades master to use
the latest snapshot to check that there are no last-minute bugs or regressions.
Upgrading the GCS SDK to the most recent version.
Adjusting (i.e. improving) the REST mock accordingly.
This should significantly boost performance by pulling in
https://github.com/googleapis/java-core/issues/86 in some cases.
The countdown didn't work well here because it only returns `true` once the countdown reaches `0`
but can on subsequent executions return `false` again if a countdown at `0` is counted down again,
leading to more than the expected number of simulated failures.
Closes#52607
This adds a test to AggregatorTestCase that allows us to programmatically
verify that an aggregator supports or does not support a particular
field type. It fetches the list of registered field type parsers,
creates a MappedFieldType from the parser and then attempts to run
a basic agg against the field.
A supplied list of supported VSTypes are then compared against the
output (success or exception) and suceeds or fails the test accordingly.
Co-Authored-By: Mark Tozzi <mark.tozzi@gmail.com>
* Skip fields that are not aggregatable
Cache latest `RepositoryData` on heap when it's absolutely safe to do so (i.e. when the repository is in strictly consistent mode).
`RepositoryData` can safely be assumed to not grow to a size that would cause trouble because we often have at least two copies of it loaded at the same time when doing repository operations. Also, concurrent snapshot API status requests currently load it independently of each other and so on, making it safe to cache on heap and assume as "small" IMO.
The benefits of this move are:
* Much faster repository status API calls
* listing all snapshot names becomes instant
* Other operations are sped up massively too because they mostly operate in two steps: load repository data then load multiple other blobs to get the additional data
* Additional cloud cost savings
* Better resiliency, saving another spot where an IO issue could break the snapshot
* We can simplify a number of spots in the current code that currently pass around the repository data in tricky ways to avoid loading it multiple times in follow ups.
* Add Blob Download Retries to GCS Repository
Exactly as #46589 (and kept as close to it as possible code wise so we can dry things up in a follow-up potentially) but for GCS.
Closes#52319
Transport the version to use for a snapshot instead of whether to use shard generations in the snapshots in progress entry. This allows making upcoming repository metadata changes in a flexible manner in an analogous way to how we handle serialization BwC elsewhere.
Also, exposing the version at the repository API level will make it easier to do BwC relevant changes in derived repositories like source only or encrypted.
Add a new cluster setting `search.allow_expensive_queries` which by
default is `true`. If set to `false`, certain queries that have
usually slow performance cannot be executed and an error message
is returned.
- Queries that need to do linear scans to identify matches:
- Script queries
- Queries that have a high up-front cost:
- Fuzzy queries
- Regexp queries
- Prefix queries (without index_prefixes enabled
- Wildcard queries
- Range queries on text and keyword fields
- Joining queries
- HasParent queries
- HasChild queries
- ParentId queries
- Nested queries
- Queries on deprecated 6.x geo shapes (using PrefixTree implementation)
- Queries that may have a high per-document cost:
- Script score queries
- Percolate queries
Closes: #29050
Move EC2 discovery tests to using the mock REST API introduced in
https://github.com/elastic/elasticsearch/pull/50550 instead of mocking
the AWS SDK classes manually.
Move the trivial remaining AWS SDK mocks to the single test suit that
was using them.
This commit changes how RestHandlers are registered with the
RestController so that a RestHandler no longer needs to register itself
with the RestController. Instead the RestHandler interface has new
methods which when called provide information about the routes
(method and path combinations) that are handled by the handler
including any deprecated and/or replaced combinations.
This change also makes the publication of RestHandlers safe since they
no longer publish a reference to themselves within their constructors.
Closes#51622
Co-authored-by: Jason Tedor <jason@tedor.me>
Our FIPS 140 testing depends on setting the appropriate java policy
in order to configure the JVM in FIPS mode. Some tests (
discovery-ec2 and ccr qa ) also needed to set a custom policy file
to grant a specific permission, which overwrote the FIPS related
policy and tests would fail. This change ensures that when a
custom policy needs to be set in these tests, the permissions that
are necessary for FIPS are also set.
Resolves: #51685, #52034
- Enable SunJGSS provider for Kerberos tests
- Handle the fact that in the decrypt method in KeyStoreWrapper might
not throw immediately when the GCM cipher is from BouncyCastle FIPS
and we end up with a DataInputStream that has reached it's end.
- Disable tests, jarHell, testingConventions for ingest attachment
plugin. We don't support this plugin (and document this) in FIPS
mode.
- Don't attempt to install ingest-attachment in smoke-test-plugins
Now that the FIPS 140 security provider is simply a test dependency
we don't need the thirdPartyAudit exceptions, but plugin-cli and
transport-netty4 do need jarHell disabled as they use the non fips
BouncyCastle security provider as a test dependency too.
Remove lucene-spatial module
Keep previous behaviour of FieldHighlighter::highlightOffsetsEnums
LUCENE-9093 modified how FieldHighlighter breaks texts into passages,
which doesn't work well with elasticsearch custome BoundedBreakIteratorScanner.
This commit for now keeps previous behaviour of FieldHighlighter.
For small uploads (that can still be up to 5MB!) we needlessly
reading the `InputStream` into a BAOS which entailed allocating
the `byte[]` for the stream contents twice (because to `toByteArray` on the BAOS copies).
Also, for resumeable uploads we were needlessly wrapping the output channel and running each individual write in its own privileged context when we could just wrap the whole upload in a single privileged context.
Relates #51593
This test was still very GC heavy in Java 8 runs in particular
which seems to slow down request processing to the point of timeouts
in some runs.
This PR completely removes the large number of O(MB) `byte[]` allocations
that were happening in the mock http handler which cuts the allocation rate
by about a factor of 5 in my local testing for the GC heavy `testSnapshotWithLargeSegmentFiles`
run.
Closes#51446Closes#50754
In e34d7fd we decided to use the default distribution for all tests
when running in a FIPS 140 JVM. This is strictly not necessary in
master, as the problem we originally attempted to fix is that we
set `xpack.security.ssl.diagnose.trust` when running in FIPS 140
JVMs and OSS didn't know of that setting. In master, we do not
need to set `xpack.security.ssl.diagnose.trust` so we don't need
to _always_ run with default distribution.
It is the job of the http server transport to release the request in the handler
but the mock fails to do so since we never override `incomingRequest`.
This addresses a very old TODO comment in MetadataFieldMapper.TypeParser; passing
in a previously constructed field mapper here was a hack in order to provide access to
prebuilt analyzers for the AllFieldType; this has now been removed, so we can remove
the parameter from this method signature.
Reverts #50960
This commit has been causing test failures during upgrade tests: specifically, an upgraded
node becomes master and sends a cluster state update to a 7.x node; this node sees that the
mapping version of its .tasks index is the same as the master, so asserts that the serialized
mappings are the same; however, because the master has rewritten the mapping to use
_docinstead oftasks`, we get an assertion failure. The logical fix is for the master to
increment its mapping version when it rewrites the mapping, but there isn't a simple way to
do that currently.
This reverts commit 774bfb5e22.
Add cool down period after snapshot finalization and delete to prevent eventually consistent AWS S3 from corrupting shard level metadata as long as the repository is using the old format metadata on the shard level.
This moves the testing of custom significance heuristic plugins from an
`ESIntegTestCase` to an example plugin. This is *much* more "real" and
can be used as an example for anyone that needs to actually build such a
plugin. The old test had testing concerns and the example all jumbled
together.
The existing package private visibility of the serialization ctor isn't enough when copying
example code to oder packages or projects when bootstrapping a new project from it.
This commit begins the process of removing types from the document parsing
infrastructure. Initially, we just ignore the user-supplied type after it has been
removed from the mapping json structure, and always supply _doc as the name
of the root parser.
The production code change is very small here, and most of the changeset
consists of alterations to Mapper test code that was passing in non-standard
type names and checking serialization.
Relates to #41059
This commit removes the type parameter from `CreateIndexRequest.mapping(type, object...)`,
and the associated delegating method on `CreateIndexRequestBuilder`. To make migration
simpler, the method on `CreateIndexRequest` is renamed to `simpleMapping`, and
on `CreateIndexRequestBuilder` to `setMapping`; this should help the compiler catch all
necessary changes on upgrades.
Relates to #41059
It's impossible to tell why #50754 fails without this change.
We're failing to close the `exchange` somewhere and there is no
write timeout in the GCS SDK (something to look into separately)
only a read timeout on the socket so if we're failing on an assertion without
reading the full request body (at least into the read-buffer) we're locking up
waiting forever on `write0`.
This change ensure the `exchange` is closed in the tests where we could lock up
on a write and logs the failure so we can find out what broke #50754.
This solves half of the problem in #46813 by moving the S3
tests to using the shared minio fixture so we at least have
some non-3rd-party, constantly running coverage on these tests.
This continues the removal of type parameters from CreateIndexRequest.mapping
methods started in #50419. Here the removed methods are almost entirely in test
code, with the exception of a change to TransformIndex in the transform plugin.
Relates to #41059
Follow up to #50550. Cache empty nodes lists (`fetchDynamicNodes` will return an empty list in case of failure)
now that the plugin properly retries requests to AWS EC2 APIs.
Use the default retry condition instead of never retrying in the discovery plugin causing hot retries upstream and add a test that verifies retrying works.
Closes#50462
Avoid backwards incompatible changes for 8.x and 7.6 by removing type
restriction on compile and Factory. Factories may optionally implement
ScriptFactory. If so, then they can indicate determinism and thus
cacheability.
Relates: #49466
When a third party test failed, it potentially left some snapshots
in the repository. In case of tests running against an external
service like Azure, the remaining snapshots can fail the future
test executions are they are not supposed to exist.
Similarly to what has been done for S3 and GCS, this commit
cleans up remaining snapshots before the test execution.
Closes#50304
Unfortunately bulk delete exceptions don't show the individual delete
errors when a bulk delete fails when you log them outright so I added this work-around
to get the individual details to get useful logging.
* Remove BlobContainer Tests against Mocks
Removing all these weird mocks as asked for by #30424.
All these tests are now part of real repository ITs and otherwise left unchanged if they had
independent tests that didn't call the `createBlobStore` method previously.
The HDFS tests also get added coverage as a side-effect because they did not have an implementation
of the abstract repository ITs.
Closes#30424
* Remove Unused Single Delete in BlobStoreRepository
There are no more production uses of the non-bulk delete or the delete that throws
on missing so this commit removes both these methods.
Only the bulk delete logic remains. Where the bulk delete was derived from single deletes,
the single delete code was inlined into the bulk delete method.
Where single delete was used in tests it was replaced by bulk deleting.
Batch deletes get a response for every delete request, not just those that actually hit an existing blob.
The fact that we only responded for existing blobs leads to a degenerate response that throws a parse exception if a batch delete only contains non-existant blobs.
Today settings can declare dependencies on another setting. This
declaration is implemented so that if the declared setting is not set
when the declaring setting is, settings validation fails. Yet, in some
cases we want not only that the setting is set, but that it also has a
specific value. For example, with the monitoring exporter settings, if
xpack.monitoring.exporters.my_exporter.host is set, we not only want
that xpack.monitoring.exporters.my_exporter.type is set, but that it is
also set to local. This commit extends the settings infrastructure so
that this declaration is possible. The use of this in the monitoring
exporter settings will be implemented in a follow-up.
In order to cache script results in the query shard cache, we need to
check if scripts are deterministic. This change adds a default method
to the script factories, `isResultDeterministic() -> false` which is
used by the `QueryShardContext`.
Script results were never cached and that does not change here. Future
changes will implement this method based on whether the results of the
scripts are deterministic or not and therefore cacheable.
Refs: #49466
Our docs specifically mention that CBOR is supported when ingesting attachments. However this is not tested anywhere.
This adds a test, that uses specifically CBOR format in its IndexRequest and another one that behaves like CBOR in the ingest attachment unit tests.
Adds `GET /_script_language` to support Kibana dynamic scripting
language selection.
Response contains whether `inline` and/or `stored` scripts are
enabled as determined by the `script.allowed_types` settings.
For each scripting language registered, such as `painless`,
`expression`, `mustache` or custom, available contexts for the language
are included as determined by the `script.allowed_contexts` setting.
Response format:
```
{
"types_allowed": [
"inline",
"stored"
],
"language_contexts": [
{
"language": "expression",
"contexts": [
"aggregation_selector",
"aggs"
...
]
},
{
"language": "painless",
"contexts": [
"aggregation_selector",
"aggs",
"aggs_combine",
...
]
}
...
]
}
```
Fixes: #49463
* Copying the request is not necessary here. We can simply release it once the response has been generated and a lot of `Unpooled` allocations that way
* Relates #32228
* I think the issue that preventet that PR that PR from being merged was solved by #39634 that moved the bulk index marker search to ByteBuf bulk access so the composite buffer shouldn't require many additional bounds checks (I'd argue the bounds checks we add, we save when copying the composite buffer)
* I couldn't neccessarily reproduce much of a speedup from this change, but I could reproduce a very measureable reduction in GC time with e.g. Rally's PMC (4g heap node and bulk requests of size 5k saw a reduction in young GC time by ~10% for me)
This is a preliminary to #49060.
It does not introduce any substantial behavior change to how the blob store repository
operates. What it does is to add all the infrastructure changes around passing the cluster service to the blob store, associated test changes and a best effort approach to tracking the latest repository generation on all nodes from cluster state updates. This brings a slight improvement to the consistency
by which non-master nodes (or master directly after a failover) will be able to determine the latest repository generation. It does not however do any tricky checks for the situation after a repository operation
(create, delete or cleanup) that could theoretically be used to get even greater accuracy to keep this change simple.
This change does not in any way alter the behavior of the blobstore repository other than adding a better "guess" for the value of the latest repo generation and is mainly intended to isolate the actual logical change to how the
repository operates in #49060
The test was slightly modified with #49166, the two test documents in
`testNormalization` look like they should mirror the document id in the "id"
field in order for it to work as a tie breaker.
Closes#49654
This change adds a dynamic cluster setting named `indices.id_field_data.enabled`.
When set to `false` any attempt to load the fielddata for the `_id` field will fail
with an exception. The default value in this change is set to `false` in order to prevent
fielddata usage on this field for future versions but it will be set to `true` when backporting
to 7x. When the setting is set to true (manually or by default in 7x) the loading will also issue
a deprecation warning since we want to disallow fielddata entirely when https://github.com/elastic/elasticsearch/issues/26472
is implemented.
Closes#43599
All the implementations of `EsBlobStoreTestCase` use the exact same
bootstrap code that is also used by their implementation of
`EsBlobStoreContainerTestCase`.
This means all tests might as well live under `EsBlobStoreContainerTestCase`
saving a lot of code duplication. Also, there was no HDFS implementation for
`EsBlobStoreTestCase` which is now automatically resolved by moving the tests over
since there is a HDFS implementation for the container tests.
The annotated text mapper has a field type that currently extends StringFieldType,
which means that all the positional-related query factory methods need to be copied
over from TextFieldType. In addition, MappedFieldType.intervals() hasn't been
overridden, so you can't use intervals queries with annotated text - a major drawback,
since one of the purposes of annotated text is to be able to run positional queries against
annotations.
This commit changes the annotated text field type to extend TextFieldType instead,
adding tests to ensure that position queries work correctly.
Closes#49289
Same as #49518 pretty much but for GCS.
Fixing a few more spots where input stream can get closed
without being fully drained and adding assertions to make sure
it's always drained.
Moved the no-close stream wrapper to production code utilities since
there's a number of spots in production code where it's also useful
(will reuse it there in a follow-up).
- Don't install ingest-attachment and don't run the related docs
tests, since ingest-attachment is not supported in FIPS 140 JVMs
- Move copying extra jars and extra config files earlier on in the
node configuration so that elasticsearch-keystore and
elasticsearch-plugin that run before the node starts have all files
(policy, properties, jars) available.
- BCJSSE needs a certificate to be explicitly added in a keystore
as a trustedcerty entry, it's not enough for it to be in privatekeyentry
for it to be trusted
- Set the value for BuildParams.inFipsJvm configuration time
Fixing a few small issues found in this code:
1. We weren't reading the request headers but the response headers when checking for blob existence in the mocked single upload path
2. Error code can never be `null` removed the dead code that resulted
3. In the logging wrapper we weren't checking for `Throwable` so any failing assertions in the http mock would not show up since they
run on a thread managed by the mock http server
This commit fixes the server side logic of "List Objects" operations
of Azure and S3 fixtures. Until today, the fixtures were returning a "
flat" view of stored objects and were not correctly handling the
delimiter parameter. This causes some objects listing to be wrongly
interpreted by the snapshot deletion logic in Elasticsearch which
relies on the ability to list child containers of BlobContainer (#42653)
to correctly delete stale indices.
As a consequence, the blobs were not correctly deleted from the
emulated storage service and stayed in heap until they got garbage
collected, causing CI failures like #48978.
This commit fixes the server side logic of Azure and S3 fixture when
listing objects so that it now return correct common blob prefixes as
expected by the snapshot deletion process. It also adds an after-test
check to ensure that tests leave the repository empty (besides the
root index files).
Closes#48978
Similarly to what has been done for Azure (#48636) and GCS (#48762),
this committ removes the existing Ant fixture that emulates a S3 storage
service in favor of multiple docker-compose based fixtures.
The goals here are multiple: be able to reuse a s3-fixture outside of the
repository-s3 plugin; allow parallel execution of integration tests; removes
the existing AmazonS3Fixture that has evolved in a weird beast in
dedicated, more maintainable fixtures.
The server side logic that emulates S3 mostly comes from the latest
HttpHandler made for S3 blob store repository tests, with additional
features extracted from the (now removed) AmazonS3Fixture:
authentication checks, session token checks and improved response
errors. Chunked upload request support for S3 object has been added
too.
The server side logic of all tests now reside in a single S3HttpHandler class.
Whereas AmazonS3Fixture contained logic for basic tests, session token
tests, EC2 tests or ECS tests, the S3 fixtures are now dedicated to each
kind of test. Fixtures are inheriting from each other, making things easier
to maintain.
External API requests can no longer include types, so we have no need to pass this
information along in internal PutMappingClusterStateUpdateRequest objects, or on
PutMappingRequests used by the internal node client.
Relates to #41059
Closes#48724. Update `.editorconfig` to make the Java settings the default
for all files, and then apply a 2-space indent to all `*.gradle` files.
Then reformat all the files.
Similarly to what has be done for Azure in #48636, this commit
adds a new :test:fixtures:gcs-fixture project which provides two
docker-compose based fixtures that emulate a Google Cloud
Storage service.
Some code has been extracted from existing tests and placed
into this new project so that it can be easily reused in other
projects.
We're not doing any blocking connects in production code anymore
so we can move these helpers to test code only.
Also, we were only tracking the proper closing of blockingly opened
connections on the mock transport service but didn't check those
created via the non-blocking API which is fixed here too.
This commit introduces a consistent, and type-safe manner for handling
global build parameters through out our build logic. Primarily this
replaces the existing usages of extra properties with static accessors.
It also introduces and explicit API for initialization and mutation of
any such parameters, as well as better error handling for uninitialized
or eager access of parameter values.
Closes#42042
This commit adds a new :test:fixtures:azure-fixture project which
provides a docker-compose based container that runs a AzureHttpFixture
Java class that emulates an Azure Storage service.
The logic to emulate the service is extracted from existing tests and
placed in AzureHttpHandler into the new project so that it can be
easily reused. The :plugins:repository-azure project is an example
of such utilization.
The AzureHttpFixture fixture is just a wrapper around AzureHttpHandler
and is now executed within the docker container.
The :plugins:repository-azure:qa:microsoft-azure project uses the new
test fixture and the existing AzureStorageFixture has been removed.
In repository integration tests, we drain the HTTP request body before
returning a response. Before this change this operation was done using
Streams.readFully() which uses a 8kb buffer to read the input stream, it
now uses a 1kb for the same operation. This should reduce the allocations
made during the tests and speed them up a bit on CI.
Co-authored-by: Armin Braun <me@obrown.io>
In #47176 we changed the internal HTTP server that emulates
the Azure Storage service so that it includes a response body
for injected errors. This fixed most of the issues reported in
#47120 but sadly I missed to map one error to its Azure
equivalent, and it triggered some CI failures today.
Closes#47120
As types are no longer used in index requests, we can remove the type parameter
from `prepareIndex` methods in the `Client` interface. However, just changing the signature
of `prepareIndex(index, type, id)` to `prepareIndex(index, id)` risks confusion when
upgrading with the previous (now removed) `prepareIndex(index, type)` method -
just changing the dependency version of java code would end up silently changing the
semantics of the method call. Instead we should just remove this method entirely, and
replace it by calling `prepareIndex(index).setId(id)`
ESIntegTestCase has a number of sugar index methods that prepare and execute
index requests. Now that index requests no longer use types, we can remove the
type parameter from each of these.
To prevent issues when re-compiling against the test framework, the method
`index(String index, String id, Object... source)` is renamed to `indexDoc`
so that e.g. `index(index, type, id, field1, field2, field3)` does not get re-interpreted
as an index request with an id of `type`.
Types are no longer used in IndexRequest, DeleteRequest or UpdateRequest;
this commit removes them from the prepareX(index, type) methods on Client, as well as removing setType() and deprecated constructors on XRequestBuilder objects.
Note that Client.prepareIndex(index, type, id) is not affected by this PR and will be removed
in a followup.
Relates to #41059
This commit changes the test so that each node use a specific
service account and private key. It also changes how unique
request ids are generated for refresh token request using the
token itself, so that error count will be specific per node (each
node should execute a single refresh token request as tokens
are valid for 1 hour).
BytesReference is currently an abstract class which is extended by
various implementations. This makes it very difficult to use the
delegation pattern. The implication of this is that our releasable
BytesReference is a PagedBytesReference type and cannot be used as a
generic releasable bytes reference that delegates to any reference type.
This commit makes BytesReference an interface and introduces an
AbstractBytesReference for common functionality.
This commit removes the types filter from the GetMappings API, which is no longer
useful seeing as we can only have a single mapping type per index. It also changes
the structure of GetMappingsResponse and GetIndexResponse to remove the extra
nesting of mappings below the no-longer-relevant type name, and removes the types
methods from the equivalent request classes.
Relates to #41059
We were incorrectly handling `IOExceptions` thrown by
the `InputStream` side of the upload operation, resulting
in a `ClassCastException` as we expected to never get
`IOException` from the Azure SDK code but we do in practice.
This PR also sets an assertion on `markSupported` for the
streams used by the SDK as adding the test for this scenario
revealed that the SDK client would retry uploads for
non-mark-supporting streams on `IOException` in the `InputStream`.
Especially in the snapshot code there's a lot
of logic chaining `ActionRunnables` in tricky
ways now and the code is getting hard to follow.
This change introduces two convinience methods that
make it clear that a wrapped listener is invoked with
certainty in some trickier spots and shortens the code a bit.