Commit Graph

3207 Commits

Author SHA1 Message Date
Rory Hunter 123a06f588
Enable compiler warnings in x-pack security (#75473)
Part of #40366.
2021-07-21 19:58:39 +01:00
Rory Hunter cd4446732e
Re-enable compiler warnings in :test:framework (#75449)
Part of #40366.
2021-07-21 13:47:50 +01:00
Benjamin Trent 9b88db7f4c
Significant terms test refactor for extendability (#75452)
The original PR #75264 made some test mistakes

NXY Significant term heuristics have additional values that need to be set when testing
basicScore properties.

Additionally, previous refactor kept the abstract test class in a package that other plugins
don't have access to.

closes #75442, #75561
2021-07-21 08:12:55 -04:00
Martijn van Groningen 3dde09a7b4
Add filter support to data stream aliases (#74784)
This allows specifying a query as filter on data stream alias,
which will then always be applied when searching via this alias.

Relates #66163
2021-07-20 11:21:27 +02:00
Alan Woodward cf575f4766
Make NestedObjectMapper its own class (#74410)
Nested objects are implemented via a Nested class directly on object mappers,
even though nested and non-nested objects have quite different semantics. In
addition, most call-sites that need to get an object mapper in fact need a nested
object mapper. To make it clearer that nested and object mappers are different
beasts with different implementations and different requirements, we should
split them into different classes.
2021-07-19 09:44:48 +01:00
Armin Braun 3bd2672813
Fix Snapshot Out of Order Finalization Repo Corruption (#75362)
* Fix up shard generations in `SnapshotsInProgress` during snapshot finalization (don't do it earlier because it's a really heavy computation and we have a ton of places where it would have to run).
* Adjust finalization queue to be able to work with changing snapshot entries after they've been enqueued for finalisation
* Still one remaining bug left after this (see TODO about leaking generations) that I don't feel confident in fixing for `7.13.4` due to the complexity of a fix and how minor the blob leak is (+ it's cleaned up just fine during snapshot deletes)

Closes #75336
2021-07-16 14:51:12 +02:00
Yannick Welsch db814f403b
Track Lucene field usage (#74227)
Adds a field usage API that reports shard-level statistics about which Lucene fields have been accessed, and which
parts of the Lucene data structures have been accessed.

Field usage statistics are automatically captured when queries are runnning on a cluster. A shard-level search request
that accesses a given field, even if multiple times during that request, is counted as a single use.
2021-07-14 13:21:11 +02:00
David Turner f8185e5702
Fix testPreferCopyWithHighestMatchingOperations (#75170)
In #74081 this test failed with a `NoNodeAvailableException` within the
`indexRandom()` call immediately after stopping a node. This could
happen if the `node-left` event wasn't fully applied before calling
`indexRandom()` with an empty list of docs but with `forceRefresh` set
to true: since there's no docs, the replica wouldn't be marked as stale,
so the final refresh would detect the missing node, failing its
`assertNoFailures` wrapper.

This commit avoids calling `indexRandom()` with no docs in this
location. It also enhances `assertNoFailures` to report the details of
each failure, rather than just the summary.

Closes #74081
2021-07-13 16:25:26 +01:00
Ignacio Vera caa7e1678b
Fix AbstractSimpleTransportTestCase#testFailToSend (#75211)
Adjust test after changing how exception is wrapped.
2021-07-12 10:37:21 +02:00
Ignacio Vera b9b1af12fb
Mute AbstractSimpleTransportTestCase#testToFail (#75210) 2021-07-12 08:23:49 +02:00
Ignacio Vera ebe8a9b58f
Alias field does not work with geo_shape query (#74895)
Fixes the resolution of the field name in the geo_shape query.
2021-07-08 07:19:38 +02:00
Nik Everett 23f95246c5
Update test assertion library for better errors (#74863)
This updates the `mapmatcher` test assertion library that we use to pick
up a fix for error messages when you expect a `Map` or a `List` but get
*nothing*. Now it says something sensible like:

```
  key: expected a map but was <missing>
```

instead of the confusing

```
  key: expected a map containing
    <the stuff the map was expected to contain> but was missing
```

Relates to #74721

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2021-07-07 10:22:38 -04:00
Tim Brooks d3eb540fe4
Ensure replica requests are marked as index_data (#75008)
This is related to #73497. Currently replica requests are wrapped in a
concrete replica shard request. This leads to the transport layer not
properly identifying them as replica index_data requests and not
compressing them properly. This commit resolves this bug.
2021-07-06 15:31:47 -06:00
Luca Cavanna c6641bf00c
Rename ParseContext to DocumentParserContext (#74963)
ParseContext is used to parse documents. It was easily confused with ParserContext (now renamed to MappingParserContext) which is instead used to parse mappings.

To remove any confusion, this commit renames ParseContext to DocumentParserContext and adapts its subclasses accordingly.
2021-07-06 09:15:59 -04:00
Luca Cavanna 2700cc802c
Simplify ParseContext (#74831)
We currently have one ParseContext class, which is used to parse incoming documents, not to be confused with the former ParserContext (now renamed to MappingParserContext) which is instead used to parse mappings.

There are a few implementations of ParseContext, but mostly the InternalParseContext one is used. There is also a FilterParseContext that allows to delegate to a given context for all methods besides the one explicitly overridden by it.

This commit attempts to simplify ParseContext by extracting its InternalParseContext implementation and moving it where it's used, within DocumentParser and making it private, so that the super-class can be used. This allows to hide some implementation details that only InternalParseContext knows about on nested documents and the way they are stored in lucene.

Also, we are introducing separate test implementations in place of reusing InternalParseContext in tests too.

Additionally FilterParseContext can be greatly simplified by relying on a copy constructor, that makes it so that it does not have to override every single method to delegate to the provided context, at least for the behaviour that can't be overridden (final methods).
2021-07-06 13:26:10 +02:00
Armin Braun d41dfc9903
Upgrade GCS SDK to 1.117.1 (#74938)
We're behind the ugprade schedule by quite a bit here, upgrading to the latest version
and adjusting our test fixture accordingly.
2021-07-05 21:57:30 +02:00
David Turner 0254dc3503
Warn on possible master service starvation (#74820)
Today the master service processes pending tasks in priority order. If
high-priority tasks arrive too frequently then low-priority tasks are
starved of access to the master service and are not executed. This can
cause certain tasks to appear to be stuck due to apparently-unrelated
overloads elsewhere.

With this commit we measure the interval between times when the pending
task queue is empty; if this interval exceeds a configurable threshold
then we log a warning.
2021-07-05 14:24:21 +01:00
David Turner 588fe5c5f1
Move DeterministicTaskQueue to appropriate package (#74901)
`o.e.c.coordination.DeterministicTaskQueue` is today used in various
places, not just for tests of the cluster coordination subsystem. It's
also a bit of a pain to construct, requiring a nonempty `Settings` and a
`Random` even though essentially everyone passes in the same values.
This commit moves this class to the more generic `o.e.c.util.concurrent`
package, adds some Javadoc, and makes it easier to construct.
2021-07-05 12:33:02 +01:00
Ignacio Vera 0f30c79e4b
Remove legacy geo code from AbstractGeometryQueryBuilder classes (#74741)
removes references to Legacy ShapeParser and ShapeBuilder in AbstractGeometryQueryBuilder classes 
in favour to Geometry and GeometryParser.
2021-07-05 07:31:55 +02:00
Yannick Welsch 90e663b344
Always use DirectoryReader for realtime get from translog (#74722)
Reading from translog during a realtime get requires special handling in some higher level components, e.g.
ShardGetService, where we're doing a bunch of tricks to extract other stored fields from the source. Another issue with
the current approach relates to #74227 where we introduce a new "field usage tracking" directory wrapper that's always
applied, and we want to make sure that we can still quickly do realtime gets from translog without creating an in-memory
index of the document, even when this directory wrapper exists.

This PR introduces a directory reader that contains a single translog indexing operation. This can be used during a
realtime get to access documents that haven't been refreshed yet. In the normal case, all information relevant to resolve
the realtime get is mocked out to provide fast access to _id and _source. In case where more values are requested (e.g.
access to other stored fields) etc., this reader will index the document into an in-memory Lucene segment that is
created on-demand.

Relates #64504
2021-07-01 12:53:14 +02:00
Armin Braun eca57352b5
Remove Outdated GetSnapshots BwC Handling in Rest Tests (#74734)
The group-by-repository response format is gone as a result of #74451
=> cleaning up its last remnants from REST tests.
2021-06-30 13:12:39 +02:00
Tim Brooks 293d490ded
Add additional transport compression options (#74587)
This commit is related to #73497. It adds two new settings. The first setting
is transport.compression_scheme. This setting allows the user to
configure LZ4 or DEFLATE as the transport compression. Additionally, it
modifies transport.compress to support the value indexing_data. When
this setting is set to indexing_data only messages which are primarily
composed of raw source data will be compressed. This is bulk, operations
recovery, and shard changes messages.
2021-06-29 12:14:47 -06:00
Armin Braun 8947c1e980
Save Memory on Large Repository Metadata Blob Writes (#74313)
This PR adds a new API for doing streaming serialization writes to a repository to enable repository metadata of arbitrary size and at bounded memory during writing. 
The existing write-APIs require knowledge of the eventual blob size beforehand. This forced us to materialize the serialized blob in memory before writing, costing a lot of memory in case of e.g. very large `RepositoryData` (and limiting us to `2G` max blob size).
With this PR the requirement to fully materialize the serialized metadata goes away and the memory overhead becomes completely bounded by the outbound buffer size of the repository implementation. 

As we move to larger repositories this makes master node stability a lot more predictable since writing out `RepositoryData` does not take as much memory any longer (same applies to shard level metadata), enables aggregating multiple metadata blobs into a single larger blobs without massive overhead and removes the 2G size limit on `RepositoryData`.
2021-06-29 11:29:55 +02:00
Ignacio Vera 21794f82e0
remove dependency on legacy geo code from test (#74609)
ShapeBuilders are moved to use Geometry classes.
2021-06-29 07:37:34 +02:00
Christos Soulios df941367df
Add dimension mapping parameter (#74450)
Added the dimension parameter to the following field types:

    keyword
    ip
    Numeric field types (integer, long, byte, short)

The dimension parameter is of type boolean (default: false) and is used 
to mark that a field is a time series dimension field.

Relates to #74014
2021-06-24 20:16:27 +03:00
Armin Braun cbf48e0633
Flatten Get Snapshots Response (#74451)
This PR returns the get snapshots API to the 7.x format (and transport client behavior) and enhances it for requests that ask for multiple repositories.
The changes for requests that target multiple repositories are:
* Add `repository` field to `SnapshotInfo` and REST response
* Add `failures` map alongside `snapshots` list instead of returning just an exception response as done for single repo requests
* Pagination now works across repositories instead of being per repository for multi-repository requests

closes #69108
closes #43462
2021-06-24 16:58:33 +02:00
Luca Cavanna 7cedc3ec3a
Make Document a top-level class (#74472)
There is no reason for Document to be an inner class of ParseContext, especially as it is public and accessed directly from many different places.

This commit takes it out to its own top-level class file, which has the advantage of simplifying ParseContext which could use some love too.
2021-06-24 10:56:30 +02:00
Przemyslaw Gomulka c6c662def4
RestController not using thread context directly from thread pool (#74293)
At the moment thread context is passed via dispatchRequest but in some
places thread context is fetched directly from thread pool
This is not a problem in production, because thread pool is initialized
with the same thread context as the one passed to dispatchRequest via
AbstractHttpServerTransport.
It might be harder to understand though and might cause problems in
testing in smaller units.
2021-06-23 15:01:14 +02:00
Alan Woodward 4b069c217e
Remove references to SpanBoostQuery (#74432)
SpanBoostQuery will be removed in lucene 9.0. It is currently a no-op anyway,
unless it appears at the top level of a span query tree, in which case it is
equivalent to a standard BoostQuery. This commit removes references to
SpanBoostQuery from elasticsearch SpanQueryBuilders, replacing it with
BoostQuery where appropriate.

It also adds a new, breaking, check to field_masking_span to ensure that
its inner query does not have a boost set on it, bringing it into line with all
other span queries that wrap inner spans.
2021-06-23 13:15:54 +01:00
Luca Cavanna 0d0e403258
Move and rename ParserContext (#74402)
ParserContext is an inner class of Mapper.TypeParser but is used outside of the context of parsing mappers, for instance also to parse runtime fields. Its purpose is to be used to parse mappings in general, and its name is confusing as it differs ever so slightly from ParseContext which is used for parsing incoming documents.

This commit moves ParserContext to be a top-level class, and renames it to MappingParserContext.
2021-06-23 09:28:56 +02:00
Martijn van Groningen 4d84f11ef3
Add meta field to deprecation issue definition. (#74085)
This will allow components to add custom metadata to deprecation issues.
This make extracting additional details about deprecations more robust,
otherwise these details need to be parsed from the deprecation message field.

Adjusted the ml model snapshot deprecation to use custom metadata, and
included the job id and snapshot id as custom metadata.

Closes #73089
2021-06-22 12:05:16 +02:00
Armin Braun 269718ff10
Enhance Tests around SnapshotInfo UserMetadata (#74362)
We barely test the correct handling of user metadata directly.
With upcoming changes to how `SnapshotInfo` is stored it would be nice
to have better test coverage. This PR adds randomized coverage of serializing
user metadata to a large number of tests that all user the shared infrastructure
that is adjusted here.
2021-06-21 19:41:01 +02:00
Nik Everett 8904ffe2be
Add extra profiling information to terms agg (#73636)
I was helping some folks debug an issue with the terms agg and noticed
that we didn't always have the `total_buckets` debug information. I also
noticed that we can't tell how many buckets we build, so I added that
too as `built_buckets`.

Finally, I noticed that when we're using segment ords we count segments
without any values as "multi-valued". We can do better there and count
them as no-valued. That will, mostly, just improve the profiling. When
we collect from global ords we have no way to tell how many values are
on the segment so segments without any values will, sadly, in this case
still be miscounted as multi-valued.
2021-06-21 10:10:41 -04:00
Armin Braun c1e9590a69
Pagination and Sorting for Get Snapshots API (#73952)
Pagination and snapshots for get snapshots API, build on top of the current implementation to enable work that needs this API for testing. A follow-up will leverage the changes to make things more efficient via pagination.

Relates https://github.com/elastic/elasticsearch/pull/73570 which does part of the under-the-hood changes required to efficiently implement this API on the repository layer.
2021-06-17 09:00:11 +02:00
Rory Hunter a5d2251064
Order imports when reformatting (#74059)
Change the formatter config to sort / order imports, and reformat the
codebase. We already had a config file for Eclipse users, so Spotless now
uses that.

The "Eclipse Code Formatter" plugin ought to be able to use this file as
well for import ordering, but in my experiments the results were poor.
Instead, use IntelliJ's `.editorconfig` support to configure import
ordering.

I've also added a config file for the formatter plugin.

Other changes:
   * I've quietly enabled the `toggleOnOff` option for Spotless. It was
     already possible to disable formatting for sections using the markers
     for docs snippets, so enabling this option just accepts this reality
     and makes it possible via `formatter:off` and `formatter:on` without
     the restrictions around line length. It should still only be used as
     a very last resort and with good reason.
   * I've removed mention of the `paddedCell` option from the contributing
     guide, since I haven't had to use that option for a very long time. I
     moved the docs to the spotless config.
2021-06-16 09:22:22 +01:00
Armin Braun dbb626abbb
Add Bulk Fetch SnapshotInfo API to Repository (#73570)
This PR refactors the `Repository` API for fetching `SnapshotInfo` to enabled implementations to optimize for bulk fetching multiple `SnapshotInfo` at once. This is a requirement for making use of a more efficient repository format that does not require loading individual blobs per snapshot to fetch a snapshot listing. Also, by enabling consuming `SnapshotInfo` as they are fetched on the snapshot meta thread this allows for some more memory efficient usage of snapshot listing.
Also, this commit makes use of the new API to make the snapshot status API run a little more parallel if fetching multiple snapshots (though there's additional improvements possible+useful here as far as fetching shard level metadata in parallel).
2021-06-14 19:17:47 +02:00
Alan Woodward 6d52cd6b2a
Revert "Make NestedObjectMapper it's own class (#73058)" (#74069)
This commit contains a bug when merging deeply-nested mappers which
was causing errors downstream. Reverting to unblock while the bug
is fixed.

This reverts commit 29ee4202a2.
2021-06-14 15:33:42 +01:00
David Turner 68ae79240e
Log at DEBUG only on disconnect during cancellation (#74042)
If a `NodeDisconnectedException` happens when sending a ban for a task
then today we log a message at `INFO` or `WARN` indicating that the ban
failed, but we don't indicate why. The message also uses a default
`toString()` for an inner class which is unhelpful.

Ban failures during disconnections are benign and somewhat expected, and
task cancellation respects disconnections anyway (#65443). There's not
much the user can do about these messages either, and they can be
confusing and draw attention away from the real problem.

With this commit we log the failure messages at `DEBUG` on
disconnections, and include the exception details. We also include the
exception message for other kinds of failures, and we fix up a few cases
where a useless default `toString()` implementation was used in log
messages.

Slightly relates #72968 in that these messages tend to obscure a
connectivity issue.
2021-06-14 06:58:58 +01:00
Armin Braun 4fa99f58e2
Remove S3 Eventual Consistency Related Tests (#74015)
S3 list, update etc. are consistent now => no need to have these tests around any longer.
2021-06-10 20:50:28 +02:00
Armin Braun 5249540a5c
Simplify Blobstore Consistency Check in Tests (#73992)
With work to make repo APIs more async incoming in #73570
we need a non-blocking way to run this check. This adds that async
check and removes the need to manually pass executors around as well.
2021-06-10 16:12:26 +02:00
David Turner a2c1d31d82
Fix split package with voting-only nodes (#73965)
Moves the implementation of voting-only nodes to the
`o.e.c.c.votingonly` package.
2021-06-10 13:47:34 +01:00
Tanguy Leroux 559c4e6ef4
Apply spotless formatting to more sub-projects (#73989) 2021-06-10 11:24:44 +02:00
Armin Braun 38fa7b72f6
Dry up HTTP Smoke Tests around Snapshots (#73962)
Drying up a few spots of code duplication with these tests. Partly to
reduce the size of PR #73952 that makes use of the smoke test infrastructure.
2021-06-10 09:43:46 +02:00
Przemyslaw Gomulka 4598b46e7f
[Rest Api Compatibility] Typed endpoint for multiget api (#73878)
Retrofits typed api for M-get api removed in #46587

relates #51816
2021-06-09 15:19:56 +02:00
Ryan Ernst ab1a2e4a84
Add precommit task for detecting split packages (#73784)
Modularization of the JDK has been ongoing for several years. Recently
in Java 16 the JDK began enforcing module boundaries by default. While
Elasticsearch does not yet use the module system directly, there are
some side effects even for those projects not modularized (eg #73517).
Before we can even begin to think about how to modularize, we must
Prepare The Way by enforcing packages only exist in a single jar file,
since the module system does not allow packages to coexist in multiple
modules.

This commit adds a precommit check to the build which detects split
packages. The expectation is that we will add the existing split
packages to the ignore list so that any new classes will not exacerbate
the problem, and the work to cleanup these split packages can be
parallelized.

relates #73525
2021-06-08 15:04:23 -07:00
Ryan Ernst 63012c8a40
Move ParseField to o.e.c.xcontent (#73923)
ParseField is part of the x-content lib, yet it doesn't exist under the
same root package as the rest of the lib. This commit moves the class to
the appropriate package.

relates #73784
2021-06-08 13:32:14 -07:00
Ryan Ernst 68817d7ca2
Rename o.e.common in libs/core to o.e.core (#73909)
When libs/core was created, several classes were moved from server's
o.e.common package, but they were not moved to a new package. Split
packages need to go away long term, so that Elasticsearch can even think
about modularization. This commit moves all the classes under o.e.common
in core to o.e.core.

relates #73784
2021-06-08 09:53:28 -07:00
Luca Cavanna 849b66d441
Dismax query tests to not rely on order (#73844)
This commit updates some of the multi match and query string unit tests to not rely on order when checking dismax query subclauses.

This is somehow related to how fields are returned from FieldTypeLookup#getMatchingFieldsNames, which though returns a Set<String> hence we can't rely on its ordering.
2021-06-08 09:39:00 +02:00
Ryan Ernst 64054de1ac
Rename bootstrap package in core jar (#73788)
The org.elasticsearch.bootstrap package exists in server with classes
for starting up Elasticsearch. The elasticsearch-core jar has a handful
of classes that were split out from there, namely java version parsing
and jarhell. This commit moves those classes to a new
org.elasticsearch.jdk package so as to not split the server owned
bootstrap package.

relates #73784
2021-06-07 08:14:44 -07:00
Ryan Ernst e81323e147
Move plugin classloader to its own package (#73786)
The plugin classloader exists in its own jar file for legacy reasons,
and while it should go away in the future, it currently duplicates the
package name of the rest of the plugin classes. This commit moves the
classloader into its own unique package.

relates #73784
2021-06-07 08:12:24 -07:00
Alan Woodward 29ee4202a2
Make NestedObjectMapper it's own class (#73058)
Nested objects are implemented via a Nested class directly on object mappers,
even though nested and non-nested objects have quite different semantics. In
addition, most call-sites that need to get an object mapper in fact need a nested
object mapper. To make it clearer that nested and object mappers are different
beasts with different implementations and different requirements, we should
split them into different classes.
2021-06-07 14:36:44 +01:00
Tanguy Leroux 0061823c30
Ignore 404-Not Found exceptions when cleaning up resources after tests (#73753)
We're doing some clean up logic to delete indices, data streams, 
auto-follow patterns or searchable snapshot indices in some test 
classes after a test case is executed. Today we either fail or log 
a warning if the clean up failed but I think we should simply 
ignore the 404 - Not Found response exception, like we do in 
other places for regular indices.

Note that this change applies only:
- when cleaning up searchable snapshots indices in ESRestTestCase
- when cleaning up indices, data streams and auto-follow pattern in AutoFollowIT
2021-06-04 12:45:51 +02:00
Przemyslaw Gomulka aba2282511
Change year max digits for strict_date_optional_time and date_optional_time (#73034)
We changed the default joda behaviour in strict_date_optional_time to
max 4 digits in a year. Java.time implementation should behave the same way.
At the same time date_optional_time should have 9digits for year part.

closes #52396
closes #72191
2021-06-04 09:35:07 +02:00
William Brafford 1c295a92d8
Add threadpool for critical operations on system indices (#72625)
* Add new thread pool for critical operations
* Split critical thread pool into read and write
* Add POJO to hold thread pool names
* Add tests for critical thread pools
* Add thread pools to data streams
* Update settings for security plugin
* Retrieve ExecutorSelector from SystemIndices where possible
* Use a singleton ExecutorSelector
2021-06-03 12:07:37 -04:00
Luca Cavanna 05ca9cf876
Remove getMatchingFieldTypes method (#73655)
FieldTypeLookup and MappingLookup expose the getMatchingFieldTypes method to look up matching field type by a string pattern. We have migrated ExistsQueryBuilder to instead rely on getMatchingFieldNames, hence we can go ahead and remove the remaining usages and the method itself.

The remaining usages are to find specific field types from the mappings, specifically to eagerly load global ordinals and for the join field type. These are operations that are performed only once when loading the mappings, and may be refactored to work differently in the future. For now, we remove getMatchingFieldTypes and rather call for the two mentioned scenarios getMatchingFieldNames(*) and then getFieldType for each of the returned field name. This is a bit wasteful but performance can be sacrificed for these scenarios in favour of less code to maintain.
2021-06-03 10:01:22 +02:00
Mark Vieira 0cdb748242
Improve error message when rest api specs are missing from classpath (#73640) 2021-06-02 09:05:14 -07:00
Przemyslaw Gomulka 6d34a38cb1
Fix EnsureNoWarning assertion (#73647)
EnsureNoWarnings method should assert that there is no other warnings
than the allowed "predefined" warnings in filteredWarnings() method

bug introduced in #71207
2021-06-02 17:55:14 +02:00
Nik Everett 4b5aebe8b0
Add setting to disable aggs optimization (#73620)
Sometimes our fancy "run this agg as a Query" optimizations end up
slower than running the aggregation in the old way. We know that and use
heuristics to dissable the optimization in that case. But it turns out
that the process of running the heuristics itself can be slow, depending
on the query. Worse, changing the heuristics requires an upgrade, which
means waiting. If the heurisics make a terrible choice folks need a
quick way out. This adds such a way: a cluster level setting that
contains a list of queries that are considered "too expensive" to try
and optimize. If the top level query contains any of those queries we'll
disable the "run as Query" optimization.

The default for this settings is wildcard and term-in-set queries, which
is fairly conservative. There are certainly wildcard and term-in-set
queries that the optimization works well with, but there are other queries
of that type that it works very badly with. So we're being careful.

Better, you can modify this setting in a running cluster to disable the
optimization if we find a new type of query that doesn't work well.

Closes #73426
2021-06-02 09:12:54 -04:00
Tanguy Leroux 4927b6917d
Delete mounted indices after test case in ESRestTestCase (#73650)
This commit adds some clean up logic to ESRestTestCase so 
that searchable snapshots indices are deleted after test case 
executions, before the snapshot and repositories are wipe out.

Backport of #73555
2021-06-02 15:06:44 +02:00
Lee Hinman 3d80e77ffa
Add `data-streams-mappings` to isXPackTemplate method (#73633)
This template was added in #64978, however, there can be some test failures if we try to remove
built-in templates. It was missing from the list and now needs to be added back.
2021-06-01 16:25:56 -06:00
Martijn van Groningen afc17bdb74
Add support for is_write_index flag to data stream aliases. (#73462)
This allows indexing documents into a data stream alias.
The ingestion is that forwarded to the write index of the data stream
that is marked as write data stream.
The `is_write_index` parameter can be used to indicate what the write data stream is,
when updating / adding a data steam alias.

Relates to #66163
2021-05-31 15:08:39 +02:00
Nik Everett 6b991c574a
Test: Use hamcrest for MatchAssertion (#72928)
Ever since I wrote `NotEqualsMessageBuilder` I've thought to myself
"if this were a hamcrest matcher we could use it everywhere and get
nicer error messages." A few weeks ago I finally built a work-alike
hamcrest matcher that I think produces better error messages. This plugs
that matcher into the `MatchAssertion` used by our yaml and docs tests.
2021-05-24 14:14:12 -04:00
Nhat Nguyen 1764e8ba15
Upgrade to Lucene-8.9.0-SNAPSHOT-efdc43fee18 (#73130)
Upgrades to Lucene-8.9 snapshot which includes:

- LUCENE-9507: Custom order for leaves (/cc @mayya-sharipova)
- LUCENE-9935: Enable bulk merge for stored fields with index sort
2021-05-17 09:37:20 -04:00
Alan Woodward 3bd594ebe8
Replace simpleMatchToFullName (#72674)
MappingLookup has a method simpleMatchToFieldName that attempts
to return all field names that match a given pattern; if no patterns match,
then it returns a single-valued collection containing just the pattern that
was originally passed in. This is a fairly confusing semantic.

This PR replaces simpleMatchToFullName with two new methods:

* getMatchingFieldNames(), which returns a set of all mapped field names
  that match a pattern. Calling getFieldType() with a name returned by
  this method is guaranteed to return a non-null MappedFieldType
* getMatchingFieldTypes, that returns a collection of all MappedFieldTypes
  in a mapping that match the passed-in pattern.

This allows us to clean up several call-sites because we know that
MappedFieldTypes returned from these calls will never be null. It also
simplifies object field exists query construction.
2021-05-13 11:35:23 +01:00
Armin Braun 3dff3a48af
Allow some Repository Settings to be Updated Dynamically (#72543)
This commit serves two purposes. For one, we need the ability to dynamically
update a repository setting for the encrypted repository work.

Also, this allows dynamically updating repository rate limits while snapshots are
in progress. This has often been an issue in the past where a long running snapshot
made progress over a long period of time already but is going too slowly with the
current rate limit. This left no good options, either throw away the existing
partly done snapshot's work and recreate the repo with a higher rate limit to speed
things up or wait for a long time with the current rate limit.
With this change the rate limit can simply be increased while a snapshot or restore
is running and will take effect imidiately.
2021-05-11 19:56:00 +02:00
Martijn van Groningen 6689b8bf1c
Add basic alias support for data streams (#72613)
Aliases to data streams can be defined via the existing update aliases api.
Aliases can either only refer to data streams or to indices (not both).
Also the existing get aliases api has been modified to support returning
aliases that refer to data streams.

Aliases for data streams are stored separately from data streams and
and refer to data streams by name and not to the backing indices of
a data stream. This means that when backing indices are added or removed
from a data stream that then the data stream alias doesn't need to be
updated.

The authorization model for aliases that refer to data streams is the
same as for aliases the refer to indices. In security privileges can
be defined on aliases, indices and data streams. When a privilege is
granted on an alias then access is also granted on the indices that
an alias refers to (irregardless whether privileges are granted or denied
on the actual indices). The same will apply for aliases that refer
to data streams. See for more details:
https://github.com/elastic/elasticsearch/issues/66163#issuecomment-824709767

Relates to #66163
2021-05-11 09:51:05 +02:00
Armin Braun 52e7b926a9
Make Large Bulk Snapshot Deletes more Memory Efficient (#72788)
Use an iterator instead of a list when passing around what to delete.
In the case of very large deletes the iterator is a much smaller than
the actual list of files to delete (since we save all the prefixes
which adds up if the individual shard folders contain lots of deletes).
Also this commit as a side-effect adjusts a few spots in logging where the
log messages could be catastrophic in size when trace logging is activated.
2021-05-10 13:40:57 +02:00
Armin Braun bef9dab643
Cleanup BlobPath Class (#72860)
There should be a singleton for the empty version of this.
All the copying to `String[]` or use as an iterator make
no sense either when we can just use the list outright.
2021-05-10 00:10:39 +02:00
Jason Tedor 8b4b2f9534
Remove bootstrap.system_call_filter setting (#72848)
This commit removes the bootstrap.system_call_filter setting, as
starting in Elasticsearch 8.0.0 we are going to require that system call
filters be installed and that this is not user configurable. Note that
while we force bootstrap to attempt to install system call filters, we
only enforce that they are installed via a bootstrap check in production
environments. We can consider changing this behavior, but leave that for
future consideration and thus a potential follow-up change.
2021-05-07 18:46:27 -04:00
Gordon Brown 1d85cb6481
Improve cleanup of Node Shutdown in tests (#72772)
Makes the following changes:
 - The node shutdown feature flag isn't set on the test runner, only the
   cluster JVMs, so we can't use it to check here. Instead, the cleanup
   now infers whether it's enabled from the shape of the first
   GET `_nodes/shutdown` response.
 - Now uses `adminClient()` instead of `client()`
 - Removes the unnecessary `instanceof` check, which was *not* due to parsing,
   but the fact that `nodes` is indeed a map if the feature flag isn't enabled.
2021-05-06 10:15:00 -06:00
Rene Groeschke e609e07cfe
Remove internal build logic from public build tool plugins (#72470)
Extract usage of internal API from TestClustersPlugin and PluginBuildPlugin and related plugins and build logic

This includes a refactoring of ElasticsearchDistribution to handle types
better in a way we can differentiate between supported Elasticsearch
Distribution types supported in TestCkustersPlugin and types only supported
in internal plugins.

It also introduces a set of internal versions of public plugins.

As part of this we also generate the plugin descriptors now.

As a follow up on this we can actually move these public used classes into 
an extra project (declared as included build)

We keep LoggedExec and VersionProperties effectively public And workaround for RestTestBase
2021-05-06 14:02:35 +02:00
Jim Ferenczi 051bbb2238
Fix early termination of search request with sort optimization (#72683)
The query phase applies an optimization when sorting by a numeric field.
This optimization doesn't handle early termination correctly when `timeout`
and/or `terminate_after` are used. An IAE exception is thrown at the shard
level when the timeout is reached.
This commit fixes the bug, early terminated exceptions are correctly caught
and the result is computed from the documents that the shard was able to collect
before the termination.

Closes #72661
2021-05-06 09:47:47 +02:00
Jim Ferenczi eb8d7e2aaf
Add a test module to simulate errors and warnings in search requests (#71674)
This change adds a test module called `error-query` that exposes a
query builder to simulate errors and warnings on shard search request.
The query accepts a list of indices and shard ids where errors or warnings
should be reported:
```
POST test*/_search
{
    "query": {
        "error_query": {
            "indices": [
                {
                    "name": "test_exception",
                    "shard_ids": [1],
                    "error_type": "exception",
                    "message": "boom"
                },
                {
                    "name": "test_warn*",
                    "error_type": "warning",
                    "message": "Watch out!"
                }
            ]
        }
    }
}
```

The `error_type` can be set to `exception` or `warning` and the `name` accepts
simple patterns, aliases and fully qualified index name if the search targets remote shards.

This module is published only within snapshots like the other test modules.

Relates #70784
2021-05-06 09:42:08 +02:00
Armin Braun 0220dfb3fe
Dry up Hashing BytesReference (#72443)
Dries up the efficient way to hash a bytes reference and makes use
of it in a few other spots that were needlessly copying all bytes in
the bytes reference for hashing.
2021-05-06 06:32:52 +02:00
Gordon Brown 9ce7a5a80b
Clean up Node Shutdown metadata in test cleanup (#72726)
This commit ensures that node shutdown metadata is cleaned up between
tests, as it causes unrelated tests to fail if a test leaves node
shutdown metadata in place.
2021-05-05 10:44:57 -06:00
Nhat Nguyen 80a5f3ac0d
Remove TombstoneDocSupplier from EngineConfig (#72593)
With #2251, we can create delete and noop tombstones directly.

Relates #72251
2021-05-05 12:00:37 -04:00
Armin Braun 70f1e8c33d
Make GetSnapshotsAction Cancellable (#72644)
If this runs needlessly for large repositories (especially in timeout/retry situations)
it's a significant memory+cpu hit => made it cancellable like we recently did for many
other endpoints.
2021-05-04 18:05:31 +02:00
Luca Cavanna 52b0d8ea37
Remove DocumentMapperForType (#72616)
DocumentMapperForType is used to create a document mapper when no mappings exists for an index and we are indexing the first document in it. This is only to cover for the edge case of empty docs, without any fields to dynamically map, being indexed, as we need to ensure that any index with at least one document in it has some mappings.

We can replace using DocumentMapperForType with the same logic that MapperService#documentMapperWithAutoCreate includes. This also helps clean up the only case where we create a DocumentMapper from its public constructor, which can be removed and replaced by a more targeted static method.
2021-05-04 11:56:50 +02:00
Luca Cavanna b92b9d1c94
Replace some DocumentMapper usages with MappingLookup (#72400)
We recently replaced some usages of DocumentMapper with MappingLookup in the search layer, as document mapper is mutable which can cause issues. In order to do that, MappingLookup grew and became quite similar to DocumentMapper in what it does and holds.

In many cases it makes sense to use MappingLookup instead of DocumentMapper, and we may even be able to remove DocumentMapper entirely in favour of MappingLookup in the long run.

This commit replaces some of its straight-forward usages.
2021-05-03 09:42:37 +02:00
Armin Braun 6778020301
Use Leak Tracking Infrastruture in MockPageCacheRecycler (#72477)
The leak tracking can be run for every test while the existing solution would only work
with a very limited set of tests giving us no coverage on pages that weren't acquired through
the mock transport service.
2021-04-30 21:52:20 +02:00
Luca Cavanna cdf1fc3394
Consolidate and clarify MappingLookup semantics (#72557)
MappingLookup has been introduced to expose a snapshot of the mappings to the search layer. We have been using it more and more over time as it is convenient and always non null.

This commit documents some of its semantics and makes it easier to trace when it is created with limited functionalities (without a document parser, index settings and index analyzers).
2021-04-30 16:58:47 +02:00
Alan Woodward 7d812b9f78
Handle tombstone building entirely within ParsedDocument (#72251)
DocumentMapper contains some complicated logic to load
metadata fields so that it can build tombstone documents.
However, we only actually need three metadata mappers for
this purpose, and they are all stateless so this logic is
unnecessary. This commit adds two new static methods to
ParsedDocument to build no-op and delete tombstones,
and removes some ceremony elsewhere.
2021-04-30 12:20:53 +01:00
Ignacio Vera 4fff3788f3
Disallow creating geo_shape mappings with deprecated parameters (#70850)
With the introduction of BKD-based geo shape indexing in #32039, the prefix tree indexing method has 
been deprecated. From 8.0.0, it will not be allowed to create new mappings using deprecated parameters.
2021-04-30 11:08:58 +02:00
Alan Woodward 009f23e7a9
Explicitly say if stored fields aren't supported in MapperTestCase (#72474)
MapperTestCase has a check that if a field mapper supports stored fields,
those stored fields are available to index time scripts. Many of our mappers
do not support stored fields, and we try and catch this with an assumeFalse
so that those mappers do not run this test. However, this test is fragile - it
does not work for mappers created with an index version below 8.0, and it
misses mappers that always store their values, e.g. match_only_text.

This commit adds a new supportsStoredField method to MapperTestCase,
and overrides it for those mappers that do not support storing values. It
also adds a minimalStoredMapping method that defaults to the minimal
mapping plus a store parameter, which is overridden by match_only_text
because storing is not configurable and always available on this mapper.
2021-04-30 08:59:56 +01:00
Ryan Ernst 6a7298e555
Make NodeEnvironment.nodeDataPaths singular (#72432)
This commit renames the nodeDataPaths method to be singular and return a
single Path instead of an array. This is done in isolation from other
NodeEnvironemnt methods to make it reviewable.

relates #71205
2021-04-29 14:40:26 -07:00
Francisco Fernández Castaño 4e9f9ec64c
Add support for Rest XPackUsage task cancellation (#72304) 2021-04-28 18:16:31 +02:00
David Turner f72fa49749
Fix S3HttpHandler chunked-encoding handling (#72378)
The `S3HttpHandler` reads the contents of the uploaded blob, but if the
upload used chunked encoding then the reader would skip one or more
`\r\n` sequences if they appeared at the start of a chunk.

This commit reworks the reader to be stricter about its interpretation
of chunks, and removes some indirection via streams since we can work
pretty much entirely on the underlying `BytesReference` instead.

Closes #72358
2021-04-28 15:13:48 +01:00
David Turner 01aad86d04
Remove spurious docker volume from S3 fixture (#72388) 2021-04-28 15:11:31 +01:00
Ryan Ernst b1eab79f4c
Make Environment.dataFiles singular (#72327)
The path.data setting is now a singular string, but the method
dataFiles() that gives access to the path is still an array. This commit
renames the method and makes the return type a single Path.

relates #71205
2021-04-27 19:48:29 -07:00
Nik Everett 5f281ceedd
Prevent `date_histogram` from OOMing (#72081)
This prevents the `date_histogram` from running out of memory allocating
empty buckets when you set the interval to something tiny like `seconds`
and aggregate over a very wide date range. Without this change we'd
allocate memory very quickly and throw and out of memory error, taking
down the node. With it we instead throw the standard "too many buckets"
error.

Relates to #71758
2021-04-27 14:41:52 -04:00
Ryan Ernst d933ecd26c
Convert path.data to String setting instead of List (#72282)
Since multiple data path support has been removed, the Setting no longer
needs to support multiple values. This commit converts the
PATH_DATA_SETTING to a String setting from List<String>.

relates #71205
2021-04-27 08:29:12 -07:00
bellengao eaa59fbc41
Enhance and add more tests for ResizeRequest (#68502) 2021-04-26 15:01:47 -04:00
Armin Braun 47c77160ef
Fix ListenableFuture Resolving Listeners under Mutex (#71943) (#72087)
We shouldn't loop over the listeners under the mutex in `done` since in most use-cases we used `DirectExecutorService`
with this class.
Also, no need to create an `AbstractRunnable` for direct execution. We use this listener on the hot path in authentication
making this a worthwhile optimization I think.
Lastly, no need to clear and thus loop over `listeners`, the list is not used again after the `done` call returns anyway
so no point in retaining it at all (especially when in a number of use cases we add listeners only after the `done` call
so we can also save the instantiation by making the field non-final).
2021-04-26 19:33:34 +02:00
Alan Woodward 2560798488
Remove MapperService.simpleMatchToFullname() (#72244)
This just delegates to mappingLookup().simpleMatchToFullName(), and
is only called in two places.
2021-04-26 16:54:23 +01:00
Rene Groeschke 5bcd02cb4d
Restructure build tools java packages (#72030)
Related to #71593 we move all build logic that is for elasticsearch build only into
the org.elasticsearch.gradle.internal* packages

This makes it clearer if build logic is considered to be used by external projects
Ultimately we want to only expose TestCluster and PluginBuildPlugin logic
to third party plugin authors.

This is a very first step towards that direction.
2021-04-26 14:53:55 +02:00
Ryan Ernst 6aa0735177
Fail when using multiple data paths (#72184)
This commit converts the deprecation messages for multiple data paths
into errors. It effectively removes support for multiple data paths.

relates #71205
2021-04-24 15:45:27 -07:00
Igor Motov 50d0ebb50e
Fix close_to assertion (#72187)
Fixes the assertion to actually assert and adds a test to check that it actually does that.
2021-04-23 15:58:07 -10:00
Julie Tibshirani fdf254335f
Remove more references to query_and_fetch. (#71988)
This search type was deleted several releases ago.
2021-04-23 09:19:57 -07:00
Nik Everett 39fee5e908
Fix composite early termination on sorted (#72101)
I broke composite early termination when reworking how aggregations'
contact for `getLeafCollector` around early termination in #70320. We
didn't see it in our tests because we weren't properly emulating the
aggregation collection stage. This fixes early termination by adhering
to the new contract and adds more tests.

Closes #72078

Co-authored-by: Benjamin Trent <4357155+benwtrent@users.noreply.github.com>
2021-04-22 14:32:26 -04:00
Armin Braun ede947fdd8
Refactor Repository#snapshotShard (#72083)
Create a class for holding the large number of arguments to this method
and to dry up resource handling across snapshot shard service and the
source-only repository.
2021-04-22 16:42:31 +02:00