Commit Graph

3207 Commits

Author SHA1 Message Date
Jason Tedor c7130ebb9b
Move die with dignity to be a test module (#77136)
This commit moves the die with dignity tests to be a test module. The
purpose of this is so the _die_with_dignity endpoint is available in
snapshot builds, for the purpose of enabling testing orchestration logic
that manages what happens to a node after it dies with an
OutOfMemoryError.
2021-09-03 12:34:01 -04:00
Alan Woodward 385b97f92b
Choose postings format from FieldMapper instead of MappedFieldType (#77234)
Currently we configure per-field postings formats by asking the MapperService
for the MappedFieldType of the field in question, and then checking to see if it
is a completion field. If no MappedFieldType is available, we emit a warning.
However, MappedFieldTypes are for search fields only, and so we end up emitting
warnings for hidden sub-fields that have no corresponding field type, such as
prefix or phrase accelerator fields on text mappers.

This commit reworks things so that the MappingLookup is responsible for defining
per-field postings formats, and will detect CompletionFieldMappers at build time.
All fields that are not mapped to a completion field will just get the default postings
format. This also means that we no longer need a logger instance on CodecService.

Fixes #77183
2021-09-03 14:54:43 +01:00
Mayya Sharipova f18b9d5ac8
Add segment sorter for data streams (#75195)
It is beneficial to sort segments within a datastream's index
by desc order of their max timestamp field, so
that the most recent (in terms of timestamp) segments
will be first.

This allows to speed up sort query on @timestamp desc field,
which is the most common type of query for datastreams,
as we are mostly concerned with the recent data.
This patch addressed this for writable indices.

Segments' sorter is different from index sorting.
An index sorter by itself is  concerned about the order of docs
within an individual segment (and not how the segments are organized),
while the segment sorter is only used during search and allows
to start docs collection with the "right" segment,
so we can terminate the collection faster.

This PR adds a property to IndexShard `isDataStreamIndex` that
shows if a shard is a part of datastream.
2021-09-03 09:42:48 -04:00
Oleg Smirnov f9b38406f0
Prevent unnecessary boxing, improve code clarity. (#76808)
Use BooleanSupplier, OptionalInt, Consumer and Predicate in places
where more generic types were used.
2021-09-03 13:45:29 +02:00
David Turner bfcc93a042
Anonymize AbstractRefCounted (#77208)
Today `AbstractRefCounted` has a `name` field which is only used to
construct the exception message when calling `incRef()` after it's been
closed. This isn't really necessary, the stack trace will identify the
reference in question and give loads more useful detail besides. It's
also slightly irksome to have to name every single implementation.

This commit drops the name and the constructor parameter, and also
introduces a handy factory method for use when there's no extra state
needed and you just want to run a method or lambda when all references
are released.
2021-09-03 07:59:44 +01:00
Ignacio Vera 062276f84d
Refactor GeoBoundingBoxQuery integration tests (#77103)
This commit breaks the test to separate the testing of legacy geo_shape field.
2021-09-03 08:51:52 +02:00
Ignacio Vera 68e44bd834
Mute test LegacyGeoShapeIT#testBulk (#77175) 2021-09-02 13:18:51 +02:00
Armin Braun 0920e21445
Implement Sort By Repository Name in Get Snapshots API (#77049)
This one is the last sort column not yet implemented but used by Kibana.
2021-09-01 13:01:58 +02:00
Ignacio Vera 07715438b5
Refactor of GeoShape integration tests (#77052)
This commit joins GeoFilterIT and GeoShapeIntegrationIT into one test case called GeoShapeIntegTestCase 
which is moved into the test framework.
2021-09-01 07:21:15 +02:00
David Turner ead0020497
Tidy up ClusterApplierService (#76837)
This commit cleans up some cruft left over from older versions of the
`ClusterApplierService`:

- `UpdateTask` doesn't need to implement lots of interfaces and give
  access to its internals, it can just pass appropriate arguments to
  `runTasks()`.
- No need for the `runOnApplierThread` override with a default priority,
  just have callers be explicit about the priority.
- `submitStateUpdateTask` takes a config which never has a timeout, may
  as well just pass the priority and remove the dead code
- `SafeClusterApplyListener` doesn't need to be a
  `ClusterApplyListener`, may as well just be an `ActionListener<Void>`.
- No implementations of `ClusterApplyListener` care about the source
  argument, may as well drop it.
- Adds assertions to prevent `ClusterApplyListener` implementations from
  throwing exceptions since we just swallow them.
- No need to override getting the current time in the
  `ClusterApplierService`, we can control this from the `ThreadPool`.
2021-08-31 17:35:32 +01:00
Armin Braun f89eda5f9d
Fix Snapshot BwC Version Randomization Behavior (#77057)
The randomization of the repo version often wasn't used because of the repository cache.
Force re-creating the repository every time we manually mess with the versions.
2021-08-31 15:28:57 +02:00
Rene Groeschke 35ec6f348c
Introduce simple public yaml-rest-test plugin (#76554)
This introduces a basic public yaml rest test plugin that is supposed to be used by external 
elasticsearch plugin authors. This is driven by #76215

- Rename yaml-rest-test to intern-yaml-rest-test
- Use public yaml plugin in example plugins

Co-authored-by: Mark Vieira <portugee@gmail.com>
2021-08-31 08:45:52 +02:00
Armin Braun 48f3784a6d
Add Sort By Shard Count and Failed Shard Count to Get Snapshots API (#77011)
It's in the title. As requested by the Kibana team, adding these two additional sort columns.

relates #74350
2021-08-30 13:39:51 +02:00
Armin Braun 706ccbd8b5
Remove Needless Sleeps on Node Configuration Changes in Internal Cluster Tests (#76884)
I noticed this recently when trying to reproduce a test failure. We're doing a lot of sleeping
when validating that the cluster formed if that process is slow randomly (which it tends to be
due to disk interaction on node starts and such.). By reusing the approach for waiting on a
cluster state we rarely if ever need to get into the busy assert loop and remove all these sleeps,
shaving of a few seconds here and there from running internal cluster tests.
2021-08-25 05:39:33 +02:00
David Turner 4a17847b85
Add timing stats to publication process (#76771)
This commit introduces into the node stats API various statistics to
track the time that the elected master spends in various phases of the
cluster state publication process.

Relates #76625
2021-08-23 17:38:32 +01:00
David Turner dfe877c3a8
Introduce `ClusterStatePublicationEvent` (#76723)
Today we use `ClusterChangedEvent` to represent a committed change to
the cluster state while it's being applied, and also to represent the
proposed change while it's being published. These are quite different
usages in practice, so this commit separates them by introducing a
`ClusterStatePublicationEvent` to represent the change to be published.

Relates #76625 in that we will be able to use the new
`ClusterStatePublicationEvent` to track various stats about the
publication as it progresses, but which don't make sense on a
`ClusterChangedEvent`.
2021-08-19 21:00:56 +01:00
Gordon Brown 0100e1148a
Remove Node Shutdown API feature flag (#76588)
* Remove Node Shutdown API feature flag

This PR removes the Node Shutdown API feature flag.

The Node Shutdown API will now always be available.

* Check if xpack is enabled in cleanup

When I removed the feature flag, I assumed that we would always have the
Node Shutdown APIs, but that turns out not to be the case if xpack isn't
enabled. This case was caught by the logic to handle the case where the
feature flag wasn't enabled by accident.

This commit adds the check we always should have had.

* Also check version before tyring cleanup
2021-08-18 14:35:02 -04:00
Rory Hunter d01efa4fd6
Changes to keep Checkstyle happy after reformatting (#76464)
* Reformatting to keep Checkstyle after formatting

* Configure spotless everywhere, and disable the tasks if necessary

* Add XContentBuilder helpers, fix test

* Tweaks

* Add a TODO

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2021-08-18 07:15:55 -04:00
Gordon Brown 58f66cf04a
Delay shard reassignment from nodes which are known to be restarting (#75606)
This PR makes the delayed allocation infrastructure aware of registered node shutdowns, so that reallocation of shards will be further delayed for nodes which are known to be restarting.

To make this more configurable, the Node Shutdown APIs now support a `allocation_delay` parameter, which defaults to 5 minutes. For example:
```
PUT /_nodes/USpTGYaBSIKbgSUJR2Z9lg/shutdown
{
  "type": "restart",
  "reason": "Demonstrating how the node shutdown API works",
  "allocation_delay": "20m"
}
```

Will cause reallocation of shards assigned to that node to another node to be delayed by 20 minutes. Note that this delay will only be used if it's *longer* than the index-level allocation delay, set via `index.unassigned.node_left.delayed_timeout`.

The `allocation_delay` parameter is only valid for `restart`-type shutdown registrations, and the request will be rejected if it's used with another shutdown type.
2021-08-16 15:59:50 -06:00
Henning Andersen 51d524bb2c
Add recovery from snapshot to tests (#76535)
Randomly add to use a snapshot for recovery to searchable snapshot and
snapshot tests to verify that recover from snapshot does not break other
features (those should not care about the flag).

Relates #76237
2021-08-15 19:32:30 +02:00
Tim Brooks e6fd459a6e
Respond with same compression scheme received (#76372)
This is related to #73497. Currently, we only use the configured
transport.compression_scheme setting when compressing a request or a
response. Additionally, the cluster.remote.*.compression_scheme
setting is ignored. This commit fixes this behavior by respecting the
per-cluster setting. Additionally, it resolves confusion around inbound
and outbound connections by always responding with the same scheme that
was received. This allows remote connections to have different schemes
than local connections.
2021-08-13 13:29:22 -06:00
Francisco Fernández Castaño a6aa5998aa
Add third party integration tests for snapshot based recoveries (#76489)
This commit adds third party integration tests for snapshot based
recoveries in S3, Azure and GCS.

Relates #73496
2021-08-13 15:11:28 +02:00
Francisco Fernández Castaño 2ebe5cd075
Add peer recoveries using snapshot files when possible (#76237)
This commit adds peer recoveries from snapshots. It allows establishing a replica by downloading file data from a snapshot rather than transferring the data from the primary. 

Enabling this feature is done on the repository definition. Repositories having the setting `use_for_peer_recovery=true` will be consulted to find a good snapshot when recovering a shard.

Relates #73496
2021-08-13 10:42:16 +02:00
Armin Braun 1f080e3aa5
Fix Snapshot State Machine Issues around Failed Clones (#76419)
With recent fixes it is never correct to simply remove a snapshot from the cluster state without
updating other snapshot entries if an entry contains any successful shards due to possible dependencies.
This change reproduces two issues resulting from simply removing snapshot without regard for other queued
operations and fixes them by having all removal of snapshot from the cluster state go through the same
code path.
Also, this change moves the tracking of a snapshot as "ending" up a few lines to fix an assertion about finishing
snapshots that forces them to be in this collection.
2021-08-12 20:41:44 +02:00
Armin Braun b01cd8ff87
Radomize BlobContainer Path in Retries Tests (#76303)
Follow up to #76273 adding some randomization across all retries tests.
2021-08-11 22:31:04 +02:00
Jack Conradson a48c7369a2
Add Fields API to aggregation scripts and field scripts (#76325)
This change updates the aggregation script, map script for aggregations, and field scripts to extend 
DocBasedScript to give them access to the new fields api.
2021-08-11 08:32:36 -07:00
Tim Brooks a91be8282d
Precompile regex for PatternSeenEventExpectation (#76332)
Currently we compile the regex for each log string that we are
analyzing. This is extremely inefficient and may contribute to
instability seen in the logging IT. This commit compiles the regex a
single time.
2021-08-11 07:04:58 -06:00
Armin Braun da668f9cb0
Refactor Snapshot Finalization Method (#76005)
This refactors the signature of snapshot finalization. For one it allows removing
a TODO about being dependent on mutable `SnapshotInfo` which was not great but
more importantly this sets up a follow-up where state can be shared between the
cluster state update at the end of finalization and subsequent old-shard-generation
cleanup so that we can resolve another open TODO about leaking shard generation files
in some cases.
2021-08-10 16:41:21 +02:00
Rory Hunter 128a7e7744
Fix compiler warnings in :server - part 3 (#76024)
Part of #40366. Fix a number of javac issues when linting is enforced in `server/`.
2021-08-10 15:05:55 +01:00
Luca Cavanna 32d2f60f8a
Emit multiple fields from a runtime field script (#75108)
We have recently introduced support for grok and dissect to the runtime fields 
Painless context that allows to split a field into multiple fields. However, each runtime 
field can only emit values for a single field. This commit introduces support for emitting 
multiple fields from the same script.

The API call to define a runtime field that emits multiple fields is the following:

```
PUT localhost:9200/logs/_mappings
{
    "runtime" : {
      "log" : {
        "type" : "composite",
        "script" : "emit(grok(\"%{COMMONAPACHELOG}\").extract(doc[\"message.keyword\"].value))",
        "fields" : {
            "clientip" : {
                "type" : "ip"
            },
            "response" : {
                "type" : "long"
            }
        }
      }
    }
}
```

The script context for this new field type accepts two emit signatures:

* `emit(String, Object)`
* `emit(Map)`

Sub-fields need to be declared under fields in order to be discoverable through 
the field_caps API and accessible through the search API. 

The way that it emits multiple fields is by returning multiple MappedFieldTypes 
from RuntimeField#asMappedFieldTypes. The sub-fields are instances of the 
runtime fields that are already supported, with a little tweak to adapt the script 
defined by their parent to an artificial script factory for each of the sub-fields 
that makes its corresponding sub-field accessible. This approach allows to reuse 
all of the existing runtime fields code for the sub-fields.

The runtime section has been flat so far as it has not supported objects until now. 
That stays the same, meaning that runtime fields can have dots in their names. 
Because there are though two ways to create the same field with the introduction 
of the ability to emit multiple fields, we have to make sure that a runtime field with 
a certain name cannot be defined twice, which is why the following mappings are 
rejected with the error `Found two runtime fields with same name [log.response]`:

```
PUT localhost:9200/logs/_mappings
{
    "runtime" : {
        "log.response" : {
            "type" : "keyword"
        },
        "log" : {
            "type" : "composite",
            "script" : "emit(\"response\", grok(\"%{COMMONAPACHELOG}\").extract(doc[\"message.keyword\"].value)?.response)",
            "fields" : {
                "response" : {
                    "type" : "long"
                }
            }
        }
    }
}
```

Closes #68203
2021-08-10 13:07:53 +01:00
Armin Braun 46ebb2298c
Fix S3 Streaming Writes Ignoring Relative Paths for Large Writes (#76273)
It's in the title, we were not accounting for relative paths at all
here and only saved by the fact that we mostly short-circuit to
non-streaming writes.
Extended testing to catch this case for S3 and would do a follow-up
to extend it for the other implementations as well.
2021-08-10 13:36:25 +02:00
Armin Braun 873fbf7b65
Fix Leaking Http Channel Objects when Http Client Stats are Disabled (#76257)
We have to remove the channel from the internal collection of channels when stats are disabled.

Closes #76183
2021-08-10 12:39:12 +02:00
David Turner 56f33ce133
Track cancellable tasks by parent ID (#76186)
Today when cancelling a task with its descendants we perform a linear
scan through all the tasks looking for the few that have the right
parent ID. With potentially hundreds of thousands of tasks this takes
quite some time, particularly if there are many tasks to cancel.

This commit introduces a second map that tracks the tasks by their
parent ID so that it's super-cheap to find the descendants that need to
be cancelled.

Closes #75316
2021-08-09 16:11:02 +01:00
Francisco Fernández Castaño 3c8b9a6f2e
Add peer recovery planners that take into account available snapshots (#75840)
This commit adds a new set of classes that would compute a peer
recovery plan, based on source files + target files + available
snapshots. When possible it would try to maximize the number of
files used from a snapshot. It uses repositories with `use_for_peer_recovery`
setting set to true.

It adds a new recovery setting `indices.recovery.use_snapshots`

Relates #73496
2021-08-09 14:03:12 +02:00
Ioannis Kakavas fed790e4e4
Set xpack.security.enabled to true for all licenses (#72300)
This change sets the default value for `xpack.security.enabled` to true
for all licenses. As such the value of the settings is read directly 
from the node's settings and not from XPackLicenseState which 
doesn't need to keep track of it depending on potential license changes
any more.
2021-08-09 09:36:01 +03:00
Stuart Tettemer bb9c91fc57
Script: Fields API for Filter context (#76119) 2021-08-04 14:29:43 -05:00
Stuart Tettemer 6c02a6c657
Script: Fields API for Sort and Score scripts (#75863)
Adds minimal fields API support to sort and score scripts.

Example: `field('myfield').getValue(123)` where `123` is the default if the field has no values.

Refs: #61388
2021-08-04 10:11:12 -05:00
Armin Braun f62618c5ae
Ensure Node Shutdown Waits for Running Restores to Complete (#76070)
We must wait for ongoing restores to complete before shutting down the repositories
service. Otherwise we may leak file descriptors because tasks for releasing the store
are submitted to the `SNAPSHOT` or some searchable snapshot pools that quietly accept
but never reject/fail tasks after shutdown.

same as #46178 where we had the same bug in recoveries

closes #75686
2021-08-04 15:33:38 +02:00
David Turner 4441b66c26
Replace String shard gen with ShardGeneration (#75927)
Today we use Strings for lots of different things when manipulating
snapshots; one crucial such thing is a shard generation. We're not very
consistent about naming the variables containing these things, and have
other kinds of generation in use, so it takes extra effort to track
shard generations through the code. This commit introduces a
`ShardGeneration` class to encapsulate just those strings that are used
as shard generations.
2021-08-04 12:00:33 +01:00
Lee Hinman a76ee40d5b
Flip node shutdown feature flag to default to true on snapshot builds (#75962)
* Flip node shutdown feature flag to default to true on snapshot builds

It previously defaulted to false. The setting can still only be set to 'true' on a
non-release (snapshot) build of Elasticsearch.

Relates to #70338

* Handle case where operator privileges are enabled
2021-08-02 13:15:36 -04:00
Adrien Grand d15445e0f3
Remove usage of RAM accounting of segments (#75674)
This is a pre-requisite for the upgrade to Lucene 9, which removes the ability to estimate RAM usage of segments.
2021-07-29 08:36:09 +02:00
Przemyslaw Gomulka c96139d006
[Rest Api Compatibility] Deprecate the use of synced flush (#75372)
synced flush is going to be replaced by flush. This commit allows to synced_flush api only in v7 compatibility mode.
Worth noting - sync_id is gone and won't be available in v7 responses from indices.stats

relates removal pr #50882
relates #51816
2021-07-28 14:17:49 +02:00
Rory Hunter 944b3f3b56
Fix compiler warnings in :server - part 1 (#75708)
Part of #40366. Fix a number of javac issues when linting is enforced in `server/`.
2021-07-27 19:40:40 +01:00
Mark Vieira 9d14bc91d7
Set netty available processors system property for tests globally (#75699) 2021-07-27 11:21:42 -07:00
Armin Braun b22180e3c6
Enhance Shard Level Metdata check in BlobStoreTestUtil (#75737)
Adding check that shard level index metadata actually contains the snapshots
it's supposed to contain. This would have caught a number of recent bugs.
2021-07-27 18:07:52 +02:00
Armin Braun f1ba7c4d5d
Fix Concurrent Snapshot Repository Corruption from Operations Queued after Failing Operations (#75733)
The node executing a shard level operation would in many cases communicate `null` for the shard state update,
leading to follow-up operations incorrectly assuming an empty shard snapshot directory and starting from scratch.

closes #75598
2021-07-27 16:13:21 +02:00
David Turner 3f77adcc66
Include reason in cancellation exceptions (#75332)
Today when a task is cancelled we record the reason for the cancellation
but this information is very rarely exposed to users. This commit
centralises the construction of the `TaskCancellationException` and
includes the reason in the exception message.

Closes #74825
2021-07-27 11:08:09 +01:00
David Turner 98504ea258
Drop ReceiveTimeoutTransportException stack trace (#75671)
We only create a `ReceiveTimeoutTransportException` in one place, the
timeout handler for the corresponding transport request, so the stack
trace contains no useful information and just adds noise if ever it is
logged. With this commit we drop the stack trace from these exceptions.
2021-07-26 09:24:55 +01:00
Alan Woodward ffcaffc29f
Handle runtime subfields when shadowing dynamic mappings (#75595)
In #75454 we changed our dynamic shadowing logic to check that an unmapped
field was truly shadowed by a runtime field before returning no-op mappers. However,
this does not handle the case where the runtime field can have multiple subfields, as
will be true for the upcoming composite field type. We instead need to check that
the field in question would not be shadowed by any field type returned by any
runtime field.

This commit abstracts this logic into a new isShadowed() method on
DocumentParserContext, which uses a set of runtime field type names built from
the mapping lookup at construction time. It also simplifies the no-op mapper
slightly by making it a singleton object, as we don't need to preserve field names
here.
2021-07-22 13:15:07 +01:00
Francisco Fernández Castaño 90148fe31e
Add the ability to fetch the latest successful shard snapshot (#75080)
This commit adds a new master transport action TransportGetShardSnapshotAction
that allows getting the last successful snapshot for a particular
shard in a set of repositories. It deals with the different
implementation details around BwC for repositories.

Relates #73496
2021-07-22 13:55:32 +02:00