Commit Graph

3207 Commits

Author SHA1 Message Date
Alan Woodward 29ee4202a2
Make NestedObjectMapper it's own class (#73058)
Nested objects are implemented via a Nested class directly on object mappers,
even though nested and non-nested objects have quite different semantics. In
addition, most call-sites that need to get an object mapper in fact need a nested
object mapper. To make it clearer that nested and object mappers are different
beasts with different implementations and different requirements, we should
split them into different classes.
2021-06-07 14:36:44 +01:00
Tanguy Leroux 0061823c30
Ignore 404-Not Found exceptions when cleaning up resources after tests (#73753)
We're doing some clean up logic to delete indices, data streams, 
auto-follow patterns or searchable snapshot indices in some test 
classes after a test case is executed. Today we either fail or log 
a warning if the clean up failed but I think we should simply 
ignore the 404 - Not Found response exception, like we do in 
other places for regular indices.

Note that this change applies only:
- when cleaning up searchable snapshots indices in ESRestTestCase
- when cleaning up indices, data streams and auto-follow pattern in AutoFollowIT
2021-06-04 12:45:51 +02:00
Przemyslaw Gomulka aba2282511
Change year max digits for strict_date_optional_time and date_optional_time (#73034)
We changed the default joda behaviour in strict_date_optional_time to
max 4 digits in a year. Java.time implementation should behave the same way.
At the same time date_optional_time should have 9digits for year part.

closes #52396
closes #72191
2021-06-04 09:35:07 +02:00
William Brafford 1c295a92d8
Add threadpool for critical operations on system indices (#72625)
* Add new thread pool for critical operations
* Split critical thread pool into read and write
* Add POJO to hold thread pool names
* Add tests for critical thread pools
* Add thread pools to data streams
* Update settings for security plugin
* Retrieve ExecutorSelector from SystemIndices where possible
* Use a singleton ExecutorSelector
2021-06-03 12:07:37 -04:00
Luca Cavanna 05ca9cf876
Remove getMatchingFieldTypes method (#73655)
FieldTypeLookup and MappingLookup expose the getMatchingFieldTypes method to look up matching field type by a string pattern. We have migrated ExistsQueryBuilder to instead rely on getMatchingFieldNames, hence we can go ahead and remove the remaining usages and the method itself.

The remaining usages are to find specific field types from the mappings, specifically to eagerly load global ordinals and for the join field type. These are operations that are performed only once when loading the mappings, and may be refactored to work differently in the future. For now, we remove getMatchingFieldTypes and rather call for the two mentioned scenarios getMatchingFieldNames(*) and then getFieldType for each of the returned field name. This is a bit wasteful but performance can be sacrificed for these scenarios in favour of less code to maintain.
2021-06-03 10:01:22 +02:00
Mark Vieira 0cdb748242
Improve error message when rest api specs are missing from classpath (#73640) 2021-06-02 09:05:14 -07:00
Przemyslaw Gomulka 6d34a38cb1
Fix EnsureNoWarning assertion (#73647)
EnsureNoWarnings method should assert that there is no other warnings
than the allowed "predefined" warnings in filteredWarnings() method

bug introduced in #71207
2021-06-02 17:55:14 +02:00
Nik Everett 4b5aebe8b0
Add setting to disable aggs optimization (#73620)
Sometimes our fancy "run this agg as a Query" optimizations end up
slower than running the aggregation in the old way. We know that and use
heuristics to dissable the optimization in that case. But it turns out
that the process of running the heuristics itself can be slow, depending
on the query. Worse, changing the heuristics requires an upgrade, which
means waiting. If the heurisics make a terrible choice folks need a
quick way out. This adds such a way: a cluster level setting that
contains a list of queries that are considered "too expensive" to try
and optimize. If the top level query contains any of those queries we'll
disable the "run as Query" optimization.

The default for this settings is wildcard and term-in-set queries, which
is fairly conservative. There are certainly wildcard and term-in-set
queries that the optimization works well with, but there are other queries
of that type that it works very badly with. So we're being careful.

Better, you can modify this setting in a running cluster to disable the
optimization if we find a new type of query that doesn't work well.

Closes #73426
2021-06-02 09:12:54 -04:00
Tanguy Leroux 4927b6917d
Delete mounted indices after test case in ESRestTestCase (#73650)
This commit adds some clean up logic to ESRestTestCase so 
that searchable snapshots indices are deleted after test case 
executions, before the snapshot and repositories are wipe out.

Backport of #73555
2021-06-02 15:06:44 +02:00
Lee Hinman 3d80e77ffa
Add `data-streams-mappings` to isXPackTemplate method (#73633)
This template was added in #64978, however, there can be some test failures if we try to remove
built-in templates. It was missing from the list and now needs to be added back.
2021-06-01 16:25:56 -06:00
Martijn van Groningen afc17bdb74
Add support for is_write_index flag to data stream aliases. (#73462)
This allows indexing documents into a data stream alias.
The ingestion is that forwarded to the write index of the data stream
that is marked as write data stream.
The `is_write_index` parameter can be used to indicate what the write data stream is,
when updating / adding a data steam alias.

Relates to #66163
2021-05-31 15:08:39 +02:00
Nik Everett 6b991c574a
Test: Use hamcrest for MatchAssertion (#72928)
Ever since I wrote `NotEqualsMessageBuilder` I've thought to myself
"if this were a hamcrest matcher we could use it everywhere and get
nicer error messages." A few weeks ago I finally built a work-alike
hamcrest matcher that I think produces better error messages. This plugs
that matcher into the `MatchAssertion` used by our yaml and docs tests.
2021-05-24 14:14:12 -04:00
Nhat Nguyen 1764e8ba15
Upgrade to Lucene-8.9.0-SNAPSHOT-efdc43fee18 (#73130)
Upgrades to Lucene-8.9 snapshot which includes:

- LUCENE-9507: Custom order for leaves (/cc @mayya-sharipova)
- LUCENE-9935: Enable bulk merge for stored fields with index sort
2021-05-17 09:37:20 -04:00
Alan Woodward 3bd594ebe8
Replace simpleMatchToFullName (#72674)
MappingLookup has a method simpleMatchToFieldName that attempts
to return all field names that match a given pattern; if no patterns match,
then it returns a single-valued collection containing just the pattern that
was originally passed in. This is a fairly confusing semantic.

This PR replaces simpleMatchToFullName with two new methods:

* getMatchingFieldNames(), which returns a set of all mapped field names
  that match a pattern. Calling getFieldType() with a name returned by
  this method is guaranteed to return a non-null MappedFieldType
* getMatchingFieldTypes, that returns a collection of all MappedFieldTypes
  in a mapping that match the passed-in pattern.

This allows us to clean up several call-sites because we know that
MappedFieldTypes returned from these calls will never be null. It also
simplifies object field exists query construction.
2021-05-13 11:35:23 +01:00
Armin Braun 3dff3a48af
Allow some Repository Settings to be Updated Dynamically (#72543)
This commit serves two purposes. For one, we need the ability to dynamically
update a repository setting for the encrypted repository work.

Also, this allows dynamically updating repository rate limits while snapshots are
in progress. This has often been an issue in the past where a long running snapshot
made progress over a long period of time already but is going too slowly with the
current rate limit. This left no good options, either throw away the existing
partly done snapshot's work and recreate the repo with a higher rate limit to speed
things up or wait for a long time with the current rate limit.
With this change the rate limit can simply be increased while a snapshot or restore
is running and will take effect imidiately.
2021-05-11 19:56:00 +02:00
Martijn van Groningen 6689b8bf1c
Add basic alias support for data streams (#72613)
Aliases to data streams can be defined via the existing update aliases api.
Aliases can either only refer to data streams or to indices (not both).
Also the existing get aliases api has been modified to support returning
aliases that refer to data streams.

Aliases for data streams are stored separately from data streams and
and refer to data streams by name and not to the backing indices of
a data stream. This means that when backing indices are added or removed
from a data stream that then the data stream alias doesn't need to be
updated.

The authorization model for aliases that refer to data streams is the
same as for aliases the refer to indices. In security privileges can
be defined on aliases, indices and data streams. When a privilege is
granted on an alias then access is also granted on the indices that
an alias refers to (irregardless whether privileges are granted or denied
on the actual indices). The same will apply for aliases that refer
to data streams. See for more details:
https://github.com/elastic/elasticsearch/issues/66163#issuecomment-824709767

Relates to #66163
2021-05-11 09:51:05 +02:00
Armin Braun 52e7b926a9
Make Large Bulk Snapshot Deletes more Memory Efficient (#72788)
Use an iterator instead of a list when passing around what to delete.
In the case of very large deletes the iterator is a much smaller than
the actual list of files to delete (since we save all the prefixes
which adds up if the individual shard folders contain lots of deletes).
Also this commit as a side-effect adjusts a few spots in logging where the
log messages could be catastrophic in size when trace logging is activated.
2021-05-10 13:40:57 +02:00
Armin Braun bef9dab643
Cleanup BlobPath Class (#72860)
There should be a singleton for the empty version of this.
All the copying to `String[]` or use as an iterator make
no sense either when we can just use the list outright.
2021-05-10 00:10:39 +02:00
Jason Tedor 8b4b2f9534
Remove bootstrap.system_call_filter setting (#72848)
This commit removes the bootstrap.system_call_filter setting, as
starting in Elasticsearch 8.0.0 we are going to require that system call
filters be installed and that this is not user configurable. Note that
while we force bootstrap to attempt to install system call filters, we
only enforce that they are installed via a bootstrap check in production
environments. We can consider changing this behavior, but leave that for
future consideration and thus a potential follow-up change.
2021-05-07 18:46:27 -04:00
Gordon Brown 1d85cb6481
Improve cleanup of Node Shutdown in tests (#72772)
Makes the following changes:
 - The node shutdown feature flag isn't set on the test runner, only the
   cluster JVMs, so we can't use it to check here. Instead, the cleanup
   now infers whether it's enabled from the shape of the first
   GET `_nodes/shutdown` response.
 - Now uses `adminClient()` instead of `client()`
 - Removes the unnecessary `instanceof` check, which was *not* due to parsing,
   but the fact that `nodes` is indeed a map if the feature flag isn't enabled.
2021-05-06 10:15:00 -06:00
Rene Groeschke e609e07cfe
Remove internal build logic from public build tool plugins (#72470)
Extract usage of internal API from TestClustersPlugin and PluginBuildPlugin and related plugins and build logic

This includes a refactoring of ElasticsearchDistribution to handle types
better in a way we can differentiate between supported Elasticsearch
Distribution types supported in TestCkustersPlugin and types only supported
in internal plugins.

It also introduces a set of internal versions of public plugins.

As part of this we also generate the plugin descriptors now.

As a follow up on this we can actually move these public used classes into 
an extra project (declared as included build)

We keep LoggedExec and VersionProperties effectively public And workaround for RestTestBase
2021-05-06 14:02:35 +02:00
Jim Ferenczi 051bbb2238
Fix early termination of search request with sort optimization (#72683)
The query phase applies an optimization when sorting by a numeric field.
This optimization doesn't handle early termination correctly when `timeout`
and/or `terminate_after` are used. An IAE exception is thrown at the shard
level when the timeout is reached.
This commit fixes the bug, early terminated exceptions are correctly caught
and the result is computed from the documents that the shard was able to collect
before the termination.

Closes #72661
2021-05-06 09:47:47 +02:00
Jim Ferenczi eb8d7e2aaf
Add a test module to simulate errors and warnings in search requests (#71674)
This change adds a test module called `error-query` that exposes a
query builder to simulate errors and warnings on shard search request.
The query accepts a list of indices and shard ids where errors or warnings
should be reported:
```
POST test*/_search
{
    "query": {
        "error_query": {
            "indices": [
                {
                    "name": "test_exception",
                    "shard_ids": [1],
                    "error_type": "exception",
                    "message": "boom"
                },
                {
                    "name": "test_warn*",
                    "error_type": "warning",
                    "message": "Watch out!"
                }
            ]
        }
    }
}
```

The `error_type` can be set to `exception` or `warning` and the `name` accepts
simple patterns, aliases and fully qualified index name if the search targets remote shards.

This module is published only within snapshots like the other test modules.

Relates #70784
2021-05-06 09:42:08 +02:00
Armin Braun 0220dfb3fe
Dry up Hashing BytesReference (#72443)
Dries up the efficient way to hash a bytes reference and makes use
of it in a few other spots that were needlessly copying all bytes in
the bytes reference for hashing.
2021-05-06 06:32:52 +02:00
Gordon Brown 9ce7a5a80b
Clean up Node Shutdown metadata in test cleanup (#72726)
This commit ensures that node shutdown metadata is cleaned up between
tests, as it causes unrelated tests to fail if a test leaves node
shutdown metadata in place.
2021-05-05 10:44:57 -06:00
Nhat Nguyen 80a5f3ac0d
Remove TombstoneDocSupplier from EngineConfig (#72593)
With #2251, we can create delete and noop tombstones directly.

Relates #72251
2021-05-05 12:00:37 -04:00
Armin Braun 70f1e8c33d
Make GetSnapshotsAction Cancellable (#72644)
If this runs needlessly for large repositories (especially in timeout/retry situations)
it's a significant memory+cpu hit => made it cancellable like we recently did for many
other endpoints.
2021-05-04 18:05:31 +02:00
Luca Cavanna 52b0d8ea37
Remove DocumentMapperForType (#72616)
DocumentMapperForType is used to create a document mapper when no mappings exists for an index and we are indexing the first document in it. This is only to cover for the edge case of empty docs, without any fields to dynamically map, being indexed, as we need to ensure that any index with at least one document in it has some mappings.

We can replace using DocumentMapperForType with the same logic that MapperService#documentMapperWithAutoCreate includes. This also helps clean up the only case where we create a DocumentMapper from its public constructor, which can be removed and replaced by a more targeted static method.
2021-05-04 11:56:50 +02:00
Luca Cavanna b92b9d1c94
Replace some DocumentMapper usages with MappingLookup (#72400)
We recently replaced some usages of DocumentMapper with MappingLookup in the search layer, as document mapper is mutable which can cause issues. In order to do that, MappingLookup grew and became quite similar to DocumentMapper in what it does and holds.

In many cases it makes sense to use MappingLookup instead of DocumentMapper, and we may even be able to remove DocumentMapper entirely in favour of MappingLookup in the long run.

This commit replaces some of its straight-forward usages.
2021-05-03 09:42:37 +02:00
Armin Braun 6778020301
Use Leak Tracking Infrastruture in MockPageCacheRecycler (#72477)
The leak tracking can be run for every test while the existing solution would only work
with a very limited set of tests giving us no coverage on pages that weren't acquired through
the mock transport service.
2021-04-30 21:52:20 +02:00
Luca Cavanna cdf1fc3394
Consolidate and clarify MappingLookup semantics (#72557)
MappingLookup has been introduced to expose a snapshot of the mappings to the search layer. We have been using it more and more over time as it is convenient and always non null.

This commit documents some of its semantics and makes it easier to trace when it is created with limited functionalities (without a document parser, index settings and index analyzers).
2021-04-30 16:58:47 +02:00
Alan Woodward 7d812b9f78
Handle tombstone building entirely within ParsedDocument (#72251)
DocumentMapper contains some complicated logic to load
metadata fields so that it can build tombstone documents.
However, we only actually need three metadata mappers for
this purpose, and they are all stateless so this logic is
unnecessary. This commit adds two new static methods to
ParsedDocument to build no-op and delete tombstones,
and removes some ceremony elsewhere.
2021-04-30 12:20:53 +01:00
Ignacio Vera 4fff3788f3
Disallow creating geo_shape mappings with deprecated parameters (#70850)
With the introduction of BKD-based geo shape indexing in #32039, the prefix tree indexing method has 
been deprecated. From 8.0.0, it will not be allowed to create new mappings using deprecated parameters.
2021-04-30 11:08:58 +02:00
Alan Woodward 009f23e7a9
Explicitly say if stored fields aren't supported in MapperTestCase (#72474)
MapperTestCase has a check that if a field mapper supports stored fields,
those stored fields are available to index time scripts. Many of our mappers
do not support stored fields, and we try and catch this with an assumeFalse
so that those mappers do not run this test. However, this test is fragile - it
does not work for mappers created with an index version below 8.0, and it
misses mappers that always store their values, e.g. match_only_text.

This commit adds a new supportsStoredField method to MapperTestCase,
and overrides it for those mappers that do not support storing values. It
also adds a minimalStoredMapping method that defaults to the minimal
mapping plus a store parameter, which is overridden by match_only_text
because storing is not configurable and always available on this mapper.
2021-04-30 08:59:56 +01:00
Ryan Ernst 6a7298e555
Make NodeEnvironment.nodeDataPaths singular (#72432)
This commit renames the nodeDataPaths method to be singular and return a
single Path instead of an array. This is done in isolation from other
NodeEnvironemnt methods to make it reviewable.

relates #71205
2021-04-29 14:40:26 -07:00
Francisco Fernández Castaño 4e9f9ec64c
Add support for Rest XPackUsage task cancellation (#72304) 2021-04-28 18:16:31 +02:00
David Turner f72fa49749
Fix S3HttpHandler chunked-encoding handling (#72378)
The `S3HttpHandler` reads the contents of the uploaded blob, but if the
upload used chunked encoding then the reader would skip one or more
`\r\n` sequences if they appeared at the start of a chunk.

This commit reworks the reader to be stricter about its interpretation
of chunks, and removes some indirection via streams since we can work
pretty much entirely on the underlying `BytesReference` instead.

Closes #72358
2021-04-28 15:13:48 +01:00
David Turner 01aad86d04
Remove spurious docker volume from S3 fixture (#72388) 2021-04-28 15:11:31 +01:00
Ryan Ernst b1eab79f4c
Make Environment.dataFiles singular (#72327)
The path.data setting is now a singular string, but the method
dataFiles() that gives access to the path is still an array. This commit
renames the method and makes the return type a single Path.

relates #71205
2021-04-27 19:48:29 -07:00
Nik Everett 5f281ceedd
Prevent `date_histogram` from OOMing (#72081)
This prevents the `date_histogram` from running out of memory allocating
empty buckets when you set the interval to something tiny like `seconds`
and aggregate over a very wide date range. Without this change we'd
allocate memory very quickly and throw and out of memory error, taking
down the node. With it we instead throw the standard "too many buckets"
error.

Relates to #71758
2021-04-27 14:41:52 -04:00
Ryan Ernst d933ecd26c
Convert path.data to String setting instead of List (#72282)
Since multiple data path support has been removed, the Setting no longer
needs to support multiple values. This commit converts the
PATH_DATA_SETTING to a String setting from List<String>.

relates #71205
2021-04-27 08:29:12 -07:00
bellengao eaa59fbc41
Enhance and add more tests for ResizeRequest (#68502) 2021-04-26 15:01:47 -04:00
Armin Braun 47c77160ef
Fix ListenableFuture Resolving Listeners under Mutex (#71943) (#72087)
We shouldn't loop over the listeners under the mutex in `done` since in most use-cases we used `DirectExecutorService`
with this class.
Also, no need to create an `AbstractRunnable` for direct execution. We use this listener on the hot path in authentication
making this a worthwhile optimization I think.
Lastly, no need to clear and thus loop over `listeners`, the list is not used again after the `done` call returns anyway
so no point in retaining it at all (especially when in a number of use cases we add listeners only after the `done` call
so we can also save the instantiation by making the field non-final).
2021-04-26 19:33:34 +02:00
Alan Woodward 2560798488
Remove MapperService.simpleMatchToFullname() (#72244)
This just delegates to mappingLookup().simpleMatchToFullName(), and
is only called in two places.
2021-04-26 16:54:23 +01:00
Rene Groeschke 5bcd02cb4d
Restructure build tools java packages (#72030)
Related to #71593 we move all build logic that is for elasticsearch build only into
the org.elasticsearch.gradle.internal* packages

This makes it clearer if build logic is considered to be used by external projects
Ultimately we want to only expose TestCluster and PluginBuildPlugin logic
to third party plugin authors.

This is a very first step towards that direction.
2021-04-26 14:53:55 +02:00
Ryan Ernst 6aa0735177
Fail when using multiple data paths (#72184)
This commit converts the deprecation messages for multiple data paths
into errors. It effectively removes support for multiple data paths.

relates #71205
2021-04-24 15:45:27 -07:00
Igor Motov 50d0ebb50e
Fix close_to assertion (#72187)
Fixes the assertion to actually assert and adds a test to check that it actually does that.
2021-04-23 15:58:07 -10:00
Julie Tibshirani fdf254335f
Remove more references to query_and_fetch. (#71988)
This search type was deleted several releases ago.
2021-04-23 09:19:57 -07:00
Nik Everett 39fee5e908
Fix composite early termination on sorted (#72101)
I broke composite early termination when reworking how aggregations'
contact for `getLeafCollector` around early termination in #70320. We
didn't see it in our tests because we weren't properly emulating the
aggregation collection stage. This fixes early termination by adhering
to the new contract and adds more tests.

Closes #72078

Co-authored-by: Benjamin Trent <4357155+benwtrent@users.noreply.github.com>
2021-04-22 14:32:26 -04:00
Armin Braun ede947fdd8
Refactor Repository#snapshotShard (#72083)
Create a class for holding the large number of arguments to this method
and to dry up resource handling across snapshot shard service and the
source-only repository.
2021-04-22 16:42:31 +02:00
Luca Cavanna 1d514c53cb
Remove MapperService#parse method (#72080)
We have recently split DocumentMapper creation from parsing Mapping. There was one method leftover that exposed parsing mapping into DocumentMapper, which is generally not needed. Either you only need to parse into a Mapping instance, which is more lightweight, or like in some tests you need to apply a mapping update for which you merge new mappings and get the resulting document mapper. This commit addresses this and removes the method.
2021-04-22 16:08:34 +02:00
David Turner aa5d1948ba
Introduce RepositoryData.SnapshotDetails (#71826)
Today we track a couple of values for each snapshot in the top-level
`RepositoryData` blob: the overall snapshot state and the version of
Elasticsearch which took the snapshot. In the blob these values are
fields of the corresponding snapshot object, but in code they're kept in
independent maps. In the near future we would like to track some more
values in the same way, but adding a new field for every tracked value
is a little ugly. This commit refactors these values into a single
object, `SnapshotDetails`, so that it can be more easily extended in
future.
2021-04-22 14:48:39 +01:00
Armin Braun 5f69ee3fbb
Ensure GCS Repository Metadata Blob Writes are Atomic (#72051)
In the corner case of uploading a large (>5MB) metadata blob we did not set content validation
requirement on the upload request (we automatically have it for smaller requests that are not resumable
uploads). This change sets the relevant request option to enforce a MD5 hash check when writing
`BytesReference` to GCS (as is the case with all but data blob writes)

closes #72018
2021-04-22 11:27:53 +02:00
Adrien Grand 83113ec8d3
Add `match_only_text`, a space-efficient variant of `text`. (#66172)
This adds a new `match_only_text` field, which indexes the same data as a `text`
field that has `index_options: docs` and `norms: false` and uses the `_source`
for positional queries like `match_phrase`. Unlike `text`, this field doesn't
support scoring.
2021-04-22 08:41:47 +02:00
Alan Woodward 72f9c4c122
Add null-field checks to shape field mappers (#71999)
#71696 introduced a regression to the various shape field mappers,
where they would no longer handle null values. This commit fixes
that regression and adds a testNullValues method to MapperTestCase
to ensure that all field mappers correctly handle nulls.

Fixes #71874
2021-04-21 15:54:22 +01:00
Hendrik Muhs a96cad4137
re-factor top-metrics to use a generic multi value output interface (#71903)
re-factor top-metrics to return generic(non-numeric) multi value
aggregation results
2021-04-21 16:12:11 +02:00
Jay Modi a7dbb31765
Add Fleet action results system data stream (#71667)
This commit adds support for system data streams and also the first use
of a system data stream with the fleet action results data stream. A
system data stream is one that is used to store system data that users
should not interact with directly. Elasticsearch will manage these data
streams. REST API access is available for external system data streams
so that other stack components can store system data within a system
data stream. System data streams will not use the system index read and
write threadpools.
2021-04-20 13:33:12 -06:00
Nik Everett 6e0e6255a5
Remove some extra reproduce info (#71706)
This drops a few properties from the reproduction info printed when a
test fails because it is implied by the build:
* `tests.security.manager`
* `tests.rest.suite`
* `tests.rest.blacklist`

The two `tests.rest` properties a set by the build *and* duplicate the
`--test` output!

Closes #71290
2021-04-20 08:34:47 -04:00
Henning Andersen 9d6ce2c8d6
Frozen autoscaling decider based on storage pct (#71756)
The frozen tier partially downloads shards only. This commit
introduces an autoscaling decider that scales the total storage
on the tier according to a configurable percentage relative to
the total data set size.
2021-04-20 14:09:07 +02:00
Luca Cavanna d8057bfe71
Rename on_script_error options to fail or continue (#71841)
As we started thinking about applying on_script_error to runtime fields, to handle script errors at search time, we would like to use the same parameter that was recently introduced for indexed fields. We decided that continue or fail gives a better indication of the behaviour compared to the current ignore or reject which is too specific to indexing documents.

This commit applies such rename.
2021-04-20 09:59:42 +02:00
Henning Andersen 4312bf31c9
Add force single data path option for integ tests (#71868)
Some functionality will no longer work with multiple data paths and in
order to run integration tests for that, we need the capability to
force a single data path for those tests.

Relates #71844
2021-04-20 08:18:28 +02:00
David Turner c8fb9aad40
Track index details in SnapshotInfo (#71754)
This commit adds some per-index statistics to the `SnapshotInfo` blob:

- number of shards
- total size in bytes
- maximum number of segments per shard

It also exposes these statistics in the get snapshot API.
2021-04-19 14:57:32 +01:00
Przemyslaw Gomulka 3ef5e4c6e7
[Rest Compatible Api] include_type_name parameter (#70966)
This commit allows to use the include_type_name parameter with the compatible rest api.
The support for include_type_name was previously removed in #48632

relates #51816
types removal meta issue #54160
2021-04-19 15:21:24 +02:00
Dan Hermann eb345b2a8f
Deprecate legacy index template API endpoints (#71309) 2021-04-16 08:07:28 -05:00
Igor Motov 02eef40a45
Tests: add support for close_to assertion (#71590)
Adds support for close_to assertion to yaml tests. The assertion can be called
the following way:
```
  - close_to:   { get.fields._routing: { value: 5.1, error: 0.00001 } }
```
Closes #71303
2021-04-15 17:11:37 -10:00
Julie Tibshirani a18a65565a
Fix SearchReplicaSelectionIT failures (#71507)
This PR makes sure MockSearchService collects ARS statistics. Before, if we
randomly chose to use MockSearchService then ARS information would be missing
and the test would fail.

Also makes the following fixes:
* Removes a test workaround for the bug #71022, which is now fixed.
* Handle the case where nodes have same rank, to prevent random failures.
2021-04-15 08:31:37 -07:00
Nik Everett 1d69985dc9
Speed up terms agg when not force merged (#71241)
This speeds up the `terms` aggregation when it can't take the fancy
`filters` path, there is more than one segment, and any of those
segments have only a single value for the field. These three things are
super common.

Here are the performance change numbers:
```
|        50th percentile latency | date-histo-string-terms-via-global-ords | 3414.02 | 2632.01 | -782.015 | ms |
|        90th percentile latency | date-histo-string-terms-via-global-ords | 3470.91 | 2756.88 | -714.031 | ms |
|       100th percentile latency | date-histo-string-terms-via-global-ords | 3620.89 | 2875.79 | -745.102 | ms |
|   50th percentile service time | date-histo-string-terms-via-global-ords | 3410.15 | 2628.87 | -781.275 | ms |
|   90th percentile service time | date-histo-string-terms-via-global-ords | 3467.36 | 2752.43 | -714.933 | ms |   20%!!!!
|  100th percentile service time | date-histo-string-terms-via-global-ords | 3617.71 | 2871.63 | -746.083 | ms |
```

This works by hooking global ordinals into `DocValues.unwrapSingleton`.
Without this you could unwrap singletons *if* the segment's ordinals
aligned exactly with the global ordinals. If they didn't we'd return an
doc values iterator that you can't unwrap. Even if the segment ordinals
were singletons.

That speeds up the terms aggregator because we have a fast path we can
take if we have singletons. It was previously only working if we had a
single segment. Or if the segment's ordinals lined up exactly. Which,
for low cardinality fields is fairly common. So they might not benefit
from this quite as much as high cardinality fields.

Closes #71086
2021-04-15 08:27:28 -04:00
Nik Everett 2d6f8d1e0c
Add integration tests for filters (#69439)
Revamps the integration tests for the `filter` agg to be more clear and
builds integration tests for the `fitlers` agg. Both of these
integration tests are fairly basic but they do assert that the aggs
work.
2021-04-14 16:54:23 -04:00
Julie Tibshirani 318bf14126
Introduce `combined_fields` query (#71213)
This PR introduces a new query called `combined_fields` for searching multiple
text fields. It takes a term-centric view, first analyzing the query string
into individual terms, then searching for each term any of the fields as though
they were one combined field. It is based on Lucene's `CombinedFieldQuery`,
which takes a principled approach to scoring based on the BM25F formula.

This query provides an alternative to the `cross_fields` `multi_match` mode. It
has simpler behavior and a more robust approach to scoring.

Addresses #41106.
2021-04-14 13:33:19 -07:00
Jake Landis aaf1bb6400
Fix 4 path segment Window's REST test blacklist/repo line (#71660)
Normally there are only 3 parts to a YAML REST test
`api/name/test section name` where `api` is sourced 
from the filesystem, a relative path from the root of 
the tests. `name` is the filename of the test minus the `.yml` 
and the `test section name` is from inside the .yml file`

Some tests have use multiple directories to represent the `api`
for example `foo/bar/10_basic/My test Name` where foo/bar is the 
relative path from the root of the tests. All works fine in both 
*nix and Windows. Except for when you need to reference that `api`
(aka path from root) under Windows. Under Windows that relative path 
uses backslashes to represent the `api`.  This means that under Windows
you need to `foo\bar/10_basic/My test Name` to reproduce\execute a test. 
Additionally, due to how the regex matching is done for blacklisting tests
the backslash will never match, so it is not possible to 
blacklist a 4+ path YAML REST test for Windows. 

This commit simply ensures that the API part is always represented as a 
forward slash. This commit also removes a prior naive attempt to blacklist
on Windows. 

closes #71475
2021-04-14 14:21:59 -05:00
Jason Tedor 6823b8eb5e
Remove the ability for plugins to add roles (#71527)
This commit removes the ability for plugins to add roles. Roles are
fairly tightly coupled with the behavior of the system (as evidenced by
the fact that some roles from the default distribution leaked behavior
into the OSS distribution). We previously had this plugin extension
point so that we could support a difference in the set of roles between
the OSS and default distributions. We no longer need to maintain that
differentiation, and can therefore remove this plugin extension
point. This was technical debt that we were willing to accept to allow
the default distribution to have additional roles, but now we no longer
need to be encumbered with that technical debt.
2021-04-13 22:53:05 -04:00
Alan Woodward 78c79134b9
Forbid setting copy_to on scripted field mappers (#71621)
copy_to is currently implemented at document parse time, and does not
work with values generated from index-time scripts. We may want to add
this functionality in future, but for now this commit ensures that we throw
an exception if copy_to and script are both set on a field mapper.
2021-04-13 20:37:17 +01:00
Lyudmila Fokina 3b0b7941ae
Warn users if security is implicitly disabled (#70114)
* Warn users if security is implicitly disabled

Elasticsearch has security features implicitly disabled by default for
Basic and Trial licenses, unless explicitly set in the configuration
file.
This may be good for onboarding, but it also lead to unintended insecure
 clusters.
 This change introduces clear warnings when security features are
 implicitly disabled.
 - a warning header in each REST response if security is implicitly
 disabled;
 - a log message during cluster boot.
2021-04-13 18:33:41 +02:00
Nik Everett 57e6c78a52
Fix profiled global agg (#71575)
This fixes the `global` aggregator when `profile` is enabled. It does so
by removing all of the special case handling for `global` aggs in
`AggregationPhase` and having the global aggregator itself perform the
scoped collection using the same trick that we use in filter-by-filter
mode of the `filters` aggregation.

Closes #71098
2021-04-13 08:36:51 -04:00
Nik Everett 3583ba0eb5
Tests for runtime field queries with fbf aggs (#71503)
This adds a few tests for runtime field queries applied to
"filter-by-filter" style aggregations. We expect to still be able to
use filter-by-filter aggregations to speed up collection when the top
level query is a runtime field. You'd think that filter-by-filter would
be slow when the top level query is slow, like it is with runtime
fields, but we only run filter-by-filter when we can translate each
aggregation bucket into a quick query. So long as the results of those
queries don't "overlap" we shouldn't end up running the slower top level
query more times than we would during regular collection.

This also adds some javadoc to that effect to the two places where we
chose between filter-by-filter and a "native" aggregation
implementation.
2021-04-12 15:25:10 -04:00
Alan Woodward 5e11709693
Add scripts to keyword field mapper (#71555)
This commit adds script and on_script_error parameters to
keyword field mappers, allowing you to define index-time scripts
for keyword fields.
2021-04-12 16:46:02 +01:00
Tanguy Leroux 8a0beceeec
Centralize Lucene files extensions in one place (#71416)
Elasticsearch enumerates Lucene files extensions for various 
purposes: grouping files in segment stats under a description, 
mapping files in memory through HybridDirectory or adjusting 
the caching strategy for Lucene files in searchable snapshots.

But when a new extension is handled somewhere(let's say, 
added to the list of files to mmap) it is easy to forget to add it 
in other places. This commit is an attempt to centralize in a 
single place all known Lucene files extensions in Elasticsearch.
2021-04-12 15:58:32 +02:00
Alan Woodward 08aa65d061
Disallow multifields on mappers with index-time scripts (#71558)
Multifields are built at the same time as their parent fields, using
a positioned xcontent parser to read information. Fields with index
time scripts are built entirely differently, and it does not make sense
to combine the two.

This commit adds a base test to MapperScriptTestCase that ensures
a field mapper defined with both multifields and a script parameter
throws a parse error.
2021-04-12 14:27:10 +01:00
Luca Cavanna 1469e18c98
Add support for script parameter to boolean field mapper (#71454)
Relates to #68984
2021-04-12 10:04:12 +02:00
Jason Tedor 60808e92c1
Move voting only role to server (#71473)
This commit moves the voting only role to server, as part of the effort
to remove the ability for plugins to add roles.
2021-04-09 10:13:53 -04:00
Nhat Nguyen 5c9969250d
Allow specify dynamic templates in bulk request (#69948)
This change allows users to specify dynamic templates in a bulk request.

```
PUT myindex
{
  "mappings": {
    "dynamic_templates": [{
      "time_histograms": {
        "mapping": {
          "type": "histogram",
          "meta": {
            "unit": "s"
          }
        }
      }
    }]
  }
}
```

```
POST myindex/_bulk
{ "index": { "dynamic_templates": { "response_times": "time_histograms" } } }
{ "@timestamp": "2020-08-12", "response_times": { "values": [1, 10], "counts": [5, 1] }}
```

Closes #61939
2021-04-08 12:44:36 -04:00
Przemko Robakowski 44a2ae4893
Add GeoIP CLI integration test (#71381)
This change adds additional test to GeoIpDownloaderIT which tests that artifacts produces by GeoIP CLI tool can be consumed by cluster the same way as from our original service.
It does so by running the tool from fixture which then simply serves the generated files (this is exactly the way users are supposed to use the tool as well).

Relates to #68920
2021-04-08 12:49:29 +02:00
Alan Woodward af3f0e5069
Add MapperScriptTestCase (#71322)
When we added scripts to long and double mapped fields, we added tests
for the general scripting infrastructure, and also specific tests for those two
field types. This commit extracts those type-specific tests out into a new base
test class that we can use when adding scripts to more field mappers.
2021-04-08 11:55:08 +02:00
Nhat Nguyen bd124399c4
Ensure search contexts are released after tests (#71427)
These assertions are introduced in #71354
2021-04-07 14:08:24 -04:00
Yannick Welsch 801c50985c
Use default application credentials for GCS repositories (#71239)
Adds support for "Default Application Credentials" for GCS repositories, making it easier to set up a repository on GCP,
as all relevant information to connect to the repository is retrieved from the environment, not necessitating complicated
keystore setups.
2021-04-06 15:16:00 +02:00
Francisco Fernández Castaño e6894960f4
Include URLHttpClientIOException on URLBlobContainerRetriesTests testReadBlobWithReadTimeouts (#71318)
In some scenarios where the read timeout is too tight it's possible
that the http request times out before the response headers have
been received, in that case an URLHttpClientIOException is thrown.
This commit adds that exception type to the expected set of read timeout
exceptions.

Closes #70931
2021-04-06 14:58:57 +02:00
Christoph Büscher a07d876a93
Avoid duplicate values in MapperTestCase#testFetchMany (#71068)
The test currently generates a list of random values and checks whether
retrieval of these values via doc values is equivallent to fetching them with a
value fetcher from source. If the random value array contains a duplicate value,
we will only get one back via doc values, but fetching from source will return
both, which is a case we should probably avoid in this test.

Closes #71053
2021-04-06 10:54:35 +02:00
Ryan Ernst 6cf4eb7273
Deprecate multiple path.data entries (#71207)
This commit adds a node level deprecation log message when multiple
data paths are specified.

relates #71205
2021-04-02 14:55:36 -07:00
Jason Tedor 32314493a2
Pass override settings when creating test cluster (#71203)
Today when creating an internal test cluster, we allow the test to
supply the node settings that are applied. The extension point to
provide these settings has a single integer parameter, indicating the
index (zero-based) of the node being constructed. This allows the test
to make some decisions about the settings to return, but it is too
simplistic. For example, imagine a test that wants to provide a setting,
but some values for that setting are not valid on non-data nodes. Since
the only information the test has about the node being constructed is
its index, it does not have sufficient information to determine if the
node being constructed is a non-data node or not, since this is done by
the test framework externally by overriding the final settings with
specific settings that dicate the roles of the node. This commit changes
the test framework so that the test has information about what settings
are going to be overriden by the test framework after the test provide
its test-specific settings. This allows the test to make informed
decisions about what values it can return to the test framework.
2021-04-02 10:20:36 -04:00
Yash Jipkate 60f4d22722
Change default value of `action.destructive_requires_name` to True. (#66908)
This PR sets the default value of `action.destructive_requires_name`
to `true.` Fixes #61074. Additionally, we set this value explicitly in
test classes that rely on wildcard deletions to clear test state.
2021-03-31 15:59:57 -04:00
Jason Tedor e119ac60d4
Move data tier roles to server (#71084)
This commit moves the data tier roles to server. It is no longer
necessary to separate these roles from server as we no longer build
distributions that would not contain these roles. Moving these roles
will simplify many things. This is deliberately the smallest possible
commit that moves these roles. Other aspects related to the data tiers
can move in separate, also small, commits.
2021-03-31 15:13:02 -04:00
Przemko Robakowski 61fe14565a
Add tool for preparing local GeoIp database service (#71018)
Air-gapped environments can't simply use GeoIp database service provided by Infra, so they have to either use proxy or recreate similar service themselves.
This PR adds tool to make this process easier. Basic workflow is:

download databases from MaxMind site to single directory (either .mmdb files or gzipped tarballs with .tgz suffix)
run the tool with $ES_PATH/bin/elasticsearch-geoip -s directory/to/use [-t target/directory]
serve static files from that directory (for example with docker run -v directory/to/use:/usr/share/nginx/html:ro nginx
use server above as endpoint for GeoIpDownloader (geoip.downloader.endpoint setting)
to update new databases simply put new files in directory and run the tool again
This change also adds support for relative paths in overview json because the cli tool doesn't know about the address it would be served under.

Relates to #68920
2021-03-31 12:30:21 +02:00
Alan Woodward 1653f2fe91
Add script parameter to long and double field mappers (#69531)
This commit adds a script parameter to long and double fields that makes
it possible to calculate a value for these fields at index time. It uses the same
script context as the equivalent runtime fields, and allows for multiple index-time
scripted fields to cross-refer while still checking for indirection loops.
2021-03-31 11:14:11 +01:00
Nhat Nguyen 1edc7c6849 Mute testFetchMany
Tracked at #71053
2021-03-30 17:47:22 -04:00
Dan Hermann 2c6ba92d46
Improve data stream rollover and simplify cluster metadata validation for data streams (#70934) 2021-03-29 07:36:44 -05:00
Alan Woodward c475fd9e8a
Move runtime fields classes into common packages (#70965)
Runtime fields currently live in their own java package. This is really
a leftover from when they were in their own module; now that they are
in core they should instead live in the common packages for classes of
their kind.

This commit makes the following moves:
org.elasticsearch.runtimefields.mapper => org.elasticsearch.index.mapper
org.elasticsearch.runtimefields.fielddata => org.elasticsearch.index.fielddata
org.elasticsearch.runtimefields.query => org.elasticsearch.search.runtime

The XFieldScript fields are moved out of the `mapper` package into 
org.elasticsearch.scripts, and the `PARSE_FROM_SOURCE` default scripts
are moved from these Script classes directly into the field type classes that
use them.
2021-03-29 12:02:01 +01:00
Przemko Robakowski b025f51ece
Add support for .tgz files in GeoIpDownloader (#70725)
We have to ship COPYRIGHT.txt and LICENSE.txt files alongside .mmdb files for legal compliance. Infra will pack these in single .tgz (gzipped tar) archive provided by GeoIP databases service.
This change adds support for that format to GeoIpDownloader and DatabaseRegistry
2021-03-29 12:46:27 +02:00
Ignacio Vera a35563aaaf
Fix infinite loop when polygonizing a circle with centre on the pole (#70875)
This PR prevents the algorithm to run on circles that contain a pole.
2021-03-29 07:36:29 +02:00
Mark Vieira 6339691fe3
Consolidate REST API specifications and publish under Apache 2.0 license (#70036) 2021-03-26 16:20:14 -07:00
Francisco Fernández Castaño 3f8a9256ea
Add searchable snapshots integration tests for URL repositories (#70709)
Relates #69521
2021-03-26 15:23:44 +01:00
William Brafford 35af0bb47b
Don't use filesystem concat for resource paths in schema validation tests (#70596)
We use the `getDataPath` method to convert from a resource
location to a filesystem path in anticipation of eventually moving
the json files to a top-level directory. However, we were constructing
the resource locations using a filesystem concatenation, which,
on Windows, put backslashes in the path instead of slashes.
We will use a simple string concatenation to fix the Windows tests.
2021-03-26 10:13:18 -04:00
Dan Hermann 5077017034
Fix failing FullClusterRestartIT.testDataStreams test (#70845) 2021-03-25 06:53:48 -05:00
Nik Everett 91c700bd99
Super randomized tests for fetch fields API (#70278)
We've had a few bugs in the fields API where is doesn't behave like we'd
expect. Typically this happens because it isn't obvious what we expct. So
we'll try and use randomized testing to ferret out what we want. This adds
a test for most field types that asserts that `fields` works similarly
to `docvalues_fields`. We expect this to be true for most fields.

It does so by forcing all subclasses of `MapperTestCase` to define a
method that makes random values. It declares a few other hooks that
subclasses can override to further randomize the test.

We skip the test for a few field types that don't have doc values:
* `annotated_text`
* `completion`
* `search_as_you_type`
* `text`
We should come up with some way to test these without doc values, even
if it isn't as nice. But that is a problem for another time, I think.

We skip the test for a few more types just because I wanted to cut this
PR in half so we could get to reviewing it earlier. We'll get to those
in a follow up change.

I've filed a few bugs for things that are inconsistent with
`docvalues_fields`. Typically that means that we have to limit the
random values that we generate to those that *do* round trip properly.
2021-03-24 14:16:27 -04:00
Ignacio Vera afde502c14
Make sure forbidPrivateIndexSettings is kept during an internal cluster full restart (#70823) 2021-03-24 18:31:51 +01:00
Dan Hermann 7c3ebe220f
Remove obsolete BWC checks for data streams (#70777) 2021-03-24 07:22:40 -05:00
Tanguy Leroux efa6aea168
Prevent snapshot backed indices to be followed using CCR (#70580)
Today nothing prevents CCR's auto-follow patterns to pick 
up snapshot backed indices on a remote cluster. This can 
lead to various errors on the follower cluster that are not 
obvious to troubleshoot for a user (ex: multiple engine 
factories provided).

This commit adds verifications to CCR to make it fail faster 
when a user tries to follow an index that is backed by a 
snapshot, providing a more obvious error message.
2021-03-24 10:58:31 +01:00
Przemyslaw Gomulka e942873bd5
[REST Compatible API] Typed endpoints for Index and Get APIs (#69131)
The types removal effort has removed the type from Index API in #47671 and from Get API in #46587
This commit allows to use 'typed' endpoints for the both Index and Get APIs

relates compatible types-removal meta issue #54160
2021-03-23 10:59:21 +01:00
Alan Woodward 7dd96ef68a
Add ScriptCompiler interface (#70657)
TypeParser.ParserContext exposes a ScriptService allowing type parsers to
compile scripts from runtime fields (and in future, index-time scripts). However,
ScriptService itself is a fairly heavyweight object, and you can only add script
parsers to it via the Plugin mechanism.

To make testing easier, this commit extracts a ScriptCompiler interface and makes
that available from ParserContext instead.
2021-03-23 08:58:00 +00:00
Benjamin Trent 86e1e25663
[ML][Transform] muting testSchema test on windows (#70634)
test mute for: #70532
2021-03-22 09:29:44 -04:00
Varsha Muzumdar 6cc0670bab
Enable compiler warnings in xpack core plugin project (#66899)
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2021-03-22 11:37:44 +00:00
Martijn van Groningen ef9b1bf1ee
Made small modifications to ESRestTestCase. (#70531)
* Don't try to invoke delete component/index templates APIs if there are no templates to delete.
* Don't delete deprecation templates by marking these as xpack templates.

Relates to #69973
2021-03-18 09:40:46 +01:00
Armin Braun a0a7127c18
Create Unique Slice Names in EsIndexInputTestCase (#70490)
Low-effort fix for #70482 that makes sure that no two slices get created with the
same name but different byte ranges in a single test run which lead to conflicts with
the assumption that cfs file ranges are fixed for given names for the frozen index input.

closes #70482
2021-03-18 09:28:57 +01:00
Hendrik Muhs f1b89fad5b
add test framework for json schema validation of rest spec body's (#69902)
Rest API specs define the API's used at the rest level, however these specs
only define the endpoint and the parameters. We miss definitions for the
body, especially when it comes to rich bodies like they are used in ML. 

This change introduces an abstract testcase for json schema validation. This
allows developers to validate any object that is serializable to JSON - using
the `ToXContent` - to be tested against a json schema. You can use it for REST
input and outputs, but also for internal objects(documents) and 
`ToXContentFragments`.

As the overall goal is to ensure it validates properly, the testcase enforces
strictness. A schema file must spec all properties. This will ensure that once
a schema test has been added, it won't go out of sync. Every change to the
pojo enforces a schema update as otherwise the test would fail.

Schemas can load sub-schemas from extra files. That way you can re-use schemas
e.g. in hierarchies or re-use a schema for similar but not same interfaces.
2021-03-17 08:30:40 +01:00
Martijn van Groningen 4cf8bf9136
Enable bwc tests after backporting #70094 (#70445) 2021-03-16 11:37:27 +01:00
Nik Everett 17d56d361a
Fix percentiles agg in slow log after transport (forward port of #70318) (#70412)
If you send `percentiles` or `percentiles_ranks` over the transport
client or over cross cluster search it breaks internal components
subtly. They mostly work so we hadn't yet noticed the break. But if you
send the request to the slow log then it'll fail to log. This fixes the
subtle internal break.
2021-03-15 16:24:47 -04:00
Martijn van Groningen 715eb90fea
Support specifying multiple templates names in delete component template api (#70314)
Add support to delete component templates api to specify multiple template
names separated by a comma.

Change the cleanup template logic for rest tests to remove all component templates via a single delete component template request. This to optimize the cleanup logic. After each rest test we delete all templates. So deleting templates this via a single api call (and thus single cluster state update) saves a lot of time considering the number of rest tests.

Older versions don't support component / composable index templates
and/or data streams. Yet the test base class tries to remove objects
after each test, which adds a significant number of lines to the
log files (which slows the tests down). The ESRestTestCase will
now check whether all nodes have a specific version and then decide
whether data streams and component / composable index templates will
be deleted.

Also ensured that the logstash-index-template and security-index-template
aren't deleted between tests, these templates are builtin templates that
ES will install if missing. So if tests remove these templates between tests
then ES will add these template back almost immediately. These causes
many log lines and a lot of cluster state updates, which slow tests down.

Relates to #69973

Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
2021-03-15 13:08:49 +01:00
Martijn van Groningen 4f323b714d
Re-enable bwc tests after backporting #70094 to 7.x branch. (#70304) 2021-03-11 13:04:21 +01:00
Martijn van Groningen 36044ddb11
Support specifying multiple templates names in delete composable index template api (#70094)
Add support to delete composable index templates api to specify multiple template
names separated by a comma.

Change to cleanup template logic for rest tests to remove all composable index templates via a single delete composable index template request. This to optimize the cleanup logic. After each rest test we delete all templates. So deleting templates this via a single api call (and thus single cluster state update) saves a lot of time considering the number of rest tests.

If this pr is accepted then I will do the same change for the delete component template api.

Relates to #69973
2021-03-11 10:52:28 +01:00
Jim Ferenczi ff50da5a77
Remove the _parent_join metadata field (#70143)
This commit removes the metadata field _parent_join
that was needed to ensure that only one join field is used in a mapping.
It is replaced with a validation at the field level.
This change also fixes in [bug](https://github.com/elastic/kibana/issues/92960) in the handling of parent join fields in _field_caps.
This metadata field throws an unexpected exception in [7.11](https://github.com/elastic/elasticsearch/pull/63878)
when checking if the field is aggregatable.
That's now fixed since this unused field has been removed.
2021-03-10 09:19:30 +01:00
Przemyslaw Gomulka 9ad9c781de
Add compatible logging when parsing a compatible field (#69539)
A #68808 introduced a possibility to declare fields which will be only available to parsing when a compatible API was used.

This commit replaces deprecated log with compatible logging when a 'compatible only' field was used. Also includes a refactoring of LoggingDeprecationHandler method names

relates #51816
2021-03-09 12:29:40 +01:00
Nik Everett f2e19c1e98
Patch up fetching dates from source (#70040)
This fixes an issue that `fields` has with dates sent in the
`###.######` format.

If you send a date in the format `#####.######` we'll parse the bit
before the decimal place as the number of milliseconds since epoch and
we'll parse the bit after the decimal as the number of nanoseconds since
the start of that millisecond. This works and is convenient for some
folks. Sadly, the code that back the `fields` API for dates doesn't work
with the string format in this case - it works with a `double`. `double`
is bad for two reasons:
1. It's default string representation is scientific notation and our
   parsers don't know how to deal with that.
2. It loses precision relative to the string representation. `double`
   only has 52 bits of mantissa which can precisely store the number of
   nanoseconds until about 6am on April 15th, 1970. After that it starts
   to lose precision.

This fixed the first issue, getting us the correct string
representation is a "quick and dirty" way. It just converts the `double`
back to a string. But we still lose precision. Fixing that would require
a larger change.....
2021-03-08 15:54:33 -05:00
Armin Braun 5697643c3e
Remove Test-Only waitForProcessedOpsToComplete from LocalCheckpointTracker (#70070)
We only used this method in tests and it's somewhat needless to have a potentially
slow `notifyAll` in the hot path for assertions only when we can just busy assert in tests instead.

closes #69963
2021-03-08 16:33:10 +01:00
Martijn van Groningen 172ca20ef4
Improve template cleanup in ESRestTestCase (#70052)
Before this change upon wiping the cluster, we would get a list of all legacy and index component templates. For each template first attempt to delete it as legacy template if that returned a 404 then remove it as composable index template. In the worst case this means that we would make double the amount of delete requests for templates then is necessary.

This change first gets all composable index templates (if exist and if the cluster supports it) and then deletes these composable index templates. After this separately get a list of all legacy templates and then delete those legacy templates.

Relates to #69973
2021-03-08 12:27:25 +01:00
David Turner 60d53c0206
Stop double-starting transport service in tests (#70056)
Today in tests we often use a utility method that creates and starts a
transport service, and then we start it again in the tests anyway. This
commit removes this unnecessary code and asserts that we only ever call
`TransportService#acceptIncomingRequests` once.
2021-03-08 11:04:43 +00:00
Armin Braun b1446fd8d2
Expose Backing Byte Array for ReleasableBytesReference (#69660)
Unwrapping the `byte[]` in a `BytesArray` can be generalized
so that releasable bytes references can make use of the optimizations
that use the bytes directly as well.

Relates #67502 but already saves a few cycles here and there around
`BytesRequest` (publish and recovery actions)
2021-03-06 14:10:41 +01:00
Yannick Welsch 96e412f771
Cancel searches earlier (#69795)
Search cancellation currently does not work well in the context of searchable snapshot shards, as it requires search
tasks to fully enter the query phase (i.e. start execution on the node, loading up the searcher, which means loading up
the index on FrozenEngine and doing some actual work) to detect cancellation, which can take a while in the frozen tier,
blocking on file downloads.
2021-03-05 14:28:20 +01:00
Tanguy Leroux b3fd16db9a
Exclude partially cached .cfs file from SearchableSnapshotDirectoryStatsTests (#70006)
Since #69861 CFS files read from FrozenIndexInput create 
dedicated frozen shared cache files when they are sliced. 
This does not play well with some tests that use the 
randomReadAndSlice to read files: this method can create 
overlapping slice/clone reads operations which makes it 
difficult to assert anything about CFS files with partial cache.

This commit prevent the tests to generate a .cfs file name 
when the partial cache is randomly picked up. As a follow 
up we should rework those test to make them more realistic 
with the new behavior.

Closes #70000
2021-03-05 14:20:13 +01:00
Tanguy Leroux 0cf97f7460
Use blob store cache for Lucene compound files (#69861)
The blob store cache is used to cache a variable length of the 
begining of Lucene files in the .snapshot-blob-cache system 
index. This is useful to speed up Lucene directory opening 
during shard recovery and to limit the number of bytes 
downloaded from the blob store when a searchable snapshot 
shard must be rebuilt.

This commit adds support for compound files segment (.cfs) 
when they are partially cached (ie, Storage.SHARED_CACHE) 
so that the files they are composed of can also be cached in 
the blob store cache index.

Co-authored-by: Yannick Welsch yannick@welsch.lu
2021-03-04 19:02:23 +01:00
David Turner 864ff66f68
Unique names for bulk processor scheduler threads (#69432)
Today every `BulkProcessor` creates two scheduler threads, both called
`[node-name][scheduler][T#1]`, which is also the name of the main
scheduler thread for the node. The duplicated thread names make it
harder to interpret a thread dump.

This commit makes the names of these threads distinct.

Closes #68470
2021-03-04 12:05:24 +00:00
Armin Braun e622b2c718
Make PrimaryReplicaResyncer Fork to Generic Pool (#69949)
Reading ops from the translog snapshot must not run on the transport thread.
When sending more than one batch of ops the listener (and thus `run`) would be
invoked on the transport thread for all but the first batch of ops.
=> Forking to the generic pool like we do for sending ops during recovery.
2021-03-04 09:51:16 +01:00
Joe Gallo 638735bbb9
Rename RestApiCompatibleVersion to RestApiVersion (#69897) 2021-03-03 12:17:48 -05:00
Przemko Robakowski 02dbe33780
Update GeoIP database service URL (#69862)
This change updates GeoIP database service URL to the new https://geoip.elastic.co/v1/database and removes (now optional) key/UUID parameter.
It also fixes geoip-fixture to provide 3 different test databases (City, Country and ASN).
It also unmutes GeoIpDownloaderIT. testGeoIpDatabasesDownload with additional logging and increased timeouts which tries to address #69594
2021-03-03 14:02:34 +01:00
Luca Cavanna 63d6c94830
Move MapperRegistry to index.mapper package (#69805)
MapperRegistry is the only class under the indices.mapper package. It fits better under index.mapper together with MapperService and friends, hence this commit moves it there and removes the indices.mapper package.
2021-03-02 16:20:06 +01:00
Francisco Fernández Castaño a8190790c6
Add integration tests for repository analyser test kit (#69316)
Relates #67247
2021-03-02 11:14:29 +01:00
Mark Vieira 3144354826
Update Docker image used by minio test fixture to support Arm (#69743) 2021-03-01 15:16:38 -08:00
Jake Landis 13915bc8c1
Add support for regex in REST test warnings and allowed_warnings (#69501)
This commit adds support for two new REST test features.
warnings_regex and allowed_warnings_regex.

This is a near mirror of the warnings and allowed_warnings
warnings feature where the test can be instructed to allow
or require HTTP warnings. The difference with these new features
is that is allows the match to be based on a regular expression.
2021-03-01 15:40:39 -06:00
Jay Modi 1487a5a991
Introduce system index types including external (#68919)
This commit introduces system index types that will be used to
differentiate behavior. Previously system indices were all treated the
same regardless of whether they belonged to Elasticsearch, a stack
component, or one of our solutions. Upon further discussion and
analysis this decision was not in the best interest of the various
teams and instead a new type of system index was needed. These system
indices will be referred to as external system indices. Within external
system indices, an option exists for these indices to be managed by
Elasticsearch or to be managed by the external product.

In order to represent this within Elasticsearch, each system index will
have a type and this type will be used to control behavior.

Closes #67383
2021-03-01 10:38:53 -07:00
David Turner 257a21630e
Fix ensureGreen() timeout in REST tests (#69704)
In 2a04118e88 we moved `ensureGreen()`
from `IndexingIT` to `ESRestTestCase`, including its `70s` timeout. This
timeout makes sense in the context of an `AbstractRollingTestCase` which
has a client timeout of `90s` (#26781) but general-purpose REST tests
only have a `60s` client timeout, so if `ensureGreen()` fails then it
fails with a `SocketTimeoutException`, bypassing the useful exception
handling that log the cluster state at time of failure.

This commit reduces the `ensureGreen()` timeout for most tests, leaving
it at `70s` only for `AbstractRollingTestCase`.
2021-03-01 16:13:27 +00:00
Armin Braun bb77ab46e0
Stop Ignoring Exceptions on Close in Network Code (#69665)
We should not be ignoring and suppressing exceptions on releasing
network resources quietly in these spots.

Co-authored-by: David Turner <david.turner@elastic.co>
2021-03-01 14:38:18 +01:00
Henning Andersen c4e1074e68
Close search contexts on reassigned shard (#68539)
If a shard is reassigned to a node, but it has open searches (could be
scrolls even), the current behavior is to throw a
ShardLockObtainFailedException. This commit changes the behavior to
close the search contexts, likely failing some of the searches. The
sentiment is to prefer restoring availability over trying to complete
those searches. A situation where this can happen is when master(s) are
restarted, which is likely to cause similar search issues anyway.
2021-02-27 21:24:36 +01:00
Nik Everett 4ffdad36d4
Speed up terms agg when alone (#69377)
This speeds up the `terms` agg in a very specific case:
1. It has no child aggregations
2. It has no parent aggregations
3. There are no deleted documents
4. You are not using document level security
5. There is no top level query
6. The field has global ordinals
7. There are less than one thousand distinct terms

That is a lot of restirctions! But the speed up pretty substantial because
in those cases we can serve the entire aggregation using metadata that
lucene precomputes while it builds the index. In a real rally track we
have we get a 92% speed improvement, but the index isn't *that* big:

```
| 90th percentile service time | keyword-terms-low-cardinality |     446.031 |     36.7677 | -409.263 |     ms |
```

In a rally track with a larger index I ran some tests by hand and the
aggregation went from 2200ms to 8ms.

Even though there are 7 restrictions on this, I expect it to come into
play enough to matter. Restriction 6 just means you are aggregating on
a `keyword` field. Or an `ip`. And its fairly common for `keyword`s to
have less than a thousand distinct values. Certainly not everywhere, but
some places.

I expect "cold tier" indices are very very likely not to have deleted
documents at all. And the optimization works segment by segment - so
it'll save some time on each segment without deleted documents. But more
time if the entire index doesn't have any.

The optimization builds on #68871 which translates `terms` aggregations
against low cardinality fields with global ordinals into a `filters`
aggregation. This teaches the `filters` aggregation to recognize when
it can get its results from the index metadata. Rather, it creates the
infrastructure to make that fairly simple and applies it in the case of
the queries generated by the terms aggregation.
2021-02-25 14:15:57 -05:00
Mark Vieira 6bc4b5612e
Update test fixture to avoid writing to /etc/hosts file (#69583) 2021-02-25 09:59:02 -08:00
Nik Everett a76c32d25a
Test: Clean up a stray NPE (#69566)
When we test the REST actions we assumed that they'd produce a result
but one of the mocking/verification mechanisms didn't. This forces it to
produce a result. It uses some generics dancing to force the calling
code to mock things with the appropriate type even though we don't its
only a compile time guarantee. So long as callers aren't rude its safe.
2021-02-25 08:19:48 -05:00
Armin Braun c2370ffde5
Add Leak Tracking Infrastructure (#67688)
This commit adds leak tracking infrastructure that enables assertions
about the state of objects at GC time (simplified version of what Netty
uses to track `ByteBuf` instances).
This commit uses the infrastructure to improve the quality of leak
checks for page recycling in the mock nio transport (the logic in
`org.elasticsearch.common.util.MockPageCacheRecycler#ensureAllPagesAreReleased`
does not run for all tests and tracks too little information to allow for debugging
what caused a specific leak in most cases due to the lack of an equivalent of the added
`#touch` logic).

Co-authored-by: David Turner <david.turner@elastic.co>
2021-02-25 11:41:48 +01:00
David Turner 61e6734217
Make recovery APIs cancellable (#69177)
Relates #55550
2021-02-25 09:24:03 +00:00
David Turner 4c300f7347
Fix testTracerLog (#69546)
Makes sure that we assert every expected message in this test, because
if we don't then we might shut the appender down too early.

Reverts #68267
Closes #66630
2021-02-24 16:29:22 +00:00
Przemko Robakowski 6e6d5a29ee
Add ToS query parameter to GeoIP downloader (#69495)
This change adds query parameter confirming that we accept ToS of GeoIP database service provided by Infra.
It also changes integration test to use lower timeout when using local fixture.

Relates to #68920
2021-02-24 10:54:54 +01:00
Mark Vieira 728dce0e2d Remove unnecessary `throws` declaration from javadoc 2021-02-23 16:22:38 -08:00
Hendrik Muhs d0ea206e30
[CI] wait for initializing shards on teardown in ESSingleNodeTestCase (#69186)
ensure shards aren't initializing at test teardown, so indexes that are initializing are not missed
for deletion.

fixes #69057
2021-02-23 20:34:46 +01:00
Przemko Robakowski 2ba3e929e7
GeoIP database downloader (#68424)
This change adds component that will download new GeoIP databases from infra service
New databases are downloaded in chunks and stored in .geoip_databases index
Downloads are verified against MD5 checksum provided by the server
Current state of all stored databases is stored in cluster state in persistent task state

Relates to #68920
2021-02-23 19:41:18 +01:00