Commit Graph

4649 Commits

Author SHA1 Message Date
Mayya Sharipova 2ec1f6b4c4
Fix testIndexhasDuplicateData tests (#49786)
testIndexHasDuplicateData tests were failing ocassionally,
due to approximate calculation of BKDReader.estimatePointCount,
where if the node is Leaf, the number of points in it
was (maxPointsInLeafNode + 1) / 2.
As DEFAULT_MAX_POINTS_IN_LEAF_NODE = 1024, for small indexes
used in tests, the estimation could be really off.

This rewrites tests, to make the  max points in leaf node to
be a small value to control the tests.

Closes #49703
2020-03-19 13:09:50 -04:00
David Turner c1dc5238da
Apply cluster states in system context (#53785)
Today cluster states are sometimes (rarely) applied in the default context
rather than system context, which means that any appliers which capture their
contexts cannot do things like remote transport actions when security is
enabled.

There are at least two ways that we end up applying the cluster state in the
default context:

1. locally applying a cluster state that indicates that the master has failed
2. the elected master times out while waiting for a response from another node

This commit ensures that cluster states are always applied in the system
context.

Mitigates #53751
2020-03-19 14:13:52 +00:00
Ignacio Vera 6eb698bc6d
Add support for distance queries on geo_shape queries (#53466)
With the upgrade to Lucene 8.5, LatLonShape field has support for distance queries. This change implements this new feature and removes the limitation.
2020-03-19 14:16:08 +01:00
Alan Woodward c6cdd3a4c2 Revert "Report parser name and location in XContent deprecation warnings (#53752)"
This reverts commit 7636930ceb.

There is some randomization in the YAML test suite which means we can't check
for exact xcontentlocation in the deprecation warning headers.
2020-03-19 12:29:42 +00:00
Alan Woodward 7636930ceb
Report parser name and location in XContent deprecation warnings (#53752)
It's simple to deprecate a field used in an ObjectParser just by adding deprecation
markers to the relevant ParseField objects. However, the warnings themselves don't 
currently have any context; they simply say that a deprecated field has been used, 
but not where in the input it appears. This commit adds the parent object parser
name and XContentLocation to these deprecation messages.
2020-03-19 11:26:04 +00:00
Alan Woodward 5c8cd16ab3
TermsLookup uses ObjectParser for x-content parsing (#53733)
This commit refactors the fromXContent method in TermsLookup to use an
ObjectParser and adds an explicit parsing test.

Related to #53731
2020-03-19 10:21:43 +00:00
Przemyslaw Gomulka 1dac8dffc6
Fix NPE when logging null values in JSON (#53715)
Slow log's routing value when null was causing ESLogMessage.asJson
method to throw Null Pointer Exception. This should be fixed in
ESLogMessage as well as prevent passing that key at all.
This only happens in 8 because of previous refactoring #46702
2020-03-19 08:48:24 +01:00
Jim Ferenczi a2b428f5b4
Disable distributed sort optimization on scroll requests (#53759)
This commit disables the sort optimization added in #51852 for scroll requests.
Scroll queries keep a state per shard so we cannot modify the request on
the first round (submit).
This bug was introduced in non-released versions which is why this pr
is marked as a non-issue.
2020-03-19 08:10:34 +01:00
Jason Tedor ca7a135e08
Improve performance of shards limits decider (#53577)
On clusters with a large number of shards, the shards limits allocation
decider can exhibit poor performance leading to timeouts applying
cluster state updates. This occurs because for every shard, we do a loop
to count the number of shards on the node, and the number of shards for
the index of the shard. This is roughly quadratic in the number of
shards. This loop is not necessary, since we already have a O(1) method
to count the number of non-relocating shards on a node, and with this
commit we add some infrastructure to RoutingNode to make counting the
number of shards per index O(1).
2020-03-18 20:57:50 -04:00
Andy Bristol 73d2addaa2
supported field type tests for max agg (#53701)
Adds test hooks for testing supported ValuesSource types for the max
aggregation
2020-03-18 13:19:13 -07:00
Stuart Tettemer 070ea7eff1
Scripting: Per-context script cache, default off (#52855)
* Adds per context settings:
  `script.context.${CONTEXT}.cache_max_size` ~
  `script.cache.max_size`

  `script.context.${CONTEXT}.cache_expire` ~
  `script.cache.expire`

  `script.context.${CONTEXT}.max_compilations_rate` ~
  `script.max_compilations_rate`

* Context cache is used if:
  `script.max_compilations_rate=use-context`.  This
  value is dynamically updatable, so users can
  switch back to the general cache if desired.

* Settings for context caches take the first value 
  that applies:
  1) Context specific settings if set, eg
     `script.context.ingest.cache_max_size`
  2) Correlated general setting is set to the non-default 
     value, eg `script.cache.max_size`
  3) Context default

The reason for 2's inclusion is to allow an easy
transition for users who've customized their general
cache settings.

Using the general cache settings for the context caches
results in higher effective settings, since they are 
multiplied across the number of contexts.  So a general
cache max size of 200 will become 200 * # of contexts.
However, this behavior it will avoid users snapping to a
value that is too low for them.


Refs: #50152
2020-03-18 12:31:30 -06:00
Dominic Page d1cbdfb753
Geo shape query vs geo point (#52382)
Enable geo_shape query to work on geo_point fields for shapes: circle, polygon, multipolygon, rectangle

see: #48928

Co-Authored-By:  @iverase
2020-03-18 17:03:52 +01:00
Alan Woodward e23311ce51
Make it possible to deprecate all variants of a ParseField with no replacement (#53722)
Sometimes we want to deprecate and remove a ParseField entirely, without replacement;
for example, the various places where we specify a _type field in 7x. Currently we can
tell users only that a particular field name should not be used, and that another name should
be used in its place. This commit adds the ability to say that a field should not be used at
all.
2020-03-18 14:15:30 +00:00
Marios Trivyzas 6b5fc35e44
Increase step between checks for cancellation (#53712)
The introduction of the ExitableDirectoryReader showed increase of
latencies for range queries using pointvalues.

Check for cancellation every 1024 docs instead of every 15 to lower
the impact of the check in query's performance.

Follows: #52822
Fixes: #53496
2020-03-18 14:52:03 +01:00
Jim Ferenczi de9e44fb68
Adapt serialization version checks in ShardSearchRequest (#53660)
This change adapts the serialization checks to 7.7.0 in order to cope with #53659.
Note that this commit also disables the bwc tests temporarily in order to be able to
merge #53659 first.

Relates #51852
2020-03-18 14:14:54 +01:00
Tanguy Leroux e1096b9457
Restore off-heap loading for term dictionary in ReadOnlyEngine (#53713)
This is a partial restore of #43158, following decision taken in #51247

Closes #51247
2020-03-18 13:23:45 +01:00
Alan Woodward 795a92707f
Remove deprecation warning when doc scripts refer to '_type' field (#53605)
We currently emit a warning in 8x when a script refers to the _type field for
a document. However, in 8x this field no longer exists, so the deprecation
warning is not required.

Relates to #41059
2020-03-18 11:50:36 +00:00
Tianlun Li fe5092ae24
Deprecate delaying state recovery for master nodes (#53646)
It is useful to be able to delay state recovery until enough data nodes have
joined the cluster, since this gives the shard allocator a decent opportunity
to re-use as much existing data as possible. However we also have the option to
delay state recovery until a certain number of master-eligible nodes have
joined, and this is unnecessary: we require a majority of master-eligible nodes
for state recovery, and there is no advantage in waiting for more.

This commit deprecates the unnecessary settings in preparation for their
removal.

Relates #51806
2020-03-18 10:03:21 +00:00
Ryan Ernst d63cda1bcb
Upgrade jackson to 2.10.3 and GeoIP to 2.13.1 (#53642)
Re-applies the change from #53523 along with test fixes.

closes #53626
closes #53624
closes #53622
closes #53625

Co-authored-by: Nik Everett <nik9000@gmail.com>
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: Jake Landis <jake.landis@elastic.co>
2020-03-17 10:26:35 -07:00
Lee Hinman 263e525e49
Add REST API for ComponentTemplate CRUD (#53558)
* Add REST API for ComponentTemplate CRUD

This adds the Put/Get/DeleteComponentTemplate APIs that allow inserting, retrieving, and removing
ComponentTemplateMetadata into the cluster state metadata.

These APIs are currently only available behind a feature flag system property -
`es.itv2_feature_flag_registered`.

Relates to #53101

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-03-17 10:55:07 -06:00
Andy Bristol a295161697
add tests to SumAggregatorTests (#53568)
This adds tests for supported ValuesSourceTypes, unmapped fields,
scripting, and the missing param. The tests for unmapped fields and
scripting are migrated from the SumIT integration test
2020-03-17 09:44:29 -07:00
Alan Woodward 3e607d9e93
Rename AtomicFieldData to LeafFieldData (#53554)
This conforms with lucene's LeafReader naming convention, and
matches other per-segment structures in elasticsearch.
2020-03-17 12:25:51 +00:00
Jim Ferenczi ff94792e41
Shortcut query phase using the results of other shards (#51852)
This commit, built on top of #51708, allows to modify shard search requests based on informations collected on other shards. It is intended to speed up sorted queries on time-based indices. For queries that are only interested in the top documents.

This change will rewrite the shard queries to match none if the bottom sort value computed in prior shards is better than all values in the shard.
For queries that mix top documents and aggregations this change will reset the size of the top documents to 0 instead of rewriting to match none.
This means that we don't need to keep a search context open for this shard since we know in advance that it doesn't contain any competitive hit.
2020-03-17 10:54:44 +01:00
Jason Tedor 41e3b4aa90
Invoke response handler on failure to send (#53631)
Today it can happen that a transport message fails to send (for example,
because a transport interceptor rejects the request). In this case, the
response handler is never invoked, which can lead to necessary cleanups
not being performed. There are two ways to handle this. One is to expect
every callsite that sends a message to try/catch these exceptions and
handle them appropriately. The other is merely to invoke the response
handler to handle the exception, which is already equipped to handle
transport exceptions.
2020-03-16 21:27:02 -04:00
Jason Tedor 87dc720dac
Update server name serialization version
This commit updates the serialization version for the server name on the
proxy mode info, used in the remote info API.
2020-03-16 21:21:31 -04:00
Jason Tedor 2abf40a6b6
Add server name to remote info API (#53634)
This commit adds the configured server_name to the proxy mode info so
that it can be exposed in the remote info API.
2020-03-16 21:07:55 -04:00
Zachary Tong 84a59f8447
Add scripting, supported-type tests to ValueCount (#53500)
Also adds a few small notes to the documentation regarding potentially
unintuitive behavior
2020-03-16 15:15:25 -04:00
Nik Everett 4d81edb625
Stop using round-tripped PipelineAggregators (#53423)
This begins to clean up how `PipelineAggregator`s and executed.
Previously, we would create the `PipelineAggregator`s on the data nodes
and embed them in the aggregation tree. When it came time to execute the
pipeline aggregation we'd use the `PipelineAggregator`s that were on the
first shard's results. This is inefficient because:
1. The data node needs to make the `PipelineAggregator` only to
   serialize it and then throw it away.
2. The coordinating node needs to deserialize all of the
   `PipelineAggregator`s even though it only needs one of them.
3. You end up with many `PipelineAggregator` instances when you only
   really *need* one per pipeline.
4. `PipelineAggregator` needs to implement serialization.

This begins to undo these by building the `PipelineAggregator`s directly
on the coordinating node and using those instead of the
`PipelineAggregator`s in the aggregtion tree. In a follow up change
we'll stop serializing the `PipelineAggregator`s to node versions that
support this behavior. And, one day, we'll be able to remove
`PipelineAggregator` from the aggregation result tree entirely.

Importantly, this doesn't change how pipeline aggregations are declared
or parsed or requested. They are still part of the `AggregationBuilder`
tree because *that* makes sense.
2020-03-16 14:51:54 -04:00
Nik Everett 3b7843d774
Fix sorting agg buckets by doc_count (#53617)
I broke sorting aggregations by `doc_count` in #51271 by mixing up true
and false. This flips that comparison and adds a few tests to double
check that we don't so this again.
2020-03-16 14:41:56 -04:00
Mayya Sharipova 01eee1a97f
Highlighters skip ignored keyword values (#53408)
Keyword field values with length more than ignore_above are not
indexed. But highlighters still were retrieving these values
from _source and were trying to highlight them. This sometimes lead to
errors if a field length exceeded  max_analyzed_offset. But also this
is a wrong behaviour to attempt to highlight something that was not
ignored during indexing.

This PR checks if a keyword value was ignored because of its length,
and if yes, skips highlighting it.

Closes #43800
2020-03-16 06:49:37 -04:00
markharwood a2a4756736
New wildcard field optimised for wildcard queries (#49993)
Indexes values using size 3 ngrams and also stores the full original as a binary doc value.
Wildcard queries operate by using a cheap approximation query on the ngram field followed up by a more expensive verification query using an automaton on the binary doc values.  Also supports aggregations and sorting.
2020-03-16 09:54:10 +00:00
David Turner bd580527b2
Do not log no-op reconnections at DEBUG (#53469)
Today the NodeConnectionsService emits a DEBUG-level log message each time it
calls TransportService#connectToNode, which happens for every node in the
cluster every ten seconds, and also at every cluster state update. That's a lot
of log messages. Most of these calls are no-ops and can be ignored, but if the
call was not a no-op then it may be worth investigating further. Since the logs
do not distinguish the interesting and uninteresting cases, they are not
useful.

This commit distinguishes the two cases and pushes the noisy logging for the
common no-op case down to TRACE level, leaving only useful and actionable
information in the DEBUG-level logs.
2020-03-16 08:54:51 +00:00
Mark Vieira 060b4eed59
Revert "Upgrade to Jackson 2.10.3 and GeoIP2 to 2.13.1 (#53523)"
This reverts commit 7bc75f48

Signed-off-by: Mark Vieira <portugee@gmail.com>
2020-03-15 18:10:14 -07:00
Jason Tedor fa6d515893
Remove extra code in allocation commands parsing (#53579)
This commit removes some code that is duplicated in the parsing of
allocation commands in the cluster reroute API.
2020-03-14 18:11:31 -04:00
Jason Tedor 7bc75f48d8
Upgrade to Jackson 2.10.3 and GeoIP2 to 2.13.1 (#53523)
This commit upgrades our Jackson dependency to 2.10.3 and our GeoIP2
dependency to 2.13.1.
2020-03-14 10:22:29 -04:00
Tim Brooks 8ccdaa3a35
Align remote info api with new settings (#53441)
Currently the remote info api has added a number of possible fields
(proxy, num_socket_connections, etc) that are available in proxy mode.
These fields are not aligned with what the settings are named. This
commit modifies this API to align with the settings.
2020-03-13 15:01:01 -06:00
Andy Bristol f6d3784b49
migrate tests from MissingIT to agg tests (#53448)
Move the remaining tests for the missing aggregation into its
AggregatorTestCase out of its integration test and remove the IT
2020-03-13 12:28:17 -07:00
Gordon Brown d7bbc9df1d
Allow _cat indices & aliases to use indices options (#53248)
This commit adjusts the _cat/indices and _cat/aliases APIs to allow
specifying indices options, so that these APIs can handle hidden
indices/aliases in the same way as other APIs.

Also adds the hidden option to the expand_wildcards parameter
in the YAML spec for every API that accepts it.
2020-03-13 11:57:00 -06:00
Marios Trivyzas 1fc3fe3d32
Fix Term Vectors with artificial docs and keyword fields (#53504)
Previously, Term Vectors API was returning empty results for
artificial documents with keyword fields. Checking only for `string()`
on `IndexableField` is not enough, since for `KeywordFieldType`
`binaryValue()` must be used instead.

Fixes #53494
2020-03-13 16:17:30 +01:00
William Brafford 7a182948f6
Use snake case for nodes stats/info metric names (#53446)
The REST API uses "thread_pool" as the name of the thread pool metric.
If we use this name internally when we serialize nodes stats and info
requests, we won't need to do any fancy logic to check for and switch
out "threadPool", which was the previous internal name.
2020-03-13 06:41:37 -04:00
Jim Ferenczi 37c739c048
Fix pre-sorting of shards in the can_match phase (#53397)
This commit fixes a bug on sorted queries with a primary sort field
that uses different types in the requested indices. In this scenario
the returned min/max values to sort the shards are not comparable so
we should avoid the sorting rather than throwing an obscure exception.
2020-03-13 01:27:26 +01:00
Lee Hinman b825f6461f
Update minimal supported version for ComponentTemplateMetadata (#53515)
This can be merged once #53489 is merged.
2020-03-12 16:28:29 -06:00
Christoph Büscher facd525b0a
Mask wildcard query special characters on keyword queries (#53127)
Wildcard queries on keyword fields get normalized, however this normalization
step should exclude the two special characters * and ? in order to keep the
wildcard query itself intact.

Closes #46300
2020-03-12 19:11:21 +01:00
Przemyslaw Gomulka 5ccb3b675e
Ignore isJoda flag from 7x nodes (#53481)
when upgrading from 7.7+ ES will send out a flag indicating if a pattern is of joda style. This is only used to support joda style indices in 7.x, in 8 we no longer support this. All indices in 8 should use java style pattern. Hence we can ignore this flag. Similarly when writing from v8 to v7.7+ we should always send false flag.

relates #52555
relates #53478
closes #53477
2020-03-12 18:09:15 +01:00
Nhat Nguyen a8d89fd0a5
Fix concurrent requests race over scroll context limit (#53449)
Concurrent search scroll requests can lead to more scroll contexts than the limit.
2020-03-12 11:51:16 -04:00
Lee Hinman ae55e06253
Add ComponentTemplate to MetaData (#53290)
* Add ComponentTemplate to MetaData

This adds a `ComponentTemplate` datastructure that will be used as part of #53101 (Index Templates
v2) to the `MetaData` class. Currently there are no APIs for interacting with this class, so it will
always be an empty map (other than in tests). This infrastructure will be built upon to add APIs in
a subsequent commit.

A `ComponentTemplate` is made up of a `Template`, a version, and a MetaData.Custom class. The
`Template` contains similar information to an `IndexTemplateMetaData` object— settings, mappings,
and alias configuration.
2020-03-12 09:27:25 -06:00
Nik Everett 8410356c5b
Preserve metric types in top_metrics (#53288)
This changes the `top_metrics` aggregation to return metrics in their
original type. Since it only supports numerics, that means that dates,
longs, and doubles will come back as stored, with their appropriate
formatter applied.
2020-03-11 16:44:08 -04:00
Andy Bristol 4095df443b
aggregator and yaml tests for missing agg (#53214)
Tests for unmapped fields, the missing parameter, scripting, and correct
ValuesSource types in MissingAggregatorTests. Basic yaml tests for the 
missing agg

For #42949
2020-03-11 13:23:38 -07:00
Jim Ferenczi ab66529021
Fix sporadic failures in AsyncSearchAsyncTests (#53375)
Shard group failure callbacks should be executed before incrementing
the total operations. This is required to ensure that we don't notify
a shard group failure **after** the completion callback.

This change ensures that we set the isRunning flag to `false`
when storing the initial response of an async search request.
2020-03-11 17:14:15 +01:00
Nhat Nguyen cd6d0b70f7 Adjust wire version for search context id
Relates #53143
2020-03-11 11:47:29 -04:00