Commit Graph

138 Commits

Author SHA1 Message Date
Nhat Nguyen f120064734
Make keep alive optional in PointInTimeBuilder (#62720)
Remove the keepAlive parameter from the constructor of PointInTimeBuilder
as it's optional.
2020-09-22 18:51:24 -04:00
Nhat Nguyen afa0cf2b8a
Increase keep alive of point in time in async search tests (#62593)
Async search tests can take more than one minute due to the excessive trace logs. 
And the point in time in the tests can be expired the midway.

Closes #62451
2020-09-18 07:47:45 -04:00
Alan Woodward 351284e85b
Move SearchLookup into FetchContext (#62549)
FetchSubPhase#getProcessor currently takes a SearchLookup parameter. This
however is only needed by a couple of subphases, and will almost certainly change in
future as we want to simplify how fetch phases retrieve values for individual hits.

To future-proof against further signature changes, this commit moves the SearchLookup
reference into FetchContext instead.
2020-09-17 17:31:39 +01:00
Nik Everett 9a127adb4b
Implement fields fetch for runtime fields (#61995)
This implements the `fields` API in `_search` for runtime fields using
doc values. Most of that implementation is stolen from the
`docvalue_fields` fetch sub-phase, just moved into the same API that the
`fields` API uses. At this point the `docvalue_fields` fetch phase looks
like a special case of the `fields` API.

While I was at it I moved the "which doc values sub-implementation
should I use for fetching?" question from a bunch of `instanceof`s to a
method on `LeafFieldData` so we can be much more flexible with what is
returned and we're not forced to extend certain classes just to make the
fetch phase happy.

Relates to #59332
2020-09-15 15:57:26 -04:00
Nhat Nguyen 1d0e50e556 Ensure to release async search iterator in tests
We need to close an async search response iterator to release
the related point in time if the test uses pit.
2020-09-12 12:05:46 -04:00
Nhat Nguyen 161ec69f77
Support point in time cross cluster search (#61827)
This commit integrates point in time into cross cluster search.

Relates #61062
Closes #61790
2020-09-09 20:55:44 -04:00
Luca Cavanna 654f8baabc
Async search: don't track fetch failures (#62111)
Fetch failures are currently tracked byy AsyncSearchTask like ordinary shard failures. Though they should be treated differently or they end up causing weird scenarios like total=num_shards and successful=num_shards as the query phase ran fine yet the failed count would reflect the number of shards where fetch failed.

Given that partial results only include aggs for now and are complete even if fetch fails, we can ignore fetch failures in async search, as they will be anyways included in the response. They are in fact either received as a failure when all shards fail during fetch, or as part of the final response when only some shards fail during fetch.
2020-09-09 10:17:41 +02:00
Jake Landis 1367bd0c92
Remove integTest task from PluginBuildPlugin (#61879)
This commit removes `integTest` task from all es-plugins.  
Most relevant projects have been converted to use yamlRestTest, javaRestTest, 
or internalClusterTest in prior PRs. 

A few projects needed to be adjusted to allow complete removal of this task
* x-pack/plugin - converted to use yamlRestTest and javaRestTest 
* plugins/repository-hdfs - kept the integTest task, but use `rest-test` plugin to define the task
* qa/die-with-dignity - convert to javaRestTest
* x-pack/qa/security-example-spi-extension - convert to javaRestTest
* multiple projects - remove the integTest.enabled = false (yay!)

related: #61802
related: #60630
related: #59444
related: #59089
related: #56841
related: #59939
related: #55896
2020-09-08 16:41:54 -05:00
Luca Cavanna 06c85d3c4e
Print out search request as part of async search task description (#62057)
Currently, the async search task is the task that will be running through the whole execution of an async search. While the submit async search task prints out the search as part of its description, async search task doesn't while it should.

With this commit we address that while also making sure that the description highlights that the task is originated from an async search.

Also, we streamline the way the description is printed out by SearchTask so that it does not get forgotten in the future.
2020-09-08 16:13:46 +02:00
David Kyle b08f121e65
Mute AsyncSearchActionIT tests (#62037)
For #61790
2020-09-07 10:11:38 +01:00
Jim Ferenczi 49ae2bb56a
Improve reduction of terms aggregations (#61779)
* Improve reduction of terms aggregations

Today, the terms aggregation reduces multiple aggregations at once using a map
to group same buckets together. This operation can be costly since it requires
to lookup every bucket in a global map with no particular order.
This commit changes how term buckets are sorted by shards and partial reduces in
order to be able to reduce results using a merge-sort strategy.
For bwc, results are merged with the legacy code if any of the aggregations use
a different sort (if it was returned by a node in prior versions).

Relates #51857
2020-09-04 15:08:32 +02:00
Jake Landis 51d0bcacdf
Convert first 1/2 x-pack plugins from integTest to [yaml | java]RestTest or internalClusterTest (#60630)
For 1/2 the plugins in x-pack, the integTest
task is now a no-op and all of the tests are now executed via a test,
yamlRestTest, javaRestTest, or internalClusterTest.

This includes the following projects:
async-search, autoscaling, ccr, enrich, eql, frozen-indicies,
data-streams, graph, ilm, mapper-constant-keyword, mapper-flattened, ml

A few of the more specialized qa projects within these plugins
have not been changed with this PR due to additional complexity which should
be addressed separately.

A follow up PR will address the remaining x-pack plugins (this PR is big enough as-is).

related: #61802
related: #56841
related: #59939
related: #55896
2020-09-02 09:22:48 -05:00
Ioannis Kakavas 00430cbfa3
Mute AsyncSearchActionIT test (#61851)
see #61850
2020-09-02 16:15:19 +03:00
Nhat Nguyen 71afd226af
Support point in time in async_search (#61560)
This commit integrates point in time into async search and 
ensures that it works correctly with security enabled.

Relates #61062
2020-08-26 15:40:00 -04:00
Rory Hunter f0aa24e9cc
Implement deprecation logging using log4j (#61474)
Part of #46106. Simplify the implementation of deprecation logging by
relying of log4j more completely, and implementing additional behaviour
through custom appenders and filters.
2020-08-25 09:49:33 +01:00
Jake Landis 86952d78f4
Cleanup xpack build.gradle (#60554)
This commit does three things:
* Removes all Copyright/license headers for the build.gradle files under x-pack. (implicit Apache license)
* Removes evaluationDependsOn(xpackModule('core')) from build.gradle files under x-pack
* Removes a place holder test in favor of disabling the test task (in the async plugin)
2020-08-03 10:15:12 -05:00
Albert Zaharovits d31808de80
Fix DLS/FLS permission for the submit async search action (#59693)
The submit async search action should not populate the thread context
DLS/FLS permission set, because it is not currently authorised as an "indices request"
and hence the permission set that it builds is incomplete and it overrides the
DLS/FLS permission set of the actual spawned search request (which is built correctly).
2020-07-20 08:44:54 +03:00
Nik Everett ea5df51b91
Improve cardinality measure used to build aggs (#56533)
This makes a `parentCardinality` available to every `Aggregator`'s ctor
so it can make intelligent choices about how it collects bucket values.
This replaces `collectsFromSingleBucket` and is similar to it but:
1. It supports `NONE`, `ONE`, and `MANY` values and is generally
   extensible if we decide we can use more precise counts.
2. It is more accurate. `collectsFromSingleBucket` assumed that all
   sub-aggregations live under multi-bucket aggregations. This is
   normally true but `parentCardinality` is properly carried forward
   for single bucket aggregations like `filter` and for multi-bucket
   aggregations configured in single-bucket for like `range` with a
   single range.

While I was touching every aggregation I renamed `doCreateInternal` to
`createMapped` because that seemed like a much better name and it was
right there, next to the change I was already making.

Relates to #56487
2020-07-06 18:31:08 -04:00
Jake Landis 333a5d8cdf
Create plugin for yamlTest task (#56841)
This commit creates a new Gradle plugin to provide a separate task name
and source set for running YAML based REST tests. The only project
converted to use the new plugin in this PR is distribution/archives/integ-test-zip.
For which the testing has been moved to :rest-api-spec since it makes the most
sense and it avoids a small but awkward change to the distribution plugin.

The remaining cases in modules, plugins, and x-pack will be handled in followups.

This plugin is distinctly different from the plugin introduced in #55896 since
the YAML REST tests are intended to be black box tests over HTTP. As such they
should not (by default) have access to the classpath for that which they are testing.

The YAML based REST tests will be moved to separate source sets (yamlRestTest).
The which source is the target for the test resources is dependent on if this
new plugin is applied. If it is not applied, it will default to the test source
set.

Further, this introduces a breaking change for plugin developers that
use the YAML testing framework. They will now need to either use the new source set
and matching task, or configure the rest resources to use the old "test" source set that
matches the old integTest task. (The former should be preferred).

As part of this change (which is also breaking for plugin developers) the
rest resources plugin has been removed from the build plugin and now requires
either explicit application or application via the new YAML REST test plugin.

Plugin developers should be able to fix the breaking changes to the YAML tests
by adding apply plugin: 'elasticsearch.yaml-rest-test' and moving the YAML tests
under a yamlRestTest folder (instead of test)
2020-07-06 12:13:01 -05:00
Luca Cavanna 4366360895
Improve error handling in async search code (#57925)
- The exception that we caught when failing to schedule a thread was incorrect.
- We may have failures when reducing the response before returning it, which were not handled correctly and may have caused get or submit async search task to not be properly unregistered from the task manager
- when the completion listener onFailure method is invoked, the search task has to be unregistered. Not doing so may cause the search task to be stuck in the task manager although it has completed.

Closes #58995
2020-07-03 14:58:46 +02:00
Rene Groeschke 9526c7a4b3
Replace compile configuration usage with api (#58451)
- Use java-library instead of plugin to allow api configuration usage
- Remove explicit references to runtime configurations in dependency declarations
- Make test runtime classpath input for testing convention
  - required as java library will by default not have build jar file
  - jar file is now explicit input of the task and gradle will ensure its properly build
2020-06-30 09:37:09 +02:00
Przemyslaw Gomulka 9bef31ccd3
Do not create two loggers for DeprecationLogger (#58435)
DeprecationLogger's constructor should not create two loggers. It was
taking parent logger instance, changing its name with a .deprecation
prefix and creating a new logger.
Most of the time parent logger was not needed. It was causing Log4j to
unnecessarily cache the unused parent logger instance.
2020-06-29 13:38:21 +02:00
Jim Ferenczi 19ec636d66
Submit _async search task should cancel children on cancellation (#58332)
This change allows the submit async search task to cancel children
and removes the manual indirection that cancels the search task when the submit
task is cancelled. This is now handled by the task cancellation, which can cancel
grand-children since #54757.
2020-06-24 09:00:25 +02:00
Jim Ferenczi 1e5ba7bd78
Handle failures with no explicit cause in async search (#58319)
This commit fixes an AOOBE in the handling of fatal
failures in _async_search. If the underlying cause is not found,
this change uses the root failure.

Closes #58311
2020-06-18 17:58:23 +02:00
Rene Groeschke 5f9d1f1d7c
Unify dependency licenses task configuration (#58116)
- Remove duplicate dependency configuration
- Use task avoidance api accross the build
- Remove redundant licensesCheck config
2020-06-17 18:27:16 +02:00
Igor Motov 95afb2ef52 Merge remote-tracking branch 'elastic/master' into feature/async-eql 2020-06-12 09:32:47 -04:00
Rene Groeschke 680ea07f7f
Remove deprecated usage of testCompile configuration (#57921)
* Remove usage of deprecated testCompile configuration
* Replace testCompile usage by testImplementation
* Make testImplementation non transitive by default (as we did for testCompile)
* Update CONTRIBUTING about using testImplementation for test dependencies
* Fail on testCompile configuration usage
2020-06-12 13:34:53 +02:00
Igor Motov 509c6749a7
Move async task maintenance service to core plugin (#57700)
The async task task maintenance service is used by both async search plugin
as well as EQL plugin. So it needs to reside in the core.

Relates to #49638
2020-06-11 11:12:34 -04:00
Igor Motov 8072646c88 Merge remote-tracking branch 'elastic/master' into feature/async-eql 2020-06-10 10:39:41 -04:00
Luca Cavanna 8afd9a730f
Add description to submit and get async search, as well as cancel tasks (#57745)
This makes it easier to debug where such tasks come from in case they are returned from the get tasks API.

Also renamed the last occurrence of waitForCompletion to waitForCompletionTimeout in get async search request.
2020-06-08 11:09:53 +02:00
Luca Cavanna 79aec53c46
Specify reason whenever async search gets cancelled (#57761)
This allows to trace where the cancel tasks request came from given that it may be triggered for multiple reasons.
2020-06-08 10:21:54 +02:00
Igor Motov d197a85ee5 Merge remote-tracking branch 'elastic/master' into feature/async-eql 2020-06-04 15:50:40 -04:00
Igor Motov 5a95b0f812
EQL: Adds delete async EQL search result action (#57258)
Adds support for deleting async EQL search results to EQL search API.

Relates to #49638
2020-06-04 15:48:06 -04:00
Przemyslaw Gomulka 4d6dc51c72
Header warning logging refactoring (#55941)
Splitting DeprecationLogger into two. HeaderWarningLogger - responsible for adding a response warning headers and ThrottlingLogger - responsible for limiting the duplicated log entries for the same key (previously deprecateAndMaybeLog).
Introducing A ThrottlingAndHeaderWarningLogger which is a base for other common logging usages where both response warning header and logging throttling was needed.

relates #55699
relates #52369
2020-06-01 15:44:01 +02:00
Igor Motov e99981978d
EQL: Adds get async EQL search result action (#56852)
Adds support for retrieving async EQL search result s to eql search API.

Relates to #49638
2020-05-27 10:27:17 -04:00
Igor Motov a301eab85b Merge remote-tracking branch 'elastic/master' into feature/async-eql 2020-05-27 08:55:02 -04:00
Jim Ferenczi f6eddac7ee
Stop async search maintenance service on restart (#56982)
This change ensures that we stop the maintenance service on all nodes
when a data node is restarted. This ensures that we don't send
update_by_query requests on the node that is restarted.
This commit also raises the log level to trace for some packages
in order to investigate the failures to acquire a shard lock
after a restart.

Relates #56765
2020-05-26 09:29:56 +02:00
Igor Motov c4960d5a51 Merge remote-tracking branch 'elastic/master' into feature/async-eql 2020-05-18 11:01:49 -04:00
David Kyle f15057a856
Muse AsyncSearchActionIT (#56897)
For #56765
2020-05-18 13:34:07 +01:00
Igor Motov dd2ac8ea04 Merge remote-tracking branch 'elastic/master' into feature/async-eql 2020-05-15 15:32:55 -04:00
Jim Ferenczi 39a2dec558
Stop/Start async search maintenance service in tests(#56673)
This change ensures that the maintenance service that is responsible for deleting the expired response is stopped between each test. This is needed since we check that no search context are in-flight after each test method.

Fixes #55988
2020-05-14 11:45:36 +02:00
Igor Motov fcebd4fd02
EQL: Adds an ability to start an asynchronous EQL search (#56631)
Adds support for async searches to eql search API. This commit is limited to
only submitting search API requests and doesn't provide APIs to get results
nor delete the results. These functions will be added in follow up PRs.

Relates to #49638
2020-05-13 09:50:15 -04:00
Ryan Ernst 92fcbd3a27
Migrate remaining ESIntegTestCases to internalClusterTest (#56479)
This commit migrates the ESIntegTestCase tests in x-pack to the
internalClusterTest source set.
2020-05-11 19:01:34 -07:00
Jim Ferenczi cb70ce7468
Fix spurious failures in AsyncSearchIntegTestCase (#56026)
Async search integration tests are subject to random failures when:
  * The test index has more than one replica.
  * The request cache is used.
  * Some shards are empty.
  * The maintenance service starts a garbage collection when node is closing.

They are also slow because the test index is created/populated on each
test method.

This change refactors these integration tests in order to:
  * Create the index once for the entire test suite.
  * Fix the usage of the request cache and replicas.
  * Ensures that all shards have at least one document.
  * Increase the delay of the maintenance service garbage collection.

Closes #55895
Closes #55988
2020-05-11 14:55:04 +02:00
Luca Cavanna 9ffd006ca0
Async Search: correct shards counting (#55758)
Async search allows users to retrieve partial results for a running search. For partial results, the number of successful shards does not include the skipped shards, while the response returned to users should.

Also, we recently had a bug where async search would miss tracking shard failures, which would have been caught if we had assertions in place that verified that whenever we get the last response, the number of failures included in it is the same as the failures that were tracked through the listener notifications.
2020-05-06 12:05:10 +02:00
Luca Cavanna e09425c4b0
Consolidate DelayableWriteable (#55932)
This commit includes a number of minor improvements around `DelayableWriteable`: javadocs were expanded and reworded, `get` was renamed to `expand` and `DelayableWriteable` no longer implements `Supplier`. Also a couple of methods are now private instead of package private.
2020-04-30 16:28:47 +02:00
jimczi 7c2ec580ee Revert "Mute failing tests in AsyncSearchActionIT"
This reverts commit 6e6a10f516.
2020-04-29 22:22:25 +02:00
Mark Vieira 6e6a10f516
Mute failing tests in AsyncSearchActionIT 2020-04-29 10:58:47 -07:00
Jim Ferenczi 2907f10701
Fix AsyncSearchActionIT#testTermsAggregation (#55924)
This commit fixes the initialization of total hits
in the async search response.

Relates #55683
Closes #55920
2020-04-29 14:55:42 +02:00
David Roberts 379394792d Muting AsyncSearchActionIT.testTermsAggregation
Due to https://github.com/elastic/elasticsearch/issues/55920
2020-04-29 11:46:50 +01:00
Nik Everett 55874c94e4
Save memory in on aggs in async search (#55683)
This replaces a reference to the result of partially reducing
aggregations that async search keeps with a reference to the serialized
form of the result of the partial reduction which we need to keep
anyway.
2020-04-28 12:54:29 -04:00
Jim Ferenczi db288a29ec
Ignore closed exception on refresh pending location listener (#55799)
This newly added listener should catch closed exceptions when
accessing the internal engine.

Closes #55792
2020-04-27 15:05:47 +02:00
Przemysław Witek 25cea2f06e
Mute failing tests (#55794) 2020-04-27 11:53:59 +02:00
Jim Ferenczi 7f58f4cd00
AsyncSearchMaintenanceService should stop when closing a node (#55651)
This change turns the AsyncSearchMaintenanceService into an
AbstractLifecycleComponent and ensures that the service is
stopped when a node is closing.

Closes #55646
2020-04-24 09:37:58 +02:00
jimczi 5c9b639233 Fix AsyncSearchTaskTests#testWithFetchFailures
Fix usage of a possible invalid random range [1, 0].

Relates #55688
2020-04-24 00:43:49 +02:00
Jim Ferenczi f07059059d
Fix (de)serialization of async search failures (#55688)
The (de)serialization code of the async search response
cannot handle exceptions that extend ElasticsearchException (e.g. ScriptException).
This commit fixes this bug by serializing the error with the more generic
StreamInput#writeException.
2020-04-23 23:00:34 +02:00
Igor Motov 3bf2df6037
Make AsyncSearchIndexService reusable (#55598)
EQL will require very similar functionality to async search. This PR refactors
AsyncSearchIndexService to make it reusable for EQL.

Supersedes #55119
Relates to #49638
2020-04-23 14:23:48 -04:00
Jim Ferenczi 77a1afc501
Fix expiration time in async search response (#55435)
This change ensures that we return the latest expiration time
when retrieving the response from the index.
This commit also fixes a bug that stops the garbage collection of saved responses if the async search index is deleted.
2020-04-21 13:34:05 +02:00
Rory Hunter 8638d08ebf
Always use deprecateAndMaybeLog for deprecation warnings (#55115)
Closes #53137. Replace calls to deprecate(String,Object...) with deprecateAndMaybeLog(...), with an appropriate key, so that all messages
can potentially be deduplicated.
2020-04-16 16:19:45 +01:00
David Turner 6e98af385a
Add RepositoriesService to createComponents() args (#54814)
Today we pass the `RepositoriesService` to the searchable snapshots plugin
during the initialization of the `RepositoryModule`, forcing the plugin to be a
`RepositoryPlugin` even though it does not implement any repositories.

After discussion we decided it best for now to pass this in via
`Plugin#createComponents` instead, pending some future work in which plugins
can depend on services more dynamically.
2020-04-16 15:40:28 +01:00
Luca Cavanna 1b694776ed
Async search: create internal index only before storing initial response (#54619)
We currently create the .async-search index if necessary before performing any action (index, update or delete). Truth is that this is needed only before storing the initial response. The other operations are either update or delete, which will anyways not find the document to update/delete even if the index gets created when missing. This also caused `testCancellation` failures as we were trying to delete the document twice from the .async-search index, once from `TransportDeleteAsyncSearchAction` and once as a consequence of the search task being completed. The latter may be called after the test is completed, but before the cluster is shut down and causing problems to the after test checks, for instance if it happens after all the indices have been cleaned up. It is totally fine to try to delete a response that is no longer found, but not quite so if such call will also trigger an index creation.

With this commit we remove all the calls to createIndexIfNecessary from the update/delete operation, and we leave one call only from storeInitialResponse which is where the index is expected to be created.

Closes #54180
2020-04-10 18:19:14 +02:00
Nik Everett b8564b0159
Remove pipline aggs from agg result tree (backport of #54716)
This removes pipeline aggregators from the aggregation result tree
except for a single field used for backwards compatibility with pre-7.8
versions of Elasticsearch. That field isn't populated unless we are
serializing to pre-7.8 Elasticsearch. So, good news! We no longer build
pipeline aggregators on the data node. Most of the time.
2020-04-07 14:13:39 -04:00
Jim Ferenczi f3f1b6993c
Preserve final response headers in asynchronous search (#54349)
This change adds the response headers of the original search request
in the stored response in order to be able to restore them when retrieving a result
from the async-search index. It also ensures that response headers are preserved for
users that retrieve a final response on a running search task.
Partial response can eventually return response headers too but this change only ensures
that they are present when the response if final.

Relates #33936
2020-04-07 08:29:21 +02:00
Jim Ferenczi 9e24558f0b
Fix transport serialization of AsyncSearchUser (#54761)
This change ensures that the AsyncSearchUser is correctly (de)serialized when
an action executed by this user is sent to a remote node internally (via transport client).
2020-04-07 08:24:56 +02:00
Luca Cavanna 02772ca55b
[TEST] rename AsyncSearchActionTests to IT and move it out of unit tests (#54520)
`AsyncSearchActionTests` currently fails quite often. That is since the introduction of `RestSubmitAsyncSearchActionTests` which indirectly manipulates the channels being tracked in `RestCancellableNodeClient`. There are channels left in the map after `RestSubmitAsyncSearchActionTests`  is run, and later `AsyncSearchActionTests` checks that there are no channels in the map which makes each test method fail. This is particularily hard to reproduce as the order in which tests are run appears to be platform dependent.

The test cluster assertion that there are no channels in the map only makes sense in the context of internal cluster tests, while there may be collisions with unit tests that register http channels as part of their testing.

This can be solved by renaming `AsyncSearchActionTests` to `AsyncSearchActionIT`. This way it won't be run as part of unit tests but rather within another JVM where the number of channels is `0` and such assumption holds, because there are no expected manual manipulation of the channels.

Relates to #54180
2020-04-01 11:21:42 +02:00
Jason Tedor 95a7eed9aa
Rename MetaData to Metadata in all of the places (#54519)
This is a simple naming change PR, to fix the fact that "metadata" is a
single English word, and for too long we have not followed general
naming conventions for it. We are also not consistent about it, for
example, METADATA instead of META_DATA if we were trying to be
consistent with MetaData (although METADATA is correct when considered
in the context of "metadata"). This was a simple find and replace across
the code base, only taking a few minutes to fix this naming issue
forever.
2020-03-31 15:52:01 -04:00
Nik Everett a0f7c4a6a4
Clean up how pipeline aggs check for multi-bucket (#54161)
Pipeline aggregations like `stats_bucket`, `sum_bucket`, and
`percentiles_bucket` only operate on buckets that have multiple buckets.
This adds support for those aggregations to `geo_distance`, `ip_range`,
`auto_date_histogram`, and `rare_terms`.

This all happened because we used a marker interface to mark compatible
aggs, `MultiBucketAggregationBuilder` and it was fairly easy to forget
to implement the interface.

This replaces the marker interface with an abstract method in
`AggregationBuilder`, `bucketCardinality` which makes you return `NONE`,
`ONE`, or `MANY`. The `bucket` aggregations can check for `MANY`. At
this point `ONE` and `NONE` amount to about the same thing, but I
suspect that'll be a useful distinction when validating bucket sorts.

Closes #53215
2020-03-28 11:47:01 -04:00
David Turner 1d3a8de100 AwaitsFix for #54180 2020-03-26 15:31:15 +00:00
Christoph Büscher 4ae258a27b
HLRC: Don't send defaults for SubmitAsyncSearchRequest (#54200)
Currently we set the defaults for ccsMinimizeRoundtrips, preFilterShardSize and
requestCache on the HLRC SubmitAsyncSearchRequest in the constructor. This is no
longer needed since we now only send the parameters along with the rest request
that are supported (omitting e.g. ccsMinimizeRoundtrips) and the correct
defaults are set on the client side. This change removes setting and sending
these defaults where possible, leaving only the overwrite of batchedReduceSize
with a default value of 5, since the default used in the vanilla SearchRequest
is 512. However, we don't need to send this value along as a request parameter
if its the default since the correct one will be set on the receiving end if no
value is specified.
Also adding tests for RestSubmitAsyncSearchAction that check the correct
defaults are set when parameters are missing on the server side.
2020-03-26 14:00:28 +01:00
Yannick Welsch 0a34b71f3c
Schedule commands in current thread context (#54187)
Changes ThreadPool's schedule method to run the schedule task in the context of the thread
that scheduled the task.

This is the more sensible default for this method, and eliminates a range of bugs where the
current thread context is mistakenly dropped.

Closes #17143
2020-03-26 09:46:36 +01:00
Luca Cavanna 1c482141ee
Async search: rename REST parameters (#54198)
This commit renames wait_for_completion to wait_for_completion_timeout in submit async search and get async search.
Also it renames clean_on_completion to keep_on_completion and turns around its behaviour.

Closes #54069
2020-03-26 09:40:05 +01:00
Luca Cavanna 8c29035635
Async search: prevent users from overriding pre_filter_shard_size (#54088)
Submit async search forces pre_filter_shard_size for the underlying search that it creates.
With this commit we also prevent users from overriding such default as part of request validation.
2020-03-24 17:04:38 +01:00
Luca Cavanna c5a3295475
Async search response: output start and expiration time as time fields (#54084)
This commits makes start_time and expiration_time time fields, so that their date variant will be printed out when human readable output is requested.
2020-03-24 17:01:07 +01:00
Jim Ferenczi 68f42979f7
Improve async search's tasks cancellation (#53799)
This commit adds an explicit cancellation of the search task if
the initial async search submit task is cancelled (connection closed by the user).
This was previously done through the cancellation of the parent task but we don't
handle grand-children cancellation yet so we have to manually cancel the search task
in order to ensure that shard actions are cancelled too.
This change can be considered as a workaround until #50990 is fixed.
2020-03-24 12:31:07 +01:00
Christoph Büscher 3ceb60becf
Add async_search get and delete APIs to HLRC (#53828)
This commit adds the "_async_searhc" get and delete APIs to the
AsyncSearchClient in the High Level Rest Client.

Relates to #49091
2020-03-23 15:01:35 +01:00
Luca Cavanna b1f4f32137
Async Search: replicas to auto expand from 0 to 1 (#53964)
This way single node clusters that are green don't go yellow once async search is used, while
all the others still have one replica.
2020-03-23 13:42:35 +01:00
Luca Cavanna 1af04175a1
Async search: remove version from response (#53960)
The goal of the version field was to quickly show when you can expect to find something new in the search response, compared to when nothing has changed. This can also be done by looking at the `_shards` section and `num_reduce_phases` returned with the search response. In fact when there has been one or more additional reduction of the results, you can expect new results in the search response. Otherwise, the `_shards` section could notify of additional failures of shards that have completed the query, but that is not a guarantee that their results will be exposed (only when the following partial reduction is performed their results will be available).

That said this commit clarifies this in the docs and removes the version field from the async search response
2020-03-23 13:42:10 +01:00
Nik Everett 4d81edb625
Stop using round-tripped PipelineAggregators (#53423)
This begins to clean up how `PipelineAggregator`s and executed.
Previously, we would create the `PipelineAggregator`s on the data nodes
and embed them in the aggregation tree. When it came time to execute the
pipeline aggregation we'd use the `PipelineAggregator`s that were on the
first shard's results. This is inefficient because:
1. The data node needs to make the `PipelineAggregator` only to
   serialize it and then throw it away.
2. The coordinating node needs to deserialize all of the
   `PipelineAggregator`s even though it only needs one of them.
3. You end up with many `PipelineAggregator` instances when you only
   really *need* one per pipeline.
4. `PipelineAggregator` needs to implement serialization.

This begins to undo these by building the `PipelineAggregator`s directly
on the coordinating node and using those instead of the
`PipelineAggregator`s in the aggregtion tree. In a follow up change
we'll stop serializing the `PipelineAggregator`s to node versions that
support this behavior. And, one day, we'll be able to remove
`PipelineAggregator` from the aggregation result tree entirely.

Importantly, this doesn't change how pipeline aggregations are declared
or parsed or requested. They are still part of the `AggregationBuilder`
tree because *that* makes sense.
2020-03-16 14:51:54 -04:00
Luca Cavanna 548fd9494b
Move async search yaml tests to x-pack yaml test folder (#53537)
The yaml tests for async search currently sit in its qa folder. There is no reason though for them to live in a separate folder as they don't require particular setup. This commit moves them to the main folder together with the other x-pack yaml tests so that they will be run by the client test runners too.
2020-03-13 15:37:29 +01:00
Jim Ferenczi 88d05d1e33
Fix race condition when deleting an async search (#53513)
Deleting an async search id can throw a ResourceNotFoundException even if the query was successfully cancelled.
We delete the stored response automatically if the query is cancelled so that creates a race with the delete action that also ensures that the task is removed. This change ensures that we ignore missing async search ids in the async search index if they were successfully cancelled.

Relates #53360
Relates #49931
2020-03-12 22:25:04 +01:00
jimczi eba2b03941 Fix sporadic failures in AsyncSearchActionTests (take 2)
This change removes the need to always get a new version when iterating
on an async search. This is needed since we cannot guarantee that shards will
be queried exactly in order.

Relates #53360
2020-03-11 18:17:00 +01:00
Jim Ferenczi ab66529021
Fix sporadic failures in AsyncSearchAsyncTests (#53375)
Shard group failure callbacks should be executed before incrementing
the total operations. This is required to ensure that we don't notify
a shard group failure **after** the completion callback.

This change ensures that we set the isRunning flag to `false`
when storing the initial response of an async search request.
2020-03-11 17:14:15 +01:00
Luca Cavanna 99513c0e7a
Refine SearchProgressListener internal API (#53373)
The following cumulative improvements have been made:
- rename `onReduce` and `notifyReduce` to `onFinalReduce` and `notifyFinalReduce`
- add unit test for `SearchShard`
- on* methods in `SearchProgressListener` shouldn't need to be public as they should never be called directly, they only need to be overridden hence they can be made protected. They are actually called directly from a test which required some adapting, like making `AsyncSearchTask.Listener` class package private instead of private
- Instead of overriding `getProgressListener` in `AsyncSearchTask`, as it feels weird to override a getter method, added a specific method that allows to retrieve the Listener directly without needing to cast it. Made the getter and setter for the listener final in the base class.
- rename `SearchProgressListener#searchShards` methods to `buildSearchShards` and make it static given that it accesses no instance members
- make `SearchShard` and `SearchShardTask` classes final
2020-03-11 14:29:13 +01:00
Luca Cavanna 92113c3d8e
Submit async search to work only with POST (#53368)
Currently the submit async search API can be called using both GET and POST at REST, but given that it submits a call and creates internal state, POST should be the only allowed method.
2020-03-11 10:17:02 +01:00
William Brafford 55f7246feb
Mute AsyncSearchActionTests (#53361) 2020-03-10 15:46:57 -04:00
jimczi b9529d0f79 Revert "Fix spurious failures in AsyncSearchActionTests"
This reverts commit d46b2af478.
2020-03-10 18:52:53 +01:00
jimczi d46b2af478 Fix spurious failures in AsyncSearchActionTests
AsyncSearchActionTests#testCleanupOnFailure fails sporadically in CI but not locally.
This commit switches the tests into a SuiteScopeTestCase that creates internal states
once on static members in order to make the tests more reproducible.

Relates #49931
2020-03-10 18:41:17 +01:00
Jim Ferenczi 146b2a85b4
Add new x-pack endpoints to track the progress of a search asynchronously (#49931)
### High level view

This change introduces a new API in x-pack basic that allows to track the progress of a search.
Users can submit an asynchronous search through a new endpoint called `_async_search` that
works exactly the same as the `_search` endpoint but instead of blocking and returning the final response when available, it returns a response after a provided `wait_for_completion` time.

````
# Submit an _async_search and waits up to 100ms for a final response
GET my_index_pattern*/_async_search?wait_for_completion=100ms
{
  "aggs": {
    "date_histogram": {
      "field": "@timestamp",
      "fixed_interval": "1h"
    }
  }
}
````

If after 100ms the final response is not available, a `partial_response` is included in the body:

````
{
  "id": "9N3J1m4BgyzUDzqgC15b",
  "version": 1,
  "is_running": true,
  "is_partial": true,
  "response": {
   "_shards": {
       "total": 100,
       "successful": 5,
       "failed": 0
    },
    "total_hits": {
      "value": 1653433,
      "relation": "eq"
    },
    "aggs": {
      ...
    }
  }
}
````

The partial response contains the total number of requested shards, the number of shards that successfully returned and the number of shards that failed.
It also contains the total hits as well as partial aggregations computed from the successful shards.
To continue to monitor the progress of the search users can call the get `_async_search` API like the following:

````
GET _async_search/9N3J1m4BgyzUDzqgC15b/?wait_for_completion=100ms
````

That returns a new response that can contain the same partial response than the previous call if the search didn't progress, in such case the returned `version`
should be the same. If new partial results are available, the version is incremented and the `partial_response` contains the updated progress.
Finally if the response is fully available while or after waiting for completion, the `partial_response` is replaced by a `response` section that contains the usual _search response:

````
{
  "id": "9N3J1m4BgyzUDzqgC15b",
  "version": 10,
  "is_running": false,
  "response": {
     "is_partial": false,
     ...
  }
}
````

## Persistency

Asynchronous search are stored in a restricted index called `.async-search` if they survive (still running) after the initial submit. Each request has a keep alive that defaults to 5 days but this value can be changed/updated any time:
`````
GET my_index_pattern*/_async_search?wait_for_completion=100ms&keep_alive=10d
`````
The default can be changed when submitting the search, the example above raises the default value for the search to `10d`. 
`````
GET _async_search/9N3J1m4BgyzUDzqgC15b/?wait_for_completion=100ms&keep_alive=10d
`````
The time to live for a specific search can be extended when getting the progress/result. In the example above we extend the keep alive to 10 more days.
A background service that runs only on the node that holds the first primary shard of the `async-search` index is responsible for deleting the expired results. It runs every hour but the expiration is also checked by running queries (if they take longer than the keep_alive) and when getting a result.

Like a normal `_search`, if the http channel that is used to submit a request is closed before getting a response, the search is automatically cancelled. Note that this behavior is only for the submit API, subsequent GET requests will not cancel if they are closed. 

## Resiliency

Asynchronous search are not persistent, if the coordinator node crashes or is restarted during the search, the asynchronous search will stop. To know if the search is still running or not the response contains a field called `is_running` that indicates if the task is up or not. It is the responsibility of the user to resume an asynchronous search that didn't reach a final response by re-submitting the query. However final responses and failures are persisted in a system index that allows
to retrieve a response even if the task finishes.

````
DELETE _async_search/9N3J1m4BgyzUDzqgC15b
````

The response is also not stored if the initial submit action returns a final response. This allows to not add any overhead to queries that completes within the initial `wait_for_completion`.

## Security

The `.async-search` index is a restricted index (should be migrated to a system index in +8.0) that is accessible only through the async search APIs. These APIs also ensure that only the user that submitted the initial query can retrieve or delete the running search. Note that admins/superusers would still be able to cancel the search task through the task manager like any other tasks.

Relates #49091

Co-authored-by: Luca Cavanna <javanna@users.noreply.github.com>
2020-03-10 16:33:15 +01:00