Commit Graph

300 Commits

Author SHA1 Message Date
David Roberts 4bf6a778b7
[Transform] Don't search if all indices are unchanged between checkpoints (#77204)
When every index that a transform is configured to search has
remained completely unchanged between checkpoints the transform
should not do a search at all.

Following #75839 there was a problem where the scenario of all
indices being unchanged between checkpoints could cause an empty
list of indices to be searched, which Elasticsearch treats as
meaning _all_ indices. This change should prevent that happening
in future.

Fixes #77137
2021-09-03 12:42:01 +01:00
Hendrik Muhs 4974a7cd7b
[Transform] Reduce indexes to query based on checkpoints (#75839)
Continuous transform reduce the amount of data to query for by detecting what has been changed
since the last checkpoint. This information is used to inject queries that narrow the scope.
The query is send to all configured indices. This change reduces the indexes to call
using checkpoint information. The number of network calls go down which in addition to performance
reduces the probability of a failure.

This change mainly helps the transforms of type latest, pivot transform require additional
changes planned for later.
2021-08-26 09:08:10 +02:00
Hendrik Muhs deac0030bf
[Test] TransformIndexerStateTests testStopAtCheckpoint fix listener count (#76880)
change check for open listeners, avoiding failures due to execution timing

fixes #76555
2021-08-25 08:04:33 +02:00
Przemysław Witek 676d4de3de
[Transform] Implement the ability to preview the existing transform (#76697) 2021-08-24 14:41:49 +02:00
Przemysław Witek ec07e4213e
[Transform] Rename interim_results to align_checkpoints (#76609) 2021-08-18 13:58:50 +02:00
Przemysław Witek f9d30adf6f
[Transform] Align transform checkpoint range with date_histogram interval for better performance (#74004) 2021-08-16 17:50:11 +02:00
Nik Everett e305a6bed7
Name `BulkItemResponse` ctors (#76439)
* Name `BulkItemResponse` ctors

`BulkItemResponse` can contain either a success or failure. This
replaces the two constructors used to build either case with named
static methods. So instead of
```
return new BulkItemResponse(0, OpType.CREATE, createResponse);
return new BulkItemResponse(0, OpType.CREATE, failure);
```
you now use
```
return BulkItemResponse.success(0, OpType.CREATE, createResponse);
return BulkItemResponse.failure(0, OpType.CREATE, failure);
```

This makes it marginally easier to read code building these things - you
don't have to know the type of the parameter to know if its a failure
or success.

* Consistent

* Mock response
2021-08-12 14:41:26 -04:00
Benjamin Trent aaeecd7729
[ML][Transform] fixing testFailureCounterIsResetOnSuccess test failure #76397 (#76417)
There are two possible race conditions that were not
previously handled in this test.

- Since the syncconfig was null, it may be that the
  transform actually gets set to stopping/stopped
  and its unable to kick off another indexing pass
- It may also be that the indexer thread is still
  finishing up work when the second execution is
  requested, so it returns false.

Adding a sync config and assertBusy handles these
cases. Ran 1k+ times locally with this change
and it never failed. Without, it failed ~10 runs.

closes https://github.com/elastic/elasticsearch/issues/76397
2021-08-12 10:01:49 -04:00
Benjamin Trent 70d91d1d38
[ML][Transform] reset failure count when a transform aggregation page is handled successfully (#76355)
Failure count should not only be reset at checkpoints. Checkpoints could have many pages of data. Consequently, we should reset the failure count once we handle a single composite aggregation page.

This way, the transform won't mark itself as failed erroneously when it has actually succeeded searches + indexing results within the same checkpoint.

closes #76074
2021-08-11 10:59:19 -04:00
Ioannis Kakavas fed790e4e4
Set xpack.security.enabled to true for all licenses (#72300)
This change sets the default value for `xpack.security.enabled` to true
for all licenses. As such the value of the settings is read directly 
from the node's settings and not from XPackLicenseState which 
doesn't need to keep track of it depending on potential license changes
any more.
2021-08-09 09:36:01 +03:00
Hendrik Muhs 4f22f437ee
[Transform] fix potential deadlock when using stop_at_checkpoint (#76034)
When _stop gets called with stop_at_checkpoint=true and at the same time a transform got triggered internally or externally a race condition could lead to a deadlock of 5s. The change fixes the situation where 2 lock objects could lock one another.

fixes #75846
2021-08-06 16:07:49 +02:00
Hendrik Muhs cc07145a2e
[CI][Transform] fix GroupByOptimizerTests randomization failure part 2 (#76009)
avoid clashing field name by using a unique prefix

fixes #75957
2021-08-03 14:20:58 +02:00
Hendrik Muhs a3ec2ee318
allow prefixing field names in random object creation. Fix test failure (#75928)
caused by clashing field names.

fixes #75845
2021-08-02 07:01:53 -04:00
Hendrik Muhs fb0846a23d
[Transform][Rollup] remove unnecessary list indirection (#75459)
Remove an unnecessary indirection and refactor progress tracking. Both rollup and transform
process documents as stream, however in the AsyncTwoPhaseIndexer takes a List of index
requests. This change removes the unnecessary temporary container and makes upcoming
transform enhancements easier.
2021-07-28 16:34:17 +02:00
Hendrik Muhs 4e8301e8b7
[Transform] Fix transform fails when getting field mappings (#75694)
move field mapping retrieval from task startup into indexer

fixes #75693
2021-07-27 11:50:26 +02:00
Hendrik Muhs fc37cfc8b9
[Transform] fix listener for search context missing exception (#75615)
Fix a unreleased regression introduced in #74984. In case a pit search context disappeared the listener was called twice and the transform fails.
2021-07-22 11:00:28 +02:00
Hendrik Muhs 383fbd8e07
[Transform] Optimize composite agg execution using ordered groupings (#75424)
Automatically reorder group_by for composite aggs, ensuring date histogram
group by comes first. The order is only changed for execution, the provided
config remains unchanged.

In case of 2 group_by's of the same order type, the configuration order is
respected. Script and runtime field based group_by's are penalized.
2021-07-20 09:19:28 +02:00
Hendrik Muhs 15a3b3541d
[Transform] improve performance by using point in time API for search (#74984)
Use point in time API for every checkpoint in transform. Using point in time reduces pressure
on the source indexes, e.g. less refreshes. In case, pit isn't supported (e.g. when searching
remote clusters) it falls back to ordinary search requests as before.

closes #73481
2021-07-14 12:00:49 +02:00
Przemysław Witek 232921f62a
Add logging of shard failures (#75275) 2021-07-14 11:00:11 +02:00
Rene Groeschke 1db75f75ca
Polish and cleanup test related plugins (#74673)
- avoid eagerly created test cluster
- remove duplicate superflous configuration
- resolve system properties via provider factory
- move common test configuration / setup into rest test base plugin
2021-07-05 10:34:36 +02:00
Rene Groeschke d8e4e48a3b
Avoid creating unused test tasks (#74644)
With the overall theme of trying to configure and add less to the build instead of just disabling it later,
we're replacing standalone-test by standalone-rest tasks avoids creating the
unused test tasks.

Standalone rest test plugin and the other rest test plugins behave a little bit different in the sense how source sets and test tasks are wired.

The standalone rest test plugin assumes that all RestTestTasks are using the same sourceSet (test). The yaml, java Rest test plugins use one dedicated sourceSet per test task.

In the long run we probably will migrate standalone-rest-test usages to one of the other plugins and deprecate standalone-rest-test
2021-06-30 14:11:25 +02:00
Rene Groeschke 54738fc067
Avoid unused sourcesets and tasks in javaRestTest plugin (#74492)
We only need the javaRestTest sourceSet and can avoid main and test sourceSet by
just using the new introduced ElasticsearchJavaBase instead of ElasticsearchJava plugin
2021-06-28 09:54:59 +02:00
Rene Groeschke b79dd52c1b
Cleanup QA projects build scripts (#74428)
Aiming for configuring less during the build,
this removes non required configuration from qa build scripts that do not
contain any sources. We also remove a few non required afterEvaluate hooks
2021-06-23 11:35:47 +02:00
Hendrik Muhs d51b995f3a
[Transform] optmize histogam group_by change detection (#74031)
implement a simple change optimization for histograms using min and max aggregations. The
optimization is not applied if the range cutoff would be too small compared to the overall
range from previous checkpoints. At least 20% must be cut compared to former checkpoints.

fixes #63801
2021-06-14 16:44:36 +02:00
Przemysław Witek 602fb449b1
[Transform] Small cleanup in transform indexer code (#73891) 2021-06-10 08:32:26 +02:00
Ryan Ernst 63012c8a40
Move ParseField to o.e.c.xcontent (#73923)
ParseField is part of the x-content lib, yet it doesn't exist under the
same root package as the rest of the lib. This commit moves the class to
the appropriate package.

relates #73784
2021-06-08 13:32:14 -07:00
Ryan Ernst 68817d7ca2
Rename o.e.common in libs/core to o.e.core (#73909)
When libs/core was created, several classes were moved from server's
o.e.common package, but they were not moved to a new package. Split
packages need to go away long term, so that Elasticsearch can even think
about modularization. This commit moves all the classes under o.e.common
in core to o.e.core.

relates #73784
2021-06-08 09:53:28 -07:00
William Brafford 953059c35b
Resolve concrete associated indices when resetting features (#73017)
When associated index patterns contained wildcards and
action.destructive_requires_name was set to true, feature resets were
failing. In order to avoid this, we want to resolve associated index
names, then pass the concrete index names to the delete request.

For normal system indices, the SystemIndexDescriptor provides a method
that searches cluster metadata for indices that match the the system
index pattern. This commit introduces an AssociatedIndexDescriptor that
provides the same mechanism. Although we could use an
IndexNameExpressionResolver for the associated indices, it makes sense
to me to keep things consistent across the various feature index pattern
collections.

This change has the effect of allowing the same regex-like syntax that
system index patterns can use, rather than just wildcards, in associated
index patterns.
2021-06-02 15:46:55 -04:00
Przemysław Witek 7e3f098dcf
[Transform] Revamp transform config and query validation code (#72526) 2021-05-26 13:49:54 +02:00
David Roberts 0216cf065b
[ML] Switch ML internal index templates to composable templates (#73232)
Legacy index templates are deprecated but ML was still using
them for its hidden indices.

This PR switches the legacy ML index templates to use the new
composable index template framework.

The composable index templates get installed once the master
node is on a version that understands them.  For templates
that need to be up-to-date in mixed version clusters where the
master might still be on a version that doesn't understand
composable index templates we still ship the legacy template
too, and install this if required in the mixed version cluster.
(The notifications index template falls into this category.)
This makes a couple of places in the code a little messy, as
the new style template definitions don't contain a dummy _doc
level (where the type used to be), but the legacy template
definitions do - hopefully we can tidy this up in master once
8.0 is released.

There is one more change of note in this PR that is not
strictly related to switching to composable templates, but
which was shown up during the testing.  We used to wait for
all templates to be installed by the master node before running
tests in mixed version clusters.  I do not believe we should
have been doing this, as other upgrade orchestration systems,
e.g. Cloud, will not be doing this.  Our production code needs
to install templates and/or mappings before any operation that
requires them if there's a chance that the elected master won't
have done this in time.

Fixes #65437
2021-05-24 11:13:24 +01:00
Przemysław Witek 51181d5d52
[Transform] Improve error message when user lacks privilege in _preview endpoint (#72002) 2021-05-12 15:06:03 +02:00
Hendrik Muhs ec92261293
[Transform] test with minimal privileges (#72816)
limit the privileges to the minimal required ones

relates #72715
2021-05-10 13:01:07 +02:00
Rene Groeschke e609e07cfe
Remove internal build logic from public build tool plugins (#72470)
Extract usage of internal API from TestClustersPlugin and PluginBuildPlugin and related plugins and build logic

This includes a refactoring of ElasticsearchDistribution to handle types
better in a way we can differentiate between supported Elasticsearch
Distribution types supported in TestCkustersPlugin and types only supported
in internal plugins.

It also introduces a set of internal versions of public plugins.

As part of this we also generate the plugin descriptors now.

As a follow up on this we can actually move these public used classes into 
an extra project (declared as included build)

We keep LoggedExec and VersionProperties effectively public And workaround for RestTestBase
2021-05-06 14:02:35 +02:00
Hendrik Muhs ca63643525
[Transform] fix 2 issues with index sort in integration test (#72742)
fix 2 corner cases in test setup: unsigned_long not support as index sort,do not overlay a runtime field with index sort

fixes #72733
relates #72692
2021-05-05 11:58:53 +02:00
Ignacio Vera 41241f69a2
mute TransformContinuousIT#testContinousEvents (#72734) 2021-05-05 09:24:11 +02:00
Hendrik Muhs 98db34907d
[Transform] unmute continuous transform testing on sorted indexes (#72692)
unmute continuous transform testing on sorted indexes. These extra
test randomness has been disabled due to triggered lucene assertions.
The upstream issue seems to have been fixed.
2021-05-04 17:55:57 +02:00
Hendrik Muhs 125726a0f2
[Transform] Fix rolling upgrade regression introduced in #72533 (#72666)
#72533 introduced a regression, causing transforms to timeout/fail.
With this change transform only waits for 1 active shard(primary) as waiting for all can block during
rolling upgrade

fixes #72617
relates #72533
2021-05-04 14:25:16 +02:00
Hendrik Muhs 918be1d501
[Transform] avoid transform failure during rolling upgrade (#72533)
ensure shards are searchable after creation of a new internal index version

fixes #72525
2021-04-30 15:28:28 +02:00
Lee Hinman 0f50800ecb
Don't assign persistent tasks to nodes shutting down (#72260)
This commit changes the `PersistentTasksClusterService` to limit nodes for a task to a subset of
nodes (candidates) that are not currently shutting down.

It does not yet cancel tasks that may already be running on the nodes that are shut down, that will
be added in a subsequent request.

Relates to #70338
2021-04-28 14:00:57 -06:00
William Brafford fc7c06d8a1
Make feature reset API response more informative (#71240)
Previously, the ResetFeatureStateStatus object captured its status in a
String, which meant that if we wanted to know if something succeeded or
failed, we'd have to parse information out of the string. This isn't a
good way of doing things.

I've introduced a SUCCESS/FAILURE enum for status constants, and added a
check for failures in the transport action. We return a 207 if some but not all
reset actions fail, and for every failure, we also return information about the
exception or error that caused it.

Co-authored-by: Jay Modi <jaymode@users.noreply.github.com>
2021-04-27 13:47:10 -04:00
Hendrik Muhs 7fff5df7a3
[Transform] add support for top metrics (#71850)
add support for the stats and top metrics aggregation in transform. With this change it became
easier to add more multi value aggregations to transform

Limitations:
 - only the 1st element of top_metrics gets consumed by transform[*].
 - all values of stats will be mapped to double if mapping deduction is used, including count,
   sum, min, max

fixes #52236
relates #51925
2021-04-27 13:45:53 +02:00
Przemysław Witek f992e47763
[Transform] Make transform _stats work again, even if there are no transform nodes (#72221) 2021-04-27 09:47:20 +02:00
Rene Groeschke 5bcd02cb4d
Restructure build tools java packages (#72030)
Related to #71593 we move all build logic that is for elasticsearch build only into
the org.elasticsearch.gradle.internal* packages

This makes it clearer if build logic is considered to be used by external projects
Ultimately we want to only expose TestCluster and PluginBuildPlugin logic
to third party plugin authors.

This is a very first step towards that direction.
2021-04-26 14:53:55 +02:00
Ayushman Singh Chauhan 4169587115
DOC: Fix spelling (#72179)
DOC: Fix spelling
2021-04-23 16:59:51 -04:00
Benjamin Trent 5e3d54b908
[ML] [Transform] use feature reset API for transform test cleanup (#72044)
This moves all transform cleanup logic to use the feature reset API.
2021-04-22 13:11:16 -04:00
Hendrik Muhs f3c175cc60
[Transform] enhance geobounds test for sparse data case (#72023)
use sparse data for geobounds agg, verifies the fix of #71874, adding debug
logging of index requests send by transform
2021-04-22 12:57:54 +02:00
Hendrik Muhs c7fb400b3f
[Transform] remove deprecated endpoint from tests (#71891)
remove the use of the deprecated _data_frame endpoint from rest test cases, because it can prevent detecting problems in PR builds

relates #71792
2021-04-20 10:35:23 +02:00
Przemysław Witek 319548c80c
Redirect transform actions to transform&remote_cluster_client node when needed (#70125) 2021-04-19 17:22:09 +02:00
Przemysław Witek 749120ed5b
[Transform] Add telemetry support for transform features (#71607) 2021-04-19 14:38:50 +02:00
Dan Hermann c831d887cb
Fix failure in TransformPivotRestSpecialCasesIT::testIndexTemplateMappingClash (#71778) 2021-04-19 07:34:12 -05:00