Remove an unnecessary indirection and refactor progress tracking. Both rollup and transform
process documents as stream, however in the AsyncTwoPhaseIndexer takes a List of index
requests. This change removes the unnecessary temporary container and makes upcoming
transform enhancements easier.
Automatically reorder group_by for composite aggs, ensuring date histogram
group by comes first. The order is only changed for execution, the provided
config remains unchanged.
In case of 2 group_by's of the same order type, the configuration order is
respected. Script and runtime field based group_by's are penalized.
Use point in time API for every checkpoint in transform. Using point in time reduces pressure
on the source indexes, e.g. less refreshes. In case, pit isn't supported (e.g. when searching
remote clusters) it falls back to ordinary search requests as before.
closes#73481
- avoid eagerly created test cluster
- remove duplicate superflous configuration
- resolve system properties via provider factory
- move common test configuration / setup into rest test base plugin
With the overall theme of trying to configure and add less to the build instead of just disabling it later,
we're replacing standalone-test by standalone-rest tasks avoids creating the
unused test tasks.
Standalone rest test plugin and the other rest test plugins behave a little bit different in the sense how source sets and test tasks are wired.
The standalone rest test plugin assumes that all RestTestTasks are using the same sourceSet (test). The yaml, java Rest test plugins use one dedicated sourceSet per test task.
In the long run we probably will migrate standalone-rest-test usages to one of the other plugins and deprecate standalone-rest-test
We only need the javaRestTest sourceSet and can avoid main and test sourceSet by
just using the new introduced ElasticsearchJavaBase instead of ElasticsearchJava plugin
Aiming for configuring less during the build,
this removes non required configuration from qa build scripts that do not
contain any sources. We also remove a few non required afterEvaluate hooks
implement a simple change optimization for histograms using min and max aggregations. The
optimization is not applied if the range cutoff would be too small compared to the overall
range from previous checkpoints. At least 20% must be cut compared to former checkpoints.
fixes#63801
ParseField is part of the x-content lib, yet it doesn't exist under the
same root package as the rest of the lib. This commit moves the class to
the appropriate package.
relates #73784
When libs/core was created, several classes were moved from server's
o.e.common package, but they were not moved to a new package. Split
packages need to go away long term, so that Elasticsearch can even think
about modularization. This commit moves all the classes under o.e.common
in core to o.e.core.
relates #73784
When associated index patterns contained wildcards and
action.destructive_requires_name was set to true, feature resets were
failing. In order to avoid this, we want to resolve associated index
names, then pass the concrete index names to the delete request.
For normal system indices, the SystemIndexDescriptor provides a method
that searches cluster metadata for indices that match the the system
index pattern. This commit introduces an AssociatedIndexDescriptor that
provides the same mechanism. Although we could use an
IndexNameExpressionResolver for the associated indices, it makes sense
to me to keep things consistent across the various feature index pattern
collections.
This change has the effect of allowing the same regex-like syntax that
system index patterns can use, rather than just wildcards, in associated
index patterns.
Legacy index templates are deprecated but ML was still using
them for its hidden indices.
This PR switches the legacy ML index templates to use the new
composable index template framework.
The composable index templates get installed once the master
node is on a version that understands them. For templates
that need to be up-to-date in mixed version clusters where the
master might still be on a version that doesn't understand
composable index templates we still ship the legacy template
too, and install this if required in the mixed version cluster.
(The notifications index template falls into this category.)
This makes a couple of places in the code a little messy, as
the new style template definitions don't contain a dummy _doc
level (where the type used to be), but the legacy template
definitions do - hopefully we can tidy this up in master once
8.0 is released.
There is one more change of note in this PR that is not
strictly related to switching to composable templates, but
which was shown up during the testing. We used to wait for
all templates to be installed by the master node before running
tests in mixed version clusters. I do not believe we should
have been doing this, as other upgrade orchestration systems,
e.g. Cloud, will not be doing this. Our production code needs
to install templates and/or mappings before any operation that
requires them if there's a chance that the elected master won't
have done this in time.
Fixes#65437
Extract usage of internal API from TestClustersPlugin and PluginBuildPlugin and related plugins and build logic
This includes a refactoring of ElasticsearchDistribution to handle types
better in a way we can differentiate between supported Elasticsearch
Distribution types supported in TestCkustersPlugin and types only supported
in internal plugins.
It also introduces a set of internal versions of public plugins.
As part of this we also generate the plugin descriptors now.
As a follow up on this we can actually move these public used classes into
an extra project (declared as included build)
We keep LoggedExec and VersionProperties effectively public And workaround for RestTestBase
unmute continuous transform testing on sorted indexes. These extra
test randomness has been disabled due to triggered lucene assertions.
The upstream issue seems to have been fixed.
#72533 introduced a regression, causing transforms to timeout/fail.
With this change transform only waits for 1 active shard(primary) as waiting for all can block during
rolling upgrade
fixes#72617
relates #72533
This commit changes the `PersistentTasksClusterService` to limit nodes for a task to a subset of
nodes (candidates) that are not currently shutting down.
It does not yet cancel tasks that may already be running on the nodes that are shut down, that will
be added in a subsequent request.
Relates to #70338
Previously, the ResetFeatureStateStatus object captured its status in a
String, which meant that if we wanted to know if something succeeded or
failed, we'd have to parse information out of the string. This isn't a
good way of doing things.
I've introduced a SUCCESS/FAILURE enum for status constants, and added a
check for failures in the transport action. We return a 207 if some but not all
reset actions fail, and for every failure, we also return information about the
exception or error that caused it.
Co-authored-by: Jay Modi <jaymode@users.noreply.github.com>
add support for the stats and top metrics aggregation in transform. With this change it became
easier to add more multi value aggregations to transform
Limitations:
- only the 1st element of top_metrics gets consumed by transform[*].
- all values of stats will be mapped to double if mapping deduction is used, including count,
sum, min, max
fixes#52236
relates #51925
Related to #71593 we move all build logic that is for elasticsearch build only into
the org.elasticsearch.gradle.internal* packages
This makes it clearer if build logic is considered to be used by external projects
Ultimately we want to only expose TestCluster and PluginBuildPlugin logic
to third party plugin authors.
This is a very first step towards that direction.
If a machine learning job is killed while it is attempting to open, there is a race condition that may cause it to not close.
This is most evident during the `reset_feature` API call. The reset feature API will kill the jobs, then call close quickly to wait for the persistent tasks to complete.
But, if this is called while a job is attempting to be assigned to a node, there is a window where the process continues to start even though we attempted to kill and close it.
This commit locks the process context on `kill`, and sets the job to `closing`. This way if the process context is already locked (to start), we won't try to kill it until it is fully started.
Setting the job to `closing` allows the starting process to exit early if the `kill` command has already been completed (before the communicator was created).
closes https://github.com/elastic/elasticsearch/issues/71646
* Warn users if security is implicitly disabled
Elasticsearch has security features implicitly disabled by default for
Basic and Trial licenses, unless explicitly set in the configuration
file.
This may be good for onboarding, but it also lead to unintended insecure
clusters.
This change introduces clear warnings when security features are
implicitly disabled.
- a warning header in each REST response if security is implicitly
disabled;
- a log message during cluster boot.
Now that we have a feature reset API, we should use
this for cleaning up in between tests instead of running
lots of bespoke cleanup code.
During testing of this change we found we need to
delete custom cluster state as part of the reset process,
so this PR also implements that.
Additionally we no longer assign persistent tasks
during feature reset.
- Update gradle wrapper to gradle 7.0
- Remove deprecated usages to make build 7.0 compatible
- Fix excludes in docs snippet tasks (See https://github.com/gradle/gradle/issues/16160 for details)
- Fix deprecation warnings in 7.0
- Add explicit dependencies that have been missed
- Make extract native licenses tasks output dir more explicit
- Use a snapshot of the ospackage plugin that includes a fix for 7.0 already
- fix test runtime classpath setup in repository-hdfs
- Make task dependency explicit to fix further deprecation warnings
- Remove manual check for http repo usages that has been deprecated in gradle 7.0
- Update spock to latest 2.0 milestone required for groovy 3
This commit moves the machine learning roles to server. We no longer
need to maintain these roles outside of server since we only produce a
single distribution, the default distribution, which includes all
roles. Therefore we can simplify the plugin architecture by removing the
plugin extension point for roles. This is one step in that, by moving
the machine learning roles to server.
This change exposes for each field in the _field_caps response if the field is a metadata field.
This is needed for consumers of this API that want to filter these fields. Currently ML keeps a static list
and QL checks that the family type starts with `_`. In order to ease the addition of new metadata fields, this
change reworks the strategy in this solution and now only checks for the new flag.
Note that the new flag is also applied at the coordinator level in a best-effort to apply the logic on older nodes
in a mixed-version cluster.
shouldStopAtCheckpoint tells transform to stop at the next checkpoint, if
this API is called while a checkpoint is finishing, it can cause a race condition
in state persistence. This is similar to #69551, but this time in a different
place.
With this change _stop?shouldStopAtCheckpoint=true does not call doSaveState
if indexer is shutting down. Still it ensures the job stops after the indexer has
shutdown. Apart from that the change fixes: a logging problem, it adds error
handling in case of a timeout during _stop?shouldStopAtCheckpoint=true. Some
logic has been moved from the task to the indexer.
fixes#70416
When we disable access to system indices, plugins will still need
a way to erase their state. The obvious and most pressing use
case for this is in tests, which need to be able to clean up the
state of a cluster in between groups of tests.
* Use a HandledTransportAction for reset action
My initial cut used a TransportMasterNodeAction, which requires code
that carefully manipulates cluster state. At least for the first cut and
testing, it seems like it will be much easier to use a client within a
HandledTransportAction, which effectively makes the
TransportResetFeatureStateAction a class that dispatches other transport
actions to do the real work.
* Clean up code by using a GroupedActionListener
* ML feature state cleaner
* Implement Transform feature state reset
* Change _features/reset path to _features/_reset
Out of an abundance of caution, I think the "reset" part of this path
should have a leading underscore, so that if there's ever a reason to
implement "GET _features/<feature_id>" we won't have to worry about
distinguishing "reset" from a feature name.
Co-authored-by: Gordon Brown <gordon.brown@elastic.co>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
fix a race condition in the test: the indexer thread might still be in the
process of shutting down, when the test thread triggers it again.
relates #69551fixes#70297