* Mixed cluster tests with string NodeInfo version
- Move version based feature comparison to a common, deprecated method (to be replaced with real features)
- Use string comparison against old cluster version to partition new/old cluster nodes
- Creates a new StackTemplateRegistry that uses the new names
- The new registry only respects stack.templates.enabled for index templates
- Renames the old registry to LegacyStackTemplateRegistry
- Component templates are not duplicated but registered under two different names
- Documents the new naming convention
- Index templates are not renamed, at least for now, as there are some challenges with it
See 7fd0423 for more details.
Various things related to snapshots appear in debug logs, but have no
useful string representation which makes it hard to follow the process.
This commit adds some missing string representations.
This adds more tests for some of the `MV_` functions and updates their
docs now that the railroad diagram and table generated by the tests
covers all of the types.
The tests in DeprecationHttpIT are affecting each other - this adds a check the index is actually deleted between each test. This should stop the regular CI failures we see in DeprecationHttpIT.
Adds a test framework that validates instruments are registered before they are called and are not double registered.
Also records all invocations of Instruments and allows test authors to add validation to instruments.
We have recently introduced a 6 hour cache for executables (previously
lifetime in the cache was unbounded) in the host agent. With this commit
we align the look-back so it matched the client-side cache lifetime.
- allow mv_expand to push down limit and project past it
- accept a limit after mv_expand when there is also a second limit before the mv_expand
- adds a default TopN for cases when there is only a sort at Lucene level
- adds OrderBy node type to the exceptions for duplicating the limit after mv_expand
- Removes the registration of the inner actions via `getActions()`.
- Replace the outer action's `ActionType` subclass using `localOnly()`.
- Collapses each outer `Action` class with the inner `TransportAction`.
- Tightens up some unnecessary `public` visibility.
Closes#101198
Today we rely on an `isRunning` check to check for task cancellation,
but since #82685 we can actively record the failure arising from the
cancellation using a `CancellationListener`.
Closes#101197
* Compatible version parsing in YAML tests
* Propagate exception in case of non-semantic version where one is expected
* Removed remove of SNAPSHOT (no longer needed)
While dot expansion is disabled when parsing percolator queries at index
time, as that would interfere with query parsing, we still use a wrapper parser
that is conservative about what methods it supports, assuming that
document parsing needs nextToken and not much more. Turns out that when
parsing queries instead, we need to support all the XContentParser
methods including map, list etc.
This commit adds a test for script score query parsing through document
parsing via percolator field mapper, and removes the limitations in the
wrapper parser when dots expansion is disabled.
We resample data randomly if required. So far we have initialized the
random number generator based on the hash code of the request with the
intent of providing a random resampling that is still stable if the same
request is issued multiple times. However, the hash code was not stable
in a cluster because a query may use Lucene's `ByteRef` class to store
values (such as the upper and lower bound of a date range). That class
uses a murmur hash for its hash code. The murmur hash is initialized
from `org.apache.lucene.util.StringHelper#GOOD_FAST_HASH_SEED` which
intentionally varies across JVM instances. Consequently, the hash code
of `ByteRef` (and ultimately the request's hash code) varies depending
on which node in the cluster handles a request.
With this commit we instead rely on the string representation of a
query, which is stable across instances and node restarts to initialize
the random number generator. This provides randomness across requests
but also a consistent result for identical requests. Converting the
query builder to its string representation adds around 1ms of overhead.
Given that typical response times are in the range of single digit
seconds, we deem this overhead acceptable.
The anti-contention delay in the S3 repository's compare-and-exchange
operation is hard-coded at 1 second today, but sometimes we encounter a
repository that needs much longer to perform a compare-and-exchange
operation when under contention. With this commit we make the
anti-contention delay configurable.
It appears that task cancelation is executed before the settings update is
event starting in testClusterSettingsUpdateNotAcknowledged. This change uses
longer timeout to improve the probability of blocking.
The node client type is a remnant of the transport client. This commit
cleans up some test reads and an unnecessary override of the setting. It
was already not read anywhere in production. Now it is only registered
in order to provide validation. In the future it should be deprecated
and removed.
This splits out the registry and the service, which makes testing easier and removes much of the delegation from the old `APMMeter` to `Instruments` (now renamed `APMMeterRegistry`).
APMMeterService takes care of the lifecycle and APMMeterRegistry holds the instruments.
Just found that we have a lot of inconsistency and needless verbosity
here in tests. We can just use `assertAcked` in a couple spots
to save `.get`, `.actionGet` etc., especially with the signature
change I added here.
Today using painless execute api with tsdb index can fail with a `_id must be unset or set to [cn4exTOUtxytuLkQAAABeRnR_mY] but was [_id] because [test_index] is in time_series mode` error.
This change addresses this.
The painless execute api shouldn't set use a static _id, but
let the TsidExtractingIdFieldMapper generate it.
Otherwise validation TsidExtractingIdFieldMapper fails.
Closes#101072
With this commit we remove the `auto_configure` privilege for the Fleet
service account that targets profiling-related indices. This privilege
was needed to automatically create indices and data streams in the past
but as this managed by the Elasticsearch plugin, there is no need to
grant this privilege to Fleet-managed components.
With this commit we increase the look-back time interval from 3 hours to
4 hours by default. This look-back time interval is applied to determine
the correct K/V indices to query around a rollover. As the new index may
not have all data immediately after a rollover, we also need to query
the old index. Clients may cache data for up to 3 hours but to avoid
unlucky timing we add a bit of slack and increase the time interval to 4
hours.
Today repository analysis verifies that a register behaves correctly
under contention, retrying until successful, but it turns out that some
repository implementations cannot even perform uncontended register
writes correctly which may cause endless retries in the contended case.
This commit adds another repository analyser which verifies that
uncontended register writes work correctly on the first attempt.
Before we check the amount of active tasks on the prewarming executor,
we need to verify that all the tasks have been actually submitted.
Otherwise, we have a race in and amount of active tasks can be lower
then the amount of submitted tasks.
Fixes#99124
---------
Co-authored-by: David Turner <david.turner@elastic.co>
Replaces the transport-level timeout with an overall timeout on the
whole repository analysis task to ensure that all child tasks terminate
promptly.
Relates #66992Closes#101182
Follow-up to #100966.
Add more assertion overloads that consume a requestBuilder as in the
other PRs and start using `assertHitCount` in more places that were
duplicating what it does. Also add a shortcut for
`client().prepareSearch()` to integ tests and bulk-replace some usages
of this pattern to avoid these changes from blowing up test code line
count further.