The node client type is a remnant of the transport client. This commit
cleans up some test reads and an unnecessary override of the setting. It
was already not read anywhere in production. Now it is only registered
in order to provide validation. In the future it should be deprecated
and removed.
Fix for #97334 where incorrect feature name was provided.
Correct more instances of synonyms_feature_flag_enabled for synonyms_api_feature_flag_enabled
Closes#96641, #97177
For snapshots builds we automatically enable all feature flags,
but for release builds they need to be explicitly added to
test clusters for tests.
This PR does it for synonyms feature.
Closes#96641, #97177
we want to allow overriding info (GET /) api in serverless, therefore this commit moves the RestMainAction and is transport classes into a module that has a rest plugin
Main endpoint is often used in testing to verfiy that a cluster is ready, hence this commit also has to add a testing dependency on main to a lot of modules
relates #95422
Document parsing methods currently throw MapperParsingException. This
isn't very helpful, as it doesn't contain any information about where the parse
error happened - it is designed for parsing mappings, which are realised into
java maps before being examined. This commit introduces a new exception
specifically for document parsing that extends XContentException, so that
it reports the current position of the parser as part of its error message.
Fixes#85083
This commit adds a new test framework for configuring and orchestrating
test clusters for both Java and YAML REST testing. This will eventually
replace the existing "test-clusters" Gradle plugin and the build-time
cluster orchestration.
Loading of stored fields is currently handled directly in FetchPhase, with
some fairly complex logic examining various bits of the FetchContext to work
out what fields need to be loaded. This is further complicated by synthetic
source, which may have its own stored field requirements.
This commit tries to separate out these concerns a little by adding a new
StoredFieldsSpec record that holds information about which stored fields
need to be loaded. Each FetchSubPhaseProcessor can now report a
StoredFieldsSpec detailing what its requirements are, and these specs can
be merged together, along with requirements from a SourceLoader, to
determine up-front what fields should be loaded by the StoredFieldLoader.
The stored fields themselves are added into the SearchHit by a new
StoredFieldsPhase, which handles alias resolution and value post-
processing. The logic to determine when source should be loaded and
when not, based on the presence of script fields or stored fields, is
moved into FetchContext, which highlights some inconsistencies that
can be fixed in follow-up commits.
Run the aggregations tests v7 compat tests against the aggregations
module and *not* the `rest-api-spec` module. This allows us to drop
`rest-api-spec`'s dependency on the aggregations module and keep it
"just the server" which is nice.
There are a few side effects here that are ok:
1. We run all aggregations REST tests in the aggregations module.
Even the ones in `rest-api-spec`. This means we run them twice. We
plan to move all of the aggregations REST tests into the aggregations
module anyway.
2. We now bundle the REST tests in the aggregations module into the
tests that the clients run for their verification step. This should
keep our clients from losing coverage.
This commit introduces a new aggregation module
and moves the `adjacency_matrix` to this new module.
The new module name is `aggregations`.
The new module will use the `org.elasticsearch.aggregations.bucket` package for all bucket aggregations.
Relates to #90283
If a docvalues field matches multiple field patterns, then ES will
return the value of that doc-values field multiple times. Like fetching
fields from source, we should deduplicate the matching doc-values
fields.
This adds the generation and upload logic of Gradle dependency graphs to snyk
We directly implemented a rest api based snyk plugin as:
the existing snyk gradle plugin delegates to the snyk command line tool the command line tool
uses custom gradle logic by injecting a init file that is
a) using deprecated build logic which we definitely want to avoid
b) uses gradle api we avoid like eager task creation.
Shipping this as a internal gradle plugin gives us the most flexibility as we only want to monitor
production code for now we apply this plugin as part of the elasticsearch.build plugin,
that usage has been for now the de-facto indicator if a project is considered a "production" project
that ends up in our distribution or public maven repositories. This isnt yet ideal and we will revisit
the distinction between production and non production code / projects in a separate effort.
As part of this effort we added the elasticsearch.build plugin to more projects that actually end up
in the distribution. To unblock us on this we for now disabled a few check tasks that started failing by applying elasticsearch.build.
Addresses #87620
Back when we introduced the fields parameter to the search API, it could only fetch values from _source, hence
the corresponding sub-fetch phase fails early whenever _source is disabled. Today though runtime fields can
be retrieved from a separate value fetcher that reads from fielddata, and metadata fields can be retrieved
from stored fields. These two scenarios currently throw an unnecessary error whenever _source is disabled.
This commit removes the check for disabled _source, so that runtime fields and metadata fields can be retrieved even when _source is disabled. Fields that need to be loaded from _source are simply skipped whenever _source is disabled, similar to when a field is not found in _source.
Closes#87072
Mostly this is just removing boosts, but it also simplifies span_not
slightly. It also removes the boost parameter from term, as that
shares an implementation with span_term.
Relates to #76515
Move more of the strategy decision making for which Cardinality collector to use into the factory, and also the ValuesSourceRegistry. Lays the ground work for future improvements to Cardinality.
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Ensure projects with only yaml, java or cluster tests also apply precommit checks.
We only apply testingconventions now for projects with existing src test folder
as the TestingConventionsTask is incompatible with projects with no test sourceSet.
This is a prerequisite to port more projects away from using StandaloneRestTestPlugin
and RestTestPlugin in favor of yaml, java or cluster tests with dedicated sourceSets.
Also we fix deprecation warnings for forbiddenPattern and filePermissions tasks about
implicit non declared dependencies on resources tasks
This adds a new sampling aggregation that performs a background sampling over all documents in an index.
The syntax is as follows:
```
{
"aggregations": {
"sampling": {
"random_sampler": {
"probability": 0.1
},
"aggs": {
"price_percentiles": {
"percentiles": {
"field": "taxful_total_price"
}
}
}
}
}
}
```
This aggregation provides fast random sampling over the entire document set in order to speed up costly aggregations.
Testing this over a variety of aggregations and data sets, the median speed up when sampling at `0.001` over millions of documents is around 70X speed improvement.
Relative error rate does rely on the size of the data and the aggregation kind. Here are some typically expected numbers when sampling over 10s of millions of documents. `p` is the configured probability and `n` is the number of documents matched by your provided filter query.
This is a reincarnation of #53200
This commit adds a new `random_sampler` aggregation for randomly including documents in the collected result.
API format is
```js
{
"aggs": {
"sampler": {
"random_sampler": {
"probability": 0.001, //the probability that a doc is included
"seed": 42 // Optional seed for consistent results
},
"aggs": {
"mean": {
"avg": {
"field": "value"
}
}
}
}
}
}
```
The sampling skips `n` documents where `n` is a random sampling from an optimized geometric distribution where the probability of success is the provided `probability`. Additionally, each shard queried will have a separate random stream (even when the seed is provided). One may consider `probability` as "percentage of documents matched", but that comparison is not exact as there is variability in the number of documents considered.
Performance is greatly improved for many metrics and on larger datasets this improvement can be immense.
Allows searching on number field types (long, short, int, float, double, byte, half_float) when those fields are not
indexed (index: false) but just doc values are enabled.
This enables searches on archive data, which has access to doc values but not index structures. When combined with
searchable snapshots, it allows downloading only data for a given (doc value) field to quickly filter down to a select set
of documents.
Note to reviewers:
I have split isSearchable into two separate methods isIndexed and isSearchable on MappedFieldType. The former one is
about whether actual indexing data structures have been used (postings or points), and the latter one on whether you
can run queries on the given field (e.g. used by field caps). For number field types, queries are now allowed whenever
points are available or when doc values are available (i.e. searchability is expanded).
Relates #81210 and #52728
DocumentParser parses documents by following their object hierarchy, and
using a parallel hierarchy of ObjectMappers to work out how to map leaf fields.
Field names that contain dots complicate this, meaning that many methods
need to reverse-engineer the object hierarchy to check that the current parent
object mapper is the correct one; this is particularly complex when objects
are being created dynamically.
To simplify this logic, this commit introduces a DotExpandingXContentParser,
which wraps another XContentParser and re-interprets any field name containing
dots as a series of objects. So for example, `"foo.bar.baz":{ ... }` is represented
as `"foo":{"bar":{"baz":{...}}}`. DocumentParser uses this to automatically
expand all field names containing dots when parsing the source.
This PR changes uses of transient cluster settings to
persistent cluster settings.
The PR also deprecates the transient settings usage.
Relates to #49540
This fixes a bug where the range aggregation always treats the range end points as doubles, even if the field value doesn't have enough resolution to fill a double. This was creating issues where the range would have a "more precise" approximation of an unrepresentable number than the field, causing the value to fall in the wrong bucket.
Note 1: This does not resolve the case where we have a long value that is not precisely representable as a double. Since the wire format sends the range bounds as doubles, by the time we get to where this fix is operating, we've already lost the precision to act on a big long. Fixing that problem will require a wire format change, and I'm not convinced it's worth it right now.
Note 2: This is probably still broken for ScaledFloats, since they don't implement NumberFieldType.
Resolves#77033
- Use file property and conventions to avoid afterEvaluate hook
- Simplify root build script
- One little step closer to configuration cache compliance
If the _nodes/stats API received a level=shards request parameter, then the response would have two "shards" fields,
which would cause problems with json parsers. This commit renames the "shards" field that currently only contains
"total_count" to "shard_stats".
Relates #78311#75433
Prior to this change, the only way to express that a compatible REST test
should be skipped was via the blacklist of the ES test runner. While this
works for ES, it requires any consumers of the compatible REST tests copy/paste
the list of tests that should not be executed.
This commit introduces 2 new transforms for the compatible REST tests.
skipTest - adds a skip section to the named test with a given reason
skipTestsByFilePattern - add a skip section to the global setup for
any test file that matches the file pattern
and includes the given reason.
All uses of the blacklist have been replaced by the new skip transforms.
* Do not create unused testCluster
This avoids creating test clusters that are not required during the build.
We use lazy configuration here on testClusters and only instantiate them as theyre
* Do not fail on run task (debug)
* Create more test cluster lazy
* Make more test cluster lazy
* Avoid creating unused testcluster
* Fix PluginBuildPlugin
* Fix disabling geo db download
* Fix cluster setup in repository-multi-version
* Polishing
* Fix issue with irretic groovy ogic
* Fix bwc tests
* Fix more bwcTests
* Fix more bwc tests
* Fix more bwc tests
* Fix more bwc tests
* Fix typo
* Minor polishing
* Fix rolling upgrade tests
* Fix cluster config in sql qa mixedcluster project
* Fix more bwc tests
* Clean up before review
* Document test cluster usage
* Api polising after Review
provide useCluster(Provider) method to TestClusterAware
Ideally we take this a step further and realize those test clusters only on use.
But out of scope of this PR.
* Allow gradle provider as value for nonSystemProperties
* Some simplification on test configuration
* Fix typo in rest test config
* Fix more typos
* Fix another typo
* Fix more typos
This adds a setting to enable time series mode that is hidden by the
`es.index_mode_feature_flag_registered` feature flag. All it does right
now is make sure you haven't configured index sorting and partitioning.
This gives us a place to "hang" all of our further work on time series
mode.
Time series mode will entirely take over index sorting. We don't believe
you'll ever be able to configure sorting yourself when you are in time
series mode. We don't expect time series mode to support index
partitioning when we first build it but we'd like to get there.
This commit updates two task names:
```
yamlRestCompatTest -> yamlRestTestV7CompatTest
transformV7RestTests -> yamlRestTestV7CompatTransform
```
`7` is the N-1 version and calculated, such that when `8` is
N-1 version the task names will be `yamlRestTestV8CompatTest` and
`yamlRestTestV8CompatTransform`
The motivation for `yamlRestCompatTest -> yamlRestTestV7CompatTest` is that
many projects have configured `yamlRestCompatTest`
but that configuration is specific to the N-1 version. For example,
if we blacklist tests when running compatibility with v7, we don't also
want to blacklist those tests when running compatibility with v8.
By introducing a version-specific identifier in the name, the task will not
even exist when bumping the version creating the need to (correctly) remove
the version-specific condition.
The motivation for `transformV7RestTests -> yamlRestTestV7CompatTransform`
is to provide more consistent naming.
The idea behind the naming is the main task people
are likely familiar with is :
`yamlRestTest` so we will use that as a base.
`yamlRestTestV7CompatTest` to run the version-specific compat tests
`yamlRestTestV7CompatTransform` to run the version-specific transformations for the compat tests
CI should be un-effected since since we introduced a lifecycle task
name `checkRestCompat` which is what CI should be configured to use.
This commit allows the compatible REST API tests to execute on Windows.
They were previously excluded from Windows due to a command line limit
when defining a very large exclusion list. That exclusion list is much
smaller now and they will now execute properly on Windows.
Also, an empty exclusion list has been removed from the build config.
This introduces a basic public yaml rest test plugin that is supposed to be used by external
elasticsearch plugin authors. This is driven by #76215
- Rename yaml-rest-test to intern-yaml-rest-test
- Use public yaml plugin in example plugins
Co-authored-by: Mark Vieira <portugee@gmail.com>
v7compatibilityNotSupportedTests was introduced to make it easier to
track tests that have been identified as not needing compatible changes
and those that still need to be checked.
We have checked all tests now and the separate list is no longer needed.
relates #51816
relates #73912
This change introduces a Service Account for Kibana to use when
authenticating to Elasticsearch. The Service Account with
kibana service name under the elastic namespace,
uses the same RoleDescriptor as the existing kibana_system
built-in user and is its functional equivalent when it comes to
AuthZ.
Previously removed in #42654. The query and the parameter won't work under rest api compatibility and an exception with a message is returned advising that just use of match/multi_match is enough
relates #51816