Commit Graph

181 Commits

Author SHA1 Message Date
Mark Vieira a92a647b9f Update sources with new SSPL+Elastic-2.0 license headers
As per the new licensing change for Elasticsearch and Kibana this commit
moves existing Apache 2.0 licensed source code to the new dual license
SSPL+Elastic license 2.0. In addition, existing x-pack code now uses
the new version 2.0 of the Elastic license. Full changes include:

 - Updating LICENSE and NOTICE files throughout the code base, as well
   as those packaged in our published artifacts
 - Update IDE integration to now use the new license header on newly
   created source files
 - Remove references to the "OSS" distribution from our documentation
 - Update build time verification checks to no longer allow Apache 2.0
   license header in Elasticsearch source code
 - Replace all existing Apache 2.0 license headers for non-xpack code
   with updated header (vendored code with Apache 2.0 headers obviously
   remains the same).
 - Replace all Elastic license 1.0 headers with new 2.0 header in xpack.
2021-02-02 16:10:53 -08:00
Rory Hunter ad1f876daa
Replace NOT operator with explicit `false` check (#67817)
We have an in-house rule to compare explicitly against `false` instead
of using the logical not operator (`!`). However, this hasn't
historically been enforced, meaning that there are many violations in
the source at present.

We now have a Checkstyle rule that can detect these cases, but before we
can turn it on, we need to fix the existing violations. This is being
done over a series of PRs, since there are a lot to fix.
2021-01-26 14:47:09 +00:00
Julie Tibshirani 5852fbedf5
Rename QueryShardContext -> SearchExecutionContext. (#67490)
We decided to rename `QueryShardContext` to clarify that it supports all parts
of search request execution. Before there was confusion over whether it should
only be used for building queries, or maybe only used in the query phase. This
PR also updates the javadocs.

Closes #64740.
2021-01-14 09:11:59 -08:00
Mark Tozzi e26c9bbd52
Rename BYTES ValuesSourceType to reflect intended usage (#66762) 2020-12-30 12:39:17 -05:00
Julie Tibshirani d0683141f4
Ensure all query builder tests consider older versions. (#66401)
This PR removes outdated overrides in some tests that prevent them from testing
older index versions. Also removes an old comment + logic from
AggregatorFactoriesTests.
2020-12-16 09:19:26 -08:00
Nik Everett 7b3c6f2a0c
Further clean up in AggregatorTestCase (#66395)
Drops `AggregatorTestCase#mapperServiceMock` because it is getting in
the way of other work I'm doing for runtime fields. It was only
overridden to test the `parent` and `child` aggregation to add the
`MappedFieldType`s for join fields in the backdoor. Those aggregations
can just as easily add those fields in the normal method calls.
2020-12-16 11:56:04 -05:00
Armin Braun 06a31a0aca
Add List Append Utility Method (#65576)
(list -> copy -> add one -> wrap immutable) is a pretty common pattern in CS
updates and tests => added a shortcut for it here and used it in easily identifyable
spots.
2020-12-01 02:47:21 +01:00
Nik Everett c227554080
Remove SearchContext from constructing aggregations (#64953)
This replaces the `SearchContext` passed to the ctor of `Aggregation`s
with `AggregationContext`. It ends up adding a fairly large number of
methods to `AggregationContext` but in exchange it shows a path to
removing a few methods from `SearchContext`. That seems nice!

It also gives us an accurate inventory of "all of the stuff" that
aggregations use to build and run.
2020-11-30 13:19:44 -05:00
Julie Tibshirani f4a462d05e
Simplify how source is passed to fetch subphases. (#65292)
This PR simplifies how the document source is passed to each fetch subphase. A summary of the strategy:
* For each document, we try to eagerly load the source and store it on `HitContext`. Most subphases that access source, like source filtering and highlighting, use `HitContext`. For nested hits, we filter the parent source and also store this source on `HitContext`.
* Only for non-nested documents, we also store the loaded source on `QueryShardContext#lookup`. This allows subphases that access source through `SearchLookup` to use the pre-loaded source when possible. This is now a common occurrence, since runtime fields are supported in the 'fields' option and may soon be supported in highlighting.

There is no longer a special `SearchLookup` just for the fetch phase. This was not necessary and was mostly caused by a misunderstanding of how `QueryShardContext` should be used.

Addresses #62511.
2020-11-20 14:09:41 -08:00
Alan Woodward 0fd70ae383
Remove Mapper.BuilderContext (#64625)
Mapper.BuilderContext is a simple wrapper around two objects, some
IndexSettings and a ContentPath. The IndexSettings are the same as
those provided in the ParserContext, so we can simplify things here
by removing them and just passing ContentPath directly to
Mapper.Builder#build()
2020-11-05 10:48:39 +00:00
Luca Cavanna f1e9aec8dc
Replace more MapperService usages in favour of QueryShardContext (#64584)
This commit replaces most of the leftover direct access to MapperService from SearchContext and FetchContext with accessing QueryShardContext instead, which wraps the MapperService and exposes a subset of its functionality needed when executing the different phases of search
2020-11-04 15:49:38 +01:00
Alan Woodward f010269ab7
Move index analyzer management to FieldMapper/MapperService (#63937)
Index-time analyzers are currently specified on the MappedFieldType. This
has a number of unfortunate consequences; for example, field mappers that
index data into implementation sub-fields, such as prefix or phrase
accelerators on text fields, need to expose these sub-fields as MappedFieldTypes,
which means that they then appear in field caps, are externally searchable,
etc. It also adds index-time logic to a class that should only be concerned
with search-time behaviour.

This commit removes references to the index analyzer from MappedFieldType.
Instead, FieldMappers that use the terms index can pass either a single analyzer
or a Map of fields to analyzers to their super constructor, which are then
exposed via a new FieldMapper#indexAnalyzers() method; all index-time analysis 
is mediated through the delegating analyzer wrapper on MapperService. 
In a follow-up, this will make it possible to register multiple field analyzers from 
a single FieldMapper, removing the need for 'hidden' mapper implementations 
on text field, parent joins, and elsewhere.
2020-11-04 13:53:09 +00:00
Luca Cavanna 344ad33a16
Remove ValueFetcher depedendency from MapperService (#64524)
The signature of MappedFieldType#valueFetcher requires MapperService as an argument which is unfortunate as that is one of the reasons why FetchContext exposes the whole MapperService.

Such use of MapperService can be replaced with exposing the QueryShardContext which encapsulates the MapperService.
2020-11-04 12:08:34 +01:00
Alan Woodward a5168572d5
Collapse ParametrizedFieldMapper into FieldMapper (#64365)
Now that all our FieldMapper implementations extend ParametrizedFieldMapper,
we can collapse the two classes together, and remove a load of cruft from
FieldMapper that is unused. In particular:

* we no longer need the lucene FieldType field on FieldMapper
* we no longer use clone() for merging, so we can remove it from all impls
* the serialization code in FieldMapper that assumes we're looking at text fields can go
2020-11-02 15:07:52 +00:00
Nik Everett 3af540b50d
Remove aggregation's postCollect phase (#64016)
After #63811 it became clear to me that `postCollect` is kind of
dangerous and not all that useful. So this removes it.

The trouble with `postCollect` is that it all happened right after we
finished calling `collect` on the `LeafBucketCollectors` but before we
built the aggregation results. But in #63811 we found out that we can't
call `postCollect` on the children of `parent` or `child` aggregators
until we know which *which* aggregation results we're building.

So this removes `postCollect` and moves all of the things we did at
post-collect phase into `buildAggregations` or into hooks called in
those methods.
2020-10-28 17:33:27 -04:00
Nik Everett d2043a4b12 Add more tests for parent/child aggs
I broke the `parent` and `child` agg something fierce in #57892 and
fixed it in #63811. This adds more tests for that fix mimicking other
reported failures.
2020-10-28 16:06:02 -04:00
Luca Cavanna 2186b75af9
Reduce usages of SearchContext#mapperService (#64250)
We recently removed getMapperService from QueryShardContext in the attempt to avoid consumers depending on the whole MapperService. SearchContext still has that problem although it is easier to solved as it can delegate to QueryShardContext for the most part, which is what this commit does for most of the existing usages.
2020-10-28 09:55:52 +01:00
Nik Everett 7feb19a74f
Make sure non-collecting aggs include sub-aggs (#64214)
Now that we're consistently using `cat_match` to filter which shards we
run on we can get this confusing case:
1. You have a search with, say, a range and a sub-agg.
2. That search has a query that `can_match` can recognize will match no
   docs. On *any* shard.
3. So we dutifully run it on a single shard so it can produce the
   "empty" aggs.
4. The shard we pick happens to not have the target of the range mapped.
5. This kicks in the special range aggregator that doesn't collect any
   documents.
6. Before this commit, that range aggregator *also* never produced any
   sub-aggs.

So, without this change, it was quite possible for a search that
happened to match no documents to "throw away" the sub-aggs of a range
and a few other aggs.

We've had this problem for a long, long time but it is more confusing
now because `can_match` is really kicking in and causing us to see cases
where it looks like you are targeting a lot of shards but you really are
only targeting a couple. It used to be that to get the "no sub-aggs"
behavior you had to explicitly target only shards that didn't map the
target field of the `range` agg. And, like, in that case it isn't too
bad because you targeted a sort of degenerate shard. But now that
`can_match` is doing its thing you can end up with the confusing steps
above. It took me several hours to track down what what happening I know
how the individual pieces of all of this works. It took four hours to
figure out how they fit together in this case....

Anyway! This replaces all the aggregator implementations that throw out
the sub-aggregators with ones that keep them. I think this'll be less
confusing in the future.

Closes #64142
2020-10-27 15:45:24 -04:00
Nik Everett 6ef0e5f5e8
Limit blast redius of SearchContext in aggs (#64068)
This takes away access to the `SearchContext` from all subclasses of
`Aggregator`. Now they have access to three things:
* BigArrays
* The top level Query
* The IndexSearcher

These are used by a whole bunch of aggs.

This is a useful change because `SearchContext` is very large and
difficult to mock in tests and difficult to reason about in general.
Limiting what aggs can use when they are being collected helps with
this.

We still pass `SearchContext` to `AggregatorBase`'s ctor so the thing is
still around. But we can remove that access in a follow up.
2020-10-27 09:12:58 -04:00
Nik Everett 769e30dd88
Fix broken parent and child aggregator (#63811)
In #57892 I broke *some* sub-aggregations inside of the `parent` and
`child` aggregator, specifically any sub-aggregations that do work in
the `postCollect` phase. This fixes it by delaying the post collect
phase of aggs under `parent` and `child` until `beforeBuildingBuckets`
because, well, we haven't done *any* collection until after that phase.
2020-10-19 10:54:09 -04:00
Alan Woodward b79e6ae8f7
Convert parent-join mappers to parametrized form (#63878)
This converts the three parent-join mapper implementations to parametrized
form; MetaJoinFieldMapper and ParentIdFieldMapper have no builders or
merging logic as they are always created directly by the ParentJoinFieldMapper.

Relates to #62988
2020-10-19 15:37:47 +01:00
Alan Woodward 70d88ef62d
Rework parent-join to not require access to DocumentMapper (#63738)
Parent joins work using a cluster of field mappers: the join field itself;
a set of subfields that allow multiple relationships between parents and
children to be defined; and a metadata field that acts to only allow a
single join field per index to be defined. The various queries and
aggregations that use this infrastructure retrieve the join field mapper
via a static method and then build themselves by pulling individual
relationship mappers from this main mapper.

Using mappers rather than MappedFieldTypes means that we need to
expose DocumentMapper at search time, which is something we are
trying to avoid. This commit refactors things so that the join relations
are encapsulated in a Joiner object, which lives instead on the
MappedFieldType associated with the metadata join field. Rather than
using the ParentJoinFieldMapper and connected ParentIdFieldMappers,
we can now build queries and aggregations using this Joiner object,
retrieved via the QueryShardContext or AggregationContext using
a static helper method on Joiner itself.
2020-10-19 12:17:48 +01:00
Luca Cavanna d126afb2c2
Remove direct dependency between ParserContext and MapperService (#63741)
ParserContext only needs some small portions of MapperService, and certainly does not need to expose MapperService through its current getter method.

With this change we address this by keeping references to the needed components rather than the whole MapperService
2020-10-15 17:45:53 +02:00
Alan Woodward 8b98af24b4
Remove generics from Mapper.Builder (#63623)
We simplified the generics on Mapper.Builder in #56747, but stopped short
of removing them entirely because they were still used in various places in
the code. Now that most field mappers have been converted to parametrized
form, these generics are no longer useful. There are very few places where
a fluent Builder pattern is used, almost all in tests, and these can all be
replaced with simple casts; in exchange, we remove lots of visual cruft and
clean up a number of warnings.
2020-10-13 17:24:10 +01:00
Nik Everett 4aaffc6a3d
Consider query when optimizing date rounding (#63403)
Before this change we inspected the index when optimizing
`date_histogram` aggregations, precalculating the divisions for the
buckets for the entire range of dates on the index so long as there
aren't a ton of these buckets. This works very well when you query all
of the dates in the index which is quite common - after all, folks
frequently want to query a week of data and have daily indices.

But it doesn't work as well when the index is much larger than the
query. This is quite common when dumping data into ES just to
investigate it but less common in the traditional time series use case.
But even there it still happens, it is just less impactful. Consider
the default query produced by Kibana's Discover app: a range of 15
minutes and a interval of 30 seconds. This optimization saves something
like 3 to 12 nanoseconds per document, so that 15 minutes would have to
have hundreds of millions of documents for it to be impactful.

Anyway, this commit takes the query into account when precalculating the
buckets. Mostly this is good when you have "dirty data". Immagine
loading 80 billion docs in an index to investigate them. Most of them
have dates around 2015 and 2016 but some have dates in 1970 and
others have dates in 2030. These outlier dates are "dirty" "garbage".
Well, without this change a `date_histogram` across many of these docs
is significantly slowed down because we don't precalculate the range due
to the outliers. That's just rude! So this change takes the query into
account.

The bulk of the code change here is plumbing the query into place. It
turns out that its a *ton* of plumbing, so instead of just adding a
`Query` member in hundreds of args replace `QueryShardContext` with a
new `AggregationContext` which does two things:
1. Has the top level `Query`.
2. Exposes just the parts of `QueryShardContext` that we actually need
   to run aggregation. This lets us simplify a few tests now and will
   let us simplify many, many tests later.
2020-10-12 13:11:44 -04:00
Julie Tibshirani 8c56bbc3e6
Add factory methods for common value fetchers. (#63438)
This PR adds factory methods for the most common implementations:
* `SourceValueFetcher.identity` to pass through the source value untouched.
* `SourceValueFetcher.toString` to simply convert the source value to a string.
2020-10-08 11:58:36 -07:00
Julie Tibshirani cc09b6b6a0
Make array value parsing flag more robust. (#63354)
When constructing a value fetcher, the 'parsesArrayValue' flag must match
`FieldMapper#parsesArrayValue`. However there is nothing in code or tests to
help enforce this.

This PR reworks the value fetcher constructors so that `parsesArrayValue` is
'false' by default. Just as for `FieldMapper#parsesArrayValue`, field types must
explicitly set it to true and ensure the behavior is covered by tests.

Follow-up to #62974.
2020-10-06 14:42:03 -07:00
Alan Woodward ce649d07d7
Move FieldMapper#valueFetcher to MappedFieldType (#62974)
For runtime fields, we will want to do all search-time interaction with
a field definition via a MappedFieldType, rather than a FieldMapper, to
avoid interfering with the logic of document parsing. Currently, fetching
values for runtime scripts and for building top hits responses need to
call a method on FieldMapper. This commit moves this method to
MappedFieldType, incidentally simplifying the current call sites and freeing
us up to implement runtime fields as pure MappedFieldType objects.
2020-10-04 10:47:04 +01:00
Luca Cavanna daade44174
Share same existsQuery impl throughout mappers (#57607)
Most of our field types have the same implementation for their `existsQuery` method which relies on doc_values if present, otherwise it queries norms if available or uses a term query against the _field_names meta field. This standard implementation is repeated in many different mappers.

There are field types that only query doc_values, because they always have them, and field types that always query _field_names, because they never have norms nor doc_values. We could apply the same standard logic to all of these field types as `MappedFieldType` has the knowledge about what data structures are available.

This commit introduces a standard implementation that does the right thing depending on the data structure that is available. With that only field types that require a different behaviour need to override the existsQuery method.

At the same time, this no longer forces subclasses to override `existsQuery`, which could be forgotten when needed. To address this we introduced a new test method in `MapperTestCase` that verifies the `existsQuery` being generated and its consistency with the available data structures.
2020-09-23 08:58:09 +02:00
Luca Cavanna 3a9b65733c
Move stored flag from TextSearchInfo to MappedFieldType (#62717) 2020-09-22 15:41:24 +02:00
Nik Everett 9a127adb4b
Implement fields fetch for runtime fields (#61995)
This implements the `fields` API in `_search` for runtime fields using
doc values. Most of that implementation is stolen from the
`docvalue_fields` fetch sub-phase, just moved into the same API that the
`fields` API uses. At this point the `docvalue_fields` fetch phase looks
like a special case of the `fields` API.

While I was at it I moved the "which doc values sub-implementation
should I use for fetching?" question from a bunch of `instanceof`s to a
method on `LeafFieldData` so we can be much more flexible with what is
returned and we're not forced to extend certain classes just to make the
fetch phase happy.

Relates to #59332
2020-09-15 15:57:26 -04:00
Julie Tibshirani f29c743a47
Support the 'fields' option in inner_hits and top_hits. (#62259)
This PR adds support for the 'fields' option in the following places:
* Anytime `inner_hits` is used, for both fetching nested/ child docs and field collapsing
* The `top_hits` aggregation

Addresses #61949.
2020-09-14 10:08:58 -07:00
Nik Everett 8c37d05fdf
Support longs in BitArray (#61867)
We frequently use `long`s with `BitArray` in aggs and right now we have
to assert that the `long` fits in an `int`. This adds support for `long`
to `BitArray` so we don't need those assertions.
2020-09-02 13:12:34 -04:00
Luca Cavanna 462e25f9bb
Pass SearchLookup supplier through to fielddataBuilder (#61430)
Runtime fields need to have a SearchLookup available, when building their fielddata implementations, so that they can look up other fields, runtime or not.

To achieve that, we add a Supplier<SearchLookup> argument to the existing MappedFieldType#fielddataBuilder method.

As we introduce the ability to look up other fields while building fielddata for mapped fields, we implicitly add the ability for a field to require other fields. This requires some protection mechanism that detects dependency cycles to prevent stack overflow errors.

With this commit we also introduce detection for cycles, as well as a limit on the depth of the references for a runtime field. Note that we also plan on introducing cycles detection at compile time, so the runtime cycles detection is a last resort to prevent stack overflow errors but we hope that we can reject runtime fields from being registered in the mappings when they create a cycle in their definition.

Note that this commit does not introduce any production implementation of runtime fields, but is rather a pre-requisite to merge the runtime fields feature branch.

This is a breaking change for MapperPlugins that plug in a mapper, as the signature of MappedFieldType#fielddataBuilder changes from taking a single argument (the index name), to also accept a Supplier<SearchLookup>.

Co-authored-by: Nik Everett <nik9000@gmail.com>

Relates to #59332
2020-08-26 20:19:21 +02:00
Nhat Nguyen 879279c9b4
Introduce point in time APIs in x-pack basic (#61062)
This commit introduces a new API that manages point-in-times in x-pack 
basic. Elasticsearch pit (point in time) is a lightweight view into the
state of the data as it existed when initiated. A search request by
default executes against the most recent point in time. In some cases,
it is preferred to perform multiple search requests using the same point
in time. For example, if refreshes happen between search_after requests,
then the results of those requests might not be consistent as changes
happening between searches are only visible to the more recent point in
time.

A point in time must be opened before being used in search requests. The 
`keep_alive` parameter tells Elasticsearch how long it should keep a
point in time around.

```
POST /my_index/_pit?keep_alive=1m
```

The response from the above request includes a `id`, which should be 
passed to the `id` of the `pit` parameter of search requests.

```
POST /_search
{
    "query": {
        "match" : {
            "title" : "elasticsearch"
        }
    },
    "pit": {
            "id":  "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWICBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==",
            "keep_alive": "1m"
    }
}
```

Point-in-times are automatically closed when the `keep_alive` is 
elapsed. However, keeping point-in-times has a cost; hence,
point-in-times should be closed as soon as they are no longer used in
search requests.

```
DELETE /_pit
{
    "id" : "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWIBBXV1aWQyAAA="
}
```

#### Notable works in this change:

- Move the search state to the coordinating node: #52741
- Allow searches with a specific reader context: #53989
- Add the ability to acquire readers in IndexShard: #54966

Relates #46523
Relates #26472

Co-authored-by: Jim Ferenczi <jimczi@apache.org>
2020-08-24 20:24:35 -04:00
Julie Tibshirani 5457b34343
Correct how field retrieval handles multifields and copy_to. (#61309)
Before when a value was copied to a field through a parent field or `copy_to`,
we parsed it using the `FieldMapper` from the source field. Instead we should
parse it using the target `FieldMapper`. This ensures that we apply the
appropriate mapping type and options to the copied value.

To implement the fix cleanly, this PR refactors the value parsing strategy. Now
instead of looking up values directly, field mappers produce a helper object
`ValueFetcher`. The value fetchers are responsible for almost all aspects of
fetching, including looking up the right paths in the _source.

The PR is fairly big but each commit can be reviewed individually.

Fixes #61033.
2020-08-19 16:50:27 -07:00
Mark Tozzi e3c9ece1e0
Remove a bunch of type boilerplate from Aggs (#60852) 2020-08-12 09:30:40 -04:00
Nik Everett dc1b9690a7
Fix the parent join aggregator test case (#60991)
The test was putting parent and child documents into different segments
which is unrealistic and was causing errors.

Closes #60980
2020-08-11 17:52:57 -04:00
Nhat Nguyen 766177ffcb Mute ChildrenToParentAggregatorTests
Tracked at #60980
2020-08-11 12:56:04 -04:00
Jim Ferenczi 5de0ed9432
Replace AggregatorTestCase#search with AggregatorTestCase#searchAndReduce (#60683)
* Replace AggregatorTestCase#search with AggregatorTestCase#searchAndReduce

This commit removes the ability to test the top level result of an aggregator
before it runs the final reduce. All aggregator tests that use AggregatorTestCase#search
are rewritten with AggregatorTestCase#searchAndReduce in order to ensure that we test
the final output (the one sent to the end user) rather than an intermediary result
that could be different.
This change also removes spurious commits triggered on top of a random index writer.
These commits slow down the tests and are redundant with the commits that the
random index writer performs.
2020-08-06 14:08:26 +02:00
Alan Woodward d6fc439fef
Move mapper validation to the mappers themselves (#60072)
Currently, validation of mappers (checking that cross-references are correct, limits on
field name lengths and object depths, multiple definitions, etc) is performed by the
MapperService. This means that any mapper-specific validation, for example that done
on the CompletionFieldMapper, needs to be called specifically from core server code,
and so we can't add validation to mappers that live in plugins.

This commit reworks the validation framework so that mapper-specific validation is
done on the Mapper itself. Mapper gets a new `validate(MappingLookup)`
method (already present on `MetadataFieldMapper` and now pulled up to the parent
interface), which is called from a new `DocumentMapper.validate()` method. All
the validation code currently living on `MapperService` moves either to individual
mapper implementations (FieldAliasMapper, CompletionFieldMapper) or into
`MappingLookup`, an altered `DocumentFieldMappers` which now knows about 
object fields and can check for duplicate definitions, or into DocumentMapper 
which handles soft limit checks.
2020-08-04 12:19:47 +01:00
Julie Tibshirani 7b64410286
Avoid reloading _source for every inner hit. (#60494)
Previously if an inner_hits block required _ source, we would reload and parse
the root document's source for every hit. This PR adds a shared SourceLookup to
the inner hits context that allows inner hits to reuse parsed source if it's
already available. This matches our approach for sharing the root document ID.

Relates to #32818.
2020-08-03 15:31:09 -07:00
Julie Tibshirani 8a89d95372
Add search `fields` parameter to support high-level field retrieval. (#60100)
This feature adds a new `fields` parameter to the search request, which
consults both the document `_source` and the mappings to fetch fields in a
consistent way. The PR merges the `field-retrieval` feature branch.

Addresses #49028 and #55363.
2020-07-27 13:25:55 -07:00
Jake Landis 7dd57c9415
Introduce javaRestTest source set/task and convert modules (#59939)
Introduce a javaRestTest source set and task to compliment the yamlRestTest.
javaRestTest differs such that the code is sourced from Java and may have
different dependencies and setup requirements for the test clusters. This also
allows the tests to run in parallel in different cluster instances to prevent any
cross test contamination between the two types of tests.

Included in this PR is all :modules no longer use the integTest task. The tests
are now driven by test, yamlRestTest, javaRestTest, and internalClusterTest.
Since only :modules (and :rest-api-spec) have been converted to yamlRestTest
we can now disable the integTest task if either yamlRestTest or javaRestTest have
been applied. Once all projects are converted, we can delete the integTest task.

related: #56841
related: #59444
2020-07-21 17:17:17 -05:00
Nik Everett 98698f569d
Drop some params from IndexFieldData.Builder (#59934)
We never used the `IndexSettings` parameter and we only used the
`MappedFieldType` parameter to get the name of the field which we
already know everywhere where we build the `IFD.Builder`. This allows us
to drop a fair bit of ceremony from a couple of tests.
2020-07-21 08:29:58 -04:00
Jake Landis ddd882b835
Convert modules to use yamlRestTest (#59089)
This commit moves the modules REST tests to the
newly introduced yamlRestTest source set. A few
tests have also been re-named to include the correct
IT suffix. Without changing the names, the testing
conventions task would fail since now that the YAML
tests are no longer present pacify the convention.
These tests have moved to the internalClusterTest
source set.

related: #56841
2020-07-13 11:32:42 -05:00
Alan Woodward 62f51eb9ae
MappedFieldType no longer requires equals/hashCode/clone (#59212)
With the removal of mapping types and the immutability of FieldTypeLookup in #58162, we no longer
have any cause to compare MappedFieldType instances. This means that we can remove all equals
and hashCode implementations, and in addition we no longer need the clone implementations which
were required for equals/hashcode testing. This greatly simplifies implementing new MappedFieldTypes,
which will be particularly useful for the runtime fields project.
2020-07-09 21:01:29 +01:00
Nik Everett ea5df51b91
Improve cardinality measure used to build aggs (#56533)
This makes a `parentCardinality` available to every `Aggregator`'s ctor
so it can make intelligent choices about how it collects bucket values.
This replaces `collectsFromSingleBucket` and is similar to it but:
1. It supports `NONE`, `ONE`, and `MANY` values and is generally
   extensible if we decide we can use more precise counts.
2. It is more accurate. `collectsFromSingleBucket` assumed that all
   sub-aggregations live under multi-bucket aggregations. This is
   normally true but `parentCardinality` is properly carried forward
   for single bucket aggregations like `filter` and for multi-bucket
   aggregations configured in single-bucket for like `range` with a
   single range.

While I was touching every aggregation I renamed `doCreateInternal` to
`createMapped` because that seemed like a much better name and it was
right there, next to the change I was already making.

Relates to #56487
2020-07-06 18:31:08 -04:00
Jake Landis 333a5d8cdf
Create plugin for yamlTest task (#56841)
This commit creates a new Gradle plugin to provide a separate task name
and source set for running YAML based REST tests. The only project
converted to use the new plugin in this PR is distribution/archives/integ-test-zip.
For which the testing has been moved to :rest-api-spec since it makes the most
sense and it avoids a small but awkward change to the distribution plugin.

The remaining cases in modules, plugins, and x-pack will be handled in followups.

This plugin is distinctly different from the plugin introduced in #55896 since
the YAML REST tests are intended to be black box tests over HTTP. As such they
should not (by default) have access to the classpath for that which they are testing.

The YAML based REST tests will be moved to separate source sets (yamlRestTest).
The which source is the target for the test resources is dependent on if this
new plugin is applied. If it is not applied, it will default to the test source
set.

Further, this introduces a breaking change for plugin developers that
use the YAML testing framework. They will now need to either use the new source set
and matching task, or configure the rest resources to use the old "test" source set that
matches the old integTest task. (The former should be preferred).

As part of this change (which is also breaking for plugin developers) the
rest resources plugin has been removed from the build plugin and now requires
either explicit application or application via the new YAML REST test plugin.

Plugin developers should be able to fix the breaking changes to the YAML tests
by adding apply plugin: 'elasticsearch.yaml-rest-test' and moving the YAML tests
under a yamlRestTest folder (instead of test)
2020-07-06 12:13:01 -05:00
ParthPunkster afb6f9c81f
Fix bug in parent and child aggregators when parent field not defined (#57089)
Adding null check for ParentJoinFieldMapper in ChildrenAggregationBuilder.joinFieldResolveConfig

Closes #42997
2020-07-06 09:54:11 -04:00