ParseContext is used to parse documents. It was easily confused with ParserContext (now renamed to MappingParserContext) which is instead used to parse mappings.
To remove any confusion, this commit renames ParseContext to DocumentParserContext and adapts its subclasses accordingly.
Change the formatter config to sort / order imports, and reformat the
codebase. We already had a config file for Eclipse users, so Spotless now
uses that.
The "Eclipse Code Formatter" plugin ought to be able to use this file as
well for import ordering, but in my experiments the results were poor.
Instead, use IntelliJ's `.editorconfig` support to configure import
ordering.
I've also added a config file for the formatter plugin.
Other changes:
* I've quietly enabled the `toggleOnOff` option for Spotless. It was
already possible to disable formatting for sections using the markers
for docs snippets, so enabling this option just accepts this reality
and makes it possible via `formatter:off` and `formatter:on` without
the restrictions around line length. It should still only be used as
a very last resort and with good reason.
* I've removed mention of the `paddedCell` option from the contributing
guide, since I haven't had to use that option for a very long time. I
moved the docs to the spotless config.
When libs/core was created, several classes were moved from server's
o.e.common package, but they were not moved to a new package. Split
packages need to go away long term, so that Elasticsearch can even think
about modularization. This commit moves all the classes under o.e.common
in core to o.e.core.
relates #73784
Extract usage of internal API from TestClustersPlugin and PluginBuildPlugin and related plugins and build logic
This includes a refactoring of ElasticsearchDistribution to handle types
better in a way we can differentiate between supported Elasticsearch
Distribution types supported in TestCkustersPlugin and types only supported
in internal plugins.
It also introduces a set of internal versions of public plugins.
As part of this we also generate the plugin descriptors now.
As a follow up on this we can actually move these public used classes into
an extra project (declared as included build)
We keep LoggedExec and VersionProperties effectively public And workaround for RestTestBase
MapperTestCase has a check that if a field mapper supports stored fields,
those stored fields are available to index time scripts. Many of our mappers
do not support stored fields, and we try and catch this with an assumeFalse
so that those mappers do not run this test. However, this test is fragile - it
does not work for mappers created with an index version below 8.0, and it
misses mappers that always store their values, e.g. match_only_text.
This commit adds a new supportsStoredField method to MapperTestCase,
and overrides it for those mappers that do not support storing values. It
also adds a minimalStoredMapping method that defaults to the minimal
mapping plus a store parameter, which is overridden by match_only_text
because storing is not configurable and always available on this mapper.
The majority of field mappers read a single value from their positioned
XContentParser, and do not need to call nextToken. There is a general
assumption that the same holds for any multifields defined on them, and
so the XContentParser is passed down to their multifields builder as-is.
This assumption does not hold for mappers that accept json objects,
and so we have a second mechanism for passing values around called
'external values', where a mapper can set a specific value on its context
and child mappers can then check for these external values before reading
from xcontent. The disadvantage of this is that every field mapper now
needs to check its context for external values. Because the values are
defined by their java class, we can also know that in the vast majority of
cases this functionality is unused. We have only two mappers that actually
make use of this, CompletionFieldMapper and GeoPointFieldMapper.
This commit removes external values entirely, and replaces it with the ability
to pass a modified XContentParser to multifields. FieldMappers can just check
the parser attached to their context for data and don't need to worry about
multiple sources.
Plugins implementing field mappers will need to take the removal of external
values into account. Implementations that are passing structured objects
as external values should instead use ParseContext.switchParser and
wrap the objects using MapXContentParser.wrapObject().
GeoPointFieldMapper passes on a fake parser that just wraps its input data
formatted as a geohash; CompletionFieldMapper has a slightly more complicated
parser that in general wraps its metadata, but if textOrNull() is called without
the parser being advanced just returns its text input.
Relates to #56063
The test currently generates a list of random values and checks whether
retrieval of these values via doc values is equivallent to fetching them with a
value fetcher from source. If the random value array contains a duplicate value,
we will only get one back via doc values, but fetching from source will return
both, which is a case we should probably avoid in this test.
Closes#71053
We've had a few bugs in the fields API where is doesn't behave like we'd
expect. Typically this happens because it isn't obvious what we expct. So
we'll try and use randomized testing to ferret out what we want. This adds
a test for most field types that asserts that `fields` works similarly
to `docvalues_fields`. We expect this to be true for most fields.
It does so by forcing all subclasses of `MapperTestCase` to define a
method that makes random values. It declares a few other hooks that
subclasses can override to further randomize the test.
We skip the test for a few field types that don't have doc values:
* `annotated_text`
* `completion`
* `search_as_you_type`
* `text`
We should come up with some way to test these without doc values, even
if it isn't as nice. But that is a problem for another time, I think.
We skip the test for a few more types just because I wanted to cut this
PR in half so we could get to reviewing it earlier. We'll get to those
in a follow up change.
I've filed a few bugs for things that are inconsistent with
`docvalues_fields`. Typically that means that we have to limit the
random values that we generate to those that *do* round trip properly.
This reduces the ceremony declaring test artifacts for a project.
It also solves an issue with usage of deprecated testRuntime that
testArtifacts extendsFrom which seems not required at all and would have
broke with Gradle 7.0 anyhow
Test artifact resolution is now variant aware which allows us a more adequate
compile and runtime classpath for the consuming projects.
We also Introduce a convention method in the elasticsearch build to declare
test artifact dependencies in an easy way close to how its done by the gradle build in
test fixture plugin.
Furthermore we cleaned up some inconsistent test dependencies declarations when
relying on a project and on its test artifacts
This has been deprecated in gradle before but we havnt been warned.
Gradle 7.0 will likely introduce a change in behaviour here that we
should fix the usage of this configuration upfront.
See https://github.com/gradle/gradle/issues/16027 for further information
about the change in Gradle 7.0
As per the new licensing change for Elasticsearch and Kibana this commit
moves existing Apache 2.0 licensed source code to the new dual license
SSPL+Elastic license 2.0. In addition, existing x-pack code now uses
the new version 2.0 of the Elastic license. Full changes include:
- Updating LICENSE and NOTICE files throughout the code base, as well
as those packaged in our published artifacts
- Update IDE integration to now use the new license header on newly
created source files
- Remove references to the "OSS" distribution from our documentation
- Update build time verification checks to no longer allow Apache 2.0
license header in Elasticsearch source code
- Replace all existing Apache 2.0 license headers for non-xpack code
with updated header (vendored code with Apache 2.0 headers obviously
remains the same).
- Replace all Elastic license 1.0 headers with new 2.0 header in xpack.
Part 5.
We have an in-house rule to compare explicitly against `false` instead
of using the logical not operator (`!`). However, this hasn't
historically been enforced, meaning that there are many violations in
the source at present.
We now have a Checkstyle rule that can detect these cases, but before we
can turn it on, we need to fix the existing violations. This is being
done over a series of PRs, since there are a lot to fix.
We decided to rename `QueryShardContext` to clarify that it supports all parts
of search request execution. Before there was confusion over whether it should
only be used for building queries, or maybe only used in the query phase. This
PR also updates the javadocs.
Closes#64740.
This PR simplifies how the document source is passed to each fetch subphase. A summary of the strategy:
* For each document, we try to eagerly load the source and store it on `HitContext`. Most subphases that access source, like source filtering and highlighting, use `HitContext`. For nested hits, we filter the parent source and also store this source on `HitContext`.
* Only for non-nested documents, we also store the loaded source on `QueryShardContext#lookup`. This allows subphases that access source through `SearchLookup` to use the pre-loaded source when possible. This is now a common occurrence, since runtime fields are supported in the 'fields' option and may soon be supported in highlighting.
There is no longer a special `SearchLookup` just for the fetch phase. This was not necessary and was mostly caused by a misunderstanding of how `QueryShardContext` should be used.
Addresses #62511.
Mapper.BuilderContext is a simple wrapper around two objects, some
IndexSettings and a ContentPath. The IndexSettings are the same as
those provided in the ParserContext, so we can simplify things here
by removing them and just passing ContentPath directly to
Mapper.Builder#build()
Index-time analyzers are currently specified on the MappedFieldType. This
has a number of unfortunate consequences; for example, field mappers that
index data into implementation sub-fields, such as prefix or phrase
accelerators on text fields, need to expose these sub-fields as MappedFieldTypes,
which means that they then appear in field caps, are externally searchable,
etc. It also adds index-time logic to a class that should only be concerned
with search-time behaviour.
This commit removes references to the index analyzer from MappedFieldType.
Instead, FieldMappers that use the terms index can pass either a single analyzer
or a Map of fields to analyzers to their super constructor, which are then
exposed via a new FieldMapper#indexAnalyzers() method; all index-time analysis
is mediated through the delegating analyzer wrapper on MapperService.
In a follow-up, this will make it possible to register multiple field analyzers from
a single FieldMapper, removing the need for 'hidden' mapper implementations
on text field, parent joins, and elsewhere.
The signature of MappedFieldType#valueFetcher requires MapperService as an argument which is unfortunate as that is one of the reasons why FetchContext exposes the whole MapperService.
Such use of MapperService can be replaced with exposing the QueryShardContext which encapsulates the MapperService.
Now that all our FieldMapper implementations extend ParametrizedFieldMapper,
we can collapse the two classes together, and remove a load of cruft from
FieldMapper that is unused. In particular:
* we no longer need the lucene FieldType field on FieldMapper
* we no longer use clone() for merging, so we can remove it from all impls
* the serialization code in FieldMapper that assumes we're looking at text fields can go
Some supported field types don't support term queries, and throw exception in their termQuery method. That exception is either an IllegalArgumentException or a QueryShardException. There is logic in MatchQuery that skips the field or not depending on the exception that is thrown.
Also, such field types should hold a TextSearchInfo.NONE while that is not always the case.
With this commit we make the following changes:
- streamline using TextSearchInfo.NONE in all field types that don't support text queries
- standardize the exception being thrown when a field type does not support term queries to be IllegalArgumentException. Note that this is not a breaking change as both exceptions previously returned translated to 400 status code.
- Adapt the MatchQuery logic to skip fields that don't support term queries. There is no need to call termQuery passing an empty string and catch exceptions potentially thrown. We can rather check the TextSearchInfo which tells already whether the field supports text queries or not.
- add a test method to MapperTestCase that verifies the consistency of a field type by verifying that it is not searchable whenever it uses TextSearchInfo.NONE, while it is otherwise. This is what triggered all of the above changes.
This PR adds factory methods for the most common implementations:
* `SourceValueFetcher.identity` to pass through the source value untouched.
* `SourceValueFetcher.toString` to simply convert the source value to a string.
When constructing a value fetcher, the 'parsesArrayValue' flag must match
`FieldMapper#parsesArrayValue`. However there is nothing in code or tests to
help enforce this.
This PR reworks the value fetcher constructors so that `parsesArrayValue` is
'false' by default. Just as for `FieldMapper#parsesArrayValue`, field types must
explicitly set it to true and ensure the behavior is covered by tests.
Follow-up to #62974.
For runtime fields, we will want to do all search-time interaction with
a field definition via a MappedFieldType, rather than a FieldMapper, to
avoid interfering with the logic of document parsing. Currently, fetching
values for runtime scripts and for building top hits responses need to
call a method on FieldMapper. This commit moves this method to
MappedFieldType, incidentally simplifying the current call sites and freeing
us up to implement runtime fields as pure MappedFieldType objects.
This commit adds a mechanism to MapperTestCase that allows implementing
test classes to check that their parameters can be updated, or throw conflict
errors as advertised. Child classes override the registerParameters method
and tell the passed-in UpdateChecker class about their parameters. Simple
conflicts can be checked, using the existing minimal mappings as a base to
compare against, or alternatively a particular initial mapping can be provided
to check edge cases (eg, norms can be updated from true to false, but not
vice versa). Updates are registered with a predicate that checks that the update
has in fact been applied to the resulting FieldMapper.
Fixes#61631
To better align the plugin naming with other mapper plugins under x-pack (e.g.
mapper-flattened) this PR changes the plugin name and the containing directory
to "mapper-version"