elasticsearch

Commit Graph

Author	SHA1	Message	Date
Armin Braun	096b8ccc26	Fix TextFieldMapper Retaining a Reference to its Builder (#77251 ) Fixes the text field mapper and the analyzers class that also retained parameter references that go really heavy. Makes `TextFieldMapper` take hundreds of bytes compared to multiple kb per instance. closes #73845	2021-09-03 18:44:11 +02:00
Armin Braun	38faeefd85	Fix MatchOnlyTextFieldMapper Retaining a Reference to its Builder (#77201 ) Just like #77131 but for the `MatchOnlyTextFieldMapper`. Also, cleaned up a few other minor things in it to make the constructor code for this class easier to follow.	2021-09-03 10:43:40 +02:00
Nik Everett	429beba517	Centralize doc values checking (#77089 ) This adds two utility methods for to validate the parameters to the `docValueFormat` method and replaces a pile of copy and pasted code with calls to them. They just emit a standard error message if the any unsupported parameters are provided.	2021-09-01 09:06:24 -04:00
Christos Soulios	707dd497e4	Add multiple validators to Parameters (#77073 ) This PR implements support for multiple validators to a FieldMapper.Parameter. The Parameter#setValidator method was replaced by Parameter#addValidator that can be called multipled times to add validation to a parameter. All validators of a parameter will be executed in the same order as they have been added and if any of them fails all validation will failed.	2021-08-31 21:28:14 +03:00
Rene Groeschke	35ec6f348c	Introduce simple public yaml-rest-test plugin (#76554 ) This introduces a basic public yaml rest test plugin that is supposed to be used by external elasticsearch plugin authors. This is driven by #76215 - Rename yaml-rest-test to intern-yaml-rest-test - Use public yaml plugin in example plugins Co-authored-by: Mark Vieira <portugee@gmail.com>	2021-08-31 08:45:52 +02:00
Luca Cavanna	c6641bf00c	Rename ParseContext to DocumentParserContext (#74963 ) ParseContext is used to parse documents. It was easily confused with ParserContext (now renamed to MappingParserContext) which is instead used to parse mappings. To remove any confusion, this commit renames ParseContext to DocumentParserContext and adapts its subclasses accordingly.	2021-07-06 09:15:59 -04:00
Christos Soulios	df941367df	Add dimension mapping parameter (#74450 ) Added the dimension parameter to the following field types: keyword ip Numeric field types (integer, long, byte, short) The dimension parameter is of type boolean (default: false) and is used to mark that a field is a time series dimension field. Relates to #74014	2021-06-24 20:16:27 +03:00
Luca Cavanna	7cedc3ec3a	Make Document a top-level class (#74472 ) There is no reason for Document to be an inner class of ParseContext, especially as it is public and accessed directly from many different places. This commit takes it out to its own top-level class file, which has the advantage of simplifying ParseContext which could use some love too.	2021-06-24 10:56:30 +02:00
Adrien Grand	a90ca8dfae	Prevent Lucene from automatically flushing in SourceIntervalsSourceTests. (#73990 ) Closes #72130	2021-06-10 13:44:28 +02:00
Ryan Ernst	ab1a2e4a84	Add precommit task for detecting split packages (#73784 ) Modularization of the JDK has been ongoing for several years. Recently in Java 16 the JDK began enforcing module boundaries by default. While Elasticsearch does not yet use the module system directly, there are some side effects even for those projects not modularized (eg #73517). Before we can even begin to think about how to modularize, we must Prepare The Way by enforcing packages only exist in a single jar file, since the module system does not allow packages to coexist in multiple modules. This commit adds a precommit check to the build which detects split packages. The expectation is that we will add the existing split packages to the ignore list so that any new classes will not exacerbate the problem, and the work to cleanup these split packages can be parallelized. relates #73525	2021-06-08 15:04:23 -07:00
Ryan Ernst	63012c8a40	Move ParseField to o.e.c.xcontent (#73923 ) ParseField is part of the x-content lib, yet it doesn't exist under the same root package as the rest of the lib. This commit moves the class to the appropriate package. relates #73784	2021-06-08 13:32:14 -07:00
Alan Woodward	009f23e7a9	Explicitly say if stored fields aren't supported in MapperTestCase (#72474 ) MapperTestCase has a check that if a field mapper supports stored fields, those stored fields are available to index time scripts. Many of our mappers do not support stored fields, and we try and catch this with an assumeFalse so that those mappers do not run this test. However, this test is fragile - it does not work for mappers created with an index version below 8.0, and it misses mappers that always store their values, e.g. match_only_text. This commit adds a new supportsStoredField method to MapperTestCase, and overrides it for those mappers that do not support storing values. It also adds a minimalStoredMapping method that defaults to the minimal mapping plus a store parameter, which is overridden by match_only_text because storing is not configurable and always available on this mapper.	2021-04-30 08:59:56 +01:00
Adrien Grand	1f74b2072f	Enable mixed-version cluster tests for `match_only_text`. (#72102 ) These tests can be enabled now that the new field type has been backported to 7.14.	2021-04-29 16:41:39 +02:00
Alan Woodward	b27eaa38dc	Remove 'external values', and replace with swapped out XContentParsers (#72203 ) The majority of field mappers read a single value from their positioned XContentParser, and do not need to call nextToken. There is a general assumption that the same holds for any multifields defined on them, and so the XContentParser is passed down to their multifields builder as-is. This assumption does not hold for mappers that accept json objects, and so we have a second mechanism for passing values around called 'external values', where a mapper can set a specific value on its context and child mappers can then check for these external values before reading from xcontent. The disadvantage of this is that every field mapper now needs to check its context for external values. Because the values are defined by their java class, we can also know that in the vast majority of cases this functionality is unused. We have only two mappers that actually make use of this, CompletionFieldMapper and GeoPointFieldMapper. This commit removes external values entirely, and replaces it with the ability to pass a modified XContentParser to multifields. FieldMappers can just check the parser attached to their context for data and don't need to worry about multiple sources. Plugins implementing field mappers will need to take the removal of external values into account. Implementations that are passing structured objects as external values should instead use ParseContext.switchParser and wrap the objects using MapXContentParser.wrapObject(). GeoPointFieldMapper passes on a fake parser that just wraps its input data formatted as a geohash; CompletionFieldMapper has a slightly more complicated parser that in general wraps its metadata, but if textOrNull() is called without the parser being advanced just returns its text input. Relates to #56063	2021-04-29 09:17:18 +01:00
Alan Woodward	e002aa809b	Make FieldNamesFieldMapper responsible for adding its own doc fields (#71929 ) The FieldNamesFieldMapper is a metadata mapper defining a field that can be used for exists queries if a mapper does not use doc values or norms. Currently, data is added to it via a special method on FieldMapper that pulls the metadata mapper from a mapping lookup, checks to see if it is enabled, and then adds the relevant value to a lucene document. This is one of only two places that pulls a metadata mapper from the MappingLookup, and it would be nice to remove this method. This commit refactors field name handling by instead storing the names of fields to index in the fieldnames field in a set on the ParseContext, and then building the field itself in FieldNamesFieldMapper.postParse(). This means that all of the responsibility for enabling indexing, etc, is handled within the metadata mapper itself.	2021-04-27 16:03:46 +01:00
Adrien Grand	314574026a	Make sure there are no merges. Closes #72130	2021-04-23 14:17:46 +02:00
Luca Cavanna	1d514c53cb	Remove MapperService#parse method (#72080 ) We have recently split DocumentMapper creation from parsing Mapping. There was one method leftover that exposed parsing mapping into DocumentMapper, which is generally not needed. Either you only need to parse into a Mapping instance, which is more lightweight, or like in some tests you need to apply a mapping update for which you merge new mappings and get the resulting document mapper. This commit addresses this and removes the method.	2021-04-22 16:08:34 +02:00
Adrien Grand	83113ec8d3	Add `match_only_text`, a space-efficient variant of `text`. (#66172 ) This adds a new `match_only_text` field, which indexes the same data as a `text` field that has `index_options: docs` and `norms: false` and uses the `_source` for positional queries like `match_phrase`. Unlike `text`, this field doesn't support scoring.	2021-04-22 08:41:47 +02:00
Alan Woodward	72f9c4c122	Add null-field checks to shape field mappers (#71999 ) #71696 introduced a regression to the various shape field mappers, where they would no longer handle null values. This commit fixes that regression and adds a testNullValues method to MapperTestCase to ensure that all field mappers correctly handle nulls. Fixes #71874	2021-04-21 15:54:22 +01:00
Luca Cavanna	1469e18c98	Add support for script parameter to boolean field mapper (#71454 ) Relates to #68984	2021-04-12 10:04:12 +02:00
Jake Landis	279fde375e	Apply REST API compatibility testing for the :modules (#71137 )	2021-04-02 11:20:54 -05:00
Alan Woodward	1653f2fe91	Add script parameter to long and double field mappers (#69531 ) This commit adds a script parameter to long and double fields that makes it possible to calculate a value for these fields at index time. It uses the same script context as the equivalent runtime fields, and allows for multiple index-time scripted fields to cross-refer while still checking for indirection loops.	2021-03-31 11:14:11 +01:00
Mark Vieira	6339691fe3	Consolidate REST API specifications and publish under Apache 2.0 license (#70036 )	2021-03-26 16:20:14 -07:00
Nik Everett	91c700bd99	Super randomized tests for fetch fields API (#70278 ) We've had a few bugs in the fields API where is doesn't behave like we'd expect. Typically this happens because it isn't obvious what we expct. So we'll try and use randomized testing to ferret out what we want. This adds a test for most field types that asserts that `fields` works similarly to `docvalues_fields`. We expect this to be true for most fields. It does so by forcing all subclasses of `MapperTestCase` to define a method that makes random values. It declares a few other hooks that subclasses can override to further randomize the test. We skip the test for a few field types that don't have doc values: * `annotated_text` * `completion` * `search_as_you_type` * `text` We should come up with some way to test these without doc values, even if it isn't as nice. But that is a problem for another time, I think. We skip the test for a few more types just because I wanted to cut this PR in half so we could get to reviewing it earlier. We'll get to those in a follow up change. I've filed a few bugs for things that are inconsistent with `docvalues_fields`. Typically that means that we have to limit the random values that we generate to those that do round trip properly.	2021-03-24 14:16:27 -04:00
Mayya Sharipova	1de0b616eb	Add positive_score_impact to rank_features type (#69994 ) rank_features field type misses positive_score_impact parameter that rank_feature type has. This adds this parameter. Closes #68619	2021-03-10 14:55:54 -05:00
Alan Woodward	8fba6e4a6d	Handle ignored fields directly in SourceValueFetcher (#68738 ) Currently, the value fetcher framework handles ignored fields by reading the stored values of the _ignored metadata field, and passing these through on calls to fetchValues(). However, this means that if a document has multiple values indexed for a field, and one malformed value, then the fields API will ignore everything, including the valid values, and return an empty list for this document. If a document source contains a malformed value, then it must have been ignored at index time. Therefore, we can safely assume that if we get an exception parsing values from source at fetch time, they were also ignored at index time and they can be skipped. This commit moves this exception handling directly into SourceValueFetcher and ArraySourceValueFetcher, removing the need to inspect the _ignored metadata and fixing the case of mixed valid and invalid values.	2021-02-16 15:19:15 +00:00
Alan Woodward	dbff7bea37	Rename DocValueFetcher.Leaf to FormattedDocValues (#68818 ) Also moves it to a top-level interface in fielddata. It is not only used by DocValueFetcher any more, and Leaf does not really describe what it does or what it provides.	2021-02-15 10:03:25 +00:00
Christoph Büscher	e2d5183af0	Return structured nested data in ‘fields’ API At the moment, the ‘fields’ API handles nested fields the same way I handles non-nested object arrays: it just returns them in a flat list. However, the relationship between nested fields is something we should try to preserve, since this is the main purpose of mapping something as “nested” instead of just using an object. This PR changes this by returning grouped field values that are inside a nested object according to the nested object they initially appear in. Any further object structures inside a nested object are again returned as a flattened list. Fields inside nested fields don’t appear in the flattened response outside of the nested path any more. The grouping of fields inside nested objects is applied recursively if nested mappings are defined inside another nested mapping. Closes #63709	2021-02-05 11:05:03 +01:00
Mark Vieira	a92a647b9f	Update sources with new SSPL+Elastic-2.0 license headers As per the new licensing change for Elasticsearch and Kibana this commit moves existing Apache 2.0 licensed source code to the new dual license SSPL+Elastic license 2.0. In addition, existing x-pack code now uses the new version 2.0 of the Elastic license. Full changes include: - Updating LICENSE and NOTICE files throughout the code base, as well as those packaged in our published artifacts - Update IDE integration to now use the new license header on newly created source files - Remove references to the "OSS" distribution from our documentation - Update build time verification checks to no longer allow Apache 2.0 license header in Elasticsearch source code - Replace all existing Apache 2.0 license headers for non-xpack code with updated header (vendored code with Apache 2.0 headers obviously remains the same). - Replace all Elastic license 1.0 headers with new 2.0 header in xpack.	2021-02-02 16:10:53 -08:00
Mayya Sharipova	76482210b8	Add linear function to rank_feature query (#67438 ) This adds a linear function to the set of functions available for rank_feature query Closes #49859	2021-01-18 11:44:13 -05:00
gf2121	92f85981a7	Avoid duplicate serialization for TermsQueryBuilder (#67223 ) Avoid duplicate serialization for TermsQuery.	2021-01-18 09:04:29 +01:00
Julie Tibshirani	5852fbedf5	Rename QueryShardContext -> SearchExecutionContext. (#67490 ) We decided to rename `QueryShardContext` to clarify that it supports all parts of search request execution. Before there was confusion over whether it should only be used for building queries, or maybe only used in the query phase. This PR also updates the javadocs. Closes #64740.	2021-01-14 09:11:59 -08:00
Jim Ferenczi	6d1f43c6d2	Fix search_as_you_type field with term_vector (#66432 ) This commit fixes a bug in the search_as_you_type field that was introduced during the refactoring of the field mapper. The prefix field that is used internally by the search_as_you_type mapper doesn't need term vector even if they are activated on the main field. So this commit ensures that we don't copy the options from the main field when we create the prefix sub-field. Closes #66407	2020-12-16 17:04:51 +01:00
Jake Landis	c35d7c0f5a	Convert module/mappers-extra to an internal cluster test (#65971 ) modules/mappers-extra should be an internal cluster, not a javaRestTest. This test will work correctly until you you try to modify the javaRestTest test cluster. Then it will treat the javaRestTest as an external cluster to it's own test cluster potentially causing issues with tests.	2020-12-08 07:59:27 -06:00
Alan Woodward	1a8ce8716d	Restore use of default search and search_quote analyzers (#65491 ) In the refactoring of TextFieldMapper, we lost the ability to define a default search or search_quote analyzer in index settings. This commit restores that ability, and adds some more comprehensive testing. Fixes #65434	2020-11-26 16:57:45 +00:00
Julie Tibshirani	f4a462d05e	Simplify how source is passed to fetch subphases. (#65292 ) This PR simplifies how the document source is passed to each fetch subphase. A summary of the strategy: * For each document, we try to eagerly load the source and store it on `HitContext`. Most subphases that access source, like source filtering and highlighting, use `HitContext`. For nested hits, we filter the parent source and also store this source on `HitContext`. * Only for non-nested documents, we also store the loaded source on `QueryShardContext#lookup`. This allows subphases that access source through `SearchLookup` to use the pre-loaded source when possible. This is now a common occurrence, since runtime fields are supported in the 'fields' option and may soon be supported in highlighting. There is no longer a special `SearchLookup` just for the fetch phase. This was not necessary and was mostly caused by a misunderstanding of how `QueryShardContext` should be used. Addresses #62511.	2020-11-20 14:09:41 -08:00
Alan Woodward	9543d3b432	Fix UOE when building exists query for nested search-as-you-type field (#64630 ) PrefixFieldType can use the default existsQuery() implementation. Fixes #64609	2020-11-05 12:08:39 +00:00
Alan Woodward	0fd70ae383	Remove Mapper.BuilderContext (#64625 ) Mapper.BuilderContext is a simple wrapper around two objects, some IndexSettings and a ContentPath. The IndexSettings are the same as those provided in the ParserContext, so we can simplify things here by removing them and just passing ContentPath directly to Mapper.Builder#build()	2020-11-05 10:48:39 +00:00
Alan Woodward	f010269ab7	Move index analyzer management to FieldMapper/MapperService (#63937 ) Index-time analyzers are currently specified on the MappedFieldType. This has a number of unfortunate consequences; for example, field mappers that index data into implementation sub-fields, such as prefix or phrase accelerators on text fields, need to expose these sub-fields as MappedFieldTypes, which means that they then appear in field caps, are externally searchable, etc. It also adds index-time logic to a class that should only be concerned with search-time behaviour. This commit removes references to the index analyzer from MappedFieldType. Instead, FieldMappers that use the terms index can pass either a single analyzer or a Map of fields to analyzers to their super constructor, which are then exposed via a new FieldMapper#indexAnalyzers() method; all index-time analysis is mediated through the delegating analyzer wrapper on MapperService. In a follow-up, this will make it possible to register multiple field analyzers from a single FieldMapper, removing the need for 'hidden' mapper implementations on text field, parent joins, and elsewhere.	2020-11-04 13:53:09 +00:00
Luca Cavanna	344ad33a16	Remove ValueFetcher depedendency from MapperService (#64524 ) The signature of MappedFieldType#valueFetcher requires MapperService as an argument which is unfortunate as that is one of the reasons why FetchContext exposes the whole MapperService. Such use of MapperService can be replaced with exposing the QueryShardContext which encapsulates the MapperService.	2020-11-04 12:08:34 +01:00
Alan Woodward	a5168572d5	Collapse ParametrizedFieldMapper into FieldMapper (#64365 ) Now that all our FieldMapper implementations extend ParametrizedFieldMapper, we can collapse the two classes together, and remove a load of cruft from FieldMapper that is unused. In particular: * we no longer need the lucene FieldType field on FieldMapper * we no longer use clone() for merging, so we can remove it from all impls * the serialization code in FieldMapper that assumes we're looking at text fields can go	2020-11-02 15:07:52 +00:00
Alan Woodward	4191c72baf	Distinguish between simple matches with and without the terms index (#63945 ) We currently use TextSearchInfo to let query parsers know when a field will support match queries. Some field types (numeric, constant, range) can produce simple match queries that don't use the terms index, and it is useful to distinguish between these fields on the one hand and keyword/text-type fields on the other. In particular, the SignificantTextAggregation only works on fields that have indexed terms, but there is currently no efficient way to see this at search time and so the factory falls back on checking to see if an index analyzer has been defined, with the result that some nonsensical field types are permitted. This commit adds a new static TextSearchInfo implementation called SIMPLE_MATCH_WITHOUT_TERMS that can be returned by field types with no corresponding terms index. It changes significant text to check for this rather than for the presence of an index analyzer. This is a breaking change, in that the significant text agg will now throw an error up-front if you try and apply it to a numeric field, whereas before you would get an empty result.	2020-10-27 12:07:51 +00:00
Luca Cavanna	b96f26eba2	Remove documentMapperParser method from MapperService (#63938 ) MapperService allows to retrieve its internal DocumentMapperParser instance. Such method is only used in tests, and always to parse mappings which is already exposed by MapperService through a specific parse method. This commit removes the getter for DocumentMapperParser from MapperService in favour of calling MapperService#parse	2020-10-20 20:11:29 +02:00
Alan Woodward	8b98af24b4	Remove generics from Mapper.Builder (#63623 ) We simplified the generics on Mapper.Builder in #56747, but stopped short of removing them entirely because they were still used in various places in the code. Now that most field mappers have been converted to parametrized form, these generics are no longer useful. There are very few places where a fluent Builder pattern is used, almost all in tests, and these can all be replaced with simple casts; in exchange, we remove lots of visual cruft and clean up a number of warnings.	2020-10-13 17:24:10 +01:00
Luca Cavanna	f491422e1e	Ensure field types consistency on supporting text queries (#63487 ) Some supported field types don't support term queries, and throw exception in their termQuery method. That exception is either an IllegalArgumentException or a QueryShardException. There is logic in MatchQuery that skips the field or not depending on the exception that is thrown. Also, such field types should hold a TextSearchInfo.NONE while that is not always the case. With this commit we make the following changes: - streamline using TextSearchInfo.NONE in all field types that don't support text queries - standardize the exception being thrown when a field type does not support term queries to be IllegalArgumentException. Note that this is not a breaking change as both exceptions previously returned translated to 400 status code. - Adapt the MatchQuery logic to skip fields that don't support term queries. There is no need to call termQuery passing an empty string and catch exceptions potentially thrown. We can rather check the TextSearchInfo which tells already whether the field supports text queries or not. - add a test method to MapperTestCase that verifies the consistency of a field type by verifying that it is not searchable whenever it uses TextSearchInfo.NONE, while it is otherwise. This is what triggered all of the above changes.	2020-10-13 11:05:43 +02:00
Julie Tibshirani	62857b49d1	Add support for missing value fetchers. (#63515 ) This PR implements value fetching for the following field types: * `text` phrase and prefix subfields * `search_as_you_type`, plus its subfields * `token_count`, which is implemented by fetching doc values Supporting these types helps ensure that retrieving all fields through `"fields": ["*"]` doesn't fail because of unsupported value fetchers.	2020-10-12 13:57:29 -07:00
Julie Tibshirani	8c56bbc3e6	Add factory methods for common value fetchers. (#63438 ) This PR adds factory methods for the most common implementations: * `SourceValueFetcher.identity` to pass through the source value untouched. * `SourceValueFetcher.toString` to simply convert the source value to a string.	2020-10-08 11:58:36 -07:00
Luca Cavanna	95582da9a5	Rename QueryShardContext#fieldMapper to getFieldType (#63399 ) Given that we have a class called `FieldMapper` and that the `fieldMapper` method exposed by `QueryShardContext` actually allows to get a `MappedFieldType` given its name, this commit renames such method to `getFieldType`	2020-10-07 16:11:53 +02:00
Julie Tibshirani	cc09b6b6a0	Make array value parsing flag more robust. (#63354 ) When constructing a value fetcher, the 'parsesArrayValue' flag must match `FieldMapper#parsesArrayValue`. However there is nothing in code or tests to help enforce this. This PR reworks the value fetcher constructors so that `parsesArrayValue` is 'false' by default. Just as for `FieldMapper#parsesArrayValue`, field types must explicitly set it to true and ensure the behavior is covered by tests. Follow-up to #62974.	2020-10-06 14:42:03 -07:00
Luca Cavanna	ac93ca1819	Remove MapperService argument from IndexFieldData.Builder#build (#63197 ) MapperService carries a lot of weight and is only used to determine if loading of field data for the id field is enabled, which can be done in a different way. There was another usage that recently went away with the removal of `TypeFieldMapper`.	2020-10-05 11:45:31 +02:00

1 2 3 4

169 Commits