elasticsearch

Commit Graph

Author	SHA1	Message	Date
Alan Woodward	ce649d07d7	Move FieldMapper#valueFetcher to MappedFieldType (#62974 ) For runtime fields, we will want to do all search-time interaction with a field definition via a MappedFieldType, rather than a FieldMapper, to avoid interfering with the logic of document parsing. Currently, fetching values for runtime scripts and for building top hits responses need to call a method on FieldMapper. This commit moves this method to MappedFieldType, incidentally simplifying the current call sites and freeing us up to implement runtime fields as pure MappedFieldType objects.	2020-10-04 10:47:04 +01:00
Alan Woodward	be3357310a	Convert all FieldMappers in mapper-extras to parametrized form (#62938 ) This converts RankFeatureFieldMapper, RankFeaturesFieldMapper, SearchAsYouTypeFieldMapper and TokenCountFieldMapper to parametrized forms. It also adds a TextParams utility class to core containing functions that help declare text parameters - mainly shared between SearchAsYouTypeFieldMapper and KeywordFieldMapper at the moment, but it will come in handy when we convert TextFieldMapper and friends. Relates to #62988	2020-09-29 18:14:28 +01:00
Alan Woodward	118fa77a31	Add parameter update and conflict tests to MapperTestCase (#62828 ) This commit adds a mechanism to MapperTestCase that allows implementing test classes to check that their parameters can be updated, or throw conflict errors as advertised. Child classes override the registerParameters method and tell the passed-in UpdateChecker class about their parameters. Simple conflicts can be checked, using the existing minimal mappings as a base to compare against, or alternatively a particular initial mapping can be provided to check edge cases (eg, norms can be updated from true to false, but not vice versa). Updates are registered with a predicate that checks that the update has in fact been applied to the resulting FieldMapper. Fixes #61631	2020-09-24 19:39:44 +01:00
Alan Woodward	b1d6d42a68	Remove mapping boost parameter entirely (#62639 ) Follow up to #62623, this commit removes support in 8x for index-time boosts. There is no longer a boost field on MappedFieldType. Indexes created in 8x and after will throw exceptions if a boost parameter is included in mappings, and indexes created in 7x will emit warnings.	2020-09-23 14:28:59 +01:00
Luca Cavanna	daade44174	Share same existsQuery impl throughout mappers (#57607 ) Most of our field types have the same implementation for their `existsQuery` method which relies on doc_values if present, otherwise it queries norms if available or uses a term query against the _field_names meta field. This standard implementation is repeated in many different mappers. There are field types that only query doc_values, because they always have them, and field types that always query _field_names, because they never have norms nor doc_values. We could apply the same standard logic to all of these field types as `MappedFieldType` has the knowledge about what data structures are available. This commit introduces a standard implementation that does the right thing depending on the data structure that is available. With that only field types that require a different behaviour need to override the existsQuery method. At the same time, this no longer forces subclasses to override `existsQuery`, which could be forgotten when needed. To address this we introduced a new test method in `MapperTestCase` that verifies the `existsQuery` being generated and its consistency with the available data structures.	2020-09-23 08:58:09 +02:00
Luca Cavanna	3a9b65733c	Move stored flag from TextSearchInfo to MappedFieldType (#62717 )	2020-09-22 15:41:24 +02:00
Luca Cavanna	d669cb500f	Dense vector field type minor fixes (#62631 ) The dense vector field is not aggregatable although it produces fielddata through its BinaryDocValuesField. It should pass up hasDocValues set to true to its parent class in its constructor, and return isAggregatable false This may not have consequences today, but it will be important once we try to share the same exists query implementation throughout all of the mappers with #57607.	2020-09-19 00:27:17 +02:00
markharwood	fe9145fa5e	Search - add case insensitive flag for "term" family of queries (#61596 ) Adds case insensitive flag for term, prefix, and wildcard queries Closes #61546	2020-09-18 17:17:08 +01:00
Christos Soulios	55294e5c42	Allow metadata fields in the _source (#61590 ) * Configurable metadata field mappers in the _source * Changes to support metadata fields in _source Added test testDocumentContainsAllowedMetadataField() * Merged DocumentParserTests from master Fixed broken tests * Handle non string values * Allow metadata fields to parse values/objects/arrays/null * Removed MetadataFieldMapper.isAllowedInSource() method Delegated this functionality to MetadataFieldMapper.parse() * Fixed bug that caused tests to break * Cleanup parsing for existing metadata fields * Cleanup parsing for existing metadata fields * Remove doParse() method * Fix broken test * Lookup metadata mapper by name Instead of linear scan	2020-09-18 09:45:32 +03:00
Nik Everett	8a9028c169	Fix docvalue fetch for scaled floats (#62425 ) In #61995 I moved the `docvalue_field` fetch code into a place where I could share it with the fancy new `fields` fetch API. Specifically, runtime fields can use it all that doc values code now. But I broke `scaled_floats` by switching them how they are fetched from `double` to `string`. This adds the override you need to switch them back.	2020-09-15 20:34:54 -04:00
Nik Everett	9a127adb4b	Implement fields fetch for runtime fields (#61995 ) This implements the `fields` API in `_search` for runtime fields using doc values. Most of that implementation is stolen from the `docvalue_fields` fetch sub-phase, just moved into the same API that the `fields` API uses. At this point the `docvalue_fields` fetch phase looks like a special case of the `fields` API. While I was at it I moved the "which doc values sub-implementation should I use for fetching?" question from a bunch of `instanceof`s to a method on `LeafFieldData` so we can be much more flexible with what is returned and we're not forced to extend certain classes just to make the fetch phase happy. Relates to #59332	2020-09-15 15:57:26 -04:00
Adrien Grand	39bde05040	Upgrade to lucene-8.7.0-snapshot-cdfdc1e0851. (#62334 ) Upgrade to a new Lucene snapshot that (at least partially) addresses the indexing rate regression when index sorting is enabled.	2020-09-15 14:19:42 +02:00
Alan Woodward	3269d1b486	Add specific test for serializing all mapping parameter values (#61844 ) This commit adds a test to MapperTestCase that explicitly checks that a mapper can serialize all its default values, and that this serialization can then be re-parsed. Note that the test is disabled for non-parametrized mappers as their serialization may in some cases output parameters that are not accepted. Gradually moving all mappers to parametrized form will address this. The commit also contains a fix to keyword mappers, which were not correctly serializing the similarity parameter; this partially addresses #61563. It also enables `null` as a value for `null_value` on `scaled_float`, as a follow-up to #61798	2020-09-02 20:03:36 +01:00
Luca Cavanna	462e25f9bb	Pass SearchLookup supplier through to fielddataBuilder (#61430 ) Runtime fields need to have a SearchLookup available, when building their fielddata implementations, so that they can look up other fields, runtime or not. To achieve that, we add a Supplier<SearchLookup> argument to the existing MappedFieldType#fielddataBuilder method. As we introduce the ability to look up other fields while building fielddata for mapped fields, we implicitly add the ability for a field to require other fields. This requires some protection mechanism that detects dependency cycles to prevent stack overflow errors. With this commit we also introduce detection for cycles, as well as a limit on the depth of the references for a runtime field. Note that we also plan on introducing cycles detection at compile time, so the runtime cycles detection is a last resort to prevent stack overflow errors but we hope that we can reject runtime fields from being registered in the mappings when they create a cycle in their definition. Note that this commit does not introduce any production implementation of runtime fields, but is rather a pre-requisite to merge the runtime fields feature branch. This is a breaking change for MapperPlugins that plug in a mapper, as the signature of MappedFieldType#fielddataBuilder changes from taking a single argument (the index name), to also accept a Supplier<SearchLookup>. Co-authored-by: Nik Everett <nik9000@gmail.com> Relates to #59332	2020-08-26 20:19:21 +02:00
Nik Everett	4d37d145aa	Migrate some more mapper test cases (#61507 ) Migrate some more mapper test cases from `ESSingleNodeTestCase` to `MapperTestCase`.	2020-08-25 11:04:40 -04:00
Alan Woodward	dbd4fd0254	Convert NumberFieldMapper to parametrized form (#61092 ) In addition, this commit converts ScaledFloatFieldMapper as it was relying on a number of static values taken from NumberFieldMapper that had changed or been removed.	2020-08-20 14:58:23 +01:00
Julie Tibshirani	5457b34343	Correct how field retrieval handles multifields and copy_to. (#61309 ) Before when a value was copied to a field through a parent field or `copy_to`, we parsed it using the `FieldMapper` from the source field. Instead we should parse it using the target `FieldMapper`. This ensures that we apply the appropriate mapping type and options to the copied value. To implement the fix cleanly, this PR refactors the value parsing strategy. Now instead of looking up values directly, field mappers produce a helper object `ValueFetcher`. The value fetchers are responsible for almost all aspects of fetching, including looking up the right paths in the _source. The PR is fairly big but each commit can be reviewed individually. Fixes #61033.	2020-08-19 16:50:27 -07:00
Nik Everett	622ac75297	Migrate some field mapper tests to ESTestCase (#61301 ) This switches a few tests for field mappers from `ESSingleNodeTestCase` to `ESTestCase` because, in general, we prefer to avoid `ESSingleNodeTestCase` when we can because it is slow and "big". "Big" here means that it pulls in an entire node, making it difficult to reason about what you are testing.	2020-08-19 11:56:55 -04:00
Alan Woodward	3a81b11073	Make MetadataFieldMapper extend ParametrizedFieldMapper (#59847 ) This commit cuts over all metadata field mappers to parametrized format.	2020-08-10 17:21:42 +01:00
Julie Tibshirani	f3403faf12	Remove IndexFieldData#clear since it is unused. (#60475 ) This method was never called. It also seemed tricky that calling a method on `IndexFieldData` could clear the contents of a shared cache.	2020-07-30 13:53:59 -07:00
Julie Tibshirani	8a89d95372	Add search `fields` parameter to support high-level field retrieval. (#60100 ) This feature adds a new `fields` parameter to the search request, which consults both the document `_source` and the mappings to fetch fields in a consistent way. The PR merges the `field-retrieval` feature branch. Addresses #49028 and #55363.	2020-07-27 13:25:55 -07:00
Jake Landis	7dd57c9415	Introduce javaRestTest source set/task and convert modules (#59939 ) Introduce a javaRestTest source set and task to compliment the yamlRestTest. javaRestTest differs such that the code is sourced from Java and may have different dependencies and setup requirements for the test clusters. This also allows the tests to run in parallel in different cluster instances to prevent any cross test contamination between the two types of tests. Included in this PR is all :modules no longer use the integTest task. The tests are now driven by test, yamlRestTest, javaRestTest, and internalClusterTest. Since only :modules (and :rest-api-spec) have been converted to yamlRestTest we can now disable the integTest task if either yamlRestTest or javaRestTest have been applied. Once all projects are converted, we can delete the integTest task. related: #56841 related: #59444	2020-07-21 17:17:17 -05:00
Nik Everett	98698f569d	Drop some params from IndexFieldData.Builder (#59934 ) We never used the `IndexSettings` parameter and we only used the `MappedFieldType` parameter to get the name of the field which we already know everywhere where we build the `IFD.Builder`. This allows us to drop a fair bit of ceremony from a couple of tests.	2020-07-21 08:29:58 -04:00
Nik Everett	6130ecc173	Small cleanup for IndexFieldData (#59724 ) This drops `IndexComponent` from `IndexFieldData` because it wasn't doing anything other than forcing us to perform a bunch of ceremony to build them.	2020-07-17 11:15:17 -04:00
Jake Landis	ddd882b835	Convert modules to use yamlRestTest (#59089 ) This commit moves the modules REST tests to the newly introduced yamlRestTest source set. A few tests have also been re-named to include the correct IT suffix. Without changing the names, the testing conventions task would fail since now that the YAML tests are no longer present pacify the convention. These tests have moved to the internalClusterTest source set. related: #56841	2020-07-13 11:32:42 -05:00
Alan Woodward	62f51eb9ae	MappedFieldType no longer requires equals/hashCode/clone (#59212 ) With the removal of mapping types and the immutability of FieldTypeLookup in #58162, we no longer have any cause to compare MappedFieldType instances. This means that we can remove all equals and hashCode implementations, and in addition we no longer need the clone implementations which were required for equals/hashcode testing. This greatly simplifies implementing new MappedFieldTypes, which will be particularly useful for the runtime fields project.	2020-07-09 21:01:29 +01:00
Jake Landis	333a5d8cdf	Create plugin for yamlTest task (#56841 ) This commit creates a new Gradle plugin to provide a separate task name and source set for running YAML based REST tests. The only project converted to use the new plugin in this PR is distribution/archives/integ-test-zip. For which the testing has been moved to :rest-api-spec since it makes the most sense and it avoids a small but awkward change to the distribution plugin. The remaining cases in modules, plugins, and x-pack will be handled in followups. This plugin is distinctly different from the plugin introduced in #55896 since the YAML REST tests are intended to be black box tests over HTTP. As such they should not (by default) have access to the classpath for that which they are testing. The YAML based REST tests will be moved to separate source sets (yamlRestTest). The which source is the target for the test resources is dependent on if this new plugin is applied. If it is not applied, it will default to the test source set. Further, this introduces a breaking change for plugin developers that use the YAML testing framework. They will now need to either use the new source set and matching task, or configure the rest resources to use the old "test" source set that matches the old integTest task. (The former should be preferred). As part of this change (which is also breaking for plugin developers) the rest resources plugin has been removed from the build plugin and now requires either explicit application or application via the new YAML REST test plugin. Plugin developers should be able to fix the breaking changes to the YAML tests by adding apply plugin: 'elasticsearch.yaml-rest-test' and moving the YAML tests under a yamlRestTest folder (instead of test)	2020-07-06 12:13:01 -05:00
Alan Woodward	3944066e99	Move MappedFieldType#getSearchAnalyzer and #getSearchQuoteAnalyzer to TextSearchInfo (#58639 ) Analyzers are specific to text searching, and so should be in TextSearchInfo rather than on the generic MappedFieldType.	2020-07-01 13:16:02 +01:00
Alan Woodward	83ce7a9691	Move MappedFieldType.similarity() to TextSearchInfo (#58439 ) Similarities only apply to a few text-based field types, but are currently set directly on the base MappedFieldType class. This commit moves similarity information into TextSearchInfo, and removes any mentions of it from MappedFieldType or FieldMapper. It was previously possible to include a similarity parameter on a number of field types that would then ignore this information. To make it obvious that this has no effect, setting this parameter on non-text field types now issues a deprecation warning.	2020-06-24 09:54:56 +01:00
Alan Woodward	57316e26af	Add text search information to MappedFieldType (#58230 ) Now that MappedFieldType no longer extends lucene's FieldType, we need to have a way of getting the index information about a field necessary for building text queries, building term vectors, highlighting, etc. This commit introduces a new TextSearchInfo abstraction that holds this information, and a getTextSearchInfo() method to MappedFieldType to make it available. Field types that do not support text search can just return null here. This allows us to remove the MapperService.getLuceneFieldType() shim method.	2020-06-23 13:37:49 +01:00
Alan Woodward	708f6bf879	Add serialization test for FieldMappers when include_defaults=true (#58235 ) Fixes a bug in TextFieldMapper serialization when index is false, and adds a base-class test to ensure that all field mappers are tested against all variations with defaults both included and excluded. Fixes #58188	2020-06-18 14:34:06 +01:00
Alan Woodward	09ff747fe7	Remove Settings parameter from FieldMapper base class (#58237 ) This is currently used to set the indexVersionCreated parameter on FieldMapper. However, this parameter is only actually used by two implementations, and clutters the API considerably. We should just remove it, and use it directly in the implementations that require it.	2020-06-18 12:39:48 +01:00
Alan Woodward	3b696828ad	MappedFieldType should not extend FieldType (#57666 ) MappedFieldType is a combination of two concerns: * an extension of lucene's FieldType, defining how a field should be indexed * a set of query factory methods, defining how a field should be searched We want to break these two concerns apart. This commit is a first step to doing this, breaking the inheritance relationship between MappedFieldType and FieldType. MappedFieldType instead has a series of boolean flags defining whether or not the field is searchable or aggregatable, and FieldMapper has a separate FieldType passed to its constructor defining how indexing should be done. Relates to #56814	2020-06-15 17:47:15 +01:00
Mayya Sharipova	b68bd78a53	Refactor how to determine if a field is metafield (#57378 ) Before to determine if a field is meta-field, a static method of MapperService isMetadataField was used. This method was using an outdated static list of meta-fields. This PR instead changes this method to the instance method that is also aware of meta-fields in all registered plugins. Related #38373, #41656 Closes #24422	2020-06-05 09:57:43 -04:00
Mark Tozzi	0a23487e73	IndexFieldData should hold the ValuesSourceType (#57373 )	2020-06-02 09:54:53 -04:00
Nik Everett	7382f446f7	Fix casting of scaled_float in sorts (#57207 ) Previously we'd get a `ClassCastException` when you tried to use `numeric_type` on `scaled_float`. Oops! This cleans up the CCE and moves some code around so the casting actually works.	2020-05-29 17:00:19 -04:00
Alan Woodward	fed71fbd66	Remove Mapper.updateFieldType() (#56986 ) When we had multiple mapping types, an update to a field in one type had to be propagated to the same field in all other types. This was done using the Mapper.updateFieldType() method, called at the end of a merge. However, now that we only have a single type per index, this method is unnecessary and can be removed. Relates to #41059	2020-05-26 13:06:13 +01:00
Alan Woodward	f82d74b501	Move merge compatibility logic from MappedFieldType to FieldMapper (#56915 ) Merging logic is currently split between FieldMapper, with its merge() method, and MappedFieldType, which checks for merging compatibility. The compatibility checks are called from a third class, MappingMergeValidator. This makes it difficult to reason about what is or is not compatible in updates, and even what is in fact updateable - we have a number of tests that check compatibility on changes in mapping configuration that are not in fact possible. This commit refactors the compatibility logic so that it all sits on FieldMapper, and makes it called at merge time. It adds a new FieldMapperTestCase base class that FieldMapper tests can extend, and moves the compatibility testing machinery from FieldTypeTestCase to here. Relates to #56814	2020-05-20 09:32:08 +01:00
Alan Woodward	0cc2345f98	Simplify generics on Mapper.Builder (#56747 ) Mapper.Builder currently has some complex generics on it to allow fluent builder construction. However, the second parameter, a return type from the build() method, is unnecessary, as we can use covariant return types. This commit removes this second generic parameter.	2020-05-15 12:06:39 +01:00
Julie Tibshirani	7b34e22890	Use index sort range query when possible. (#56657 ) This PR proposes to use `IndexSortSortedNumericDocValuesRangeQuery` when possible to speed up certain range queries. Points-based queries are already very efficient, the only time this query makes a difference is when the range matches a large number of documents. Some notes: * The optimization is only applied for fields of type `date`, `integer`, and `long`. I found that the query implementation isn't yet suited for `double` or `float` types (I will follow up with a Lucene issue). * Before applying the query, we check that the index is sorted on the query field. This isn't strictly necessary, since the query itself checks this as part of its execution. But it seemed nice to avoid wrapping the query unnecessarily -- it makes debugging easier, like when reading search profile results. Below are benchmark results on the http-logs dataset. The following ranges were run against the `logs-241998` index: range-small (897633930, 897655999]: ~2M docs range-medium (897623930, 897655999]: ~5M docs range-large (897259801, 897503930]: ~21M docs ``` \| 50th percentile service time \| range-small \| 11.0228 \| 8.19478 \| -2.82797 \| ms \| \| 95th percentile service time \| range-small \| 11.8153 \| 9.06257 \| -2.75274 \| ms \| \| 50th percentile service time \| range-medium \| 22.8912 \| 7.23264 \| -15.6585 \| ms \| \| 95th percentile service time \| range-medium \| 25.0957 \| 7.93246 \| -17.1632 \| ms \| \| 50th percentile service time \| range-large \| 39.7224 \| 6.34589 \| -33.3765 \| ms \| \| 95th percentile service time \| range-large \| 43.9104 \| 7.06604 \| -36.8444 \| ms \| ``` Relates to #48665.	2020-05-13 11:34:54 -07:00
Mark Tozzi	954afd94fe	Clean up DocValuesIndexFieldData (#56372 )	2020-05-13 10:09:38 -04:00
Julie Tibshirani	7a5d18ddc3	Simplify signature of FieldMapper#parseCreateField. (#56066 ) `FieldMapper#parseCreateField` accepts the parse context, plus a list of fields as an output parameter. These fields are immediately added to the document through `ParseContext#doc()`. This commit simplifies the signature by removing the list of fields, and having the mappers add the fields directly to `ParseContext#doc()`. I think this is nicer for implementors, because previously fields could be added either through the list, or the context (through `add`, `addWithKey`, etc.)	2020-05-04 11:18:34 -07:00
William Brafford	38cd668ad0	Remove deprecated third-party methods from tests (#55255 ) I've noticed that a lot of our tests are using deprecated static methods from the Hamcrest matchers. While this is not a big deal in any objective sense, it seems like a small good thing to reduce compilation warnings and be ready for a new release of the matcher library if we need to upgrade. I've also switched a few other methods in tests that have drop-in replacements.	2020-04-15 16:31:51 -04:00
Tal Levy	cf9603c6fd	Create new `geo` module and migrate geo_shape registration (#53562 ) This commit introduces a new `geo` module that is intended to be contain all the geo-spatial-specific features in server. As a first step, the responsibility of registering the geo_shape field mapper is moved to this module. Co-authored-by: Nicholas Knize <nknize@gmail.com>	2020-04-07 12:27:29 -07:00
Christoph Büscher	f5759bb209	Rename field name constants in AbstractBuilderTestCase (#53234 ) Some field name constants were not updaten when we moved from "string" to "text" and "keyword" fields. Renaming them makes it easier and faster to know which field type is used in test subclassing this base test case.	2020-04-03 16:00:46 +02:00
Jason Tedor	95a7eed9aa	Rename MetaData to Metadata in all of the places (#54519 ) This is a simple naming change PR, to fix the fact that "metadata" is a single English word, and for too long we have not followed general naming conventions for it. We are also not consistent about it, for example, METADATA instead of META_DATA if we were trying to be consistent with MetaData (although METADATA is correct when considered in the context of "metadata"). This was a simple find and replace across the code base, only taking a few minutes to fix this naming issue forever.	2020-03-31 15:52:01 -04:00
Mark Tozzi	a90c1de874	Add ValuesSource Registry and associated logic (#54281 ) * Remove ValuesSourceType argument to ValuesSourceAggregationBuilder (#48638) * ValuesSourceRegistry Prototype (#48758) * Remove generics from ValuesSource related classes (#49606) * fix percentile aggregation tests (#50712) * Basic thread safety for ValuesSourceRegistry (#50340) * Remove target value type from ValuesSourceAggregationBuilder (#49943) * Cleanup default values source type (#50992) * CoreValuesSourceType no longer implements Writable (#51276) * Remove genereics & hard coded ValuesSource references from Matrix Stats (#51131) * Put values source types on fields (#51503) * Remove VST Any (#51539) * Rewire terms agg to use new VS registry (#51182) Also adds some basic AggTestCases for untested code paths (and boilerplate for future tests once the IT are converted over) * Wire Cardinality aggregation to work with the ValuesSourceRegistry (#51337) * Wire Percentiles aggregator into new VS framework (#51639) This required a bit of a refactor to percentiles itself. Before, the Builder would switch on the chosen algo to generate an algo-specific factory. This doesn't work (or at least, would be difficult) in the new VS framework. This refactor consolidates both factories together and introduces a PercentilesConfig object to act as a standardized way to pass algo-specific parameters through the factory. This object is then used when deciding which kind of aggregator to create Note: CoreValuesSourceType.HISTOGRAM still lives in core, and will be moved in a subsequent PR. * Remove generics and target value type from MultiVSAB (#51647) * fix checkstyle after merge (#52008) * Plumb ValuesSourceRegistry through to QuerySearchContext (#51710) * Convert RareTerms to new VS registry (#52166) * Wire up Value Count (#52225) * Wire up Max & Min aggregations (#52219) * ValuesSource refactoring: Wire up Sum aggregation (#52571) * ValuesSource refactoring: Wire up SigTerms aggregation (#52590) * Soft immutability for VSConfig (#52729) * Unmute testSupportedFieldTypes, fix Percentiles/Ranks/Terms tests (#52734) Also fixes Percentiles which was incorrectly specified to only accept numeric, but in fact also accepts Boolean and Date (because those are numeric on master - thanks `testSupportedFieldTypes` for catching it!) * VS refactoring: Wire up stats aggregation (#52891) * ValuesSource refactoring: Wire up string_stats aggregation (#52875) * VS refactoring: Wire up median (MAD) aggregation (#52945) * fix valuesourcetype issue with constant_keyword field (#53041) this commit implements `getValuesSourceType` for the ConstantKeyword field type. master was merged into feature/extensible-values-source introducing a new field type that was not implementing `getValuesSourceType`. * ValuesSource refactoring: Wire up Avg aggregation (#52752) * Wire PercentileRanks aggregator into new VS framework (#51693) * Add a VSConfig resolver for aggregations not using the registry (#53038) * Vs refactor wire up ranges and date ranges (#52918) * Wire up geo_bounds aggregation to ValuesSourceRegistry (#53034) This commit updates the geo_bounds aggregation to depend on registering itself in the ValuesSourceRegistry relates #42949. * VS refactoring: convert Boxplot to new registry (#53132) * Wire-up geotile_grid and geohash_grid to ValuesSourceRegistry (#53037) This commit updates the geo_grid aggregations to depend on registering itself in the ValuesSourceRegistry relates to the values-source refactoring meta issue #42949. Wire-up geo_centroid agg to ValuesSourceRegistry (#53040) This commit updates the geo_centroid aggregation to depend on registering itself in the ValuesSourceRegistry. relates to the values-source refactoring meta issue #42949. * Fix type tests for Missing aggregation (#53501) * ValuesSource Refactor: move histo VSType into XPack module (#53298) - Introduces a new API (`getBareAggregatorRegistrar()`) which allows plugins to register aggregations against existing agg definitions defined in Core. - This moves the histogram VSType over to XPack where it belongs. `getHistogramValues()` still remains as a Core concept - Moves the histo-specific bits over to xpack (e.g. the actual aggregator logic). This requires extra boilerplate since we need to create a new "Analytics" Percentile/Rank aggregators to deal with the histo field. Doubly-so since percentiles/ranks are extra boiler-plate'y... should be much lighter for other aggs * Wire up DateHistogram to the ValuesSourceRegistry (#53484) * Vs refactor parser cleanup (#53198) Co-authored-by: Zachary Tong <polyfractal@elastic.co> Co-authored-by: Zachary Tong <zach@elastic.co> Co-authored-by: Christos Soulios <1561376+csoulios@users.noreply.github.com> Co-authored-by: Tal Levy <JubBoy333@gmail.com>	2020-03-26 15:01:07 -04:00
Jake Landis	afc2383b72	Optimize which Rest resources are used by the Rest tests. (#53299 ) This should help with Gradle's incremental compile such that projects only depend upon the resources they use. related #52114	2020-03-18 09:09:29 -05:00
Alan Woodward	3e607d9e93	Rename AtomicFieldData to LeafFieldData (#53554 ) This conforms with lucene's LeafReader naming convention, and matches other per-segment structures in elasticsearch.	2020-03-17 12:25:51 +00:00
Nik Everett	f4223b6a8f	Add size support to `top_metrics` (#52662 ) This adds support for returning the top "n" metrics instead of just the very top. Relates to #51813	2020-02-27 11:14:57 -05:00
Zachary Tong	f05b831e43	Comprehensively test supported/unsupported field type:agg combinations (#52493 ) This adds a test to AggregatorTestCase that allows us to programmatically verify that an aggregator supports or does not support a particular field type. It fetches the list of registered field type parsers, creates a MappedFieldType from the parser and then attempts to run a basic agg against the field. A supplied list of supported VSTypes are then compared against the output (success or exception) and suceeds or fails the test accordingly. Co-Authored-By: Mark Tozzi <mark.tozzi@gmail.com> * Skip fields that are not aggregatable	2020-02-20 14:08:25 -05:00
markharwood	cbd224d070	Upgrade Lucene 8.5 to latest snapshot (#52520 ) Upgrade Lucene 8.5 to latest snapshot	2020-02-20 10:34:41 +00:00
Nik Everett	5b2266601b	Implement top_metrics agg (#51155 ) The `top_metrics` agg is kind of like `top_hits` but it only works on doc values so it should be faster. At this point it is fairly limited in that it only supports a single, numeric sort and a single, numeric metric. And it only fetches the "very topest" document worth of metric. We plan to support returning a configurable number of top metrics, requesting more than one metric and more than one sort. And, eventually, non-numeric sorts and metrics. The trick is doing those things fairly efficiently. Co-Authored by: Zachary Tong <zach@elastic.co>	2020-02-14 07:13:52 -05:00
Marios Trivyzas	a8b39ed842	Add a cluster setting to disallow expensive queries (#51385 ) Add a new cluster setting `search.allow_expensive_queries` which by default is `true`. If set to `false`, certain queries that have usually slow performance cannot be executed and an error message is returned. - Queries that need to do linear scans to identify matches: - Script queries - Queries that have a high up-front cost: - Fuzzy queries - Regexp queries - Prefix queries (without index_prefixes enabled - Wildcard queries - Range queries on text and keyword fields - Joining queries - HasParent queries - HasChild queries - ParentId queries - Nested queries - Queries on deprecated 6.x geo shapes (using PrefixTree implementation) - Queries that may have a high per-document cost: - Script score queries - Percolate queries Closes: #29050	2020-02-12 18:06:04 +01:00
Julie Tibshirani	e0b3ea0416	Rename MapperService#fullName to fieldType. (#52025 ) The new name more accurately describes what the method returns.	2020-02-07 10:16:53 -08:00
Alan Woodward	573c7ddab1	Remove fieldMapper parameter from MetadataFieldMapper.TypeParser#getDefault() (#51219 ) This addresses a very old TODO comment in MetadataFieldMapper.TypeParser; passing in a previously constructed field mapper here was a hack in order to provide access to prebuilt analyzers for the AllFieldType; this has now been removed, so we can remove the parameter from this method signature.	2020-01-21 09:18:05 +00:00
Alan Woodward	3d79624843	Revert "Don't use user-supplied type when building DocumentMapper (#50960 )" (#51214 ) Reverts #50960 This commit has been causing test failures during upgrade tests: specifically, an upgraded node becomes master and sends a cluster state update to a 7.x node; this node sees that the mapping version of its .tasks index is the same as the master, so asserts that the serialized mappings are the same; however, because the master has rewritten the mapping to use _docinstead oftasks`, we get an assertion failure. The logical fix is for the master to increment its mapping version when it rewrites the mapping, but there isn't a simple way to do that currently. This reverts commit `774bfb5e22`.	2020-01-20 11:14:49 +00:00
Alan Woodward	774bfb5e22	Don't use user-supplied type when building DocumentMapper (#50960 ) This commit begins the process of removing types from the document parsing infrastructure. Initially, we just ignore the user-supplied type after it has been removed from the mapping json structure, and always supply _doc as the name of the root parser. The production code change is very small here, and most of the changeset consists of alterations to Mapper test code that was passing in non-standard type names and checking serialization. Relates to #41059	2020-01-14 15:15:19 +00:00
Alan Woodward	807a4fb996	Remove type parameter from PutMappingRequest.buildFromSimplifiedDef() (#50844 ) Mappings built by this method should all be wrapped with _doc, so there's no need to pass the type any more. This also renames the method to simpleMapping, in line with CreateIndexRequest, to help migration by causing compilation errors; and changes the signature to take a String... rather than an Object.... Relates to #41059	2020-01-10 13:29:19 +00:00
Alan Woodward	a59b065091	Remove type parameter from `CreateIndexRequest.mapping(type, XContentBuilder)` (#50586 ) This continues the removal of type parameters from CreateIndexRequest.mapping methods started in #50419. Here the removed methods are almost entirely in test code, with the exception of a change to TransformIndex in the transform plugin. Relates to #41059	2020-01-08 09:18:31 +00:00
Nik Everett	4c1f1b2aca	Declare remaining parsers `final` (#50571 ) We have about 800 `ObjectParsers` in Elasticsearch, about 700 of which are final. This is probably the right way to declare them because in practice we never mutate them after they are built. And we certainly don't change the static reference. Anyway, this adds `final` to these parsers. I found the non-final parsers with this: ``` diff \ <(find . -type f -name '.java' -exec grep -iHe 'static.PARSER\s=' {} \+ \| sort) \ <(find . -type f -name '.java' -exec grep -iHe 'static.final.PARSER\s*=' {} \+ \| sort) \ 2>&1 \| grep '^<' ```	2020-01-03 10:47:51 -05:00
Adrien Grand	2d627ba757	Add per-field metadata. (#49419 ) This PR adds per-field metadata that can be set in the mappings and is later returned by the field capabilities API. This metadata is completely opaque to Elasticsearch but may be used by tools that index data in Elasticsearch to communicate metadata about fields with tools that then search this data. A typical example that has been requested in the past is the ability to attach a unit to a numeric field. In order to not bloat the cluster state, Elasticsearch requires that this metadata be small: - keys can't be longer than 20 chars, - values can only be numbers or strings of no more than 50 chars - no inner arrays or objects, - the metadata can't have more than 5 keys in total. Given that metadata is opaque to Elasticsearch, field capabilities don't try to do anything smart when merging metadata about multiple indices, the union of all field metadatas is returned. Here is how the meta might look like in mappings: ```json { "properties": { "latency": { "type": "long", "meta": { "unit": "ms" } } } } ``` And then in the field capabilities response: ```json { "latency": { "long": { "searchable": true, "aggreggatable": true, "meta": { "unit": [ "ms" ] } } } } ``` When there are no conflicts, values are arrays of size 1, but when there are conflicts, Elasticsearch includes all unique values in this array, without giving ways to know which index has which metadata value: ```json { "latency": { "long": { "searchable": true, "aggreggatable": true, "meta": { "unit": [ "ms", "ns" ] } } } } ``` Closes #33267	2019-12-18 17:27:38 +01:00
Rory Hunter	3a3e5f6176	Apply 2-space indent to all gradle scripts (#48849 ) Closes #48724. Update `.editorconfig` to make the Java settings the default for all files, and then apply a 2-space indent to all `*.gradle` files. Then reformat all the files.	2019-11-13 10:14:04 +00:00
Alan Woodward	750c6d8bb1	Remove Client.prepareIndex(index, type, id) method (#48443 ) As types are no longer used in index requests, we can remove the type parameter from `prepareIndex` methods in the `Client` interface. However, just changing the signature of `prepareIndex(index, type, id)` to `prepareIndex(index, id)` risks confusion when upgrading with the previous (now removed) `prepareIndex(index, type)` method - just changing the dependency version of java code would end up silently changing the semantics of the method call. Instead we should just remove this method entirely, and replace it by calling `prepareIndex(index).setId(id)`	2019-10-25 11:09:52 +01:00
Alan Woodward	6531369f11	Don't persist type information to translog (#47229 ) We no longer need to store type information in the translog, given that an index can only have a single type. Relates to #41059	2019-10-15 09:05:29 +01:00
Alan Woodward	566e1b7d33	Remove type field from DocWriteRequest and associated Response objects (#47671 ) This commit removes the type field from index, update and delete requests, and their associated responses. Relates to #41059	2019-10-11 10:23:55 +01:00
Jim Ferenczi	50ad8029ff	Fix highlighting of overlapping terms in the unified highlighter (#47227 ) The passage formatter that the unified highlighter use doesn't handle terms with overlapping offsets. For tokenizer that provides multiple segmentation of the same terms (edge ngram for instance) the formatter should select the largest span in order to highlight the term only once. This change implements this logic.	2019-10-02 15:23:39 +02:00
Jim Ferenczi	4414fccb27	Replace SearchContext with QueryShardContext in query builder tests (#46978 ) This commit replaces the SearchContext used in AbstractQueryTestCase with a QueryShardContext in order to reduce the visibility of search contexts. Relates #46523	2019-09-23 19:37:15 +02:00
Alan Woodward	7c90801aff	Remove types from Get/MultiGet (#46587 ) This commit removes types from the ShardGetService, and propagates this API change up through the Transport and Rest actions for Get and MultiGet Relates to #41059	2019-09-20 14:22:57 +01:00
Jim Ferenczi	38f9e52c3e	Add mapper-extras and the RankFeatureQuery in the hlrc (#43713 ) This change adds the support for the RankFeatureQuery in the HLRC by providing an extra dependency on mapper-extras-client. It also removes the dependency on lang-painless in mapper-extras which is not needed anymore since the move of the vector field into a dedicated module. Closes #43634	2019-08-14 09:52:49 +02:00
Julie Tibshirani	46c2d7224d	Ensure field caps doesn't error on rank feature fields. (#44370 ) The contract for MappedFieldType#fielddataBuilder is to throw an IllegalArgumentException if fielddata is not supported. The rank feature mappers were instead throwing an UnsupportedOperationException, which caused MappedFieldType#isAggregatable to fail.	2019-07-16 14:43:04 -07:00
Mayya Sharipova	952ddf247a	Move dense_vector and sparse_vector to module (#43280 )	2019-06-18 08:15:46 -04:00
Mayya Sharipova	6e60945d62	BWC tests - move vector distance functions to 7.3	2019-06-14 12:49:16 -04:00
Colin Goodheart-Smithe	3f10cea87a	Removes types from SearchRequest and QueryShardContext (#42112 )	2019-05-29 08:50:30 +01:00
Jason Tedor	434efd1664	Add version 7.2.0 constant to master branch This commit adds the 7.2.0 constant to the master branch, and bumps the BWC logic accordingly.	2019-05-01 13:54:45 -04:00
Jim Ferenczi	501c2a7ec4	Fix search_as_you_type's sub-fields to pick their names from the full path of the root field (#41541 ) The subfields of the search_as_you_type are prefixed with the name of their root field. However they should used the full path of the root field rather than just the name since these fields can appear in a multi-`fields` definition or under an object field. Since this field type is not released yet, this should be considered as a non-issue.	2019-04-26 10:18:48 +02:00
Nhat Nguyen	8b0a74f11c	Clean up outdated skip statements in yaml tests (#41165 ) These skip statements become no-ops in 8.0 for we don't support a mixed cluster between 6.x and 8.0. Relates #41164	2019-04-18 14:19:31 -04:00
Alpar Torok	4434491c1e	convert modules to use testclusters (#40804 ) * convert modules to use testclusters * Eliminate PluginPropertiesTask and move logic in plugin where it belongs	2019-04-04 11:41:38 +03:00
Jim Ferenczi	ae569a286d	Fix merging of search_as_you_type field mapper (#40593 ) The merge of the `search_as_you_type` field mapper uses the wrong prefix field and does not update the underlying field types.	2019-03-29 09:01:36 +01:00
Andy Bristol	d0acf6285c	lower bwc skip for search as you type (#40599 )	2019-03-28 16:06:06 -07:00
Jeff Hajewski	0dd2fdfef2	Update max dims for vectors to 1024. (#40597 )	2019-03-28 17:07:03 -04:00
Julie Tibshirani	8c256d2c23	Fix an off-by-one error in the vector field dimension limit. (#40489 ) Previously only vectors up to 499 dimensions were accepted, whereas the stated limit is 500.	2019-03-27 11:13:51 -07:00
Andy Bristol	6bba9fc83b	search as you type fieldmapper (#35600 ) Adds the search_as_you_type field type that acts like a text field optimized for as-you-type search completion. It creates a couple subfields that analyze the indexed terms as shingles, against which full terms are queried, and a prefix subfield that analyze terms as the largest shingle size used and edge-ngrams, against which partial terms are queried Adds a match_bool_prefix query type that creates a boolean clause of a term query for each term except the last, for which a boolean clause with a prefix query is created. The match_bool_prefix query is the recommended way of querying a search as you type field, which will boil down to term queries for each shingle of the input text on the appropriate shingle field, and the final (possibly partial) term as a term query on the prefix field. This field type also supports phrase and phrase prefix queries however	2019-03-27 10:03:30 -07:00
Mayya Sharipova	256b1cbc28	Fix the test failure in dense and sparse vectors (#39313 ) Create index with only 1 shard to ensure an expected request failure Closes #39218	2019-02-22 10:59:45 -05:00
Tal Levy	135361743b	mute dense/sparse special case failing yaml tests accompanying awaitsfix issue #39218.	2019-02-20 17:04:25 -08:00
Mayya Sharipova	3260fd1fc8	Distance measures for dense and sparse vectors (#37947 ) * Distance measures for dense and sparse vectors Introduce painless functions of cosineSimilarity and dotProduct distance measures for dense and sparse vector fields. ```js { "query": { "script_score": { "query": { "match_all": {} }, "script": { "source": "cosineSimilarity(params.queryVector, doc['my_dense_vector'].value)", "params": { "queryVector": [4, 3.4, -1.2] } } } } } ``` ```js { "query": { "script_score": { "query": { "match_all": {} }, "script": { "source": "cosineSimilaritySparse(params.queryVector, doc['my_sparse_vector'].value)", "params": { "queryVector": {"2": -0.5, "10" : 111.3, "50": -13.0, "113": 14.8, "4545": -156.0} } } } } } ``` Closes #31615	2019-02-20 07:01:17 -05:00
Julie Tibshirani	c2e9d13ebd	Default include_type_name to false in the yml test harness. (#38058 ) This PR removes the temporary change we made to the yml test harness in #37285 to automatically set `include_type_name` to `true` in index creation requests if it's not already specified. This is possible now that the vast majority of index creation requests were updated to be typeless in #37611. A few additional tests also needed updating here. Additionally, this PR updates the test harness to set `include_type_name` to `false` in index creation requests when communicating with 6.x nodes. This mirrors the logic added in #37611 to allow for typeless document write requests in test set-up code. With this update in place, we can remove many references to `include_type_name: false` from the yml tests.	2019-02-01 11:44:13 -08:00
Colin Goodheart-Smithe	21e392e95e	Removes typed calls from YAML REST tests (#37611 ) This PR attempts to remove all typed calls from our YAML REST tests. The PR adds include_type_name: false to create index requests that use a mapping and also to put mapping requests. It also removes _type from index requests where they haven't already been removed. The PR ignores tests named *_with_types.yml since this are specifically testing typed API behaviour. The change also includes changing the test harness to add the type _doc to index, update, get and bulk requests that do not specify the document type when the test is running against a mixed 7.x/6.x cluster.	2019-01-30 16:32:58 +00:00
Mayya Sharipova	a30ce6a00a	Rename feature, feature_vector and feature_query (#37794 ) Ranaming as follows: feature -> rank_feature feature_vector -> rank_features feature query -> rank_feature query Ranaming is done to distinguish from other vector types. Closes #36723	2019-01-24 19:18:48 -05:00
Alexander Reelsen	daa2ec8a60	Switch mapping/aggregations over to java time (#36363 ) This commit moves the aggregation and mapping code from joda time to java time. This includes field mappers, root object mappers, aggregations with date histograms, query builders and a lot of changes within tests. The cut-over to java time is a requirement so that we can support nanoseconds properly in a future field mapper. Relates #27330	2019-01-23 10:40:05 +01:00
Armin Braun	860a8a7b23	Improve Precision for scaled_float (#37169 ) * Use `toString` and `Bigdecimal` parsing to get intuitive behaviour for `scaled_float` as discussed in #32570 * Closes #32570	2019-01-11 08:07:55 +01:00
Nhat Nguyen	7580d9d925	Make SourceToParse immutable (#36971 ) Today the routing of a SourceToParse is assigned in a separate step after the object is created. We can easily forget to set the routing. With this commit, the routing must be provided in the constructor of SourceToParse. Relates #36921	2018-12-24 14:06:50 -05:00
Mayya Sharipova	b5d532f9e3	Vector field (#33022 ) 1. Dense vector PUT dindex { "mappings": { "_doc": { "properties": { "my_vector": { "type": "dense_vector" }, "my_text" : { "type" : "keyword" } } } } } PUT dinex/_doc/1 { "my_text" : "text1", "my_vector" : [ 0.5, 10, 6 ] } 2. Sparse vector PUT sindex { "mappings": { "_doc": { "properties": { "my_vector": { "type": "sparse_vector" }, "my_text" : { "type" : "keyword" } } } } } PUT sindex/_doc/1 { "my_text" : "text1", "my_vector" : {"1": 0.5, "99": -0.5, "5": 1} }	2018-12-12 21:20:53 -05:00
Jim Ferenczi	18866c4c0b	Make hits.total an object in the search response (#35849 ) This commit changes the format of the `hits.total` in the search response to be an object with a `value` and a `relation`. The `value` indicates the number of hits that match the query and the `relation` indicates whether the number is accurate (in which case the relation is equals to `eq`) or a lower bound of the total (in which case it is equals to `gte`). This change also adds a parameter called `rest_total_hits_as_int` that can be used in the search APIs to opt out from this change (retrieve the total hits as a number in the rest response). Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain `hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a follow up (to allow numbers to be passed to `track_total_hits`). Relates #33028	2018-12-05 19:49:06 +01:00
Nick Knize	a5e1f4d3a2	Upgrade to lucene-8.0.0-snapshot-31d7dfe6b1 (#35224 )	2018-11-06 11:55:23 +01:00
Julie Tibshirani	78df00ff24	Simplify the return type of FieldMapper#parse. (#32654 )	2018-09-04 01:15:19 +00:00
Jim Ferenczi	8e5f281b27	AbstractQueryTestCase should run without type less often (#28936 ) This commit changes the randomization to always create an index with a type. It also adds a way to create a query shard context that maps to an index with no type registered in order to explicitely test cases where there is no type.	2018-07-26 20:29:05 +02:00
Nick Peihl	ac63408655	Add region ISO code to GeoIP Ingest plugin (#31669 )	2018-07-20 11:23:29 -07:00
Tanguy Leroux	bf58660482	Remove all unused imports and fix CRLF (#31207 ) The X-Pack opening and the recent other refactorings left a lot of unused imports in the codebase. This commit removes them all.	2018-06-11 15:12:12 +02:00
Julie Tibshirani	8f607071b6	Remove DocumentFieldMappers#smartNameFieldMapper, as it is no longer needed. (#31018 )	2018-06-08 09:24:09 -07:00

1 2 3 4

169 Commits