elasticsearch

Commit Graph

Author	SHA1	Message	Date
Ignacio Vera	8a9f4fed55	Remove explicit SearchResponse references from LegacyGeo, Aggregations and parent-join modules (#101250 )	2023-10-24 17:46:25 +02:00
David Turner	9794c6e205	Use ESIntegTestCase#prepareSearch more (#101179 ) The refactoring in #101175 only covered all the one-arg call sites. This PR does the rest.	2023-10-20 18:33:00 +01:00
David Turner	1eda6ac74b	Extract ESIntegTestCase#prepareSearch (#101175 ) Relates #101172	2023-10-20 06:18:58 -04:00
Armin Braun	ca6295e582	Remove more explicit references to SearchResponse in tests (#101092 ) Remove `assertSearchResponse` which was just an alias for `assertNoFailures` and then cleanup many spots in the result by combining the hit count and no failure assertion into a single method. follow-up to #100966	2023-10-19 17:53:13 +02:00
Armin Braun	03ea4bbe6e	Remove more explicit references to SearchResponse in tests (#101052 ) Follow up to #100966 introducing new combined assertion `assertSearchHitsWithoutFailures` to combine no-failure, count, and id assertions into one block.	2023-10-18 20:27:52 +02:00
Armin Braun	dcaba064dd	Remove more explicit SearchResponse references from test code (#100985 ) Follow-up to #100966 adding more overrides to assertions that consume a request builder.	2023-10-18 07:20:01 +02:00
Armin Braun	bae6991fb3	Remove ~600 references to SearchResponse in tests (#100966 ) We'd like to make `SearchResponse` reference counted and pooled but there are around 6k instances of tests that create a `SearchResponse` local variable that would need to be released manually to avoid leaks in the tests. This does away with about 10% of these spots by adding an override for `assertHitCount` that handles the actual execution of the search request and its release automatically and making use of it in all spots where the `.get()` on the request build could be inlined semi-automatically and in a straight-forward fashion without other code changes.	2023-10-17 15:43:36 +02:00
Armin Braun	b7eafce32c	Make some practically static methods static (#97565 ) Another round of automated fixes to this, marking things that can be made static as static. Saves some JIT cycles but also turns some lambdas from capturing to non-capturing and makes the "utilityness" of some classes visible.	2023-10-06 23:37:07 +02:00
Mark Tozzi	6660503592	Aggs error codes part 1 (#99963 ) As part of our effort to increase the supportability of Elasticsearch, this PR changes many aggregations errors from being 500 class (which is the default for `AggregationExecutionException`) to 400 class (which is the default for `IllegalArgumentException`). All of these cases are errors which should not be retried, as they are failing directly related to the content of the request and/or state of the index. There are definitely more cases where we are returning an incorrect error code, but for this PR I focused on just changing the low hanging fruit.	2023-10-04 16:12:34 -04:00
Ignacio Vera	bcdd7d5f42	Set ParentAggregationBuilder not to support concurrent execution (#99809 )	2023-09-25 09:29:27 +02:00
Alan Woodward	4e1fb3fca5	Automatically disable `ignore_malformed` on datastream `@timestamp` fields (#99346 ) Data-stream mappings require a @timestamp field to be present and configured as a date with a specific set of parameters. The index-wide setting of ignore_malformed can cause problems here if it is set to true, because it needs to be false for the @timestamp field. This commit detects if a set of mappings is configured for a datastream by checking for the presence of a DataStreamTimestampFieldMapper metadata field, and passes that information on during Mapper construction as part of the MapperBuilderContext. DateFieldMapper.Builder now checks to see if it is specifically for a data stream timestamp field, and if it is, sets ignore_malformed to false. Relates to #96051	2023-09-13 15:02:22 +01:00
Ryan Ernst	19257125b1	Move transport version constants to TransportVersions (#97990 ) Constants for TransportVersion currently live alongeside the class definition. This has been fine since there was only one set of constants. However, to support serverless, some constants will need to be defined elsewhere. This commit moves the existing constants to a new holder class, TransportVersions. It is almost entirely mechanical, using IntelliJ move members. The only non mechanical part was slightly shifting how CURRENT is found, defining a LATEST in TransportVersions that is automatically calculated (since we already have it, no need to manually define it).	2023-09-06 15:14:41 -04:00
Ignacio Vera	424a4c6d71	Hide IndexSearcher in AggregatorTestCase (#98924 ) Hide the creation of the index searcher from the implementers by changing the signature of AggregatorTestCase#searchAndReduce and AggregatorTestCase#createAggregationContext to take an IndexReader instead of an IndexSearcher.	2023-08-28 16:29:21 +08:00
Matteo Piergiovanni	e719057209	Explicit parsing object capabilities of FieldMappers (#98684 ) When the subobject property is set to false and we encounter an object while parsing we need a way to understand if its FieldMapper is able to parse an object. If that's the case we can provide the entire object to the FieldMapper otherwise its name becomes the part of the dotted field name of each internal value. This has being achieved by adding the `supportsParsingObject()` method to the `FieldMapper` class. This method defaults to `false` since the majority of FieldMappers do not support parsing objects and is overwritten to return `true` by the ones that do support objects.	2023-08-22 10:16:59 +02:00
Armin Braun	63e64ae61b	Cleanup Stream usage in various spots (#97306 ) Lots of spots where we did weird things around streams like redundant stream creation, redundant collecting before adding all the collected elements to another collection or so, redundant streams for joining strings and using less efficient `Collectors.toList` and in a few cases also incorrectly relying on the result being mutable.	2023-07-03 14:24:57 +02:00
Simon Cooper	a873e26cf7	Convert IndexVersion.CURRENT to a method with a pluggable interface (#97132 )	2023-06-27 14:47:32 +01:00
Armin Braun	3f8ee82ef8	Use indices admin client shortcut in most integration tests (#96946 ) Replacing the remaining usages that I could automatically replace and a couple that I did by hand in this PR. Also, added the same shortcut to the single node tests to save some duplication there.	2023-06-20 13:32:59 +02:00
Simon Cooper	71c12262fb	Migrate index created version to IndexVersion (#96066 )	2023-06-14 09:43:31 +01:00
Luca Cavanna	e5768d9335	Upgrade Lucene to a 9.7.0 snapshot (#96433 ) Most relevant changes: - add api to allow concurrent query rewrite (GITHUB-11838 Add api to allow concurrent query rewrite apache/lucene#11840) - knn query rewrite (Concurrent rewrite for KnnVectorQuery apache/lucene#12160) - Integrate the incubating Panama Vector API (Integrate the Incubating Panama Vector API apache/lucene#12311) As part of this commit I moved the ES codebase off of overriding or relying on the deprecated rewrite(IndexReader) method in favour of using rewrite(IndexSearcher) instead. For score functions, I went for not breaking existing plugins and create a new IndexSearcher whenever we rewrite a filter, otherwise we'd need to change the ScoreFunction#rewrite signature to take a searcher instead of a reader. Co-authored-by: ChrisHegarty <christopher.hegarty@elastic.co>	2023-05-31 10:17:10 +02:00
Ignacio Vera	c05181528a	Use DirectoryReader instead of IndexReader in AggregatorTestCase (#95876 )	2023-05-08 07:25:10 +02:00
Ignacio Vera	9bbea47899	use #newIndexSearcher in all AggregatorTestCase implementations (#95796 )	2023-05-04 11:12:26 +02:00
Armin Braun	c41bda9e3a	Dry up remaining verbose index setting building in tests (#95652 ) Lasts spots I could easily find via regex. Follow-up to #95569	2023-04-28 11:18:07 +02:00
Alan Woodward	093e36c875	Introduce DocumentParsingException (#92646 ) Document parsing methods currently throw MapperParsingException. This isn't very helpful, as it doesn't contain any information about where the parse error happened - it is designed for parsing mappings, which are realised into java maps before being examined. This commit introduces a new exception specifically for document parsing that extends XContentException, so that it reports the current position of the parser as part of its error message. Fixes #85083	2023-03-31 12:14:19 +01:00
Alan Woodward	131da70321	ValueFetchers now return a StoredFieldsSpec (#94820 ) This allows us to be more conservative about what needs to be loaded when using the fields API, and opens up the possibility of avoiding using stored fields or source altogether if we can use doc values to fetch values. This commit also uses this new information from ValueFetchers to more efficiently preload stored fields for the `fields` API, while still allowing the lazy loading of individual fields if they are asked for by scripts or runtime fields which cannot be introspected.	2023-03-30 10:46:43 +01:00
Adrien Grand	0c10cef668	Cut over from Field to StringField when applicable. (#94540 ) The most recent Lucene update made `StringField` more efficient than `Field` when indexing simple keywords. This PR cuts over remaining places where we use `Field` to index keywords to `StringField` instead.	2023-03-23 15:37:51 +01:00
Adrien Grand	b56c2df203	Upgrade to lucene-9.6.0-snapshot-f5d1e1c787c. (#94494 )	2023-03-16 16:49:54 +01:00
Armin Braun	2819b11523	Dry up setting index settings in internalClusterTests (#90204 ) We have this neat utility method for this, lets use it throughout to save hundreds of LoC and do the setting update in a consistent way throughout instead of using various variants.	2023-02-28 13:23:49 +01:00
Simon Cooper	4c46ccacaa	Migrate the remaining uses of Version to TransportVersion (#93384 ) Remove get/setVersion methods	2023-02-13 09:15:53 +00:00
Alan Woodward	c0a3bf7e60	Remove custom NoRewriteMatchNoDocsQuery (#93638 ) We added a special NoRewriteMatchNoDocsQuery to get around some aggressive rewriting that meant match phrase prefix queries wouldn't be correctly highlighted. Since lucene 9.5, however, the unified highlighter no longer rewrites queries against an empty searcher, and so this extra query is now unnecessary.	2023-02-10 09:22:55 +00:00
Adrien Grand	af8fccf4b4	Use a combined field to index terms and doc values on keyword fields. (#93579 ) Instead of indexing separately a `StringField` and a `SortedSetDocValuesField`, this commit switches to a single field that indexes both terms and doc values. On Lucene's nightly benchmarks on the NYC Taxis dataset, a similar change yielded a ~3% indexing throughput increase.	2023-02-08 14:16:43 +01:00
Simon Cooper	c513b2bcc6	Migrate VersionedWriteable & NamedDiff to TransportVersion take 2 (#93242 ) Re-apply "Migrate VersionedWriteable & NamedDiff to TransportVersion (#93076)" This reverts commit `48f96090dc`.	2023-01-26 09:49:08 +00:00
Simon Cooper	48f96090dc	Revert "Migrate VersionedWriteable & NamedDiff to TransportVersion (#93076 )" This reverts commit `bef85c66e7`.	2023-01-25 16:16:10 +00:00
Simon Cooper	bef85c66e7	Migrate VersionedWriteable & NamedDiff to TransportVersion (#93076 ) InferenceConfig is kept on Version, as that existed before VersionedNamedWriteable came along	2023-01-25 16:03:38 +00:00
Artem Prigoda	2bc7398754	Use `Strings.format` instead of `String.format(Locale.ROOT, ...)` in tests (#92106 ) Use local-independent `Strings.format` method instead of `String.format(Locale.ROOT, ...)`. Inline `ESTestCase.forbidden` calls with `Strings.format` for the consistency sake. Add `Strings.format` alias in `common.Strings`	2023-01-03 19:28:27 +01:00
Mark Vieira	c2eda511de	Add JUnit rule based integration test cluster orchestration framework (#92379 ) This commit adds a new test framework for configuring and orchestrating test clusters for both Java and YAML REST testing. This will eventually replace the existing "test-clusters" Gradle plugin and the build-time cluster orchestration.	2022-12-21 15:33:46 -08:00
Dimitris Athanasiou	f7e0d477f6	Optimize composite agg with leading global ordinal value source (#92197 ) When queries are present in a search with a composite agg with a leading source that is of type `GlobalOrdinalValuesSource` there is an optimization we can do. In particular, once the composite queue is full, we know the range of ordinals we are interested in from the source. Thus, we can add a competitive iterator to the `LeafBucketCollector` that skips documents that are out of the competitive range. This commit adds that optimization. In a dataset I have experimented with that has ~31M docs I observed a 5x improvement in a simple search with a range query that matched ~28M docs and with `size = 5` over a keyword field whose cardinality was 200. Co-authored-by: Adrien Grand <jpountz@gmail.com>	2022-12-21 16:25:00 +00:00
Alan Woodward	547c8327b2	Allow FetchSubPhaseProcessors to report their required stored fields (#91269 ) Loading of stored fields is currently handled directly in FetchPhase, with some fairly complex logic examining various bits of the FetchContext to work out what fields need to be loaded. This is further complicated by synthetic source, which may have its own stored field requirements. This commit tries to separate out these concerns a little by adding a new StoredFieldsSpec record that holds information about which stored fields need to be loaded. Each FetchSubPhaseProcessor can now report a StoredFieldsSpec detailing what its requirements are, and these specs can be merged together, along with requirements from a SourceLoader, to determine up-front what fields should be loaded by the StoredFieldLoader. The stored fields themselves are added into the SearchHit by a new StoredFieldsPhase, which handles alias resolution and value post- processing. The logic to determine when source should be loaded and when not, based on the presence of script fields or stored fields, is moved into FetchContext, which highlights some inconsistencies that can be fixed in follow-up commits.	2022-11-10 08:40:22 +00:00
Alan Woodward	41ab45a5d9	Report synthetic source status in MapperBuilderContext (#91400 ) We currently work out whether or not a mapper should be storing additional values for synthetic source by looking at the DocumentParserContext. However, this value does not change for the lifetime of the mapper - it is defined by metadata on the root mapper and is immutable - and DocumentParserContext feels like the wrong place for this information as it holds context specific to the document being parsed. This commit moves synthetic source status information from DocumentParserContext to MapperBuilderContext instead. Mappers which need this information retrieve it at build time and hold it on final fields.	2022-11-08 14:55:16 +00:00
Luca Cavanna	18942d5b11	Enhance nested depth tracking when parsing queries (#90425 ) When parsing queries on the coordinating node, there is currently no way to share state between the different parsing methods (`fromXContent`). The only query that supports a parse context is bool query, which uses the context to track nested depth of queries, added with #66204. Such nested depth tracking mechanism is not 100% accurate as it tracks bool queries only, while there's many more query types that can hold other queries hence potentially cause stack overflow when deeply nested. This change removes the parsing context that's specific to bool query, introduced with #66204, in favour of generalizing the nested depth tracking to all query types. The generic tracking is introduced by wrapping the parser and overriding the method that parses named objects through the xcontent registry. Another way would have been to require a context argument when parsing queries, which would mean adding a context argument to all the QueryBuilder#fromXContent static methods. That would be a breaking change for plugins that provide custom queries, hence I went for trying out a different approach. One aspect that this change requires and introduces is the distinction between parsing a top level query (which will wrap the parser, or it would create the context if we had one), as opposed to parsing an inner query, which goes ahead with the given parser and context. We already have this distinction as we have two different static methods in `AbstractQueryBuilder` but in practice only bool query makes the distinction being the only context-aware query. In addition to generalizing tracking nested depth when parsing queries, we should be able to adopt this same strategy to track queries usage as part #90176 . Given that the depth check is now more restrictive, as it counts all compound queries and not only bool, we have decided to raise the default limit to `30` to ensure that users are not going to hit the limit due to this change.	2022-10-12 15:15:06 +02:00
Alan Woodward	0013d46538	Extract Source interface from SourceLookup (#90762 ) SourceLookup combines a mutable lookup object that can be advanced to different documents with access to a document's source. This combination can make reasoning about where a Source comes from difficult, particularly in the FetchPhase where the source gets passed around a great deal. This commit extracts a Source interface from SourceLookup, giving read-only access to the source, and changes various FetchPhase interfaces to take this read-only view instead of a full lookup. You can now tell easily if a consumer of the source is going to try and move it to a different document. As part of this change we add a new docId parameter to various ValueFetcher methods, as previously this could be accessed via the SourceLookup.	2022-10-11 19:50:30 +01:00
Mark Tozzi	4a26dda50c	Use the AggTestConfig object in testCase (#90699 )	2022-10-06 13:33:57 -04:00
Rene Groeschke	43a0377735	Update forbiddenapis to 3.4 (#90624 ) Fix breaking changes to source validation after change in default jdk rule set	2022-10-06 16:52:06 +02:00
Mark Tozzi	df27efcae4	Minor Aggregations Test Cleanup (#90530 ) * remove unnecessary constructor from AggTestConfig * deprecate methods that we want to discourage individual tests from invoking Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2022-09-30 15:21:09 -04:00
Mark Tozzi	15932d5168	Refactor aggregator test case (#90149 ) Refactor `AggregatorTestCase` to eliminate many overloads of `searchAndReduce`. Introduce a parameter object, and default many common arguments.	2022-09-20 14:58:24 -04:00
Alan Woodward	aed64a6c76	Add a TSID global ordinal to TimeSeriesIndexSearcher (#90035 ) Rather than trying to compare BytesRefs in tsdb-related aggregations, it will be much quicker if we can use a search-global ordinal to detect when we have moved to a new TSID. This commit adds such an ordinal to the aggregation execution context.	2022-09-14 15:32:17 +01:00
Martijn van Groningen	5195cba24e	Fix ParentToChildrenAggregatorTests#testBestDeferringCollectorWithSubAggOfChildrenAggNeedingScores() test failures. (#90052 ) The failures reported in #90050 was caused by the fact that just a few docs were indexed and the string_field had in total just one value in the index. The second fix is that due test wrapping of index reader casts in ValueSource.java line 285 failed. A DirectoryReader is expected there, which is not the case if maybeWrap is true. Closes #90050	2022-09-14 18:01:38 +09:30
Martijn van Groningen	9056ff7bc4	Fail when rebuilding scorer in breadth_first mode and query context has changed (#89993 ) The children agg changes the query context, when BestBucketsDeferringCollector is rebuilding scores for the breath first collect mode then this leads to erroneous situations: * A null scorer could be returned, because a segment had no matches. * Scores for incorrect docids could be reported. This commit adds checks for both cases and throws runtime errors with a more actionable error message. These erroneous situations that could occur when top_hits is nested under children agg and terms agg with breath_first execution mode. Possible there are other cases too were this NPE would occur. Note that this NPE would actually only occur if parent and child docs are in separate segments, otherwise the scorer would report scores for different documents. This would trigger an assertion error in tests. Closes #37650	2022-09-13 08:53:39 +02:00
Nik Everett	79a89790e3	Synthetic source: load text from stored fields (#87480 ) Adds support for loading `text` and `keyword` fields that have `store: true`. We could likely load any stored fields, but I wanted to blaze the trail using something fairly useful.	2022-08-17 10:18:36 -04:00
Jack Conradson	5e0701f026	Add source fallback for keyword fields using operation (#88735 ) This change adds an operation parameter to FieldDataContext that allows us to specialize the field data that are returned from fielddataBuilder in MappedFieldType. Keyword, integer, and geo point field types now support source fallback where we build a doc values wrapper using source if doc values doesn't exist for this field under the operation SCRIPT. This allows us to have source fallback in scripting for the scripting fields API.	2022-07-28 10:34:05 -07:00
Alan Woodward	bc8ebbf540	Add FieldDataContext (#88779 ) MappedFieldType#fieldDataBuilder() currently takes two parameters, a fully qualified index name and a supplier for a SearchLookup. We expect to add more parameters here as we add support for loading fielddata from source. Rather than telescoping the parameter list, this commit instead introduces a new FieldDataContext carrier object which will allow us to add to these context parameters more easily.	2022-07-26 14:47:50 +01:00

1 2 3 4 5 ...

291 Commits