elasticsearch

Commit Graph

Author	SHA1	Message	Date
David Kyle	f52d5fab7c	[ML] Rename the ELSER service (#100944 )	2023-10-17 09:58:38 +01:00
David Turner	596c3cbd4b	TransportNodesAction impls are local-only (#100867 ) There are no remote invocations of any actions derived from `TransportNodesAction` so there is no need to register the top-level action with the `TransportService`, and that means that all the code related to de/serialization of the top-level request and response is unused and can be removed. Relates #100111 Relates #100878	2023-10-17 04:27:37 -04:00
David Turner	b0469e90bc	Throttle per-index snapshot deletes (#100793 ) Each per-index process during snapshot deletion takes some nonzero amount of working memory to hold the relevant snapshot IDs and metadata generations etc. which we can keep under tighter limits and release sooner if we limit the number of per-index processes running concurrently. That's what this commit does.	2023-10-17 04:08:25 -04:00
Ievgen Degtiarenko	14952cd7ea	Fix collision in field names (#100940 ) Emmited metrics could not be index as elasticsearch.metrics.s3.exceptions field is both long counter and a parent object for a histogram. This change renames histogram to avoid the conflict.	2023-10-17 09:40:41 +02:00
David Kyle	3a44943a78	[CI] Mute Failing tests in HeapAttackIT (#100900 ) For #100678 and and #100640	2023-10-17 09:14:44 +02:00
Ignacio Vera	3cd700073c	Add tolerance to ExtendedStatsAggregatorTests#testSummationAccuracy (#100917 )	2023-10-17 08:08:04 +02:00
David Turner	27e8e8050b	Tidy up BlobStoreRepositoryTests (#100924 ) There's no need for the `fslike` repository, the thread-name check it exists to suppress permits execution on test threads so does not need suppressing. This commit replaces it with a regular `fs` repository and cleans up a couple of other nits.	2023-10-17 06:45:49 +01:00
Costin Leau	2418f8abab	ESQL: Preserve intermediate aggregation output in local relation (#100866 ) Data nodes can fold a plan (typically for missing fields) to an empty, local relationship as a logical optimization. However the context, such as whether the output is an aggregation or not gets lost which is problematic during physical execution since the upstream aggregation expects the intermediate states while the aggregation returns the final ones. Consider the query: from index \| where field is not null \| stats c = count() On shards where the field in the filter does not exist, the filter gets nullified which folds the whole _local_ plan to a LocalRelation returning c as 0. However the data node should return the intermediate aggregation states (count and seen) - otherwise the query fails with an internal error (NPE) since the expected channel by the exchange is not found. Fix #100807	2023-10-17 01:30:03 -04:00
Ryan Ernst	32c50dc058	Separate version qualifier from version in build (#100868 ) The build version is made up of a few parts in non-release builds. Both the snapshot and pre-release qualifiers are appended to it. These qualifiers used to be part of Version, but in 7.0 the qualifiers were made to be found only in the build info. The Build class retains these qualifiers through the compile ES version extracted from the server jar at runtime. Build.qualifiedVersion() is suppose to provide the fully qualified version, including snapshot and pre-release qualifiers. Yet Build.version() also includes this information; there is no distinction since the qualifier was moved to be only in the build info. This commit separates the pre-release qualifier from the version. It maintains bwc in talking to older nodes, passing the fully qualified version there, but in current nodes splits out the pre-release qualifier into a new member of Build.	2023-10-16 20:23:03 -07:00
Nhat Nguyen	6699a3194b	Perform enrich lookup with enrich_origin (#100856 ) Direct access to the .enrich-* indices, which are restricted system indices, should not be granted to users. Instead, ESQL enrich lookup should access these indices using the enrich_origin on behalf of the user. With this change, the enrich lookup checks for the monitor_enrich cluster privilege before performing the actual lookup with the enrich_origin. Spin-off from #100724	2023-10-16 16:39:33 -07:00
Costin Leau	5da8c6cc92	QL: Preserve subfields for invalid types (#100875 ) In certain scenarios, a field can be mapped both as a primitive and object, causing it to be marked as unsupported, losing any potential subfields that might have been discovered before. This commit preserve them to avoid subfields from being incorrectly reported as missing. Fix #100869	2023-10-16 15:46:31 -07:00
Benjamin Trent	93583813f3	Refresh indicies before checking disk usage (#100845 ) * Refresh indicies before checking disk usage * switch from refresh to forceMerge	2023-10-16 17:35:54 -04:00
Nhat Nguyen	cf8a6be77f	Ensure document order in enrich yaml test (#100863 ) The test fails due to out-of-order documents in the enrich index. This can occur when replicas are initializing during indexing. To avoid this, we just need to ensure there are no initializing shards before starting indexing and disable shard relocations. Closes #99807	2023-10-16 12:50:32 -07:00
Jonathan Buttner	e26ad8f9d2	[ML] Adding request queuing for http requests (#100674 ) * Tests are really slow * Closing services * Cleaning up code * Fixing spotless * Adding some logging for evictor thread * Using a custom method for sending requests in the queue * Adding timeout and rejection logic * Fixing merge failure * Revert "Adding timeout and rejection logic" This reverts commit `acc8ba0c0b`. * Removing rethrow * Reverting node.java changes --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2023-10-16 14:15:02 -04:00
Albert Zaharovits	7df7776a72	Done (#100919 ) It appears that some freshly generated tokens fail authn under concurrency conditions. This change increases verbosity of the TokenService logging in order to track down how exactly is the token not good for authn. Related: https://github.com/elastic/elasticsearch/issues/85697	2023-10-16 13:47:12 -04:00
David Roberts	1846414946	[ML] Reduce chance of timeout in serverless ML autoscaling (#100910 ) If ML serverless autoscaling fails to return a response within the configured timeout period then the control plane autoscaler will log an error. Too many of these errors will raise an alert, therefore as much as possible should be done on the ML side to _not_ time out. Previously there were two possible causes of timeouts: 1. If a request for node stats from all ML nodes timed out 2. If a request to refresh the ML memory tracker timed out The first case can happen if a node leaves the cluster at a bad time and the message sent to it gets lost. The second case can happen if searching the ML results indices for model size stats documents is slow. We can avoid timeouts in these two situations as follows: 1. There was no need to use the API to get the only value from the node stats that the autoscaler needs to know - the total amount of memory on each ML node is stored in a node attribute on startup so exists in cluster state 2. When we refresh the ML memory tracker we can just return stats that instruct the autoscaler to do nothing until the refresh is complete - this is functionally the same as timing out each request, but without generating error messages	2023-10-16 17:48:52 +01:00
Keith Massey	8e0fd222a9	Sending an index name to DocumentParsingObserver that is not ever null (#100862 )	2023-10-16 11:02:42 -05:00
David Turner	8c0994e5f8	Use assertNoFailureListener in more places (#100892 )	2023-10-16 16:46:00 +01:00
Dianna Hohensee	9084a63d5e	Name and comment improvements for IndexRecoveryIT.java (#100912 )	2023-10-16 11:28:27 -04:00
Kostas Krikellas	30ac8feb8f	Skip cat tsdb test for versions 8.7-8-10 (#100914 ) Yet another test affected by the fix for showing the synthetic source, #98808. This can trigger an assert in older versions as the mapping they produce (without synthetic source) doesn't match the one they may get from the master, if the latter is in version 8.10+. Fixes #100913	2023-10-16 11:00:30 -04:00
Nhat Nguyen	e4ea68a104	Register TopN status in plugin's writables (#100874 ) ``` "node_failures": [ { "type": "failed_node_exception", "reason": "Failed node [qpdSPb3yQkuDlsI9TH7a2g]", "node_id": "qpdSPb3yQkuDlsI9TH7a2g", "caused_by": { "type": "transport_serialization_exception", "reason": "Failed to deserialize response from handler", "caused_by": { "type": "illegal_argument_exception", "reason": "Unknown NamedWriteable [org.elasticsearch.compute.operator.Operator$Status][topn]" } } } ] ``` I hit this error when trying to retrieve ESQL tasks. The issue is that we forget to register NamedWritable for the status of TopN.	2023-10-16 06:31:21 -07:00
Ignacio Vera	12f249ed8a	Set replica to 1 in 130_geo_shape_runtime.yaml (#100906 ) Set number of replicas to 1 so the test can run against serverless.	2023-10-16 09:19:14 -04:00
Armin Braun	04f37dfcc2	Use pooled Netty allocator in tests by default (#100877 ) We mostly run our tests with less than 1G of heap per JVM. This means that we will use the unpooled Netty allocator in most tests, losing us a lot of leak coverage in internal cluster tests (mostly for inbound buffers). Unless otherwise specified by tests, we should force the use of our standard allocator by default to get a higher chance of catching leaks in internalClusterTests in particular.	2023-10-16 15:04:33 +02:00
Alan Woodward	31736fc754	Don't hold onto ClusterState reference in AbstractSearchAsyncAction (#100901 ) If the cluster state is changing quickly while searches are starting then these captured cluster states can consume substantial memory, and we are only interested in two values here. This commit extracts the two relevant values in the constructor, removing the cluster state references entirely. Closes #100120	2023-10-16 08:20:38 -04:00
Lorenzo Dematté	a60a7890d0	Revert "Making yaml tests version selector parser compatible with versions returned by Build (#100794 )" (#100889 ) This reverts commit `5fe7e03248`.	2023-10-16 13:28:47 +02:00
Ignacio Vera	a82f0ac7b0	Add runtime field of type geo_shape (#100492 ) This commit adds the possibility to create runtime fields of type geo-shape. In order to create them, users can define an emit function that takes either a geojson object or a WKT string that internally creates a geometry object.	2023-10-16 13:14:28 +02:00
Kostas Krikellas	3247accddf	[TEST] Assert that both time-series indexes are created (#100885 ) * Assert that both time-series indexes are created * Exclude from 8.7-8.10 mixedClusterTests * Restore asserts * Fix assert	2023-10-16 13:10:47 +03:00
David Kyle	83abb37f54	[ML] Use correct writable name for model assignment metadata in mixed cluster (#100886 ) Older nodes will fail if they do not recognise the named writable	2023-10-16 11:10:04 +01:00
David Turner	d01c61fbbe	Better failure logging in testFailsIfRegisterHoldsSpuriousValue (#100888 ) Relates #99422	2023-10-16 06:07:41 -04:00
Kostas Krikellas	76b9d9591e	Merge remote-tracking branch 'upstream/main'	2023-10-16 12:24:00 +03:00
Kostas Krikellas	fe9995965f	Revert "Assert that both time-series indexes are created" This reverts commit `baff9ae361`.	2023-10-16 12:23:27 +03:00
Alan Woodward	edab22a31c	Consistent scores for multi-term SourceConfirmedTestQuery (#100846 ) SourceConfirmedTestQuery uses a QueryVisitor to collect terms from its inner query to build its internal SimScorer. It is important to hold these terms in a consistent order so that when scores for each term are summed, the order of summation is the same as it would be for the inner query. This commit changes the call to visit to use a LinkedHashSet to ensure that terms are iterated in the order in which they are collected. Fixes #98712	2023-10-16 10:11:10 +01:00
David Roberts	43a4167528	[ML] Check for internal index searchability as well as active primary (#100852 ) Currently, before performing operations that require the ML internal indices be available we check whether their primary shards are active. In stateless Elasticsearch we need to separately check whether the indices are searchable, as search and indexing shards are separate.	2023-10-16 10:02:30 +01:00
David Roberts	3ccbb001e8	[Transform] Check for internal index searchability as well as active primary (#100851 ) Currently, before performing operations that require the transform internal index be available we check whether its primary shard is active. In stateless Elasticsearch we need to separately check whether the index is searchable, as search and indexing shards are separate.	2023-10-16 09:34:58 +01:00
Kostas Krikellas	30c09f64d4	Merge remote-tracking branch 'upstream/main'	2023-10-16 11:17:00 +03:00
Kostas Krikellas	baff9ae361	Assert that both time-series indexes are created	2023-10-16 11:16:53 +03:00
Lorenzo Dematté	5fe7e03248	Making yaml tests version selector parser compatible with versions returned by Build (#100794 )	2023-10-16 09:34:16 +02:00
Julia Bardi	42cf90f67c	using all privileges (#100764 )	2023-10-16 09:30:50 +02:00
David Turner	cb184639d2	Execute local action via client in RemoteClusterNodesAction (#100876 ) Rather than sending a nodes-info request to the local node via its transport service, we should use the `Client` to invoke the action directly.	2023-10-16 06:05:24 +01:00
Yang Wang	65b4d594ae	Push s3 requests count via metrics API (#100383 ) This PR builds on top of #100464 to publish s3 request count via the metrics API. The metric takes the name of `repositories.requests.count` with attributes/dimensions of `{"repo_type": "s3", "repo_name": "xxx", "operation": "xxx", "purpose": "xxx"}`. Closes: ES-6801	2023-10-16 10:01:26 +11:00
David Turner	ec819e4a23	Relax cleanup check in SnapshotStressTestsIT (#100855 ) We can't assert no leaked blobs here because today the first cleanup leaves the original `RepositoryData` in place so the second cleanup is not a no-op. Relates #100718	2023-10-15 16:50:42 +01:00
Ignacio Vera	c00b626dfb	Add test that proves you can write a ByteReference using its iterator (#100703 )	2023-10-14 13:04:36 +02:00
Brian Seeders	17ef0af4f8	[buildkite] Upload build artifact and add to build scan (#100842 )	2023-10-13 16:35:32 -04:00
Dianna Hohensee	323d9366df	Stabilize testRerouteRecovery throttle testing (#100788 ) Refactor testRerouteRecovery, pulling out testing of shard recovery throttling into separate targeted tests. Now there are two additional tests, one testing source node throttling, and another testing target node throttling. Throttling both nodes at once leads to primarily the source node registering throttling, while the target node mostly has no cause to instigate throttling.	2023-10-13 15:45:26 -04:00
Mark Vieira	b8a204f428	Avoid eagerly creating spotless task in esql:compute project (#100789 )	2023-10-13 09:59:03 -07:00
Jake Landis	1eaa907052	Fix manage/monitor_enrich documentation (#100781 ) manage_enrich is a cluster privilege, not a built in role. manage_enrich is already documented as a cluster privilege. This commit remove manage_enrich from the role documentation. This commit also makes mention of the monitor_enrich introduced in #99646. related: #85877	2023-10-13 11:29:48 -05:00
Athena Brown	be136c8f57	Fix NullPointerException in RotableSecret (#100779 ) This commit fixes two things: 1) RotatableSecret#matches could throw a NullPointerException when the current secret is null but the prior secret is not. 2) RotatableSecret#checkExpired would not expire a prior secret when checking the same millisecond the prior secret was due to expire. Both of these would cause intermittent test failures, the first based on randomization, the second based on timing.	2023-10-13 10:25:01 -06:00
David Roberts	4d55f37427	[Transform] Consider task cancelled exceptions as recoverable (#100828 ) A task cancelled exception has REST status 400, which makes it irrecoverable as far as transforms is concerned. This means that a transform that suffers such an exception will fail without doing any retries. This is bad, because a search can fail with a task cancelled exception if one of its lower level phases suffers a circuit breaker exception. We want transforms to retry in the event of there temporarily not being sufficient memory for a search.	2023-10-13 17:06:09 +01:00
David Kyle	2ce5392ebd	[ML] Extra logging for debugging rolling upgrade test failure #100800 For investigating #100371	2023-10-13 15:42:10 +01:00
gheorghepucea	cb30096c65	Referenced the svgs of starts_with and trim in asciidoc for consistency. (#100834 )	2023-10-13 16:01:47 +02:00

... 2 3 4 5 6 ...

72833 Commits All Branches Search

72833 Commits

All Branches