Commit Graph

7402 Commits

Author SHA1 Message Date
Reza Torabi 310af09b6a
Add persian language stemmer (#99106) 2023-09-05 12:13:27 +01:00
Andrei Dan 6c3a46498a
Introduce the index.downsample.origin.name and index.downsample.origin.uuid settings (#99061)
This proposes introducing two new settings we configure when
downsampling: `index.downsample.origin.name`
`index.downsample.origin.uuid`

These settings will carry on the name and uuid of the first source index
we downsample (like the source.name and source.uuid settings behaved
before this PR).

This also changes the behaviour of `index.downsample.source.name` and
`index.downsample.source.uuid` to always reflect the source of the
current downsampling round.

If an index is downsampled once, both the source and origin settings
will have the same value.

However, for subsequent downsampling operations, a later downsampling
round will have the actual index name and uuid that were used as the
source of the downsampling operation configured  e.g.
`downsample-350s-.ds-metrics-foo-2023.08.17-000001` will have the
`index.downsample.source.name` configured to
`downsample-1s-.ds-metrics-foo-2023.08.17-000001` and
`index.downsample.origin.name` configured to
`.ds-metrics-foo-2023.08.17-000001`.

Note that this will help us simplify things in data stream lifecycle as
with subsequent  downsampling configuration we currently have to store
the true source of the downsampling operation in the downsample index
metadata (
https://github.com/elastic/elasticsearch/blob/main/modules/data-streams/src/main/java/org/elasticsearch/datastreams/lifecycle/downsampling/ReplaceSourceWithDownsampleIndexTask.java#L173
)

If you agree with this proposal we'll have to manually backport to 8.10
as well to make sure the `generateDownsampleIndexName` method is
consistent across releases
https://github.com/elastic/elasticsearch/blob/main/server/src/main/java/org/elasticsearch/action/downsample/DownsampleConfig.java#L249

Note that the `index.provided_name` setting which is set on indices when
rolling over cannot act like a  source (initially I thought we can use
that instead of introducing a new setting, the origin one) because  it
contains date-math patterns.
2023-09-04 11:08:00 -04:00
David Turner 11b6db3dfe
Remove strings-based scheduleWithFixedDelay (#99166)
Relates #99051, #99027, #99131
2023-09-04 09:41:35 -04:00
David Turner 1e9c7f1d95
Align collection de/serialization API naming (#99150)
The `StreamOutput` and `StreamInput` APIs are designed so that code
which serializes objects to the transport protocol aligns closely with
the corresponding deserialization code. However today
`StreamOutput#writeCollection` pairs up with a variety of methods on
`StreamInput`, including `readList`, `readSet`, and so on. These methods
are not obviously compatible with `writeCollection` unless you look at
the implementation, and that makes verifying transport protocol code
harder than it needs to be.

This commit renames these methods to `readCollectionAsList`,
`readCollectionAsSet`, and so on, to clarify that they are compatible
with `writeCollection`.

Relates
https://github.com/elastic/elasticsearch/pull/98971#issuecomment-1697289815
2023-09-04 06:46:54 -04:00
David Turner 082f36578d
Remove string-based scheduleUnlessShuttingDown (#99131)
Relates #99051, #99027
2023-09-04 03:31:58 -04:00
Armin Braun dea8890c50
Dry up Writeable.writeTo lambda usage around StreamOutput (#99143)
No need to duplicate the lambda all over the place.
Follow-up to #99087
2023-09-03 10:35:22 +02:00
Armin Braun 97ebcea5c2
Dry up writing string collections via StreamOutput (#99087)
We can dry up and shorten a bunch of spots by using the overloads
we already have in the code. Also, added a new overload for writing a
map with string keys.
2023-09-02 11:09:43 +02:00
Salvatore Campagna 06d8fa0706
Concurrent thread access to shared doc values (#99007)
The doc values in the `GlobalOrdCardinalityAggregator` are shared
among multiple search threads, `search` and `search_worker`.
The search thread also runs the aggregation phase. When an
executor is used the 'search' thread is running `postCollection`, which
uses doc values, while other methods are executed by the `search_worker`
thread, using doc values too. As a result, doc values are concurrently
accessed by different threads. Using doc values concurrently from multiple
threads is not correct since multiple threads end up updating the doc values
state. This breaks access to doc values resulting in different issue depending
on how threads end up being scheduled (prematurely exhausting doc values,
accessing incorrect documents as a result of trying to access docIds not
in the thread owned leaf/segment,...).

The solution here is to:
1. make sure we executed `postCollection in the same thread as other
methods, which is `search` or `search_worker`.
2. make sure we do not call `postCollection` in case the `TimeSeriesIndexSearcher`
is used. In that case `postCollection` is called by `TimeSeriesIndexSearcher`.
2023-09-01 17:42:01 +02:00
Andrei Dan c536a42c4e
Move the logging of errors inside the ErrorRecordingActionListener (#99043)
Data stream lifecycle runs periodically and we don't want
to spam the logs with error logging on each run, unless
the type of error encountered for an index changes between
runs (we had this logic applied for some operations and
implemented ad-hoc).

This moves the logic that records the error and logs an
error message (if needed) into the ErrorRecordingActionListener.

Note that if listener.onFailure is called the ErrorRecordingActionListener
will handle the logging so no additional logging is needed.
2023-08-31 18:02:44 +01:00
Keith Massey 2ac55b2f50
Avoiding using getDefaultBackingIndexName in DataStreamLifecycleServiceIT (#99062)
Avoiding using getDefaultBackingIndexName() in DataStreamLifecycleServiceIT
because it causes failures if the test runs across midnight.
2023-08-31 09:26:20 -05:00
Andrei Dan 79d829be66
[Tests] Use fewer DS generations and manually rollover DS (#99085) 2023-08-31 13:30:54 +01:00
Simon Cooper e1f353c2cf
Convert even more index created version to IndexVersion (#99088) 2023-08-31 13:08:26 +01:00
David Turner a20ee3f8f2
Migrate simple usages of ThreadPool#schedule (#99051)
In #99027 we deprecated the string-based version of
`ThreadPool#schedule`. This commit migrates all the simple usages of
this API to the new version.
2023-08-31 07:37:31 +01:00
Albert Zaharovits d3f799c879
Free up allocated buffers in Netty4HttpServerTransportTests (#99005)
Closes #98869
2023-08-30 11:03:25 +03:00
Iraklis Psaroudakis 9653da8d45
Fix testDeleteByQueryOnReadOnlyAllowDeleteIndex (#99002)
Previous PR #87841 introduces a change where disabling the disk
allocation decider makes a single check to remove existing blocks.

This created a race condition with the test that puts a block that it
expects to not be removed after disabling the disk allocation decider.

A simple fix is to execute the single check before putting the block.

Closes #98855
2023-08-29 20:03:07 +03:00
Benjamin Trent 4b5585e428
Fix tests after indexing dense vectors by default (#98946)
By default in 8.11, vectors will be indexed by default. However, various
tests that relied on the previous behavior were not updated.

This PR updates those tests.

Related: https://github.com/elastic/elasticsearch/pull/98268
2023-08-29 11:06:02 -04:00
Armin Braun e4de4021fc
Remove redundant writeList from StreamOutput (#98971)
Noticed this when benchmarking FieldCaps transport messages.
The `writeList` alias just adds more lines to the code and makes
profiling more annoying to read, lets remove it.
2023-08-29 16:00:59 +02:00
Francisco Fernández Castaño f6a2b5c9ef
Add bulk delete method to BlobStore interface and implementations (#98948) 2023-08-29 12:25:03 +02:00
Andrei Dan 01686a8093
Add downsampling parser support for PUT _lifecycle API (#98910) 2023-08-29 10:58:32 +01:00
Albert Zaharovits f85ba38ba2
RCS 2.0 Drop invalid connections to the fulfiling cluster node (#98814)
The RCS 2.0 channel only accepts requests that are
authenticated by cross cluster search type-of API Keys.

Co-authored-by: Yang Wang ywangd@gmail.com
2023-08-28 22:45:17 +03:00
Benjamin Trent d09cb767a9
Fix percolator query for stored queries that expand on wildcard field names (#98878)
An optimization introduced in:
https://github.com/elastic/elasticsearch/pull/81985 changed percolator
query behavior.

Users can specify a percolator query which expands fields based on a
wildcard pattern. Just one example is `simple_query_string`, which
allows field names like `"text_*"`. The user expects that this field
name will expand to relevant mapped fields (e.g. "text_foo"). However,
if there are no documents indexed in those fields at the time when the
percolator query is indexed, it doesn't expand to the relevant fields.

Additionally at query time, we may skip expanding fields and not match
the relevant mapped fields if they are considered "empty" (e.g. has no
values in the shard). We should instead allow expansion by indicating
that the field may exist in the shard.

closes: https://github.com/elastic/elasticsearch/issues/98819
2023-08-28 09:19:28 -04:00
Ignacio Vera 424a4c6d71
Hide IndexSearcher in AggregatorTestCase (#98924)
Hide the creation of the index searcher from the implementers by changing the signature of 
AggregatorTestCase#searchAndReduce and AggregatorTestCase#createAggregationContext to take
an IndexReader instead of an IndexSearcher.
2023-08-28 16:29:21 +08:00
Andrei Dan b11d552f95
Initial data stream lifecycle support for downsampling (#98609)
This adds data stream lifecycle service implementation support
for downsampling.
Time series backing indices for a data stream with a lifecycle
that configures downsampling will be marked as read-only,
downsampled, removed from the data stream, replaced with the
corresponding downsample index, and deleted.

Multiple rounds can be configured for a data stream, and the
latest matching round will be the first one to be executed.
If one downsampling operation is in progress, we wait until it's
finished and then we start the next downsampling operation.
Note that in this scenario a data stream could have the following
backing indices:
```
[.ds-metrics-2023.08.22-000002, downsample-10s-.ds-metrics-2023.08.22-000001]
```

If this data stream has multiple rounds of downsampling configured,
the first generation index will subsequently be downsampled again
(and again).
2023-08-26 12:53:51 +01:00
Aleh Zasypkin 77f3924baa
Revert "Kibana system index does not allow user templates to affect it (#98696)" (#98888)
* Revert "Kibana system index does not allow user templates to affect it (#98696)"

This reverts commit 22393215e7.

* Update docs/changelog/98888.yaml
2023-08-25 19:36:19 +02:00
Andrei Dan 712a5ce145
Cap the DSL recorded errors to 1000 chars (#98841)
We've seen very large generated exceptions (>5MB). As the Data Stream
Lifecycle error store is in-memory this caps the recorded error message
for each index to 1000 chars.
2023-08-25 18:24:06 +01:00
Martijn van Groningen 0ba4e75a9c
Trim stored fields for _id field in tsdb (#97409)
And in the fetch phase synthesize _id on the fly.

The _id is composed out of a hash of routing fields, tsid and timestamp. These are all properties that can be retrieved from doc values and used to generate the _id on the fly.
2023-08-25 09:46:08 +07:00
Mary Gouseti b9b818e28e
Allow explain data stream lifecycle to accept a data stream. (#98811)
Currently the `GET target/_lifecycle/explain` API only works for
indices. In this PR we extend this behaviour to allow the target to be a
data stream so we can get the overview lifecycle status for all the
backing indices of a data stream.
2023-08-24 06:29:09 -04:00
Albert Zaharovits dc373f1cfc
Add request header size limit for RCS transport connections (#98692)
Add and enforce request header size limits for the transport port of RCS 2.0 connections.

Co-authored-by: Yang Wang ywangd@gmail.com
2023-08-23 13:08:56 +03:00
Rudolf Meijering 22393215e7
Kibana system index does not allow user templates to affect it (#98696)
* Kibana system index does not allow user templates to affect it

* Spotless and compilation fix

---------

Co-authored-by: Athena Brown <athena.brown@elastic.co>
2023-08-23 12:07:00 +02:00
Simon Cooper b67a9e1ec3
Move text references to index created version to IndexVersion (#98727) 2023-08-23 10:51:56 +01:00
David Turner e4af2bfe92
Add TTL on S3 CAS uploads (#98664)
Compare-and-swap operations on a S3 repository are implemented using
multipart uploads. Today to try and avoid collisions we refuse to
perform a compare-and-swap if there are other concurrent uploads in
process. However this means that a node which crashes partway through a
compare-and-swap will block all future register operations.

With this commit we introduce a time-to-live on S3 multipart uploads,
such that uploads older than the TTL now do not block future
compare-and-swap attempts.
2023-08-23 06:51:58 +01:00
Keith Massey 852d4eb042
Simplifying testGeoIpDatabasesDownloadNoGeoipProcessors in order to avoid race condition in cleanup (#98756) 2023-08-22 16:50:13 -05:00
David Turner 27163303ef
Accept empty chunked responses (#98750)
Today we assert that every chunked response contains at least one chunk.
In practice it is sometimes useful to yield no chunks too. This commit
permits a zero-chunk chunked response, by converting it into a regular
unchunked response at the start.
2023-08-22 20:19:52 +01:00
Keith Massey 4a69793f1d
Changing GeoIpDownloaderIT pipeline creation methods to always create pipelines (#98751) 2023-08-22 13:37:52 -05:00
David Turner a8c4559492 Revert "Accept empty chunked responses (#98733)"
This reverts commit ae52a06a51.
2023-08-22 18:35:32 +01:00
David Turner ae52a06a51
Accept empty chunked responses (#98733)
Today we assert that every chunked response contains at least one chunk.
In practice it is sometimes useful to yield no chunks too. This commit
permits a zero-chunk chunked response, by converting it into a regular
unchunked response at the start.
2023-08-22 18:11:18 +01:00
Simon Cooper b0c0d80f03
Move conversion of minimum index compatibility to IndexVersion (#98685) 2023-08-22 16:23:09 +01:00
Matteo Piergiovanni e719057209
Explicit parsing object capabilities of FieldMappers (#98684)
When the subobject property is set to false and we encounter an object 
while parsing we need a way to understand if its FieldMapper is able to 
parse an object. If that's the case we can provide the entire object to 
the FieldMapper otherwise its name becomes the part of the dotted field
name of each internal value.

This has being achieved by adding the `supportsParsingObject()` method 
to the `FieldMapper` class. This method defaults to `false` since the 
majority of FieldMappers do not support parsing objects and is 
overwritten to return `true` by the ones that do support objects.
2023-08-22 10:16:59 +02:00
David Turner dadaaa8315 AwaitsFix for #98712 2023-08-22 09:11:21 +01:00
Christoph Büscher 207a995fce
Use newSearcher instead of new IndexSearcher in tests where possible (#98110)
This change swaps test code that directly creates IndexSearcher instances with LuceneTestCase#newSearcher calls
that have the advantage of randomly using concurrency and also randomly use assertion wrappers internally.
While this doesn't guarantee testing the concurrent code path, it should generally increase the likelihood of doing so.
2023-08-22 10:49:21 +07:00
Andrei Dan 01ed7de99f
GA the data stream lifecycle (#98644)
This makes the data stream lifecycle generally available. This will allow
data streams to take advantage of a native simplified and resilient
lifecycle implementation.
2023-08-21 17:28:54 +01:00
Simon Cooper 041e94d2b0
More migrations of Version for index created version to IndexVersion (#98495) 2023-08-21 16:47:58 +01:00
Simon Cooper 5271b9a9e9
Refactor some uses of randomIndexCompatibleVersion to use IndexVersion (#98032) 2023-08-21 10:09:55 +01:00
David Turner 63c825ad9a AwaitsFix for #98539 2023-08-21 08:53:27 +01:00
William Brafford 874fbbc2f0
Add mappings version number to SystemIndexDescriptor (#97934)
This PR adds a mappings version number to system index descriptor, which is intended to replace the use of Version to signal changes in the mappings for the descriptor. This value is required for managed system indices.

Previously, most of the system index descriptors automatically incremented their mapping version with each release, which meant that the mappings would stay up-to-date with any additive changes. Now, developers will need to increment the mapping version whenever there's a change.

* Add MappingsVersion inner class to SystemIndexDescriptor
* Add mappings version to metadata in all system index mappings
* Rename version meta key ('system' -> 'managed')
* Update mappings for ML indices if required
* Trigger ML index mappings updates based on new index mappings version.

---------

Co-authored-by: Ed Savage <ed.savage@elastic.co>
2023-08-17 00:03:12 -04:00
James Baiera 7d990d5a09
Allow custom geo ip database files to be downloaded (#97850)
This PR extends the assumptions we make about database file availability to all database file 
names instead of the default ones we host at Elastic. When creating a geo ip processor with 
a database name that is not recognized we unilaterally convert the processor to one that 
tags documents with a missing database message until the database file requested is 
downloaded or provided via the manual configuration route. This allows a pipeline to be 
created and for the download service to be started, potentially sourcing the needed files.

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2023-08-16 00:31:51 -04:00
tmgordeeva 171bcbb3e1
Mapped field types searchable with doc values (#97724)
* Mapped field types searchable with doc values

When using TSDB, time series metrics aren't indexed but do have doc values.
Field caps should report those fields as searchable.
2023-08-15 19:29:49 -07:00
Simon Cooper a830787b07
Bulk migration of Version.CURRENT for index created to IndexVersion.current() (#98490) 2023-08-15 13:47:27 +01:00
Keith Massey e9dc8e1d10
Not checking index.time_series.end_time in 150_tsdb's `created the data stream` test because it changes (#98458) 2023-08-14 14:26:36 -05:00
Keith Massey c9aa3b1b77
Fix assertions in GeoIpDownloaderIT.testGeoIpDatabasesDownloadNoGeoipProcessors() (#98413) 2023-08-11 17:07:17 -05:00
Mary Gouseti e71ea6e6d7
Add data stream lifecycle by default (#97823)
In this PR we enable all new data streams to be managed by the data
stream lifecycle by default. This is implemented by adding an empty
`lifecycle: {}` upon new data stream creation. 

Opting out is represented by a the `enabled` flag:

```
{
  "lifecycle": {
    "enabled": false
  }
}
```

This change has the following implications on when is an index managed
and by which feature:

| Parent data stream lifecycle| ILM| `prefer_ilm`|Managed by|
|----------------------------|----|----------------|-| | default | yes|
true| ILM| | default | yes| false| data stream lifecycle| |default |
no|true/false|data stream lifecycle| |opt-out or
missing|yes|true/false|ILM| |opt-out or missing|no|true/false|unmanaged|

Data streams that have been created before the data stream lifecycle is
enabled will not have the default lifecycle.

Next steps: - We need to document this when the feature will be GA
(https://github.com/elastic/elasticsearch/issues/97973).
2023-08-11 06:28:37 -04:00
Armin Braun c203442c48
Remove some dead-code and duplication from o.e.transport (#98321)
Removing some unused things and deduplicating connection lookup failure
handling in TransportService.
2023-08-09 21:35:51 +02:00
Keith Massey 6d545e883b
Storing nodeId explicitly in TaskInfo (#98300)
This makes the node ID an explicit part of the TaskInfo object in the response of GetTaskAction. This way it can be filtered out of the response without having to alter the taskId.
2023-08-09 09:07:34 -05:00
Tim Vernum 6fa5c61cf2
Support operator users in LocalClusterSpec (#97340)
This adds a new parameter to LocalClusterSpecBuilder.user(..) to
indicate whether the user should be an operator or not.

Users added with the simplified user(username, password) method are
operators (and have the "_es_test_root" role).
2023-08-09 02:38:58 -04:00
Jim Ferenczi 28a504d7a1
Use the Weight#matches mode for highlighting by default (#96068)
This PR adapts the unified highlighter to use the Weight#matches mode by default when possible.
This is the default mode in Lucene for some time now. For cases where the matches mode won't work (nested and parent-child queries),
 the matches mode is disabled automatically.
I didn't expose an  option to explicitly disable this mode because that should be seen as an internal implementation detail.
With this change, matches that span multiple terms are highlighted together (something that users asked for years) and the clauses that don't match the document are ignored.
2023-08-09 10:44:38 +09:00
David Turner 1d3a3c8887
Implement describeTasks in UpdateTimeSeriesRangeService (#98198)
These tasks don't have a meaningful `toString()`, but also there's no
real need to list them out at all.
2023-08-07 06:56:18 +01:00
Carlos Delgado 8e64359fb1
Change cluster stats synonyms keys (#98126) 2023-08-04 17:28:12 +02:00
Yang Wang 93ba27697e
Collect additional object store stats for S3 (#98083)
This PR adds additional stats collectiosn for s3, including Delete and
Abort. It also fixes an issue where ListNextBatchObject is not metered.
2023-08-03 06:31:04 -04:00
Alan Woodward 2d3b6b1afd
GlobalAggregator should call rewrite() before createWeight() (#98091)
Some queries need to be rewritten before they support Weights; TermsQuery
has recently changed to require this, for example. The GlobalAggregator was
not calling rewrite() on possible alias filters, meaning that an index alias with
a terms filter would cause an exception when combined with a global agg.

Fixes #98076
2023-08-01 15:22:50 +01:00
Przemyslaw Gomulka 999489ce04
Infrastructure to report upon document parsing (#97961)
In serverless we will like to report (meter and bill) upon a document ingestion. The metering should be agnostic to a document format (document structure should be normalised) hence we should allow to create XContentParsers which will keep track of parsed fields and values.
There are 2 places where the parsing of the ingested document happens:
1. upon the 'raw bulk' a request is sent without the pipelines
2. upon the 'ingest service' when a request is sent with pipelines
(parsing can occur twice when a dynamic mappings are calculated, this PR takes this into account and prevent double billing)
We also want to make sure, that the metering logic is not unnecessarily executed when a document was already reported. That is if a document was reported in IngestService, there is no point wrapping the XContentParser again.

This commit introduces a `DocumentReporterPlugin`  an internal plugin that will be implemented in serverless. This plugin should return a `DocumentParsingObserver` supplier  which will create a `DocumentParsingObserver`. A DocumentParsingObserver is used to wrap an `XContentParser` with an implementation that keeps track of parsed fields and values (performs a metering) and allows to send that information along with an index name to a MeteringReporter.
2023-08-01 13:55:18 +02:00
Simon Cooper 3a34d4a15c
Move over most remaining uses of indexSettings(Version) to IndexVersion (#98004) 2023-07-28 09:27:04 +01:00
Carlos Delgado c0a99baef5
Add synonyms sets information to cluster stats (#97900) 2023-07-27 21:25:24 +02:00
Simon Cooper c78eef86fa
Refactor non-testcase uses of index created version with release version (#98007)
Change uses of Version to IndexVersion to specify the index created version
2023-07-27 15:31:19 +01:00
David Turner f4e3113ad0
Fork remote-cluster response handling (#97922)
Today all responses to remote-cluster requests are deserialized and
handled on the transport worker thread. Some of these responses can be
sizeable, so with this commit we add the facility for callers to specify
a different executor to handle this work. It also adjusts several
callers to use more appropriate threads, including:

- responses from CCR-related admin actions are handled on `ccr`
- responses from field caps actions are handled on `search_coordination`
2023-07-27 10:26:17 +01:00
David Turner f12070e9e9
Replace stream().map().iterator() with Iterators#map (#97964)
This is a common pattern, particularly in chunked XContent encoding, but
working with streams is surprisingly expensive. This commit replaces it
with a much simpler utility that works directly on iterators.

It also simplifies the cases where we avoid creating a stream by using
`Iterators.flatMap(..., x -> Iterators.single(...))`.
2023-07-27 04:04:10 -04:00
Henning Andersen 86c4378121
Fix serverless scope for reindex and forcemerge (#97927)
Reindex should be available. Marked force merge internal (might become unavailable eventually).
2023-07-26 21:13:17 +02:00
Ryan Ernst 753bbc89b4
Allow more parts of Build to be filled in (#97824)
Build represents information about the running build of ES. It has
some methods that assume some structure to the existing version.
Additionally MainResponse contains Version just to be able to retrieve
the min compat versions.

This commit makes all of these bits of info part of the state of Build.
It also removes Version from MainResponse as it is no longer necessary.
2023-07-26 09:31:13 -07:00
Carlos Delgado 375991d974
Remove synonyms feature flag and related classes (#97962) 2023-07-26 18:13:06 +02:00
Simon Cooper 03e92b30df
Bulk migrate uses of settings(Version) to settings(IndexVersion) (#97528)
Mechanically change uses of Version.CURRENT to IndexVersion.current()
2023-07-25 12:25:56 +01:00
Ryan Ernst ca468f00b8
Fix main response to return build flavor (#97857)
Build flavor was added back in #97770, but the main endpoint response
is still hardcoded to "default". This commit updates the response to
return the flavor from the build info.
2023-07-20 15:15:18 -07:00
Mary Gouseti 1162ad6e20
Rename the DataStreamLifecycle isEnabled to isFeatureEnabled (#97816) 2023-07-19 21:07:33 +03:00
Ryan Ernst 18329b0c82
Add back build flavor in build info (#97770)
The Build class keeps information about the running build of
Elasticsearch. There was previously a build flavor, which has since been
narrowed to just return the "default" flavor. This commit adds flavor
back to Build so that a plugged in Build can include a change to flavor.
2023-07-19 11:38:19 -04:00
Mary Gouseti 430cde6909
Improve the use of DataStreamLifecycle.Builder (#97744)
With the expansion of the data stream lifecycle configuration that is
coming (for example, downsampling). It will be helpful to reduce the
number of constructors and mainly use the builder to initialise the
DataStreamLifecycle object.
2023-07-19 10:13:57 -04:00
Ryan Ernst 3f8f7182be
Remove index version and transport version from main endpoint response (#97675)
The index and transport versions are low level details of how a node
behaves in a cluster. They were recently added to the main endpoint
response, but they are too low level and should be moved to another
endpoint TBD.

This commit removes those versions from the main endpoint response. Due
to the fact lucene version is now derived from index version, this
commit also adds an explicit lucene version member to the main response.
2023-07-18 06:36:46 -07:00
Andrei Dan 22bc45a82f
Change default value for data stream lifecycle poll interval to 5 mins (#97583) 2023-07-17 06:52:45 -04:00
Carlos Delgado a5a7525e48
Synonyms - Prevent Synonym Set Delete when indices are using it (#97622) 2023-07-14 20:35:13 +02:00
Mayya Sharipova 321110ef52
Add synonyms feature for test clusters (#97658)
Fix for #97334 where incorrect feature name was provided.


Correct more instances of synonyms_feature_flag_enabled for synonyms_api_feature_flag_enabled


Closes #96641, #97177
2023-07-13 11:34:25 -04:00
Carlos Delgado 2412adf0fa
Fix synonyms API analyzers reloading (#97621) 2023-07-13 14:28:28 +02:00
Andrei Dan 0c043dca60
[Tests] Generate max 10 downsampling rounds (#97615) 2023-07-12 16:00:29 +01:00
eyalkoren 3d36b08d28
Fix `fields` API with `subobjects: false` (#97092) 2023-07-12 11:35:18 +03:00
Mary Gouseti 25fc81e851
Enable dlm feature flag in data stream tests (#97596)
When we move the data stream lifecycle code from the `dlm` module to the
`data-streams` one
(https://github.com/elastic/elasticsearch/pull/97345/files#diff-ae6caffc7a2ab209609293083f2fef0e80c453c21556bb7a703cb7013a4bdda4),
we forgot to enable the feature flag for all the tests. 

This PR fixes it and consequently fixes #97570

Closes: #97570
2023-07-12 04:11:53 -04:00
Andrei Dan c1d81011c5
Limit number of downsampling rounds to 10 (#97428) 2023-07-11 09:24:08 +01:00
Michael Peterson 6dd1841dbc
Allow users to run the painless execute API on a remote cluster shard (#97335)
Added a clusterAlias to the Painless execute Request object, so that index
expressions in the request of the form "myremote:myindex" will be parsed to
set clusterAlias to "myremote" and the index to "myindex".

If clusterAlias is null, then it is executed against a shard on the local cluster, as before.
If clusterAlias is non-null, then the SingleShardTransportAction is sent to the remote cluster,
where it will run the full request (doing remote coordination). Note that the new clusterAlias 
field is not Writeable so that when it is sent to the remote cluster it will only see the index
name, not the clusterAlias (which it wouldn't know how to handle correctly).

Added PainlessExecuteIT test that tests cross-cluster calls

Updated painless-execute-script end user docs to indicate support for cross-cluster executions
2023-07-10 12:27:00 -04:00
Armin Braun c006d10572
Fix inefficient ActionListener.delegateFailure and ActionListener.wrap usage (#97374)
ActionListener.delegateFailure is not the optimal choice when the
response handler always invokes delegate.onResponse, it's just
needlessly verbose, error-prone and sometimes less efficient when
multiple map/delegation calls are chained. ActionListener.wrap is just a
less efficient version of ActionListener.delegateAndWrap when the error
handler is `listener::onFailure` -> fixup a couple of these as well.
2023-07-10 11:12:06 -04:00
Alan Woodward cacb0e2197
Fix mixed-cluster test failure for top_hits aggregation (#97515)
The bug that this test checks is fixed is present in 8.8, so a
mixed-cluster test using 8.8 will fail.
2023-07-10 09:12:13 -04:00
Alan Woodward a426d36a6f
Set new providers before building FetchSubPhaseProcessors (#97460)
The FetchPhase sets source and field providers on its SearchLookup instance
to ensure that we only have to load stored fields once per document. However,
these were being set after subprocessors were constructed, and the
FetchFieldsPhase subprocessor pulls a reference to the SearchLookup at
construction time. This meant that the FieldsFetcher was not using these
providers, instead using the standard source and fields lookup, in effect meaning
that fields access would load source and fields separately.

This would mostly just be an issue with performance; however, the top hits
aggregation calls the fetch phase multiple times, with the result that the second
and subsequent buckets in a top hits aggregation would see the source lookup
from the previous bucket when building their FetchFieldsPhase subprocessor.

This commit moves the call to SearchExecutionContext#setLookupProviders()
in FetchPhase#buildSearchHits() before the call to build fetch phase subprocessors,
ensuring that the fetch fields phase sees the correct SearchLookup.

Fixes #96284
2023-07-10 11:53:48 +01:00
Simon Cooper 532e346e4a
Some more migrations of Version to IndexVersion (#97363) 2023-07-10 11:40:32 +01:00
Jim Ferenczi bccf4eeed2
Add the ability to reload search analyzers during shard recovery (#97421)
This change adds the ability for reloadable search analysers to adapt their loading based on
the index creation context. It is useful for reloadable search analysers that need to load
expensive resources from indices or disk. In such case they can defer the loading of the
resource during the shard recovery and avoid blocking a master or a create index thread.
---------

Co-authored-by: Mayya Sharipova <mayya.sharipova@elastic.co>
2023-07-07 17:19:47 +01:00
Simon Cooper 572716f7cc
Add IndexVersion to MainResponse (#97386) 2023-07-07 11:27:54 +01:00
Matt Culbreth d0faea3aaa
Change "rollver" to "rollover" in error message (#97436) 2023-07-06 20:31:32 -04:00
Keith Massey cdb52c67ae
Handling IOException from Windows in cleanup method of GeoIpDownloaderIT (#97433) 2023-07-06 16:32:07 -05:00
Ryan Ernst 568b292bde
Encapsulate current Build (#97292)
In order for build info to be pluggable for serverless, the current
build needs to be lazily determined. This commit moves the CURRENT
constant to a static method.

relates #96861
2023-07-06 11:38:54 -07:00
eyalkoren 9ba40d474c
Fix timestamp as object at root level for APM Server (#97401) 2023-07-06 17:27:14 +03:00
Andrei Dan cbedfbaea0
Fix test flakiness with a deterministic scenario (#97415) 2023-07-06 12:02:29 +01:00
Martijn van Groningen 55561588f5
Fix mapping parsing logic to determine synthetic source is active. (#97355)
Take index mode into account during parsing of the mapping when
determining whether source is synthetic

Fixes #97320
2023-07-06 06:42:14 -04:00
Mary Gouseti a432313ff3
Data stream lifecycle class names (#97381) 2023-07-05 12:28:32 +03:00
Simon Cooper 5728d15d15
Migrate geoshape tests to IndexVersion (#97365) 2023-07-05 09:07:10 +01:00
Rene Groeschke b8627079b4
Update Gradle Wrapper to 8.2 (#96686)
- Convention usage has been deprecated and was fixed in our build files
- Fix test dependencies and deprecation
2023-07-04 15:35:15 +02:00
Simon Cooper b6a4a7cded
Migrate preconfigured components and analyzer caches to IndexVersion (#97319) 2023-07-04 13:45:07 +01:00
Simon Cooper 55cf37cedd
Migrate Snapshot repository version to IndexVersion (#97226) 2023-07-04 11:42:46 +01:00
Andrei Dan d55ae09006
Remove redundant read of the data stream write index (#97277)
This code works on one cluster state version. We read the `DataStream`
from  the same cluster state twice. The write index will always be the
same in this case. (as we're not reading the `DataStream` from a new
cluster state version)

This PR drops a redundant (and confusing) reference the same write index
of the data  stream.
2023-07-04 06:37:10 -04:00
Mary Gouseti e978ab3f96
Move data steam lifecycle code to the data stream plugin (#97345) 2023-07-04 12:44:08 +03:00
Mary Gouseti b09279f568
Move data stream java rest tests from legacy to internal. (#97315) 2023-07-04 11:13:08 +03:00
Pablo Alcantar Morales c6c31f918a
fix flaky test (#97324) 2023-07-03 17:17:26 +02:00
Armin Braun 63e64ae61b
Cleanup Stream usage in various spots (#97306)
Lots of spots where we did weird things around streams like redundant stream creation, redundant collecting
before adding all the collected elements to another collection or so, redundant streams for joining strings
and using less efficient `Collectors.toList` and in a few cases also incorrectly relying on the result being mutable.
2023-07-03 14:24:57 +02:00
Armin Braun 81aee66925
Remove redundant Datastream.TimestampField class (#97298)
We can just inline all usage of this thing for now and keep the assertion
about the field name for parsing BwC.
-> saves some code and questions about the timestamp field
2023-07-03 10:39:41 +02:00
Armin Braun fdcecd9031
Fix Netty leak on event loop shutdown (#97301)
Follow up to #96856, turns out this wasn't safe to remove after all.
We can't go back to the previous solution though since that had double invocation issues
=> use notify-once for now until this is fixed in Netty.
2023-07-02 11:30:30 +02:00
Mary Gouseti b1069e66db
Rename origin text from lifecycle to data_stream_lifecycle (#97230) 2023-06-29 17:38:23 +03:00
Mary Gouseti f87c2c7758
Introduce downsampling configuration for data stream lifecycle (#97041)
This PR introduces downsampling configuration to the data stream lifecycle. Keep in mind downsampling implementation will come in a follow up PR. Configuration looks like this:
```
{
  "lifecycle": {
    "data_retention": "90d",
    "downsampling": [
      {
        "after": "1d",
        "fixed_interval": "2h"
      },
      { "after": "15d", "fixed_interval": "1d" },
      { "after": "30d", "fixed_interval": "1w" }
    ]
  }
}
```
We will also support using `null` to unset downsampling configuration during template composition:
```
{
  "lifecycle": {
    "data_retention": "90d",
    "downsampling": null
  }
}
```
2023-06-29 16:41:17 +03:00
Simon Cooper 94df6f2a74
Fix possible NPE when transportversion is null in MainResponse (#97203) 2023-06-29 08:31:56 +01:00
Michael Peterson 84c295cbbd
Improve error message for search.check_ccs_compatibility=true mode flag (#97059)
The code to have the better error message is already present.
But the current tests are not set up in a way to reveal that.
This adjusts the testing FailBeforeCurrentQueryBuilder to demonstrate it.

To show this via manual testing, I changed the MatchAllQueryBuilder to return
a TransportVersion in the future from its getMinimalSupportedVersion method.

When I start elasticsearch with -E search.check_ccs_compatibility=true and
do a match_all query, the response is:

```
{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "[class org.elasticsearch.action.search.SearchRequest] is not compatible with version 8500020 and the 'search.check_ccs_compatibility' setting is enabled. YY"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "[class org.elasticsearch.action.search.SearchRequest] is not compatible with version 8500020 and the 'search.check_ccs_compatibility' setting is enabled. YY",
    "caused_by": {
      "type": "illegal_argument_exception",
      "reason": "[match_all] was released first in version 8511132, failed compatibility check trying to send it to node with version 8500020"
    }
  },
  "status": 400
}
```

The underlying caused_by shows the exact query type (match_all) that was incompatible.
2023-06-28 11:27:20 -04:00
Simon Cooper 5486667d73
Convert snapshot version to IndexVersion (#96857) 2023-06-28 16:04:19 +01:00
Henning Andersen 145213ef39
Fix serverless scope for distrib APIs (#97175)
Mark APIs public/internal/unavailable as per prior discussion.
2023-06-28 15:09:18 +02:00
Volodymyr Krasnikov 6e7bc27622
Update Guava dependency version to 32.0.1-jre (#97033)
* Update Guava dependency version

* Update gradle verification metadata

* fix :repository-gcs:thirdPartyAudit

* fix :discovery-gce:thirdPartyAudit

* Use single sha256 hash in allowed list
2023-06-27 09:14:10 -07:00
Ievgen Degtiarenko d9b6c5ae29
Wire IndicesService to plugins (#97081)
This change exposes IndicesService to the plugins via Plugin#createComponents
2023-06-27 18:02:23 +02:00
Simon Cooper f554a6ab2d
Remove the DiscoveryNode constructor that takes a Version (#97136) 2023-06-27 16:41:45 +01:00
David Turner e9250e1c98
Remove unused SettingsUpgrader feature (#97127)
Plugins may supply a `SettingsUpgrader` which defines adjustments to
settings in an upgrade. The lifecycle of these adjustments is kinda
trappy: they run on a full-cluster-restart upgrade, but not in a rolling
upgrade, and they also apply upgrades to settings given in requests to
the cluster-settings update API.

In practice we have never needed these things very much, and all the
upgraders we've ever written involve versions which have been EOL for a
very long time. We have better ways to handle this kind of breaking
change today, so it'd be best to remove this feature.
2023-06-27 10:41:17 -04:00
Simon Cooper a873e26cf7
Convert IndexVersion.CURRENT to a method with a pluggable interface (#97132) 2023-06-27 14:47:32 +01:00
Armin Braun 1a34568a8b
Remove remaining redundant overrides (#97134)
Follow-up to #97130, removing all remaining redundant overrides outside
the server package.
2023-06-27 13:08:13 +02:00
Andrei Dan 3db25bdf63
Data stream lifecycle configures tail forcemerging (#97037)
This adds support for tail forcemerging in data stream lifecycle.
Before issuing the forcemerge (without the target segment specified) the
lifecycle service will aim to configure the floor segment merge policy
to `100mb` (configurable via a new cluster setting -
`data_streams.lifecycle.target.merge.policy.floor_segment`) and the merge
factor to `16` (configurable via a new cluster setting
`data_streams.lifecycle.target.merge.policy.merge_factor`).
2023-06-27 08:45:45 +01:00
David Turner 293e4e93b4
Remove unused TransportNodesAction#nodeResponseClass (#97091) 2023-06-26 05:57:00 -04:00
Ryan Ernst 16b45575c3
Fix Painless method lookup over unknown super interfaces (#97062)
In Java 21 List now extends SequencedCollection, instead of Collection
directly. When resolving methods Painless starts at the defined type,
and iterates up through super classes and interfaces. Unfortunately if a
superinterface was not known, as it is for SequencedCollection since it
is not in the allowed list of classes, method resolution would give up.
This commit adjusts the superinterface interation to continue traversing
until the method is found or no more superinterfaces are found.

fixes #97022
2023-06-23 21:28:13 -04:00
Iraklis Psaroudakis 8ada557dae
Upgrade Netty to 4.1.94.Final (#97040)
Just staying up to date.
2023-06-23 19:50:35 +03:00
Salvatore Campagna 2798c49221
fix: incorrect date if now close to midnight (#97047)
If 'now' is between midnight and 2 in the night, going back 2 hours
(twoHoursAgo) results in a date whose day is the day before. Here we use
a fixed instant so to avoid the test failing depending on the value of
`now`.

We also use `twoHoursAgo` instead of `now` to generate the correct
filename even if, for the selected instant, this is not strictly
necessary (using `now` or `twoHoursAgo` results in the same day in the
filename).

Resolves #96672
2023-06-23 09:32:11 -04:00
Mayya Sharipova 11a3104a8c
Auto-reload analyzers for specific resource (#96986)
This PR adds a new optional parameter "resource" for ReloadAnalyzersRequest.
If used, only analyzers that use this specific "resource" will be reload.
This parameter is not documented, for internal use only.

PR #96886 introduced auto-reload of analyzers on synonyms index change. The problem
was that reloading was applied broadly for all indices that contained reloadable
analyzers. This PR improves this, so when a particular synonyms set changes,
only analyzers that use this synonyms set  will auto-reloaded. Note that shard
requests will still be sent to all indices shards, as only on a shard we can
decide if analyzers need to be reloaded.
2023-06-22 13:09:45 -04:00
Armin Braun dd7d381922
Dry up getting cluster admin client in tests (#96952)
Drying this up further and adding the same short-cut for single node
tests. Dealing with most of the spots that I could grab via automatic
refactorings.
2023-06-22 14:27:23 +02:00
Ryan Ernst e0ec5fc2ea
Fix serialization constant (#96981)
In #96900 a new transport version was added for additional information
in the main response. However, due to upstream conflicts, this transport
version had to be bumped a few times. In the process, the the version
was bumped, but the condition was never updated. This commit fixes the
condition to use the version that was added.
2023-06-21 10:24:39 -04:00
Kostas Krikellas bd29706a84
Add tests for execution_hint in aggregations with percentiles. (#96927)
Setting `experiment_hint` to `high_accuracy` leads to using a slower but
more accurate implementation for TDigest.

Add YAML tests to cover for settings this param properly and for parsing
errors.

Related to #95903
2023-06-21 08:29:24 -04:00
Mary Gouseti 3791323ff0
Update DLM security related code to use the new name (#96951) 2023-06-21 14:15:02 +03:00
Ryan Ernst 3d11f42888
Add transport version to main response (#96900)
The root endpoint for Elasticsearch contains diagnostic information
about what Elasticsearch is running. This commit adds TransportVersion
to this diagnostic information, alongside the existing version
information.
2023-06-20 16:36:04 -04:00
tmgordeeva 59c6621d24
Min score for time series (#96878)
* Min score for time series

Enables min score on time series aggregation.
2023-06-20 11:33:33 -07:00
Carlos Delgado 75729363e2
Synonyms API - PUT Synonym Rule request (#96865) 2023-06-20 18:45:41 +02:00
Felix Barnsteiner f69094f343
Trim field references in reroute processor (#96941) 2023-06-20 18:22:51 +02:00
tmgordeeva 0bd1803097
Error message for misconfigured TSDB index (#96956)
* Error message for misconfigured TSDB index

TSDB indices should always have synthetic source.

* Update docs/changelog/96956.yaml
2023-06-20 08:48:58 -07:00
Armin Braun 3f8ee82ef8
Use indices admin client shortcut in most integration tests (#96946)
Replacing the remaining usages that I could automatically replace
and a couple that I did by hand in this PR.
Also, added the same shortcut to the single node tests to save some
duplication there.
2023-06-20 13:32:59 +02:00
Mary Gouseti c8e676aea8
Rebranding user facing DLM references (#96866)
Part of bigger rebranding effort (#96875)
2023-06-20 14:32:24 +03:00
Mayya Sharipova b508ee7886
Auto-reload synonym analyzers on synonyms updates (#96886)
Synonym Management API project

On changes of synonyms in a synonym set, auto-reload analyzers.
Note that currently all updateable analyzers will be reloaded, even
those that are not relevant for a synonyms set being updated.
2023-06-19 10:50:02 -04:00
Kostas Krikellas cd3f84cdb3
Switch TDigestState to use HybridDigest by default (#96904)
* Initial import for TDigest forking.

* Fix MedianTest.

More work needed for TDigestPercentile*Tests and the TDigestTest (and
the rest of the tests) in the tdigest lib to pass.

* Fix Dist.

* Fix AVLTreeDigest.quantile to match Dist for uniform centroids.

* Update docs/changelog/96086.yaml

* Fix `MergingDigest.quantile` to match `Dist` on uniform distribution.

* Add merging to TDigestState.hashCode and .equals.

Remove wrong asserts from tests and MergingDigest.

* Fix style violations for tdigest library.

* Fix typo.

* Fix more style violations.

* Fix more style violations.

* Fix remaining style violations in tdigest library.

* Update results in docs based on the forked tdigest.

* Fix YAML tests in aggs module.

* Fix YAML tests in x-pack/plugin.

* Skip failing V7 compat tests in modules/aggregations.

* Fix TDigest library unittests.

Remove redundant serializing interfaces from the library.

* Remove YAML test versions for older releases.

These tests don't address compatibility issues in mixed cluster tests as
the latter contain a mix of older and newer nodes, so the output depends
on which node is picked as a data node since the forked TDigest library
is not backwards compatible (produces slightly different results).

* Fix test failures in docs and mixed cluster.

* Reduce buffer sizes in MergingDigest to avoid oom.

* Exclude more failing V7 compatibility tests.

* Update results for JdbcCsvSpecIT tests.

* Update results for JdbcDocCsvSpecIT tests.

* Revert unrelated change.

* More test fixes.

* Use version skips instead of blacklisting in mixed cluster tests.

* Switch TDigestState back to AVLTreeDigest.

* Update docs and tests with AVLTreeDigest output.

* Update flaky test.

* Remove dead code, esp around tracking of incoming data.

* Update docs/changelog/96086.yaml

* Delete docs/changelog/96086.yaml

* Remove explicit compression calls.

This was added to prevent concurrency tests from failing, but it leads
to reduces precision. Submit this to see if the concurrency tests are
still failing.

* Revert "Remove explicit compression calls."

This reverts commit 5352c96f65.

* Remove explicit compression calls to MedianAbsoluteDeviation input.

* Add unittests for AVL and merging digest accuracy.

* Fix spotless violations.

* Delete redundant tests and benchmarks.

* Fix spotless violation.

* Use the old implementation of AVLTreeDigest.

The latest library version is 50% slower and less accurate, as verified
by ComparisonTests.

* Update docs with latest percentile results.

* Update docs with latest percentile results.

* Remove repeated compression calls.

* Update more percentile results.

* Use approximate percentile values in integration tests.

This helps with mixed cluster tests, where some of the tests where
blocked.

* Fix expected percentile value in test.

* Revert in-place node updates in AVL tree.

Update quantile calculations between centroids and min/max values to
match v.3.2.

* Add SortingDigest and HybridDigest.

The SortingDigest tracks all samples in an ArrayList that
gets sorted for quantile calculations. This approach
provides perfectly accurate results and is the most
efficient implementation for up to millions of samples,
at the cost of bloated memory footprint.

The HybridDigest uses a SortingDigest for small sample
populations, then switches to a MergingDigest. This
approach combines to the best performance and results for
small sample counts with very good performance and
acceptable accuracy for effectively unbounded sample
counts.

* Remove deps to the 3.2 library.

* Remove unused licenses for tdigest.

* Revert changes for SortingDigest and HybridDigest.

These will be submitted in a follow-up PR for enabling MergingDigest.

* Remove unused Histogram classes and unit tests.

Delete dead and commented out code, make the remaining tests run
reasonably fast. Remove unused annotations, esp. SuppressWarnings.

* Remove Comparison class, not used.

* Revert "Revert changes for SortingDigest and HybridDigest."

This reverts commit 2336b11598.

* Use HybridDigest as default tdigest implementation

Add SortingDigest as a simple structure for percentile calculations that
tracks all data points in a sorted array. This is a fast and perfectly
accurate solution that leads to bloated memory allocation.

Add HybridDigest that uses SortingDigest for small sample counts, then
switches to MergingDigest. This approach delivers extreme
performance and accuracy for small populations while scaling
indefinitely and maintaining acceptable performance and accuracy with
constant memory allocation (15kB by default).

Provide knobs to switch back to AVLTreeDigest, either per query or
through ClusterSettings.

* Small fixes.

* Add javadoc and tests.

* Add javadoc and tests.

* Remove special logic for singletons in the boundaries.

While this helps with the case where the digest contains only
singletons (perfect accuracy), it has a major issue problem
(non-monotonic quantile function) when the first singleton is followed
by a non-singleton centroid. It's preferable to revert to the old
version from 3.2; inaccuracies in a singleton-only digest should be
mitigated by using a sorted array for small sample counts.

* Revert changes to expected values in tests.

This is due to restoring quantile functions to match head.

* Revert changes to expected values in tests.

This is due to restoring quantile functions to match head.

* Tentatively restore percentile rank expected results.

* Use cdf version from 3.2

Update Dist.cdf to use interpolation, use the same cdf
version in AVLTreeDigest and MergingDigest.

* Revert "Tentatively restore percentile rank expected results."

This reverts commit 7718dbba59.

* Revert remaining changes compared to main.

* Revert excluded V7 compat tests.

* Exclude V7 compat tests still failing.

* Exclude V7 compat tests still failing.

* Remove ClusterSettings tentatively.

* Initial import for TDigest forking.

* Fix MedianTest.

More work needed for TDigestPercentile*Tests and the TDigestTest (and
the rest of the tests) in the tdigest lib to pass.

* Fix Dist.

* Fix AVLTreeDigest.quantile to match Dist for uniform centroids.

* Update docs/changelog/96086.yaml

* Fix `MergingDigest.quantile` to match `Dist` on uniform distribution.

* Add merging to TDigestState.hashCode and .equals.

Remove wrong asserts from tests and MergingDigest.

* Fix style violations for tdigest library.

* Fix typo.

* Fix more style violations.

* Fix more style violations.

* Fix remaining style violations in tdigest library.

* Update results in docs based on the forked tdigest.

* Fix YAML tests in aggs module.

* Fix YAML tests in x-pack/plugin.

* Skip failing V7 compat tests in modules/aggregations.

* Fix TDigest library unittests.

Remove redundant serializing interfaces from the library.

* Remove YAML test versions for older releases.

These tests don't address compatibility issues in mixed cluster tests as
the latter contain a mix of older and newer nodes, so the output depends
on which node is picked as a data node since the forked TDigest library
is not backwards compatible (produces slightly different results).

* Fix test failures in docs and mixed cluster.

* Reduce buffer sizes in MergingDigest to avoid oom.

* Exclude more failing V7 compatibility tests.

* Update results for JdbcCsvSpecIT tests.

* Update results for JdbcDocCsvSpecIT tests.

* Revert unrelated change.

* More test fixes.

* Use version skips instead of blacklisting in mixed cluster tests.

* Switch TDigestState back to AVLTreeDigest.

* Update docs and tests with AVLTreeDigest output.

* Update flaky test.

* Remove dead code, esp around tracking of incoming data.

* Remove explicit compression calls.

This was added to prevent concurrency tests from failing, but it leads
to reduces precision. Submit this to see if the concurrency tests are
still failing.

* Update docs/changelog/96086.yaml

* Delete docs/changelog/96086.yaml

* Revert "Remove explicit compression calls."

This reverts commit 5352c96f65.

* Remove explicit compression calls to MedianAbsoluteDeviation input.

* Add unittests for AVL and merging digest accuracy.

* Fix spotless violations.

* Delete redundant tests and benchmarks.

* Fix spotless violation.

* Use the old implementation of AVLTreeDigest.

The latest library version is 50% slower and less accurate, as verified
by ComparisonTests.

* Update docs with latest percentile results.

* Update docs with latest percentile results.

* Remove repeated compression calls.

* Update more percentile results.

* Use approximate percentile values in integration tests.

This helps with mixed cluster tests, where some of the tests where
blocked.

* Fix expected percentile value in test.

* Revert in-place node updates in AVL tree.

Update quantile calculations between centroids and min/max values to
match v.3.2.

* Add SortingDigest and HybridDigest.

The SortingDigest tracks all samples in an ArrayList that
gets sorted for quantile calculations. This approach
provides perfectly accurate results and is the most
efficient implementation for up to millions of samples,
at the cost of bloated memory footprint.

The HybridDigest uses a SortingDigest for small sample
populations, then switches to a MergingDigest. This
approach combines to the best performance and results for
small sample counts with very good performance and
acceptable accuracy for effectively unbounded sample
counts.

* Remove deps to the 3.2 library.

* Remove unused licenses for tdigest.

* Revert changes for SortingDigest and HybridDigest.

These will be submitted in a follow-up PR for enabling MergingDigest.

* Remove unused Histogram classes and unit tests.

Delete dead and commented out code, make the remaining tests run
reasonably fast. Remove unused annotations, esp. SuppressWarnings.

* Remove Comparison class, not used.

* Revert "Revert changes for SortingDigest and HybridDigest."

This reverts commit 2336b11598.

* Use HybridDigest as default tdigest implementation

Add SortingDigest as a simple structure for percentile calculations that
tracks all data points in a sorted array. This is a fast and perfectly
accurate solution that leads to bloated memory allocation.

Add HybridDigest that uses SortingDigest for small sample counts, then
switches to MergingDigest. This approach delivers extreme
performance and accuracy for small populations while scaling
indefinitely and maintaining acceptable performance and accuracy with
constant memory allocation (15kB by default).

Provide knobs to switch back to AVLTreeDigest, either per query or
through ClusterSettings.

* Add javadoc and tests.

* Remove ClusterSettings tentatively.

* Restore bySize function in TDigest and subclasses.

* Update Dist.cdf to match the rest.

Update tests.

* Revert outdated test changes.

* Revert outdated changes.

* Small fixes.

* Update docs/changelog/96794.yaml

* TDigestState uses MergingDigest by default.

* Make HybridDigest the default implementation.

* Update boxplot documentation.

* Use HybridDigest for real.

* Restore AVLTreeDigest as the default in TDigestState.

TDigest.createHybridDigest nw returns the right type.
The switch in TDigestState will happen in a separate PR
as it requires many test updates.

* Use execution_hint in tdigest spec.

* Restore expected test values.

* Fix Dist.cdf for empty digest.

* Bump up TransportVersion.

* More test updates.

* Bump up TransportVersion for real.

* Restore V7 compat blacklisting.

* HybridDigest uses its final implementation during deserialization.

* Restore the right TransportVersion in TDigestState.read

* More test fixes.

* More test updates.

* Use TDigestExecutionHint instead of strings.

* Add link to TDigest javadoc.

* Spotless fix.

* Small fixes.

* Bump up TransportVersion.

* Bump up the TransportVersion, again.

* Update docs/changelog/96904.yaml

* Delete 96794.yaml

Delete existing changelog to get a new one.

* Restore previous changelog.

* Rename  96794.yaml to 96794.yaml

* Update breaking change notes in changelog.

* Remove mapping value from changelog.

* Set a valid breaking area.

* Use HybridDigest as default TDigest impl.

* Update docs/changelog/96904.yaml

* Use TDigestExecutionHint in MedianAbsoluteDeviationAggregator.

* Update changelog and comment in blacklisted V7 compat tests.

* Update breaking area in changelog.
2023-06-19 17:19:18 +03:00
eyalkoren 09db4866eb
Adjust ECS dynamic templates to support `subobjects: false` (#96712) 2023-06-18 16:47:11 +03:00
Armin Braun 74a594d9a6
Remove redundant event loop shutdown check in Netty4TcpChannel (#96856)
This check isn't necessary any longer. We safely close all channels
before shutting down the even loop groups so we can not run into
the situation that a listener passed to `writeAndFlush` isn't completed.
We also don't have this check on the http channel sending.
-> remove the check that can cause a bug by double invoking the listener
closes #95759
2023-06-16 10:23:29 +02:00
Jack Conradson 52be52bcad
Add `sub_searches` to the search endpoint (#96224)
This change adds a new top-level element to the search endpoint called sub_searches. This top-level 
element allows for a list of additional searches where each "sub search" will have a query executed 
separately as part of ranking and later combined into a final single set of documents based on the 
ranking algorithm.
2023-06-15 20:53:07 -07:00
Aurélien FOUCRET dd1d157b47
Enable analytics geoip in behavioral analytics. (#96624)
* When using a managed pipeline GeoIpDownloader is triggered only when an index exists for the pipeline.

* When using a managed pipeline GeoIpDownloader is triggered only when an index exists for the pipeline.

* Adding the geoip processor back

* Adding tags to the events mapping.

* Fix a forbidden API call into tests.

* lint

* Adding an integration tests for managed pipelines.

* lint

* Add a geoip_database_lazy_download param to pipelines and use it instead of managed.

* Fix a edge case: pipeline can be set after index is created.

* lint.

* Update docs/changelog/96624.yaml

* Update 96624.yaml

* Uses a processor setting (download_database_on_pipeline_creation) to decide database download strategy.

* Removing debug instruction.

* Improved documentation.

* Improved the way to check for referenced pipelines.

* Fixing an error in test.

* Improved integration tests.

* Lint.

* Fix failing tests.

* Fix failing tests (2).

* Adding javadoc.

* lint javadoc.

* Using a set instead of a list to store checked pipelines.
2023-06-15 23:42:10 +02:00
Armin Braun 795e07c021
Dry up some spots around map reading from StreamInput (#96853)
Mostly the keys we read are strings, lets add an overload for that
to save some code and maybe help the compiler make better decisions.
Also readMapOfLists an be way simplified, no point in duplicating
the map reading code here just to save one capturing lambda, there's
not hot code that benefits from this.
2023-06-15 12:21:25 +02:00
Armin Braun 3fa0b8a835
Read immutable set (#96831)
Just like the readImmutableList method, adding this avoids some copying
and needless wrapping relative to the replace usages. The few plain
usages of the mutable set read that were replaced are safe to replace
since their non-stream-input constructors use immutable set copies.
2023-06-15 05:02:35 -04:00
Mayya Sharipova 353f357f0b
Move reload analyzers to server (#96846)
This is need for Synonyms Management API project to be able
to auto-reload analyzers with synonyms on synonms change.
2023-06-14 18:09:24 -04:00
Simon Cooper 71c12262fb
Migrate index created version to IndexVersion (#96066) 2023-06-14 09:43:31 +01:00
Przemyslaw Gomulka c6da231a5e
Change MainRequest constructor scope (#96810)
in order for the rest-root module to be reused, the scope of the constructor
has to be changed to public
2023-06-14 09:57:59 +02:00
Ryan Ernst 164e97e2ca
Encapsulate TransportVersion.CURRENT (#96681)
This commit changes access to the latest TransportVersion constant to
use a static method instead of a public static field. By encapsulating
the field we will be able to (in a followup) lazily determine what the
latest is, outside of clinit.
2023-06-13 18:44:15 -04:00
Simon Cooper 17fd6372bc
Replace more uses of MapBuilder with JCL maps (#96649) 2023-06-13 10:04:13 +01:00
Simon Cooper c57fb3a77d
Refactor some uses of DiscoveryNodeUtils.create to the builder (#96560) 2023-06-13 10:02:20 +01:00
Kostas Krikellas 67211be81d
Fork TDigest library (#96086)
* Initial import for TDigest forking.

* Fix MedianTest.

More work needed for TDigestPercentile*Tests and the TDigestTest (and
the rest of the tests) in the tdigest lib to pass.

* Fix Dist.

* Fix AVLTreeDigest.quantile to match Dist for uniform centroids.

* Update docs/changelog/96086.yaml

* Fix `MergingDigest.quantile` to match `Dist` on uniform distribution.

* Add merging to TDigestState.hashCode and .equals.

Remove wrong asserts from tests and MergingDigest.

* Fix style violations for tdigest library.

* Fix typo.

* Fix more style violations.

* Fix more style violations.

* Fix remaining style violations in tdigest library.

* Update results in docs based on the forked tdigest.

* Fix YAML tests in aggs module.

* Fix YAML tests in x-pack/plugin.

* Skip failing V7 compat tests in modules/aggregations.

* Fix TDigest library unittests.

Remove redundant serializing interfaces from the library.

* Remove YAML test versions for older releases.

These tests don't address compatibility issues in mixed cluster tests as
the latter contain a mix of older and newer nodes, so the output depends
on which node is picked as a data node since the forked TDigest library
is not backwards compatible (produces slightly different results).

* Fix test failures in docs and mixed cluster.

* Reduce buffer sizes in MergingDigest to avoid oom.

* Exclude more failing V7 compatibility tests.

* Update results for JdbcCsvSpecIT tests.

* Update results for JdbcDocCsvSpecIT tests.

* Revert unrelated change.

* More test fixes.

* Use version skips instead of blacklisting in mixed cluster tests.

* Switch TDigestState back to AVLTreeDigest.

* Update docs and tests with AVLTreeDigest output.

* Update flaky test.

* Remove dead code, esp around tracking of incoming data.

* Update docs/changelog/96086.yaml

* Delete docs/changelog/96086.yaml

* Remove explicit compression calls.

This was added to prevent concurrency tests from failing, but it leads
to reduces precision. Submit this to see if the concurrency tests are
still failing.

* Revert "Remove explicit compression calls."

This reverts commit 5352c96f65.

* Remove explicit compression calls to MedianAbsoluteDeviation input.

* Add unittests for AVL and merging digest accuracy.

* Fix spotless violations.

* Delete redundant tests and benchmarks.

* Fix spotless violation.

* Use the old implementation of AVLTreeDigest.

The latest library version is 50% slower and less accurate, as verified
by ComparisonTests.

* Update docs with latest percentile results.

* Update docs with latest percentile results.

* Remove repeated compression calls.

* Update more percentile results.

* Use approximate percentile values in integration tests.

This helps with mixed cluster tests, where some of the tests where
blocked.

* Fix expected percentile value in test.

* Revert in-place node updates in AVL tree.

Update quantile calculations between centroids and min/max values to
match v.3.2.

* Add SortingDigest and HybridDigest.

The SortingDigest tracks all samples in an ArrayList that
gets sorted for quantile calculations. This approach
provides perfectly accurate results and is the most
efficient implementation for up to millions of samples,
at the cost of bloated memory footprint.

The HybridDigest uses a SortingDigest for small sample
populations, then switches to a MergingDigest. This
approach combines to the best performance and results for
small sample counts with very good performance and
acceptable accuracy for effectively unbounded sample
counts.

* Remove deps to the 3.2 library.

* Remove unused licenses for tdigest.

* Revert changes for SortingDigest and HybridDigest.

These will be submitted in a follow-up PR for enabling MergingDigest.

* Remove unused Histogram classes and unit tests.

Delete dead and commented out code, make the remaining tests run
reasonably fast. Remove unused annotations, esp. SuppressWarnings.

* Remove Comparison class, not used.

* Small fixes.

* Add javadoc and tests.

* Remove special logic for singletons in the boundaries.

While this helps with the case where the digest contains only
singletons (perfect accuracy), it has a major issue problem
(non-monotonic quantile function) when the first singleton is followed
by a non-singleton centroid. It's preferable to revert to the old
version from 3.2; inaccuracies in a singleton-only digest should be
mitigated by using a sorted array for small sample counts.

* Revert changes to expected values in tests.

This is due to restoring quantile functions to match head.

* Revert changes to expected values in tests.

This is due to restoring quantile functions to match head.

* Tentatively restore percentile rank expected results.

* Use cdf version from 3.2

Update Dist.cdf to use interpolation, use the same cdf
version in AVLTreeDigest and MergingDigest.

* Revert "Tentatively restore percentile rank expected results."

This reverts commit 7718dbba59.

* Revert remaining changes compared to main.

* Revert excluded V7 compat tests.

* Exclude V7 compat tests still failing.

* Exclude V7 compat tests still failing.

* Restore bySize function in TDigest and subclasses.
2023-06-13 11:43:54 +03:00
Keith Massey 65fc5e5696
Fixing GeoIpDownloaderStatsAction$NodeResponse serialization by defensively copying inputs (#96777) 2023-06-12 14:58:48 -05:00
Alan Woodward d927d1a9a7
Upgrade to new lucene snapshot 9.7.0-snapshot-41cd1f7a88c (#96741)
Notable changes:

* more efficient backwards reads in NIOFSDirectory
* faster merging when using soft deletes
* workaround security manager when using vector API
2023-06-12 09:13:58 +01:00
Marantidis Kiriakos a8cf4d6006
Add support for pattern replace filter in normalizers (#96588)
This change adds support for `pattern_replace` token filters use in custom normalizers. 

Closes #83005
2023-06-10 00:32:39 +02:00
Martijn van Groningen 055d4e88d4
Fix GetDataStreamsTransportActionTests#testGetTimeSeriesMixedDataStream test. (#96720)
The name of backing indices was based on fixed dates.

Closes #96672
2023-06-09 10:55:30 +02:00
Mayya Sharipova 8b628517a9
Load synonyms from system index for analyzers (#96674)
- Create a new option of "synonyms_set" for synonym set filter that
specifies which synonyms set to be loaded from the system ".synonyms" index
- On index creation for this option load synonyms set  from index
- If synonyms set doesn't exist, index creation request still succeeds, but
shards are not allocated, so the cluster state will be read.

Note: this is a temporary solution, as:
- No check is done on master node, as fake synonyms are provided
- On shard on index creation we use a blocking operation in the cluster applier thread
2023-06-08 11:07:47 -04:00
Martijn van Groningen 0408f26ce8
Mute GetDataStreamsTransportActionTests#testGetTimeSeriesMixedDataStream test (#96673)
Relates to #96672
2023-06-07 15:23:24 -04:00
Martijn van Groningen e13091d813
The get data stream api incorrectly prints warning log for upgraded tsdb data streams (#96606)
Invoking the get data stream api on an upgrade tsdb data stream (which has normal and tsbd indices) can result in a warning log incorrectly be logged. This warning is logged as part of computing the temporal ranges of a tsdb data stream. This warning log indicates that tsdb backing indices are overlapping, but this isn't true. For normal indices, this api picked -9999-01-01T00:00:00Z as start time and 9999-12-31T23:59:59.999Z as end time. This will overlap with any tsdb backing index. Instead, the normal indices should be skipped for computing the temporal ranges.

The following can incorrectlu be logged for upgraded tsdb data streams: [instance-0000000074] previous backing index [-9999-01-01T00:00:00Z/9999-12-31T23:59:59.999Z] range is colliding with current backing index range [-9999-01-01T00:00:00Z/9999-12-31T23:59:59.999Z] This change addresses this. Note that in tests an assertion would trip before the log gets printed.
2023-06-07 20:19:49 +02:00
David Turner c5e519dcd8
Remove unnecessary `!= false` idioms (#96654) 2023-06-07 10:12:51 -04:00
eyalkoren 183c0a5da2
[Logs+] Adding ECS dynamic templates (#96171) 2023-06-07 16:31:20 +03:00
Andrei Dan c632108aa1
Remove feature flag checks from DLM serialisation (#96583)
Remove feature flag checks from DLM serialisation.
2023-06-07 13:13:28 +01:00
Felix Barnsteiner 2e81bcc2ee
Support dotted field notations in the reroute processor (#96243) 2023-06-06 21:36:47 +02:00
Armin Braun 414eda7b80
Cheaper ActionListener.wrap when error handler is the listener (#96575)
Motivated by looking into allocations of listeners in detail for shared cache benchmarking.
Wrapping a listener and using `listener::onFailure` as the failure callback means that we
have a reference to the listener from both the failure and the response handler.
If we use the approach used by the `.deleteGate*` methods, we can often save allocating
a response handler lambda or at least make the response handler cheaper.
We also save allocating the failure handler lambda.
2023-06-06 11:42:39 +02:00
Keith Massey cc0777535b
Changing DataLifecycleServiceIT.testAutomaticForceMerge to use a RequestHandlingBehavior to record forcemerge (#96584) 2023-06-05 17:21:42 -05:00
Mark Tozzi eb0246147a
Use explicit mapping instead of dynamic in Matrix Stats Rest Test (#96531) 2023-06-02 13:37:39 -04:00
Martijn van Groningen 0a81be5698
Fix range query coordinator rewrite issue (#96524)
If during the coordinator rewrite can't find the requested field we
should assume that the query matches. So
`MappedFieldType.Relation.INTERSECTS` should be returned instead of
`MappedFieldType.Relation.DISJOINT`.

Closes #96487 and #96501

(marking as non-issue, this is a bug in unreleased code)
2023-06-02 12:21:27 -04:00
David Turner a3179b28fc AwaitsFix for #96501 2023-06-02 12:22:22 +01:00
Mary Gouseti 8363e8c7f5
Allow for the data lifecycle to be explicitly nullified (#95979)
This PR allows a user of an index template to explicitly reset to `null` either the whole data lifecycle config or just the retention of a component template and overwrite what previous component templates have set.
2023-06-01 18:04:51 +03:00
Simon Cooper ff9c0aab37
Rename TestDiscoveryNode to DiscoveryNodeUtils (#96491) 2023-06-01 10:23:56 -04:00
Salvatore Campagna 4837e85af0
Mute testRetrievingHits (#96488) 2023-06-01 15:00:05 +02:00
Simon Cooper ef1b610019
Refactor some uses of TestDiscoveryNode.create to the builder (#96109) 2023-06-01 12:11:52 +01:00
Simon Cooper e8c382f895
Refactor some uses of TestDiscoveryNode.create to the builder (#96108) 2023-06-01 11:24:47 +01:00
Simon Cooper f49e7f78ee
Add helper functions allowing lambdas to be used to modify junit matchers (#95078)
Add junit matchers allowing other matchers to be transformed using a
function. This can make checking properties of lists/arrays a lot more
fluent using nested matchers, rather than declaratively checking
individual items. These replace the custom `ElasticsearchMatchers` with
more generic ones.

As a basic example, it allows you to turn this:

    assertThat(list.get(0).getName(), equalTo("foo"));
    assertThat(list.get(1).getName(), equalTo("bar"));
    assertThat(list.get(2).getName(), equalTo("quux"));

into this:

    assertThat(list, transformedItems(Item::getName, contains("foo", "bar", "quux")));

Doing this 'properly' without these helpers requires defining your own
matchers, which is very cumbersome.

I've applied the new methods to `ElasticsearchAssertions` and a few
other classes to show various use cases.
2023-06-01 06:10:03 -04:00
Simon Cooper 32265c5a80
Move some more constructors over to TestDiscoveryNode (#96107) 2023-06-01 10:22:31 +01:00
Chris Hegarty 1cc1d12432
Fix lossy-conversions lint warnings (#96398)
JDK 20 added a new javac lint warning for possible lossy conversion in compound assignments - because of implicit type casts, e.g.
warning: [lossy-conversions] implicit cast from int to byte in compound assignment is possibly lossy

The change resolves all such warnings, by either widening the type of the left-hand operand, or explicitly casting the type of the right-hand operand.
2023-05-31 22:16:10 +01:00
Nikolaj Volgushev e6031d870b
Port DLM permissions test to internal REST test style (#96434)
Porting the DLM permissions REST test to the new style of cluster tests.
This has the usual nice perks, and also allows us to remove the separate
`qa/with-security` package. 

No functional or test logic changes. 

I didn't suggest this as part of the PR review for
https://github.com/elastic/elasticsearch/pull/95512 so as not to block
that PR further, and also because I wasn't sure about the overhead of
making this change (it did end up taking some battling with gradle).
2023-05-31 07:36:30 -04:00
Carlos Delgado 39b7b5eb56
Synonym Mgmnt API: PUT request (#95895) 2023-05-31 10:48:56 +02:00
Luca Cavanna e5768d9335
Upgrade Lucene to a 9.7.0 snapshot (#96433)
Most relevant changes:

- add api to allow concurrent query rewrite (GITHUB-11838 Add api to allow concurrent query rewrite apache/lucene#11840)
- knn query rewrite (Concurrent rewrite for KnnVectorQuery apache/lucene#12160)
- Integrate the incubating Panama Vector API (Integrate the Incubating Panama Vector API  apache/lucene#12311)

As part of this commit I moved the ES codebase off of overriding or relying on the deprecated rewrite(IndexReader) method in favour of using rewrite(IndexSearcher) instead. For score functions, I went for not breaking existing plugins and create a new IndexSearcher whenever we rewrite a filter, otherwise we'd need to change the ScoreFunction#rewrite signature to take a searcher instead of a reader.

Co-authored-by: ChrisHegarty <christopher.hegarty@elastic.co>
2023-05-31 10:17:10 +02:00
Mayya Sharipova 433ce88852
Rank_feature field null_value test and small edits (#96392)
Correct and add more tests for adding null_value parameter for the
rank_feature field.

Relates to #95811, closes #95149
2023-05-30 07:33:40 -04:00
Keith Massey 860ccfda72
Ignoring DataLifecycleServiceIT.testAutomaticForceMerge (#96395)
Ignoring a failing test. Relates to #96084
2023-05-26 18:31:50 -04:00
Keith Massey 9aead842e9
Fixing DataLifecycleServiceIT.testAutomaticForceMerge() using a StubbableTransport.SendRequestBehavior instead of relying on forcemerge behavior (#96393)
It turns out that it is even harder than I had thought to detect that
forcemerge has run. So I have changed
DataLifecycleServiceIT.testAutomaticForceMerge() to use a custom
StubbableTransport.SendRequestBehavior to detect that a ForceMergeAction
has been sent, rather than checking whether the number of segments has
changed. Closes #96084
2023-05-26 15:25:23 -04:00
Keith Massey 322805858f
Adding manage_dlm privilege (#95512)
This adds a new index privilege called `manage_dlm`. The `manage_dlm`
has permission to perform all DLM actions on an index, including put and
delete. It also adds the ability to call DLM get and explain to the
`view_index_metadata` existing index privilege.
2023-05-26 11:16:41 -04:00
Pooya Salehi 193e9201d7
Use dynamic port range in testLoadsProxySettings (#96378)
Closes https://github.com/elastic/elasticsearch/issues/95821
2023-05-26 09:27:06 -04:00
Pooya Salehi 48c65472da
remove unnecessary routing entry null checks (#96341)
`IndexShard.routingEntry()` doesn't return null anymore, and therefore
these checks are not needed.
2023-05-26 07:37:37 -04:00
Nikolaj Volgushev 41d46426d1
Dedicated internal user for DLM runtime tasks (#96253)
This PR introduces a new internal user `_dlm`. `_dlm` will run all DLM
runtime tasks, which include index deletion (due to retention),
rollover, and force merges.  

Technically, there is a BWC concern here: in a mixed cluster with nodes
on 8.8 and 8.9, 8.8 nodes would not be able to deserialize the `_dlm`
internal user (as we're only adding it in 8.9). However, DLM is still
behind a feature flag so I don't think we need to address this.
2023-05-26 05:19:57 -04:00
Pablo Alcantar Morales 244da063ca
Add `ingest` information to the cluster info endpoint (#96328) 2023-05-26 09:28:12 +02:00
Tim Brooks ac829edc55
Enable skip methods on retrying inputstreams (#96337)
Currently we have a number of input streams that specifically override
the skip() method disabling the ability to skip bytes. In each case the
skip implementation works as we have properly implemented the
read(byte[]) methods used to discard bytes. However, we appear to have
disabled it as it would be possible to retry from the end of a skip if
there is a failure in the middle. At this time, that optimization is not
really necessary, however, we sporadically used skip so it would be nice
for the IS to support the method. This commit enables the super.skip()
and adds a comment about future optimizations.
2023-05-25 10:11:27 -06:00
Martijn van Groningen e7aef532bc
Expand start and end time to nanoseconds during coordinator rewrite when needed (#96035)
Expand index.time_series.start_time and end_time to nanoseconds if
timestamp field's resolution is set to nanoseconds. When creating
coordinator rewrite context.

Closes #96030
2023-05-24 04:29:49 -04:00
Marantidis Kiriakos e44edcebf0
Add null_value for rank_feature field
Closes #95149
2023-05-23 12:52:27 -04:00
Keith Massey ae5a2c6cb5
Fixing expectations in DataLifecycleServiceIT when there is a single node (#96228)
If an integration test brings up a 1-node cluster, and if DLM runs
forcemerge, then forcemerge will report an error. This PR updates 2
DataLifecycleServiceIT tests to account for that, similar to #96047.
Closes #96070
2023-05-23 10:23:01 -04:00
Keith Massey b8f170db95
Fixing DataLifecycleServiceTests.testUpdateForceMergeCompleteTask (#96260) 2023-05-23 08:14:54 -05:00
eyalkoren 7d57731291
[Logs+] Automatically parse JSON log events into top-level fields (#96083) 2023-05-23 06:21:56 +03:00
David Turner 0e58137c23
Cancellable node-level task in TransportTasksAction (#96248)
The node-level task is always cancellable, so we can expose it as such
to implementations. Also we can use a `ChannelActionListener` rather
than a hand-crafted equivalent.
2023-05-22 17:02:21 +01:00
Keith Massey 8ad9ec1eaa
Fixing DataLifecycleServiceIT.testAutomaticForceMerge (#96226) 2023-05-22 10:31:33 -05:00
Simon Cooper 84a85901ac
Change Version.luceneVersion to a method (#96244) 2023-05-22 14:54:54 +01:00
Pablo Alcantar Morales 8e07f63151
Convert `IngestStats` & internal objects to records (#96217)
Co-authored-by: Joe Gallo <joe.gallo@elastic.co>
2023-05-22 07:41:30 +02:00
Joe Gallo d4b37bb75e
Ingest geoip processor code cleanups (#96208) 2023-05-18 14:27:37 -04:00
Albert Zaharovits 669c50cbc7
ES APM traces for HTTP requests include authn duration (#96205)
Following the changes in #95112, which relocated the calls
into the AuthenticationService that authenticate HTTP
requests, the authentication duration was no longer
comprised in between the Tracer#startTrace and
Tracer#stopTrace. Consequently, the span records
didn't cover the authentication duration any longer.

This PR remedies that by changing the Tracer
implementation, i.e. APMTracer, to look for the trace start
time instant in the transient thread context and use that
when starting traces (overriding the now default).
The trace start time is set in the thread context when
the request-wise thread context is first populated
(with HTTP request headers).
2023-05-18 19:57:10 +03:00
Joe Gallo eb95e1f504
Small optimization of the looping in ingest's remove processor (#96202) 2023-05-17 16:37:51 -04:00
Mark Vieira 9502d39c96
Refactor GCS test fixture to remove docker dependency (#94755) 2023-05-17 11:05:26 -07:00