Commit Graph

54 Commits

Author SHA1 Message Date
Ryan Ernst 47c1d99ae0
Add settings registration for Java modules through SPI (#98857)
Currently plugins register settings through Plugin.getSettings. For
easier breakdown of the codebase, it would be nice to allow arbitrary
Java modules to register settings. This commit adds an internal
SettingsExtension SPI which acts just like Plugin.getSettings but from a
purely static context.
2023-08-25 07:00:49 -07:00
Tim Vernum 3093c40b8b
Make RestController pluggable (#98187)
This commit changes the ActionModules to allow  the RestController to be
provided by an internal plugin.

It renames  `RestInterceptorActionPlugin` to `RestServerActionPlugin`
and adds a new `getRestController` method to it.

There may be multiple RestServerActionPlugins installed on a node, but
only 1 may provide a Rest Wrapper (getRestHandlerInterceptor) and only 1
may provide a RestController (getRestController).
2023-08-08 01:29:24 -04:00
Przemyslaw Gomulka 0aed016215
Fix qualified export for serverless metering (#98106)
the module name of serverless metering is org.elasticsearch.metering
previously an incorrect name co.elastic.metering was used
2023-08-01 17:33:29 +02:00
Przemyslaw Gomulka 999489ce04
Infrastructure to report upon document parsing (#97961)
In serverless we will like to report (meter and bill) upon a document ingestion. The metering should be agnostic to a document format (document structure should be normalised) hence we should allow to create XContentParsers which will keep track of parsed fields and values.
There are 2 places where the parsing of the ingested document happens:
1. upon the 'raw bulk' a request is sent without the pipelines
2. upon the 'ingest service' when a request is sent with pipelines
(parsing can occur twice when a dynamic mappings are calculated, this PR takes this into account and prevent double billing)
We also want to make sure, that the metering logic is not unnecessarily executed when a document was already reported. That is if a document was reported in IngestService, there is no point wrapping the XContentParser again.

This commit introduces a `DocumentReporterPlugin`  an internal plugin that will be implemented in serverless. This plugin should return a `DocumentParsingObserver` supplier  which will create a `DocumentParsingObserver`. A DocumentParsingObserver is used to wrap an `XContentParser` with an implementation that keeps track of parsed fields and values (performs a metering) and allows to send that information along with an index name to a MeteringReporter.
2023-08-01 13:55:18 +02:00
Ryan Ernst cc1904add6
Expose build flavor again in nodes info (#98021)
The nodes info returns some information from the build. However, the
flavor is still hardcoded to default, even though flavor was added back
to Build. This commit exposes the build flavor again in the nodes
info response. It also fixes the build extension to be accessible in
serverless.
2023-07-28 06:08:23 -07:00
Ryan Ernst 57d5fbd639
Make build info pluggable internally (#97768)
This commit makes the Build.current() pluggable. This is only available
for internal builds.
2023-07-19 06:02:57 -07:00
Mary Gouseti a432313ff3
Data stream lifecycle class names (#97381) 2023-07-05 12:28:32 +03:00
Mary Gouseti f87c2c7758
Introduce downsampling configuration for data stream lifecycle (#97041)
This PR introduces downsampling configuration to the data stream lifecycle. Keep in mind downsampling implementation will come in a follow up PR. Configuration looks like this:
```
{
  "lifecycle": {
    "data_retention": "90d",
    "downsampling": [
      {
        "after": "1d",
        "fixed_interval": "2h"
      },
      { "after": "15d", "fixed_interval": "1d" },
      { "after": "30d", "fixed_interval": "1w" }
    ]
  }
}
```
We will also support using `null` to unset downsampling configuration during template composition:
```
{
  "lifecycle": {
    "data_retention": "90d",
    "downsampling": null
  }
}
```
2023-06-29 16:41:17 +03:00
Przemyslaw Gomulka 31e20d9239
Revert "Add JUL bridge (#96683)" (#96832)
This reverts commit 2bdf1bc0d6.
2023-06-14 14:37:53 +02:00
Ryan Ernst 2bdf1bc0d6
Add JUL bridge (#96683)
This commit adds the Log4j JUL bridge so that messages using JUL are
more nicely converted to log4j messages. Currently these messages are
captured via the stdout logging stream. This commit also adds a log4j
filter to replace the logging stream filtering mechanism used to quiet
some Lucene log messages that may be confusing to users.

closes #94613
2023-06-13 19:31:05 -04:00
Kostas Krikellas 67211be81d
Fork TDigest library (#96086)
* Initial import for TDigest forking.

* Fix MedianTest.

More work needed for TDigestPercentile*Tests and the TDigestTest (and
the rest of the tests) in the tdigest lib to pass.

* Fix Dist.

* Fix AVLTreeDigest.quantile to match Dist for uniform centroids.

* Update docs/changelog/96086.yaml

* Fix `MergingDigest.quantile` to match `Dist` on uniform distribution.

* Add merging to TDigestState.hashCode and .equals.

Remove wrong asserts from tests and MergingDigest.

* Fix style violations for tdigest library.

* Fix typo.

* Fix more style violations.

* Fix more style violations.

* Fix remaining style violations in tdigest library.

* Update results in docs based on the forked tdigest.

* Fix YAML tests in aggs module.

* Fix YAML tests in x-pack/plugin.

* Skip failing V7 compat tests in modules/aggregations.

* Fix TDigest library unittests.

Remove redundant serializing interfaces from the library.

* Remove YAML test versions for older releases.

These tests don't address compatibility issues in mixed cluster tests as
the latter contain a mix of older and newer nodes, so the output depends
on which node is picked as a data node since the forked TDigest library
is not backwards compatible (produces slightly different results).

* Fix test failures in docs and mixed cluster.

* Reduce buffer sizes in MergingDigest to avoid oom.

* Exclude more failing V7 compatibility tests.

* Update results for JdbcCsvSpecIT tests.

* Update results for JdbcDocCsvSpecIT tests.

* Revert unrelated change.

* More test fixes.

* Use version skips instead of blacklisting in mixed cluster tests.

* Switch TDigestState back to AVLTreeDigest.

* Update docs and tests with AVLTreeDigest output.

* Update flaky test.

* Remove dead code, esp around tracking of incoming data.

* Update docs/changelog/96086.yaml

* Delete docs/changelog/96086.yaml

* Remove explicit compression calls.

This was added to prevent concurrency tests from failing, but it leads
to reduces precision. Submit this to see if the concurrency tests are
still failing.

* Revert "Remove explicit compression calls."

This reverts commit 5352c96f65.

* Remove explicit compression calls to MedianAbsoluteDeviation input.

* Add unittests for AVL and merging digest accuracy.

* Fix spotless violations.

* Delete redundant tests and benchmarks.

* Fix spotless violation.

* Use the old implementation of AVLTreeDigest.

The latest library version is 50% slower and less accurate, as verified
by ComparisonTests.

* Update docs with latest percentile results.

* Update docs with latest percentile results.

* Remove repeated compression calls.

* Update more percentile results.

* Use approximate percentile values in integration tests.

This helps with mixed cluster tests, where some of the tests where
blocked.

* Fix expected percentile value in test.

* Revert in-place node updates in AVL tree.

Update quantile calculations between centroids and min/max values to
match v.3.2.

* Add SortingDigest and HybridDigest.

The SortingDigest tracks all samples in an ArrayList that
gets sorted for quantile calculations. This approach
provides perfectly accurate results and is the most
efficient implementation for up to millions of samples,
at the cost of bloated memory footprint.

The HybridDigest uses a SortingDigest for small sample
populations, then switches to a MergingDigest. This
approach combines to the best performance and results for
small sample counts with very good performance and
acceptable accuracy for effectively unbounded sample
counts.

* Remove deps to the 3.2 library.

* Remove unused licenses for tdigest.

* Revert changes for SortingDigest and HybridDigest.

These will be submitted in a follow-up PR for enabling MergingDigest.

* Remove unused Histogram classes and unit tests.

Delete dead and commented out code, make the remaining tests run
reasonably fast. Remove unused annotations, esp. SuppressWarnings.

* Remove Comparison class, not used.

* Small fixes.

* Add javadoc and tests.

* Remove special logic for singletons in the boundaries.

While this helps with the case where the digest contains only
singletons (perfect accuracy), it has a major issue problem
(non-monotonic quantile function) when the first singleton is followed
by a non-singleton centroid. It's preferable to revert to the old
version from 3.2; inaccuracies in a singleton-only digest should be
mitigated by using a sorted array for small sample counts.

* Revert changes to expected values in tests.

This is due to restoring quantile functions to match head.

* Revert changes to expected values in tests.

This is due to restoring quantile functions to match head.

* Tentatively restore percentile rank expected results.

* Use cdf version from 3.2

Update Dist.cdf to use interpolation, use the same cdf
version in AVLTreeDigest and MergingDigest.

* Revert "Tentatively restore percentile rank expected results."

This reverts commit 7718dbba59.

* Revert remaining changes compared to main.

* Revert excluded V7 compat tests.

* Exclude V7 compat tests still failing.

* Exclude V7 compat tests still failing.

* Restore bySize function in TDigest and subclasses.
2023-06-13 11:43:54 +03:00
Ryan Ernst f086ef1990
Make current Version pluggable for serverless (#96539)
Version.CURRENT is statically loaded as a constant early during startup.
Yet serverless needs to be able to override the current Version so it
can add additional versions. This commit makes Version.CURRENT pluggable
via SPI. Note that the only way for this to be plugged in is via an
additional jar on the boot layer.
2023-06-06 11:32:01 -04:00
Carlos Delgado 39b7b5eb56
Synonym Mgmnt API: PUT request (#95895) 2023-05-31 10:48:56 +02:00
Athena Brown d423f40037
Add mechanism to react to termination signals (#95850)
This commit adds a mechanism to be exposed internally to allow
plugins/modules to react to termination signals via SPI.

I'm not entirely sure how to test this - normally I'd use an
`ESIntegTestCase` and supply a stubbed version of the plugin but that's
more difficult with SPI.

Supercedes https://github.com/elastic/elasticsearch/pull/95518
2023-05-17 17:18:37 -04:00
Przemyslaw Gomulka dc03c47ada
Refactor RestMainAction into separate module (#95881)
we want to allow overriding info (GET /) api in serverless, therefore this commit moves the RestMainAction and is transport classes into a module that has a rest plugin

Main endpoint is often used in testing to verfiy that a cluster is ready, hence this commit also has to add a testing dependency on main to a lot of modules

relates #95422
2023-05-10 14:39:00 +02:00
William Brafford a8f6205084
Add ReloadingPlugin type (#95743)
* Add a ReloadAwarePlugin interface with a qualified export
2023-05-02 21:27:49 -04:00
Jack Conradson 5314e5dd55
Add support for Reciprocal Rank Fusion to the search API (#93396)
This change at a high level adds global ranking on the coordinating node at the end of query reduction 
prior to the fetch phase. Individual rank methods are defined in plugins.

The first rank plugin added as part of this change is reciprocal rank fusion (RRF). RRF uses a relatively 
simple formula for merging 1...n results sets together with sum(1/(k+d)) where k is a ranking constant 
and d is a document's scored position within a result set from a query.
2023-04-24 15:07:34 -07:00
Joe Gallo abc495d355
Move redact ingest processor into x-pack (#95426) 2023-04-21 15:04:49 -04:00
William Brafford fb93225f70
Move AbstractFileWatchingService to a common package for export (#95238)
* Move AbstractFileWatchingService to a common package
* Export the new package
* Clean up tests and member visibility
2023-04-20 13:14:26 -04:00
Martijn van Groningen 228fe1f804
Removed unused tsdb code. (#95317)
The TimeSeriesMetricsService and TimeSeriesMetrics.java classes were only used in the integration test that this change removes.
2023-04-18 15:31:51 +02:00
Ryan Ernst c619be4b5e
Move preallocate module to libs (#94884)
The preallocate module needs access to java.io internals. However, in
order to open java.io to a specific module, rather than the unnamed
module as was previously done, the said module must be in the boot
layer.

This commit moves the preallocate module to libs. It adds it to the main
lib dir, though it does not add it as a compile dependency of server.
2023-04-10 13:05:43 -07:00
Francisco Fernández Castaño 1421e2e751
Make some AtomicRegisterCoordinatorTests classes public for reuse (#94936) 2023-03-31 13:27:46 +02:00
Andrei Dan 223385f887
Introduce a _lifecycle/explain API for data stream backing indices (#94621)
This adds an {index}/_lifecycle/explain API to retrieve information
about an index's status within its lifecycle.

The response looks like so:
```
"indices" : {
    ".ds-metrics-foo-2023.03.22-000001" : {
      "index" : ".ds-metrics-foo-2023.03.22-000001",
      "managed_by_dlm" : true,
      "index_creation_date_millis" : 1679475563571,
      "time_since_index_creation" : "843ms",
      "rollover_date_millis" : 1679475564293,
      "time_since_rollover" : "121ms",
      "lifecycle" : { },
      "generation_time" : "121ms"
    },
    ".ds-metrics-foo-2023.03.22-000002" : {
      "index" : ".ds-metrics-foo-2023.03.22-000002",
      "managed_by_dlm" : true,
      "index_creation_date_millis" : 1679475564351,
      "time_since_index_creation" : "63ms",
      "lifecycle" : { }
    }
  }
}
```
2023-03-27 08:44:40 +01:00
Andrei Dan fb033b9e82
Move SchedulerEngine and TimeValueSchedule to server/common/scheduler (#93862)
We built quite a bit of infrastructure to have one polling job
running via the `SchedulerEngine` and `ActiveSchedule`. This moves this
infrastructure outside x-pack to server so elasticsearch/modules can use
it and avoid re-implementing it using `threadPool.schedule`.
2023-02-17 08:55:40 +00:00
Iraklis Psaroudakis e144bef8e9
New TransportBroadcastUnpromotableAction action (#93600)
Introduces:

* New action that can be used to broadcast to unpromotable shards of a given IndexShardRoutingTable.
* New hook in ReplicationOperation for custom logic when the primary operation completes. If there is a failure, this increases the shard failures of the replication operation.
* Refresh action now uses the new hook to broadcast the unpromotable refresh action to all unpromotable shards.

Fixes ES-5454
Fixes ES-5212
2023-02-15 12:46:06 +02:00
Joe Gallo ea17a1945a
Upgrade geoip2 dependency (#93522) 2023-02-07 08:18:25 -05:00
Thomas Dullien 14cca12d9d
Improve the false positive rate of the bloom filter by setting 7 hash functions (#93283)
Co-authored-by: Adrien Grand <jpountz@gmail.com>
2023-02-04 13:42:26 +01:00
Przemyslaw Gomulka 2cdaabe783
[Stable plugin api] Drop api suffix in package names (#92905)
Refactoring that drops the api suffix from package name
This will have to be followed up by a plugins/examples fix in imports
Also set an artifact group name to `org.elasticsearch.plugin` in the plugin-api and plugin-analysis-api
2023-01-14 09:49:37 +01:00
Salvatore Campagna 546e5169ea
TSDB numeric compression (#92045)
This doc values format is meant to be used for TSDB. It provides
compression using delta encoding, gcd encoding and bit-packing
encoding. The expectation is that such encoding works well for
fields whose values are monotonically increasing, like the timestamp
field and counter metric fields. Encoding and decoding fields using
the new format is available behind a feature flag.
2023-01-12 15:42:49 +01:00
David Turner 48c1447dd8
Add o.e.a.a.cluster.coordination to exports from server (#92059)
Fixes the red squigglies in IntelliJ.
2022-12-12 09:23:15 +00:00
Mary Gouseti 8e9a403c65
Collect health API stats (#91559)
This PR introduces the collectors of the health API telemetry. Our
target telemetry has the following shape:

```
{
    "invocations": {
      "total": 22,
      "verbose_true": 12,
      "verbose_false": 10
    },
    "statuses": {
      "green": 10,
      "yellow": 4,
      "red": 8,
      "values": ["green", "yellow", "red"]
    },
    "indicators": {
      "red" : {
        "master_stability": 2,
        "ilm":2,
        "slm": 4,
        "values": ["master_stability", "ilm", "slm"]
      },
      "yellow": {
        "disk": 1,
        "shards_availability": 1,
        "master_stability": 2,
        "values": ["disk", "shards_availability", "master_stability"]
      }
    },
    "diagnoses": {
      "red": {
        "elasticsearch:health:shards_availability:primary_unassigned": 1,
        "elasticsearch:health:disk:add_disk_capacity_master_nodes": 3,
        "values": ["elasticsearch:health:shards_availability:primary_unassigned", "elasticsearch:health:disk:add_disk_capacity_master_nodes"]
      },
      "yellow": {
        "elasticsearch:health:disk:add_disk_capacity_data_nodes": 1,
        "values": [""elasticsearch:health:disk:add_disk_capacity_data_nodes"]
      }
    }
  }
```

This PR introduces the thread safe `Counters` class and the
`HealthApiStats` which keeps keeps of the metrics above based on the
health api responses that it encounters. The `HealthApiStatsAction`
collects the `HealthApiStats` of all nodes.

Part of: #90877
2022-11-18 05:34:33 -05:00
Pooya Salehi 327f50ba46
Prevalidate node removal API (pt. 1) (#88952)
This PR adds the first part of the Prevalidate Node Removal API. This
API allows checking whether attempting to remove some node(s) from the
cluster is likely to succeed or not. This check is useful when a node
needs to be removed from a RED cluster, without risking loosing the last
copy of some RED shards.

In this PR, we only check whether a RED index is a Searchable Snapshot
index or not, in which case the removal of any node is safe as the RED
index is backed by a snapshot.

Relates #87776
2022-11-16 13:44:00 +01:00
Martijn van Groningen 1e186bb820
Move time_series aggregation to aggregations module. (#91356)
This also drops the TimeSeries ceremonial interface.

Note that this change also moves a search cancellation test to
aggregations module. It does so by creating a base search cancallation
base class that both server and this module share.

Relates to #90283 Relates to #82273
2022-11-08 12:27:35 -05:00
Jack Conradson 8b0d0716d1
Add profiling and documentation for dfs phase (#90536)
Adds profiling statistics for the dfs phase, and adds documentation for both the dfs phase profiling 
and kNN profiling.

Closes #89713
2022-10-05 09:54:36 -07:00
Martijn van Groningen 14cf04d74b
Introduce a new aggregation module (#90294)
This commit introduces a new aggregation module
and moves the `adjacency_matrix` to this new module.

The new module name is `aggregations`.
The new module will use the `org.elasticsearch.aggregations.bucket` package for all bucket aggregations. 

Relates to #90283
2022-09-30 16:24:52 +02:00
Przemyslaw Gomulka 35ea2b13b5
[Stable plugin API] Load plugin named components (#89969)
Stable plugins are using @ extensible and @ NamedComponents annotations
to mark components to be loaded.
This commit is loading extensible classNames from extensibles.json and
named components from named_components.json

The scanning mechanism that can generate these files will be done later in a gradle plugin/plugin installer

relates #88980
2022-09-13 09:05:08 +02:00
Nhat Nguyen cfad420cde
Enable BloomFilter for _id of non-datastream indices (#88409)
This PR adds BloomFilter to Elasticsearch and enables it for the _id 
field of non-data stream indices. BloomFilter should speed up the
performance of mget and update requests at a small expense of refresh,
merge, and storage.
2022-08-08 11:14:26 -04:00
Mary Gouseti d828c2a642
Health API - Monitoring local disk health (#88390)
This PR introduces the local health monitoring functionality needed for
#84811 . The monitor uses the `NodeService` to get the disk usage stats
and determines the node's disk health.

When a change in the disk's is detected or when the health node changes,
this class would be responsible to send the node's health to the health
node. Currently this is simulated with a method that just logs the
current health.

The monitor keeps the last reported health, this way, if something fails
on the next check it will try to resend the new health state.
2022-08-03 17:40:26 +09:30
Rory Hunter 5c5981d27d
Introduce tracing interfaces (#87921)
Part of #84369. Split out from #87696. Introduce tracing interfaces in
advance of adding APM support to Elasticsearch. The only implementation
at this point is a no-op class.
2022-07-26 05:31:41 +09:30
Nikola Grcevski d7d9ff2950
Rename immutable cluster state to reserved cluster state (#88481) 2022-07-13 15:05:39 -04:00
Nikola Grcevski fc93f77d3b
Implement ILM/settings operator handlers (#88097)
Relates to #86224
2022-07-04 12:12:49 -04:00
Chris Hegarty 453f12c72d
Upgrade to Log4J 2.18.0 (#88237) 2022-07-04 11:30:38 +01:00
Nikola Grcevski e1d03efd1e
Add immutable 'operator' metadata classes for cluster state (#87763)
This commit only introduces the storage classes, unused for now.
Relates to #86224
2022-06-29 18:30:35 -04:00
Nikola Grcevski c2d1b22626
Add OperatorHandler interface (#87767) 2022-06-27 09:51:13 -04:00
Julie Tibshirani 3a9e511117
Move kNN search and dense vectors to core (#87815)
This PR moves kNN search and dense vector support out of an xpack plugin and
into server.

In #87625 we plan to integrate ANN search into the main `_search` endpoint as a
new top-level component called `knn`. So kNN will be a dedicated part of the
search request, and we'll have kNN logic within the search phases. The classes
and logic will live in server, matching the other search components like
suggesters, field collapsing, etc.
2022-06-22 21:10:20 -07:00
Artem Prigoda 52e2e374e8
Make Desired Nodes API operator-only (#87778)
* Remove Desired Nodes API from NON_OPERATOR_ACTIONS

* Add desired nodes to the operator privileges test

* Add desired nodes privileges integration tests

Resolves #87777.
2022-06-20 15:32:44 +02:00
Mary Gouseti f7ef6609be
Do not wait for the health node task (#87830)
The health node task is a permanent task, for this reason it doesn't make sense to wait for it in a method that waits for pending tasks. This fixes that.
2022-06-20 11:48:59 +02:00
Przemyslaw Gomulka 0ef15b49e9
Stable logging API - the basic use case (#86612)
Introducing a stable logging API under libs/logging.
This change covers the most common use cases for logging: fetching a logger with LogManager, emitting a log messages with Logger and Level.
It is influenced by log4j2-api, but do not include Marker and LogBuilder methods.
Also methods using org.apache.logging.log4j.util.Supplier are replaced with java.util.Supplier

The basic implementation is present in server and injected statically in LogConfigurator

relates #84478
2022-06-13 10:25:54 +02:00
Ryan Ernst f5c0be5c89
Move spatial3d dependency to spatial (#87397)
Server depends on spatial3d, but it is only ever used by the spatial
xpack component. This commit moves the dependency there.

closes #87026
2022-06-07 12:54:11 -07:00
Ryan Ernst 4b44413783
Move declarative plugin sync to server cli (#87273)
When running in Docker, the elasticsearch-plugins.yml allows configuring
plugins that should be installed in the system. Upon Elasticsearch
starting up, plugins are installed/removed to match the configured
plugins. However, this happens late in startup, and it would be nice to
keep the main Elasticsearch process from ever writing outside the
configured data directories. Now that the server cli has been moved to
Java, this is possible.

This commit moves invocation of the plugins sync command into the server
cli. Note that the sync plugins action should probably be reworked as it
can be implement Command directly now. However, this commit tries to be
the minimal change possible to remove plugin cli knowledge from server.
2022-06-01 15:52:02 -04:00