Commit Graph

463 Commits

Author SHA1 Message Date
Mark Vieira cdbd7ad543
Add publishing plugin to elasticsearch-grok project (#89184)
It seems https://github.com/elastic/elasticsearch/pull/88982 introduced
a dependency on `elasticsearch-grok` to `x-pack-core`. Since the latter
is published to Maven Central, this means consumers will have issues
resolving it's dependencies since `elasticsearch-grok` isn't published.
This pull request resolves this, by adding the publishing plugin to the
`grok` library. We'll then follow up separately to add that to our
release configuration.
2022-08-09 08:02:08 +09:30
Rene Groeschke 3909b5eaf9
Add verification metadata for dependencies (#88814)
Removing the custom dependency checksum functionality in favor of Gradle build-in dependency verification support. 

- Use sha256 in favor of sha1 as sha1 is not considered safe these days.

Closes https://github.com/elastic/elasticsearch/issues/69736
2022-08-04 09:51:16 +02:00
Nik Everett 87ab933c8b
Remove calls to deprecated xcontent method (#84733)
This removes many calls to the last remaining `createParser` method that
I deprecated in #79814, migrating callers to one of the new methods that
it created.
2022-08-01 22:18:03 +09:30
Artem Prigoda 8a159e9759
Make Tuple a record (#88280)
Tuple is used extensively across the ES codebase and can be effectively represent as a Java record.
2022-07-28 09:14:12 +02:00
Chris Hegarty f3cff66877
Fix EmbeddedImplClassLoaderTests on Windows (#88813) 2022-07-27 00:08:37 +01:00
Chris Hegarty 1ce64290a1
Add package cache to EmbeddedImplClassLoader (#88537)
The initial implementation of the embedded class loader took a brute
force approach to supporting multi-release JARs - iterating over all
possible release versions when searching for classes and resources. This
change improves upon that approach by deriving and caching package and
version specific maps, so class and resource loading can go directly to
the class and resource bytes, respectively, rather than searching.

It's hard to get empirical numbers to quanify just how much this change
improves the performance of classes loaded by this loader, and there is
typically only a couple of hundred classes loaded, but the initial cli
seems observably much quicker, while the server startup has improved
just a bit (at least on my machine).
2022-07-25 15:31:58 +01:00
Chris Hegarty 810736519f
Always close directory streams (#88560) 2022-07-15 08:49:16 +01:00
Ignacio Vera ff6604f1ea
Use a faster but less accurate log algorithm for computing Geotile Y coordinate (#87515)
This commit introduces a new algorithm to ESSloppyMath to compute logarithm (base e)
2022-06-29 12:25:22 +02:00
Armin Braun 64798df2f3
Cleanup indirection in batch close utils (#87685)
Some of these show up in profiling of heavy transport load.
It's not a big thing but it's always nice to save a few cycles
on transport threads.
2022-06-22 11:54:36 +02:00
Rene Groeschke cdf5bd7ed0
Rework testing conventions gradle plugin (#87213)
This PR reworks the testing conventions precommit plugin. This plugin now:
- is compatible with yaml, java rest tests and internalClusterTest (aka different sourceSets per test type)
- enforces test base class and simple naming conventions (as it did before)
- adds one check task per test sourceSet
- uses the worker api to improve task execution parallelism and encapsulation
- is gradle configuration cache compatible  

This also ports the TestingConventions integration testing to Spock and removes the build-tools-internal/test kit folder that is not required anymore. We also add some common logic for testing java related gradle plugins. 
We will apply further cleanup on other tests within our test suite in a dedicated follow up cleanup
2022-06-20 16:26:38 +02:00
Chris Hegarty ac9e1013bb
Re-enable several tests since JDK-8287097 has been fixed (#87803) 2022-06-17 16:46:33 +01:00
Chris Hegarty ca7783c429
Modularize the h3 component (#87737) 2022-06-16 13:44:05 +01:00
Francisco Fernández Castaño eb8c4ba97b
Keep track of desired nodes status in cluster state (#87474)
This commit adds desired nodes status tracking to the cluster state. Previously status was tracked
in-memory by DesiredNodesMembershipService this approach had certain limitations, and made
the consumer code more complex. This takes a simpler approach to keep the status updated when
the desired nodes are updated or when a new node joins, storing the status in the cluster state,
this allows to consume that information easily where it is necessary.
Additionally, this commit moves test code from depending directly of DesiredNodes which can be
seen as an internal data structure to rely more on UpdateDesiredNodesRequest.

Relates #84165
2022-06-16 11:08:05 +02:00
Przemyslaw Gomulka 0ef15b49e9
Stable logging API - the basic use case (#86612)
Introducing a stable logging API under libs/logging.
This change covers the most common use cases for logging: fetching a logger with LogManager, emitting a log messages with Logger and Level.
It is influenced by log4j2-api, but do not include Marker and LogBuilder methods.
Also methods using org.apache.logging.log4j.util.Supplier are replaced with java.util.Supplier

The basic implementation is present in server and injected statically in LogConfigurator

relates #84478
2022-06-13 10:25:54 +02:00
Ryan Ernst 779871e73c
Remove impossible checked exception from jarhell (#87541)
The jarhell check declares a URISyntaxException. However, this should
not be possible as the paths and URLs come from the jdk conversion. This
commit makes a URISyntaxException when converting form URL to URI an
assertion error, similar to MalformedURLException when creating a URL.
2022-06-09 06:22:13 -07:00
Przemysław Witek 8656a29675
[Transform] Implement per-transform num_failure_retries setting. (#87361) 2022-06-09 15:22:06 +02:00
Armin Braun f2987b417f
Lower overhead of RestApiVersion use in x-content parsing (#87356)
Noticed loads of duplicate lambdas on the heap and code related to these predicates
etc. pop up during benchmarking things that are hot on x-content parsing.
These changes way simplify the rest-api code (though it could be made even simpler I think)
and remove it from profiling for the most part.

Co-authored-by: Joe Gallo <joe.gallo@elastic.co>
2022-06-08 12:22:17 +02:00
Martijn van Groningen a4fdb0ba66
Only perform ensureNoSelfReferences check during ingest when needed (#87352)
The 'ensure no self reference' check during ingest only needs to be
performed when there is a chance that a map or list is added to
`IngestDocument` that references other entries in the `IngestDocument`
and when there is a risk that a circular reference is created between
entries.

Today this is only possible in `ScriptProcessor`, since a custom script
could do this. So doing this check if no script processor is used is a
not useful. Doing this check just adds an additional tax to time spent
on ingest, that in many cases really isn't necessary.

A flag is added to `IngestDocument` that controls whether the 'ensure no
self reference' check is performed at the end when all pipelines have
been executed. The `ScriptProcessor` has been modified to set this flag
and so this check will be performed if pipelines that execute have one
or more script processors.

Closes #87335
2022-06-07 06:52:17 -04:00
Chris Hegarty d245458227
Modularize the ingest.common component (as well as dissect and grok dependent libs) (#87219)
This is change modularizes the ingest.common component,
by adding a module-info.java. As well as two dependent libs.

The project only requires painless SPI to compile, so that was
fixed along the way ( so that the compile module path can be
inferred directly from the dependencies ).
2022-05-30 17:08:13 +01:00
Przemyslaw Gomulka 416a1b352c
Catch an exception due to incorrect pattern in Strings.format (#87132)
Strings.format method, which is used heavily in logging with
Supplier should handle exceptions when a format is incorrect.
This will prevent a hard to catch mistakes to blow up in server.
Those mistakes are especially hard to detect in logging when a
code to create a message might be only executed when logger is debug
or trace. Which is not always the case in CI.

relates #87077 (comment)

relates #86549
2022-05-30 09:05:40 +02:00
Przemyslaw Gomulka 24fa003f5c
Replace supplier of ParameterizedMessage with java.util.Supplier<String> (#86971)
This is a result of structural search/replace in intellij. This only affects log methods with a signature
logger.info(Supplier<?>) where level could be info/debug etc and supplier argument is in a form of
()-> new ParameterizedMessage

This commit also introduced a Strings utility class to avoid passing Locale.ROOT to every
String.format(Locale.ROOT, pattern, args)
relates #86549
2022-05-23 08:51:07 +02:00
Chris Hegarty 0d5db357df
Skip on 19 (#87000) 2022-05-20 20:49:27 +01:00
Chris Hegarty 3071c6a055
Modularize Elasticsearch (#81066)
This PR represents the initial phase of Modularizing Elasticsearch (with
Java Modules).

This initial phase modularizes the core of the Elasticsearch server
with Java Modules, which is then used to load and configure extension
components atop the server. Only a subset of extension components are
modularized at this stage (other components come in a later phase).
Components are loaded dynamically at runtime with custom class loaders
(same as is currently done). Components with a module-info.class are
defined to a module layer.

This architecture is somewhat akin to the Modular JDK, where
applications run on the classpath. In the analogy, the Elasticsearch
server modules are the platform (thus are always resolved and present),
while components without a module-info.class are non-modular code
running atop the Elasticsearch server modules. The extension components
cannot access types from non-exported packages of the server modules, in
the same way that classpath applications cannot access types from
non-exported packages of modules from the JDK. Broadly, the core
Elasticseach java modules simply "wrap" the existing packages and export
them. There are opportunites to export less, which is best done in more
narrowly focused follow-up PRs.

The Elasticsearch distribution startup scripts are updated to put jars
on the module path (the class path is empty), so the distribution will
run the core of the server as java modules. A number of key components
have been retrofitted with module-info.java's too, and the remaining
components can follow later. Unit and functional tests run as
non-modular (since they commonly require package-private access), while
higher-level integration tests, that run the distribution, run as
modular.

Co-authored-by: Chris Hegarty <christopher.hegarty@elastic.co>
Co-authored-by: Ryan Ernst <ryan@iernst.net>
Co-authored-by: Rene Groeschke <rene@elastic.co>
2022-05-20 13:11:42 +01:00
Francisco Fernández Castaño e91e7e653b
Add support for CPU ranges in desired nodes (#86434)
This commit adds support for CPU ranges in the desired nodes API. 

This aligns better with environments where administrators/orchestrators
can define lower and upper bounds for the amount of CPUs that the
desired node would get once deployed. 

This allows to provide information about the expected CPU and possible
allowed overcommit that the desired node will run on.

This was the previous expected body for the desired nodes API (we still support it):
```
PUT /_internal/desired_nodes/history/1
{
    "nodes" : [
        {
            "settings" : {
                 "node.name" : "instance-000187",
                 "node.external_id": "instance-000187",
                 "node.roles" : ["data_hot", "master"],
                 "node.attr.data" : "hot",
                 "node.attr.logical_availability_zone" : "zone-0"
            },
            "processors" : 8, 
            "memory" : "58gb",
            "storage" : "1700gb",
            "node_version" : "8.3.0"
        }
    ]
}
```

Now it's possible to define `processors` or `processors_range` as in:
```
PUT /_internal/desired_nodes/history/1
{
    "nodes" : [
        {
            "settings" : {
                 "node.name" : "instance-000187",
                 "node.external_id": "instance-000187",
                 "node.roles" : ["data_hot", "master"],
                 "node.attr.data" : "hot",
                 "node.attr.logical_availability_zone" : "zone-0"
            },
            "processors_range" : {"min": 8.0, "max": 16.0},
            "memory" : "58gb",
            "storage" : "1700gb",
            "node_version" : "8.3.0"
        }
    ]
}
```
Note that `max` in `processors_range` is optional.

This commit also moves from representing CPUs as integers to
accept floating point numbers.

Note: I disabled the bwc yamlRestTests for versions < 8.3 since we introduced
a few "breaking changes" but since this is an internal API it should be fine.
2022-05-20 11:47:32 +02:00
Ryan Ernst b9c504b892
Replace most shell script logic with Java (#85758)
Elasticsearch provides several command line tools, as well as the main script to start elasticsearch. While most of the logic is abstracted away for cli tools, the main elasticsearch script has hundreds of lines of platform specific shell code. That code is difficult to maintain because it uses many special shell features which then must also exist in other platforms (ie windows batch files). Additionally, the logic in these scripts are not easy to test, we must be on the actual platform and test with a full installation of Elasticsearch, which is relatively slow (compared to most in process tests).

This commit replaces logic of the main server script, as well as the windows service management script, with Java. The new entrypoints use the CliToolLauncher. The server cli figures out all the jvm options and such necessary, then launches the real server process. If run in the foreground, the launcher will stay alive for the lifetime of Elasticsearch; the streams are effectively inherited so all output from Elasticsearch still goes to the console. If daemonizing, the launcher waits around until Elasticsearch is "ready" (this means the Node startup completed), then detaches and exits.

Co-authored-by: William Brafford <william.brafford@elastic.co>
2022-05-19 08:29:08 -07:00
Ryan Ernst a40ed708f0
Fix Terminal flushing and make api consistent (#86808)
This commit fixes the PrintWriters created by Terminal to use
autoflushing. It also fixes prompting to flush the error stream after
writing. Finally, it makes the APIs more consistent by adding getters
for verbosity and reader so that subclasses of Terminal can fully wrap
and existing terminal.

relates #85758
2022-05-16 16:26:12 -07:00
Chris Hegarty 74c03c21dc
Expand jar hell to include modules (#86622)
Expands jar hell to include modules from the system class loader.
2022-05-13 07:54:12 +01:00
Chris Hegarty ddc354f62f
Make embedded class loader module aware (#86355)
This change adds support to embedded class loader to load the provider
and implmentation dependencies as modules - within their own module
layer - when the caller itself is a named module. Currently, this code
is not yet triggered during deployment, since the caller is always an
unnamed module, but the caller will be moularized in a subsequent
change.
2022-05-10 09:20:15 +01:00
Ignacio Vera 021989457d
Add getResolution method to H3 (#86519) 2022-05-09 11:10:42 +02:00
Chris Hegarty e30a069839
Fix Windows EmbeddedImplClassLoaderTests (#86413)
Fix Windows EmbeddedImplClassLoaderTests. Remove use of openStream in favor of getResourceAsStream.

Co-authored-by: Ryan Ernst <ryan@iernst.net>
2022-05-06 09:53:05 +01:00
Alan Woodward 4d076eee20
Upgrade to Lucene 9.2 snapshot efa5d6f4d43 (#86227)
Notable changes include:

count implementations for MultiRangeQuery and IndexSortedNumericDocValuesRangeQuery, which may speed up certain aggregations
more efficient decoding of docids in BKD reader
2022-05-05 15:48:13 +01:00
Ryan Ernst 733f9fa5b8
Move cli shutdown hook to CliToolLauncher (#86412)
Each Command subclass can implement close() so that resources will be
cleaned up on exceptional exit like SIGINT. This is implemented through
a shutdown hook added in the superclass constructor. However, this hook
makes testing difficult because the hook cannot be added in normal
tests, so a flag must be overriden when testing Command classes.

This commit moves the shutdown hook handling into the CliToolLauncher
that creates the command. It also adds non-evil tests that check how the
hook runs, in place of the old evil tests that actually registered a
real shutdown hook.

relates #85758
2022-05-05 06:50:45 -07:00
Armin Braun 017bf7ff91
Use faster and cleaner map parsing loop in ObjectParser.parse (#86319)
Much easier to use the new API and not have to worry about all the manual
checks on the field name that were dead code anyway because they'd only be
hit on mal-formed input bytes which Jackson would throw on before entering
our code.
2022-05-03 16:14:16 +02:00
Armin Braun 1c497ac516
Fix using FilterOutputStream without overriding bulk write (#86304)
Fix using the filter output stream without overriding the bulk write.
The usage in `directFieldAsBase64` seems like a serious performance bug
since the stream is used to write a potentially larger response here.

I also removed the `BlobOutputStream` that used to contain the same
fix now added to the no-close stream after realizing the class is pointless
to begin with to cut down on our usage of `FilterOutputStream` where the bulk
write fix is needed.
2022-05-02 21:25:14 +02:00
Chris Hegarty a6177121ea
Make embedded class loader MRJAR aware (#86316)
This change updates the embedded class loader to be MRJAR aware.
2022-05-01 12:23:46 +01:00
Ryan Ernst ed749fcc5c
Move cli sysprops and envVars to execute parameter (#86279)
The sysprops and envVars members of Command provide cli implementations
with information about the jvm process that is running. This is
convenient for runtime, but difficult for tests to mock because they
must subclass the cli class.

This commit adds a ProcessInfo record, and plumbs it through the
main and execute methods. The new record includes system properties,
environment variables and the working directory. By having this be a
single new parameter, additional information can be added in the future
without again needing to modify the method signatures.

relates #85758
2022-04-29 13:47:30 -07:00
Armin Braun 943c0d551b
Use Faster API for Parsing Maps from XContent (#85732)
We can be a little more efficient when parsing maps and exploit
the fact that we know the next token is a name in a couple of cases.
I fixed the most performance relevant one but there's a couple more
that could make use of this API in a follow up.
2022-04-29 12:20:33 +02:00
Ryan Ernst f66ece6a24
Remove cli variant of SuppressForbidden (#86274)
The cli lib has the SuppressForbidden annotation, but so does core,
which cli depends on. This commit removes the SuppressForbidden from
cli, in favor of the one from core.

relates #85758
2022-04-28 18:21:33 -07:00
Ryan Ernst 3e581c66f1
Cleanup Terminal to make it easier to subclass (#86198)
Terminal is the abstraction Elasticsearch uses for all input and output,
both character based and binary. In an interactive shell, this is backed
by Java's Console, and in non-interactive it is backed by
stdin/stdout/stderr. Over time, the Terminal class has been amended to
support several different use cases, which has made constructing
subclasses for testing or filter based implementations complex. This
commit reworks Terminal so that the readers/writers/streams are
constructor arguments, instead of overrides. This allows subclasses to
simply call super with what is neeeded, rather than overloading several
methods and adding the same boilerplate implementation as others.

Note that the majority of the modifications here are to tests because
MockTerminal now has a factory method instead of direct constructor.

relates #85758
2022-04-27 19:06:05 -07:00
Chris Hegarty 06b14e57ec
Move embedded loader and provider locator to core (#86081) 2022-04-25 08:23:45 +01:00
Chris Hegarty ce6d2e7f31
remove xmlbeans jarhell exception (#86100) 2022-04-22 19:52:07 +01:00
Ryan Ernst b2c9028384
Move io utils to core package (#85954)
Most classes under elasticsearch-core had been moved to the o.e.core
package. However, a couple io related classes remained in an "internal"
package. This commit moves Streams and IOUtils to the core package, as
they are no more "internal" than the rest of the classes in core.
2022-04-19 21:26:28 -07:00
Ryan Ernst 118711efc7
Remove Terminal.readSecret with max length (#85962)
The Terminal class currently has two variants of readSecret, one that is
similar to Console.readPassword, and another that takes a maximum length
to read. This second variant was originally added as a protection when
reading the keystore password. However, there are no other uses of it,
and the passphrase has already been fully read in bin/elasticsearch by
the time it is read from Terminal, so it shouldn't be possible to
actually hit this edge case. This commit removes the maxLength variant.

relates #85758
2022-04-19 14:54:42 -07:00
Ryan Ernst d9e6fc8161
Move netutil to netty module (#85953)
This utils class was previously shared across transport implementations,
but is only used by netty now.
2022-04-19 14:53:56 -07:00
Ryan Ernst aafd2f92fc
Move docker env var settings handling out of bash (#85913)
In docker ES allows settings to be set via environment variables. This
is currently handled in complex bash logic. This commit moves that logic
into EnvironmentAwareCommand where the rest of the initial settings are
found.

relates #85758
2022-04-18 09:26:14 -07:00
Ryan Ernst 9f46aae615
Consolidating logging initialization in cli launcher (#85920)
Serveral mechanisms exist for intializing logging in cli tools. Some
base Command classes exist which initialize logging. But they do this
late, when they are constructed, which may be after static init has
occured for classes grabbing a Logger. Other CLIs like node tool
explicitly initialize logging to avoid that problem.

This commit removes all the of the LoggingAware classes, and
unifies logging configuration to occur at the very beginning of the cli
launcher.

relates #85758
2022-04-18 08:22:50 -07:00
Ryan Ernst e307f3222f
Check only major version for java compatibility (#85886)
This commit adjusts the version check for plugins to only check the Java
major version, instead of the entire thing.

fixes #85880
2022-04-14 11:02:57 -07:00
Ryan Ernst 1088ef6ded
Capture system properties and env variables for cli tools to use (#85885)
Currently any code needing to access system properties or environment
variables does it with the static methods provided by Java. While this
is ok in production since these are instantiated for the entire jvm
once, it makes any code reading these properties difficult to test
without mucking with the test jvm.

This commit adds system properties and environment variables to the base
Command class that our CLI tools use. While it does not propagate the
properties and env down for all possible uses in the system, it is the
first step, and it makes CLI testing a bit easier.
2022-04-14 09:22:57 -07:00
Ryan Ernst 1d4534f848
Introduce unified entrypoint for CLI scripts (#85821)
CLI scripts have a common infrastructure in that they call to the shared
elasticsearch-cli shell script which launches them with the appropriate
java command line. However, each underlying Java class must implement
its own main method.

This commit introduces a single main method to be shared by CLIs. The
new CliToolLauncher takes in system properties to determine which tool
is being run, and a new CliToolProvider SPI allows defining and finding
the named tools.

relates #85758

Co-authored-by: William Brafford <william.brafford@elastic.co>
2022-04-14 08:53:36 -07:00
Armin Braun f887ac22b4
Remove NIO Transport Plugin (#82085)
Removes NIO transport.
2022-04-12 11:00:26 +02:00