Another round of automated fixes to this, marking things that can be
made static as static. Saves some JIT cycles but also turns some lambdas
from capturing to non-capturing and makes the "utilityness" of some
classes visible.
This commit fixes a jarhell test to create an unnamed temp dir, instead
of the existing creation which uses the test method name. The reason
this causes problems is when running with many iterations, the test
method name is artificially adjusted to include seed information, using
special characters that are potentially invalid path characters.
closes#98949
Lots of spots where we did weird things around streams like redundant stream creation, redundant collecting
before adding all the collected elements to another collection or so, redundant streams for joining strings
and using less efficient `Collectors.toList` and in a few cases also incorrectly relying on the result being mutable.
It's legitimate to wrap the delegate twice, with two different
assertOnce calls, which would yield different objects if and only if
assertions are enabled. So we'd better not ever use these things as map
keys etc.
This class was quite hot in recent benchmarks of shared-cached based
searches and we can make instantiating the releasable locks a little cheaper.
Also, those same benchmarks showed a lot of visible time spent on
dealing with ref counts. I removed one layer of indirection in atomic
use from both the release-once and the abstract ref count which
should save a little in CPU caches as well.
Today `AbstractRefCounted` holds an `AtomicInteger` which holds the
actual ref count, which is an extra heap object and means that
acquiring/releasing refs always goes through that extra pointer lookup.
We use this utility extensively, on some pretty hot paths, so with this
commit we move to using a primitive `refCount` field with atomic
operations via a `VarHandle`.
In #94884 the ability to add qualified exports and opens from jars
upstream of server was added. Some Elasticsearch components need to
qualify their exports to another component. This commit tweaks the
loading of the exports services so that each loaded plugin/component
has their qualified exports handled automatically.
The preallocate module needs access to java.io internals. However, in
order to open java.io to a specific module, rather than the unnamed
module as was previously done, the said module must be in the boot
layer.
This commit moves the preallocate module to libs. It adds it to the main
lib dir, though it does not add it as a compile dependency of server.
Fixes#82794. Upgrade the spotless plugin, which addresses the issue
around formatting `instanceof` expressions. Formatting of statements
including lambdas seems to have improved too.
Today both `Releasables#wrap` implementations return a lambda which does
not render well in logs or exception messages. This commit improves
that. It also adds tests for these methods and moves the test suite to
the correct module.
Some optimisations that I found when reusing searchable snapshot code elsewhere:
* Add an efficient input stream -> byte buffer path that avoids allocations + copies for heap buffers, this is non-trivial in its effects IMO
* Also at least avoid allocations and use existing thread-local buffer when doing input stream -> direct bb
* move `readFully` to lower level streams class to enable this
* Use same thread local direct byte buffer for frozen and caching index input instead of constantly allocating new heap buffers and writing those to disk inefficiently
We recently adjusted `RunOnce`, `Releasables#releaseOnce` and
`ActionListener#notifyOnce` so that once they have fired they drop the
now-unnecessary reference to the delegate. This commit introduces tests
to verify that this reference does genuinely become unreachable (i.e.
available for garbage collection) as expected.
It also fixes a bug in `ActionListener#notifyOnce` which caused us to
unexpectedly retain a reference to the delegate 🤦
Relates #92452 Relates #92507 Relates #92537
Use local-independent `Strings.format` method instead of `String.format(Locale.ROOT, ...)`.
Inline `ESTestCase.forbidden` calls with `Strings.format` for the consistency sake.
Add `Strings.format` alias in `common.Strings`
Adds a null check and a `toString()` implementation which passes through
to the wrapped runnable. Also renames `RefCountedTests` to
`AbstractRefCountedTests` since they're really all about testing this
specific implementation.
Like #92452 and #92507 but for `Releasables#releaseOnce`: there's no
need to keep hold of the wrapped releasable after closing it, and in
some cases this might hold on to excessive heap. With this commit we
drop the reference to the delegate when it's complete.
Today `DisruptableMockTransport` leaks refs to transport messages in
various ways if the transport is rebooted. This commit adds the missing
ref-count handling.
Closes#91837
With the new StableBuildPlugin we do not add default dependencies to the plugin classpath.
That exposed an issue with the JarHellPlugin not really taking care of configuring the
jarHell configuration.
The JarHell Plugin only worked so far as the plugin build plugin has added a transitive dependency
to elasticsearch-core and therefore kind of only worked by accident-.
This adds a default configuration of the jarHell configuration that can be overridden manually.
Furthermore we clear some inter plugin dependencies explicit and add proper functional
test coverage
These two mostly do the same, making `CompletableContext` and
the compatibility layer between the context and `ActionListener`
redundant.
Removing it here to in part to set up a clean solution for #77999.
The initial implementation of the embedded class loader took a brute
force approach to supporting multi-release JARs - iterating over all
possible release versions when searching for classes and resources. This
change improves upon that approach by deriving and caching package and
version specific maps, so class and resource loading can go directly to
the class and resource bytes, respectively, rather than searching.
It's hard to get empirical numbers to quanify just how much this change
improves the performance of classes loaded by this loader, and there is
typically only a couple of hundred classes loaded, but the initial cli
seems observably much quicker, while the server startup has improved
just a bit (at least on my machine).
The jarhell check declares a URISyntaxException. However, this should
not be possible as the paths and URLs come from the jdk conversion. This
commit makes a URISyntaxException when converting form URL to URI an
assertion error, similar to MalformedURLException when creating a URL.
Noticed loads of duplicate lambdas on the heap and code related to these predicates
etc. pop up during benchmarking things that are hot on x-content parsing.
These changes way simplify the rest-api code (though it could be made even simpler I think)
and remove it from profiling for the most part.
Co-authored-by: Joe Gallo <joe.gallo@elastic.co>
Strings.format method, which is used heavily in logging with
Supplier should handle exceptions when a format is incorrect.
This will prevent a hard to catch mistakes to blow up in server.
Those mistakes are especially hard to detect in logging when a
code to create a message might be only executed when logger is debug
or trace. Which is not always the case in CI.
relates #87077 (comment)
relates #86549
This is a result of structural search/replace in intellij. This only affects log methods with a signature
logger.info(Supplier<?>) where level could be info/debug etc and supplier argument is in a form of
()-> new ParameterizedMessage
This commit also introduced a Strings utility class to avoid passing Locale.ROOT to every
String.format(Locale.ROOT, pattern, args)
relates #86549
This PR represents the initial phase of Modularizing Elasticsearch (with
Java Modules).
This initial phase modularizes the core of the Elasticsearch server
with Java Modules, which is then used to load and configure extension
components atop the server. Only a subset of extension components are
modularized at this stage (other components come in a later phase).
Components are loaded dynamically at runtime with custom class loaders
(same as is currently done). Components with a module-info.class are
defined to a module layer.
This architecture is somewhat akin to the Modular JDK, where
applications run on the classpath. In the analogy, the Elasticsearch
server modules are the platform (thus are always resolved and present),
while components without a module-info.class are non-modular code
running atop the Elasticsearch server modules. The extension components
cannot access types from non-exported packages of the server modules, in
the same way that classpath applications cannot access types from
non-exported packages of modules from the JDK. Broadly, the core
Elasticseach java modules simply "wrap" the existing packages and export
them. There are opportunites to export less, which is best done in more
narrowly focused follow-up PRs.
The Elasticsearch distribution startup scripts are updated to put jars
on the module path (the class path is empty), so the distribution will
run the core of the server as java modules. A number of key components
have been retrofitted with module-info.java's too, and the remaining
components can follow later. Unit and functional tests run as
non-modular (since they commonly require package-private access), while
higher-level integration tests, that run the distribution, run as
modular.
Co-authored-by: Chris Hegarty <christopher.hegarty@elastic.co>
Co-authored-by: Ryan Ernst <ryan@iernst.net>
Co-authored-by: Rene Groeschke <rene@elastic.co>
This change adds support to embedded class loader to load the provider
and implmentation dependencies as modules - within their own module
layer - when the caller itself is a named module. Currently, this code
is not yet triggered during deployment, since the caller is always an
unnamed module, but the caller will be moularized in a subsequent
change.
Notable changes include:
count implementations for MultiRangeQuery and IndexSortedNumericDocValuesRangeQuery, which may speed up certain aggregations
more efficient decoding of docids in BKD reader
Fix using the filter output stream without overriding the bulk write.
The usage in `directFieldAsBase64` seems like a serious performance bug
since the stream is used to write a potentially larger response here.
I also removed the `BlobOutputStream` that used to contain the same
fix now added to the no-close stream after realizing the class is pointless
to begin with to cut down on our usage of `FilterOutputStream` where the bulk
write fix is needed.
Most classes under elasticsearch-core had been moved to the o.e.core
package. However, a couple io related classes remained in an "internal"
package. This commit moves Streams and IOUtils to the core package, as
they are no more "internal" than the rest of the classes in core.
Since Java 9, the JDK has provided a means of parsing Java versions and
getting the current Java version. That class obviates the need for the
JavaVersion class of Elasticsearch. This commit removes the JavaVersion
class in favor of Runtime.Version.
Note that most of the changes here simply removed logic around
versioning because this change is intended only for the master branch,
where Java 17 is required.
Lucene issues that resulted in elasticsearch changes:
LUCENE-9820 Separate logic for reading the BKD index from logic to intersecting it.
LUCENE-10377: Replace 'sortPos' with 'enableSkipping' in SortField.getComparator()
LUCENE-10301: make the test-framework a proper module by moving all test
classes to org.apache.lucene.tests
LUCENE-10300: rewrite how resources are read in ukrainian morfologik analyzer:
LUCENE-10054 Make HnswGraph hierarchical