This commit extends the ThirdPartyAuditTask check that adds the vector module when building, to include JDK 21.
This is needed now as Lucene has added support for the VectorUtilPanamaProvider with JDK 21. We want to keep the check to very specific versions / ranges, that match that of Lucene.
This commit fixes an edge case in dependency info generation, much like
in #96355 for dependency licenses check, where an Elasticsearch
dependency may show not show up as a project dependency as is the case
with serverless. Additionally this fixes the dependency info task to
properly work when the licenses dir is missing if there are no
dependencies.
* Initial import for TDigest forking.
* Fix MedianTest.
More work needed for TDigestPercentile*Tests and the TDigestTest (and
the rest of the tests) in the tdigest lib to pass.
* Fix Dist.
* Fix AVLTreeDigest.quantile to match Dist for uniform centroids.
* Update docs/changelog/96086.yaml
* Fix `MergingDigest.quantile` to match `Dist` on uniform distribution.
* Add merging to TDigestState.hashCode and .equals.
Remove wrong asserts from tests and MergingDigest.
* Fix style violations for tdigest library.
* Fix typo.
* Fix more style violations.
* Fix more style violations.
* Fix remaining style violations in tdigest library.
* Update results in docs based on the forked tdigest.
* Fix YAML tests in aggs module.
* Fix YAML tests in x-pack/plugin.
* Skip failing V7 compat tests in modules/aggregations.
* Fix TDigest library unittests.
Remove redundant serializing interfaces from the library.
* Remove YAML test versions for older releases.
These tests don't address compatibility issues in mixed cluster tests as
the latter contain a mix of older and newer nodes, so the output depends
on which node is picked as a data node since the forked TDigest library
is not backwards compatible (produces slightly different results).
* Fix test failures in docs and mixed cluster.
* Reduce buffer sizes in MergingDigest to avoid oom.
* Exclude more failing V7 compatibility tests.
* Update results for JdbcCsvSpecIT tests.
* Update results for JdbcDocCsvSpecIT tests.
* Revert unrelated change.
* More test fixes.
* Use version skips instead of blacklisting in mixed cluster tests.
* Switch TDigestState back to AVLTreeDigest.
* Update docs and tests with AVLTreeDigest output.
* Update flaky test.
* Remove dead code, esp around tracking of incoming data.
* Update docs/changelog/96086.yaml
* Delete docs/changelog/96086.yaml
* Remove explicit compression calls.
This was added to prevent concurrency tests from failing, but it leads
to reduces precision. Submit this to see if the concurrency tests are
still failing.
* Revert "Remove explicit compression calls."
This reverts commit 5352c96f65.
* Remove explicit compression calls to MedianAbsoluteDeviation input.
* Add unittests for AVL and merging digest accuracy.
* Fix spotless violations.
* Delete redundant tests and benchmarks.
* Fix spotless violation.
* Use the old implementation of AVLTreeDigest.
The latest library version is 50% slower and less accurate, as verified
by ComparisonTests.
* Update docs with latest percentile results.
* Update docs with latest percentile results.
* Remove repeated compression calls.
* Update more percentile results.
* Use approximate percentile values in integration tests.
This helps with mixed cluster tests, where some of the tests where
blocked.
* Fix expected percentile value in test.
* Revert in-place node updates in AVL tree.
Update quantile calculations between centroids and min/max values to
match v.3.2.
* Add SortingDigest and HybridDigest.
The SortingDigest tracks all samples in an ArrayList that
gets sorted for quantile calculations. This approach
provides perfectly accurate results and is the most
efficient implementation for up to millions of samples,
at the cost of bloated memory footprint.
The HybridDigest uses a SortingDigest for small sample
populations, then switches to a MergingDigest. This
approach combines to the best performance and results for
small sample counts with very good performance and
acceptable accuracy for effectively unbounded sample
counts.
* Remove deps to the 3.2 library.
* Remove unused licenses for tdigest.
* Revert changes for SortingDigest and HybridDigest.
These will be submitted in a follow-up PR for enabling MergingDigest.
* Remove unused Histogram classes and unit tests.
Delete dead and commented out code, make the remaining tests run
reasonably fast. Remove unused annotations, esp. SuppressWarnings.
* Remove Comparison class, not used.
* Small fixes.
* Add javadoc and tests.
* Remove special logic for singletons in the boundaries.
While this helps with the case where the digest contains only
singletons (perfect accuracy), it has a major issue problem
(non-monotonic quantile function) when the first singleton is followed
by a non-singleton centroid. It's preferable to revert to the old
version from 3.2; inaccuracies in a singleton-only digest should be
mitigated by using a sorted array for small sample counts.
* Revert changes to expected values in tests.
This is due to restoring quantile functions to match head.
* Revert changes to expected values in tests.
This is due to restoring quantile functions to match head.
* Tentatively restore percentile rank expected results.
* Use cdf version from 3.2
Update Dist.cdf to use interpolation, use the same cdf
version in AVLTreeDigest and MergingDigest.
* Revert "Tentatively restore percentile rank expected results."
This reverts commit 7718dbba59.
* Revert remaining changes compared to main.
* Revert excluded V7 compat tests.
* Exclude V7 compat tests still failing.
* Exclude V7 compat tests still failing.
* Restore bySize function in TDigest and subclasses.
Notable changes:
* more efficient backwards reads in NIOFSDirectory
* faster merging when using soft deletes
* workaround security manager when using vector API
This updates the JarApiComparisonTask to be a bit more robust so it no
longer requires the `javap` command to be on the path and instead
locates it from the current build JDK. This ensures, firstly, that we're
using the `javap` executable that corresponds to the current compiler
Java we're using and secondly, that the task will work even if
`JAVA_HOME/bin` isn't added to `PATH`.
This adds IndexVersion that represents the index data & metadata version, separate to the release version. Similar to TransportVersion, this will eventually be completely separated from release version.
Most relevant changes:
- add api to allow concurrent query rewrite (GITHUB-11838 Add api to allow concurrent query rewrite apache/lucene#11840)
- knn query rewrite (Concurrent rewrite for KnnVectorQuery apache/lucene#12160)
- Integrate the incubating Panama Vector API (Integrate the Incubating Panama Vector API apache/lucene#12311)
As part of this commit I moved the ES codebase off of overriding or relying on the deprecated rewrite(IndexReader) method in favour of using rewrite(IndexSearcher) instead. For score functions, I went for not breaking existing plugins and create a new IndexSearcher whenever we rewrite a filter, otherwise we'd need to change the ScoreFunction#rewrite signature to take a searcher instead of a reader.
Co-authored-by: ChrisHegarty <christopher.hegarty@elastic.co>
The dependency licenses check is meant to ensure we have license
information included for external dependencies. The check currently
looks at all non-project dependencies. This works within the
elasticsearch repo, since internal dependencies are all project
dependencies. However, in serverless the dependencies will be jars from
the upstream project (or rather not project dependencies, since it is a
compound build). This commit loosens the dependency license check filter
to omit any that have an elasticsearch group.
Related to https://github.com/elastic/elasticsearch/issues/96207, I
believe our retry logic is either a) contributing to weird failures or
b) masking the root cause of failures in our Docker builds. I'm
temporarily disabling this for now to try and get some better
diagnostics.
Using gradle toolchain support in gradle requires refactoring how the composite build is composed.
We added three toolchain resolver
1. Resolver for resolving defined bundled version from oracle as openjdk
2. Resolve all available jdks from Adoption
3. Resolve archived Oracle jdk distributions.
We should be able to remove the JdkDownloadPlugin altogether without having that in place, but we'll do that in a separate effort.
Fixes#95094
* Fix release highlights generator
* Add missing quotes to and remove slashes from test results too.
* Remove newlines
* One more newline
* One more newline
* Add newline before endif in test results
---------
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
The preallocate module raises endless exceptions since we fail to get
the file descriptor from the file channel in `FileDescriptorFieldAction` without the export.
the #92776 introduced a bwc test to make sure we do not break stable plugin API. The change was merged to 8.7.0
At the same time an artifact group rename was merged into 8.7.0 https://github.com/elastic/elasticsearch/pull/92905/files
This commit fixes the group name used in a plugin
The preallocate module needs access to java.io internals. However, in
order to open java.io to a specific module, rather than the unnamed
module as was previously done, the said module must be in the boot
layer.
This commit moves the preallocate module to libs. It adds it to the main
lib dir, though it does not add it as a compile dependency of server.
Version.java currently contains mappings to the Lucene version for each
Elasticsearch version. The only use of this in the build logic is for
filtering based on index compatibility. However, that compatibility can
be inferred based on the Elasticsearch major version since there is a
one to one mapping between Elasticsearch major and Lucene major. This
commit removes Lucene version extraction from the build bwc logic.
Fixes#82794. Upgrade the spotless plugin, which addresses the issue
around formatting `instanceof` expressions. Formatting of statements
including lambdas seems to have improved too.
* Ensure breaking-changes tag exists for both cases
This was previously added so that it only worked when there were no
breaking changes, but should be for the case of breaking changes also
* Revert "Ensure breaking-changes tag exists for both cases"
This reverts commit f8cb87ad45.
* Ensure breaking-changes tag exists for both cases
This was previously added so that it only worked when there were no
breaking changes, but should be for the case of breaking changes also
* Improved wording
We generate a tarball with various test reports, logs and diagnostics on
build completion when running in CI. Adapt this script to also support
Buildkite, in addition to Jenkins.
This PR is follow up on
https://github.com/elastic/elasticsearch/pull/93414, which allows using
API keys to authenticate cross cluster requests in the new remote
cluster security model.
The main change is around removing restrictions and allowing API key
authentication subjects on the fulfilling (server) side.
This new snapshot has the following changes that could be interesting:
- Less contention on the indexing path.
- Faster flushing of keywords when index sorting is enabled.
Rename refactor PR that uses `cross_cluster_access` in place of
`remote_access` wherever appropriate, since `cross_cluster_access` is a
more precise, clearer term. No functional changes, however I did make a
few tweaks around version handling.
- Remove custom checksum build logic in wrapper task
- Adjust jdk home handling adjusting the change in behaviour in gradle. Requires providing canonical paths for provisioned jdk homes.
- Fix test by add workaround to bug in configuration cache
This updates the gradle wrapper to 7.6.1. This patch release contains a
fix for incremental compilation of java modules we have raised against
gradle 7.6
see https://github.com/gradle/gradle/issues/23067
This change updates slf4j to 2.0.6 (latest at time of writing). 2.0.6 is already in the build (used by another component), so no need to add an entry to the gradle verification metadata.
The initial motivation for the upgrade is the addition of a stable module name to slf4j, post 1.7, which can be seen by the update to the requires statement in the module-info change. A later change, that will propose to modularise azure-repository, transitively requires org.slf4j.
When resolving the JARs for checking stable plugin API compatibility
we want to disable transitive dependency resolution since we just need
the API jar, not any of its dependencies.
We should only test snapshot version when running "check". Due to a
misswired task dependency, we were incorrectly running tests for the
full matrix of supported backward compatible versions.
upgrading jackson to be 2.14.2 everywhere except for the azure plugin which depends on
jackson-databind in version 2.13.4.2
and
jackson-core 2.13.4
jackson-dataformat-xml 2.13.4
jackson-datatype-jsr310 2.13.4
jackson-dataformat-xml 2.13.4
related: #90553
a replace PR for #91725
We have recently upgraded to a Lucene 9.5.0 snapshot. With this commit we upgrade to the final 9.5.0 release.
Main changes are around the float vector field, query and values API.
Today we forbid the trappy locale-free `String#format` and
`String#formatted` and suggest to use the `String#format` override which
accepts an explicit `Locale`. These days a better alternative is
`Strings#format`, so this commit adjusts the message that
`forbidddenApis` returns to reflect that.
The templates are straightforward java-like files with minimal string
replace and ifdef support. These template files are processed by ANTLR's
StringTemplate library to produce java source code files, which are
checked into the repository.
This source code generation mechanism is separate to that of what
generates the aggs implementations. The aggs generation and the data
classes generation are different use cases. The latter being a
convenience to reduce the friction of specialised data class types while
ensuring consistency and maintainability. Whereas the former is intended
to create optimised versions of particular aggs, given a particular
recipe ( one could envisage a point where aggs specialisations are
generated at runtime, rather than compile time ).
For now, the data classes are generated into a separate output
directory, to avoid Gradle issues. A later change should consider how to
best merge the output of the annotation processor generated aggs and the
string-template generate data classes.
Co-authored-by: Rene Groeschke <rene@elastic.co>
We need to verify, for each release, that our stable plugin APIs
are not breaking.
This commit adds some Gradle support for basic backwards compatibility
testing. On the Gradle side, we add a new qa project to test the
current commit against downloads of released versions, and against
fresh builds of snapshot versions.
As for the actual comparison, we break up the output of javap (the
decompiler) by line and create maps of classes to public class,
field, and method declarations within those class files. We then
check that the signature map from the new jar is not missing any
elements present in the old jar. This method has known limitations,
which are documented in the JarApiComparisonTask class.
Co-authored-by: Mark Vieira <portugee@gmail.com>
The docker-compose plugin uses randomized auto-generated project names
for test fixtures. This can cause issues on some platforms where it will
generate an invalid identifier. This commit simply configures the plugin
to use the gradle project name for docker compose.
This PR adds settings and infrastructure to support a new Remote Cluster port,
to be used in Remote Cluster Security 2.0. Specifically, this commit adds new
settings that allow opening a new Remote Cluster port, which will eventually
exclusively support the new cross-cluster authentication method. Since support
for that new authentication method is still under construction, these settings
are hidden behind a feature flag.
The new settings cover all Transport profile settings, to ensure that users will
not have to be exposed to Transport Profiles directly to make use of RCS2.0
functionality. This includes all core settings, as well as IP filter and SSL
configuration.
Co-authored-by: Yang Wang <yang.wang@elastic.co>
Refactoring that drops the api suffix from package name
This will have to be followed up by a plugins/examples fix in imports
Also set an artifact group name to `org.elasticsearch.plugin` in the plugin-api and plugin-analysis-api