Commit Graph

403 Commits

Author SHA1 Message Date
David Turner bfcc93a042
Anonymize AbstractRefCounted (#77208)
Today `AbstractRefCounted` has a `name` field which is only used to
construct the exception message when calling `incRef()` after it's been
closed. This isn't really necessary, the stack trace will identify the
reference in question and give loads more useful detail besides. It's
also slightly irksome to have to name every single implementation.

This commit drops the name and the constructor parameter, and also
introduces a handy factory method for use when there's no extra state
needed and you just want to run a method or lambda when all references
are released.
2021-09-03 07:59:44 +01:00
Rene Groeschke 35ec6f348c
Introduce simple public yaml-rest-test plugin (#76554)
This introduces a basic public yaml rest test plugin that is supposed to be used by external 
elasticsearch plugin authors. This is driven by #76215

- Rename yaml-rest-test to intern-yaml-rest-test
- Use public yaml plugin in example plugins

Co-authored-by: Mark Vieira <portugee@gmail.com>
2021-08-31 08:45:52 +02:00
Rory Hunter d01efa4fd6
Changes to keep Checkstyle happy after reformatting (#76464)
* Reformatting to keep Checkstyle after formatting

* Configure spotless everywhere, and disable the tasks if necessary

* Add XContentBuilder helpers, fix test

* Tweaks

* Add a TODO

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2021-08-18 07:15:55 -04:00
Armin Braun 873fbf7b65
Fix Leaking Http Channel Objects when Http Client Stats are Disabled (#76257)
We have to remove the channel from the internal collection of channels when stats are disabled.

Closes #76183
2021-08-10 12:39:12 +02:00
Armin Braun d510fa4841
Upgrade to Netty 4.1.66 (#76135)
Upgrade to 4.1.66 which might bring some helpful
improvements to how TLS exceptions are propagated.
2021-08-05 09:24:52 +02:00
Yang Wang bad4e88278
[Test] Allow trace.id in default thread context (#76089)
A new trace.id header is added by #74210. It is handled almost the same
way as x-opaque-id. Specifically, it gets passed into default thread
context. This means the existing assertion should expect it in addition
to x-opaque-id.
2021-08-05 10:45:05 +10:00
Armin Braun ffeaab8e58
Fix Issues in Netty4MessageChannelHandler (#75861)
Fixes a few rough edges in this class:
* we need to always pass a flush call down the pipeline and not just conditionally
if they apply to the message handler, otherwise we lose flushes e.g. when a channel
becomes not-writable due to a write from off the event-loop that exceeds the outbound
buffer size
  * this is suspected of causing recently observed intermittent and unexplained slow message writes (logged by the outbound slow logger) where a message became stuck until a subsequent message was sent (e.g. during period leader checks or so)
* Pass size `0` messages down the pipeline instead of just resolving their promise to avoid
unexpected behavior (though we don't make use of `0`-length writes as of today
* Avoid unnecessary flushes in queued-writes loop and only flush if the channel stops being
writable
* Release buffers on queued writes that we fail on channel close (not doing this wasn't causing bugs today because we release the underlying bytes elsewhere but could cause trouble later)

Unfortunately, I was not able to reproduce the issue in the first point reliably as the timing is really tricky. I therefore tried to make this PR as short and uncontroversial as possible. I think there's possible further improvements here and this should have been caught by a test but it's not yet clear to me how to design a reliable reproducer here.
2021-07-30 16:09:11 +02:00
Mark Vieira 9d14bc91d7
Set netty available processors system property for tests globally (#75699) 2021-07-27 11:21:42 -07:00
Mark Vieira 4ef48e8db5 Fix thirdPartyAudit task configuration 2021-06-29 15:43:05 -07:00
Ryan Ernst ab1a2e4a84
Add precommit task for detecting split packages (#73784)
Modularization of the JDK has been ongoing for several years. Recently
in Java 16 the JDK began enforcing module boundaries by default. While
Elasticsearch does not yet use the module system directly, there are
some side effects even for those projects not modularized (eg #73517).
Before we can even begin to think about how to modularize, we must
Prepare The Way by enforcing packages only exist in a single jar file,
since the module system does not allow packages to coexist in multiple
modules.

This commit adds a precommit check to the build which detects split
packages. The expectation is that we will add the existing split
packages to the ignore list so that any new classes will not exacerbate
the problem, and the work to cleanup these split packages can be
parallelized.

relates #73525
2021-06-08 15:04:23 -07:00
Ryan Ernst 68817d7ca2
Rename o.e.common in libs/core to o.e.core (#73909)
When libs/core was created, several classes were moved from server's
o.e.common package, but they were not moved to a new package. Split
packages need to go away long term, so that Elasticsearch can even think
about modularization. This commit moves all the classes under o.e.common
in core to o.e.core.

relates #73784
2021-06-08 09:53:28 -07:00
Ryan Ernst 64054de1ac
Rename bootstrap package in core jar (#73788)
The org.elasticsearch.bootstrap package exists in server with classes
for starting up Elasticsearch. The elasticsearch-core jar has a handful
of classes that were split out from there, namely java version parsing
and jarhell. This commit moves those classes to a new
org.elasticsearch.jdk package so as to not split the server owned
bootstrap package.

relates #73784
2021-06-07 08:14:44 -07:00
Ryan Ernst e9970aa7e3
Upgrade netty to 4.1.63 (#73011) 2021-05-27 15:51:35 -07:00
Armin Braun d65a16ad1d
Use NettyByteBufSizer for Outbound Connections (#72193)
We should also use the sizer for outbound connections like
we do for inbound. Also made it a singleton since we use it in 3
spots now and there's no point instantiating it multiple times.
2021-04-26 19:26:08 +02:00
Rene Groeschke 5bcd02cb4d
Restructure build tools java packages (#72030)
Related to #71593 we move all build logic that is for elasticsearch build only into
the org.elasticsearch.gradle.internal* packages

This makes it clearer if build logic is considered to be used by external projects
Ultimately we want to only expose TestCluster and PluginBuildPlugin logic
to third party plugin authors.

This is a very first step towards that direction.
2021-04-26 14:53:55 +02:00
Dan Hermann eb345b2a8f
Deprecate legacy index template API endpoints (#71309) 2021-04-16 08:07:28 -05:00
Mark Vieira 50d6e1369a Mute Netty4HeadBodyIsEmptyIT.testTemplateExists 2021-04-13 18:00:01 -07:00
Rory Hunter fb1921c9dc
Change deprecation logs data stream name (#68737)
More fixes to deprecation log indexing so that the data stream name and document
contents are more ECS-compatible.
2021-04-13 15:11:57 +01:00
Jake Landis 279fde375e
Apply REST API compatibility testing for the :modules (#71137) 2021-04-02 11:20:54 -05:00
Jason Tedor 32314493a2
Pass override settings when creating test cluster (#71203)
Today when creating an internal test cluster, we allow the test to
supply the node settings that are applied. The extension point to
provide these settings has a single integer parameter, indicating the
index (zero-based) of the node being constructed. This allows the test
to make some decisions about the settings to return, but it is too
simplistic. For example, imagine a test that wants to provide a setting,
but some values for that setting are not valid on non-data nodes. Since
the only information the test has about the node being constructed is
its index, it does not have sufficient information to determine if the
node being constructed is a non-data node or not, since this is done by
the test framework externally by overriding the final settings with
specific settings that dicate the roles of the node. This commit changes
the test framework so that the test has information about what settings
are going to be overriden by the test framework after the test provide
its test-specific settings. This allows the test to make informed
decisions about what values it can return to the test framework.
2021-04-02 10:20:36 -04:00
Mark Vieira 6339691fe3
Consolidate REST API specifications and publish under Apache 2.0 license (#70036) 2021-03-26 16:20:14 -07:00
David Turner 60d53c0206
Stop double-starting transport service in tests (#70056)
Today in tests we often use a utility method that creates and starts a
transport service, and then we start it again in the tests anyway. This
commit removes this unnecessary code and asserts that we only ever call
`TransportService#acceptIncomingRequests` once.
2021-03-08 11:04:43 +00:00
Yang Wang c0fc847883 Add comment to explain the extra test check for FIPS 2021-03-03 12:48:26 +11:00
Armin Braun bb77ab46e0
Stop Ignoring Exceptions on Close in Network Code (#69665)
We should not be ignoring and suppressing exceptions on releasing
network resources quietly in these spots.

Co-authored-by: David Turner <david.turner@elastic.co>
2021-03-01 14:38:18 +01:00
Yang Wang 9711975619
[Test] Remove some watcher indices from comparison (#69497)
Since #67588, .triggered_watches and .watches indices are no longer created on node startup. This PR removing them from the warnings for comparison.
2021-02-24 14:25:59 +11:00
Mark Vieira a92a647b9f Update sources with new SSPL+Elastic-2.0 license headers
As per the new licensing change for Elasticsearch and Kibana this commit
moves existing Apache 2.0 licensed source code to the new dual license
SSPL+Elastic license 2.0. In addition, existing x-pack code now uses
the new version 2.0 of the Elastic license. Full changes include:

 - Updating LICENSE and NOTICE files throughout the code base, as well
   as those packaged in our published artifacts
 - Update IDE integration to now use the new license header on newly
   created source files
 - Remove references to the "OSS" distribution from our documentation
 - Update build time verification checks to no longer allow Apache 2.0
   license header in Elasticsearch source code
 - Replace all existing Apache 2.0 license headers for non-xpack code
   with updated header (vendored code with Apache 2.0 headers obviously
   remains the same).
 - Replace all Elastic license 1.0 headers with new 2.0 header in xpack.
2021-02-02 16:10:53 -08:00
Rory Hunter ad1f876daa
Replace NOT operator with explicit `false` check (#67817)
We have an in-house rule to compare explicitly against `false` instead
of using the logical not operator (`!`). However, this hasn't
historically been enforced, meaning that there are many violations in
the source at present.

We now have a Checkstyle rule that can detect these cases, but before we
can turn it on, we need to fix the existing violations. This is being
done over a series of PRs, since there are a lot to fix.
2021-01-26 14:47:09 +00:00
Armin Braun 6d025d3a27
Log Slowness on Sending Transport Messages (#67664)
Similar to #62444 but for the outbound path.

This does not detect slowness in individual transport handler logic,
this is done via the inbound handler logging already, but instead
warns if it takes a long time to hand off the message to the relevant
transport thread and then transfer the message over the wire.
This gives some visibility into the stability of the network
connection itself and into the reasons for slow network
responses (if they are the result of slow networking on the sender).
2021-01-19 12:19:32 +01:00
Tim Vernum 248b6a89e8
Update template warning for FIPS in Netty test (#67067)
This changes the expected error message (on FIPS) so that the
order of the templates (and their associated patterns) matches
the (newly updated) order generated by the server.

Relates: #67066
Resolves: #66820
2021-01-07 12:01:23 +11:00
Mayya Sharipova 5b6675ab0d
Mute testTemplateExists (#66863)
Mute Netty4HeadBodyIsEmptyIT.testTemplateExists, as it fails in FIPS
mode.

Relates to #66820
2020-12-29 10:45:16 -05:00
Tim Vernum 22bc833d85
Skip netty4 yaml test in FIPS mode (#66842)
The "Netty loaded" YAML test asserts that the configured transport is
"netty4", however when in FIPS mode, the tests enable security and the
configured transport is "security4".

This change skips the netty4 yaml test when running in FIPS mode.

Resolves: #66818
2020-12-29 18:28:23 +11:00
Ioannis Kakavas bd873698bc
Ensure CI is run in FIPS 140 approved only mode (#64024)
We were depending on the BouncyCastle FIPS own mechanics to set
itself in approved only mode since we run with the Security
Manager enabled. The check during startup seems to happen before we
set our restrictive SecurityManager though in
org.elasticsearch.bootstrap.Elasticsearch , and this means that
BCFIPS would not be in approved only mode, unless explicitly
configured so.

This commit sets the appropriate JVM property to explicitly set
BCFIPS in approved only mode in CI and adds tests to ensure that we
will be running with BCFIPS in approved only mode when we expect to.
It also sets xpack.security.fips_mode.enabled to true for all test clusters
used in fips mode and sets the distribution to the default one. It adds a
password to the elasticsearch keystore for all test clusters that run in fips
mode.
Moreover, it changes a few unit tests where we would use bcrypt even in
FIPS 140 mode. These would still pass since we are bundling our own
bcrypt implementation, but are now changed to use FIPS 140 approved
algorithms instead for better coverage.

It also addresses a number of tests that would fail in approved only mode
Mainly:

    Tests that use PBKDF2 with a password less than 112 bits (14char). We
    elected to change the passwords used everywhere to be at least 14
    characters long instead of mandating
    the use of pbkdf2_stretch because both pbkdf2 and
    pbkdf2_stretch are supported and allowed in fips mode and it makes sense
    to test with both. We could possibly figure out the password algorithm used
    for each test and adjust password length accordingly only for pbkdf2 but
    there is little value in that. It's good practice to use strong passwords so if
    our docs and tests use longer passwords, then it's for the best. The approach
    is brittle as there is no guarantee that the next test that will be added won't
    use a short password, so we add some testing documentation too.
    This leaves us with a possible coverage gap since we do support passwords
    as short as 6 characters but we only test with > 14 chars but the
    validation itself was not tested even before. Tests can be added in a followup,
    outside of fips related context.

    Tests that use a PKCS12 keystore and were not already muted.

    Tests that depend on running test clusters with a basic license or
    using the OSS distribution as FIPS 140 support is not available in
    neither of these.

Finally, it adds some information around FIPS 140 testing in our testing
documentation reference so that developers can hopefully keep in
mind fips 140 related intricacies when writing/changing docs.
2020-12-23 21:00:49 +02:00
Ryan Ernst cfe6e68fde
Add codebases hints for testing (#65281)
This commit adds codebases files for the remaining plugin security
policies that use jar specific grants.
2020-11-20 10:01:51 -08:00
Rene Groeschke 810e7ff6b0
Move tasks in build scripts to task avoidance api (#64046)
- Some trivial cleanup on build scripts
- Change task referencing in build scripts to use task avoidance api
where replacement is trivial.
2020-11-12 12:04:15 +01:00
Przemyslaw Gomulka 0bb64cbe1e
Handle incorrect header values (#64708)
When Accept or Content-Type header values are incorrect, a request
should be gracefully rejected and an exception message returned to a
client.

relates #64689
relates #51816
2020-11-10 09:07:32 +01:00
Ryan Ernst 9b07042e59
Fix incorrect setContextClassLoader permissions (#64745)
Netty requires the setContextClassLoader permission. However, many of
our policy files incorrectly use `*` as the name, thinking
`setContextClassLoader` is the actions element of the permission (it
looks like a copy paste error that was then itself copy pasted through
several policy files). This commit corrects these permissions, which had
actually granted all RuntimePermissions.
2020-11-07 11:42:58 -08:00
Armin Braun 9b4a151005
Revert Serializing Outbound Transport Messages on IO Threads (#64632)
Serializing outbound transport message on the IO loop was introduced in https://github.com/elastic/elasticsearch/pull/56961. Unfortunately it turns out that this is incompatible with assumptions made by CCR code here: f22ddf822e/x-pack/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/action/repositories/GetCcrRestoreFileChunkAction.java (L60-L61) and that are not easy to work around on short notice.

Raising reverting this move (as a temporary solution, it's still a valuable change long-term) as a blocker therefore as this seriously affects the stability of the initial phase of the CCR following by causing corrupted bytes to be send to the follower.
2020-11-05 14:41:30 +01:00
Przemyslaw Gomulka c219e176e9
Introduce per REST endpoint media types (#64406)
Per REST endpoint media types declaration allows to make parsing/validation more strict.

If a media type was declared only in one endpoint (for instance CSV in SQL endpoint) it should not be allowed to parse that media type when using other endpoints.
However, the Compatible API need to be able to understand all media types supported by Elasticsearch in order to parse a compatible-with=version parameter.
This implies that endpoints need to declare which media type they support and how to parse them (if introducing new media types - like SQL).

How to parse:
MediaType interface still serves as an abstraction on top of XContentType and TextFormat. It also has a declaration of mappings String-MediaType with parameters. Parameters declares the names of parameters and regex to validate its values.
This instructs how to perform the parsing. For instance - XContentType.JSON has the mapping of application/vnd.elasticsearch+json -> JSON and allows parameters compatible-with=\d and charset=utf-8

MediaTypeParser was simplified into ParsedMediaType class with static factory method for parsing.

How to declare:
RestHandler interface is extended with a validAcceptMediaTypes which returns a MediaTypeRegistry - a class that encapsulates mappings of string (type/subtype) to MediaType, allowed parameters and formatPathParameter values.
We only need to allow of declaration of valid media types for Accept header. Content-Type valid media types are fixed to XContentType instances - json, yaml, smile, cbor.

relates #51816
2020-11-05 09:47:13 +01:00
Tim Brooks 1547bd672d
Transfer network bytes to smaller buffer (#62673)
Currently we read in 64KB blocks from the network. When TLS is not
enabled, these bytes are normally passed all the way to the application
layer (some exceptions: compression). For the HTTP layer this means that
these bytes can live throughout the entire lifecycle of an indexing
request.

The problem is that if the reads from the socket are small, this means
that 64KB buffers can be consumed by 1KB or smaller reads. If the socket
buffer or TCP buffer sizes are small, the leads to massive memory
waste. It has been identified as a major source of OOMs on coordinating
nodes as Elasticsearch easily exhausts the heap for these network bytes.

This commit resolves the problem by placing a handler after the TLS
handler to copy these bytes to a more appropriate buffer size as
necessary. This comes after TLS, because TLS is a framing layer which
often resolves this problem for us (the 64KB buffer will be decoded
into a more appropriate buffer size). However, this extra handler will
solve it for the non-TLS pipelines.
2020-09-30 11:31:54 -06:00
Tim Brooks 19c19f28cb
Split up large HTTP responses in outbound pipeline (#62666)
Currently Netty will batch compression an entire HTTP response
regardless of its content size. It allocates a byte array at least of
the same size as the uncompressed content. This causes issues with our
attempts to remove humungous G1GC allocations. This commit resolves the
issue by split responses into 128KB chunks.

This has the side-effect of making large outbound HTTP responses that
are compressed be send as chunked transfer-encoding.
2020-09-24 14:20:12 -06:00
Tim Brooks 2ae1cef99b
Log alloc description after netty processors set (#62741)
Currently we log the NettyAllocator description when the netty plugin is
created. Unfortunately, this hits certain static fields in Netty which
triggers the settings of the number of CPU processors. This conflicts
with out Elasticsearch behavior to override this based on a setting.

This commit resolves the issue by logging after the processors have been
set.
2020-09-21 19:36:12 -06:00
Tim Brooks 326286a22e
Change netty pool chunk size to respect G1 region (#62410)
Currently the netty pool chunk size defaults to 16MB. The number does
not play well with the G1GC which causes this to consume entire regions.
Additionally, we normally allocated arrays of size 64KB or less. This
means that Elasticsearch could handle a smaller pool chunk size to play
nicer with the G1GC.
2020-09-21 13:50:20 -06:00
Tim Brooks d5f9e4ecb0
Move CorsHandler to server (#62007)
Currently we duplicate our specialized cors logic in all transport
plugins. This is unnecessary as it could be implemented in a single
place. This commit moves the logic to server. Additionally it fixes a
but where we are incorrectly closing http channels on early Cors
responses.
2020-09-08 08:36:18 -06:00
Rene Groeschke dd74be0f83
Merge test runner task into RestIntegTest (#60261)
* Merge test runner task into RestIntegTest
* Reorganizing Standalone runner and RestIntegTest task
* Rework general test task configuration and extension
2020-08-03 12:07:41 +02:00
David Turner 3b80be69d1
Fix network logging test failures (#60334)
In #60297 we added some tests related to logging from the transport
layer, but these tests failed occasionally since the cluster
was kept alive between test invocations but the logging framework
expected it only to be used for a single test. With this commit we
reduce the scope of the internal test cluster to `TEST` to solve this
problem.

Closes #60321.
2020-07-29 08:28:44 +01:00
David Turner c98be4d29d Mute tests for #60321 2020-07-28 18:12:21 +01:00
David Turner 940d618186
Log and track open/close of transport connections (#60297)
Transport connections between nodes remain in place until one or other
node shuts down or the connection is disrupted by a flaky network.
Today it is very difficult to demonstrate that transient failures and
cluster instability are caused by the network even though this is often
the case. In particular, transport connections open and close without
logging anything, even at `DEBUG` level, making it very hard to quantify
the scale of the problem or to correlate the networking problems with
external events.

This commit adds the missing `DEBUG`-level logging when transport
connections open and close, and also tracks the total number of
transport connections a node has opened as a measure of the stability of
the underlying network.
2020-07-28 16:58:00 +01:00
Yannick Welsch 2078509af5
Set specific keepalive options by default on supported platforms (#59278)
keepalives tell any intermediate devices that the connection remains alive, which helps with overzealous firewalls that are
killing idle connections. keepalives are enabled by default in Elasticsearch, but use system defaults for their
configuration, which often times do not have reasonable defaults (e.g. 7200s for TCP_KEEP_IDLE) in the context of
distributed systems such as Elasticsearch.

This PR sets the socket-level keep_alive options for network.tcp.{keep_idle,keep_interval} to 5 minutes on configurations
that support it (>= Java 11 & (MacOS || Linux)) and where the system defaults are set to something higher than 5
minutes. This helps keep the connections alive while not interfering with system defaults or user-specified settings
unless they are deemed to be set too high by providing better out-of-the-box defaults.
2020-07-27 13:38:24 +02:00
Jake Landis 7dd57c9415
Introduce javaRestTest source set/task and convert modules (#59939)
Introduce a javaRestTest source set and task to compliment the yamlRestTest.
javaRestTest differs such that the code is sourced from Java and may have
different dependencies and setup requirements for the test clusters. This also
allows the tests to run in parallel in different cluster instances to prevent any
cross test contamination between the two types of tests.

Included in this PR is all :modules no longer use the integTest task. The tests
are now driven by test, yamlRestTest, javaRestTest, and internalClusterTest.
Since only :modules (and :rest-api-spec) have been converted to yamlRestTest
we can now disable the integTest task if either yamlRestTest or javaRestTest have
been applied. Once all projects are converted, we can delete the integTest task.

related: #56841
related: #59444
2020-07-21 17:17:17 -05:00
Jake Landis ddd882b835
Convert modules to use yamlRestTest (#59089)
This commit moves the modules REST tests to the
newly introduced yamlRestTest source set. A few
tests have also been re-named to include the correct
IT suffix. Without changing the names, the testing
conventions task would fail since now that the YAML
tests are no longer present pacify the convention.
These tests have moved to the internalClusterTest
source set.

related: #56841
2020-07-13 11:32:42 -05:00