Commit Graph

117 Commits

Author SHA1 Message Date
Armin Braun 15f6cfe6e1
Dry Up XContent Parser Construction (#75114)
Cleanup duplication in how we parse byte arrrays directly.
2021-07-08 14:14:19 +02:00
Felix Barnsteiner 67fbc337ea
Json processor: allow duplicate keys (#74956) 2021-07-06 15:02:32 +02:00
Ryan Ernst 63012c8a40
Move ParseField to o.e.c.xcontent (#73923)
ParseField is part of the x-content lib, yet it doesn't exist under the
same root package as the rest of the lib. This commit moves the class to
the appropriate package.

relates #73784
2021-06-08 13:32:14 -07:00
Ryan Ernst 68817d7ca2
Rename o.e.common in libs/core to o.e.core (#73909)
When libs/core was created, several classes were moved from server's
o.e.common package, but they were not moved to a new package. Split
packages need to go away long term, so that Elasticsearch can even think
about modularization. This commit moves all the classes under o.e.common
in core to o.e.core.

relates #73784
2021-06-08 09:53:28 -07:00
Przemyslaw Gomulka 4bdd00d452
[Rest Api Compatibility] Typed endpoint for bulk api (#73571)
retrofits typed endpoint and type in request parsing
the original types removal commit
#46983

relates #51816
2021-06-07 19:36:31 +02:00
Ryan Ernst f98b374cf6
Revert "Upgrade Azure SDK and Jackson (#72833) (#72995)" (#73837)
The recent upgrade of the Azure SDK has caused a few test failures that
have been difficult to debug and do not yet have a fix. In particular, a
change to the netty reactor resolving
(https://github.com/reactor/reactor-netty/issues/1655). We need to wait
for a fix for that issue, so this reverts commit
6c4c4a0ecb.

relates #73493
2021-06-07 10:20:46 -07:00
Nhat Nguyen d6d5d0d66d Fix assertion message in directFieldAsBase64 method
Relates #73804
2021-06-06 22:56:15 -04:00
Nhat Nguyen cb1144886f
Allow build XContent directly from Writable (#73804)
Today, writing a Writable value to XContent in Base64 format performs 
these steps: (1) create a BytesStreamOutput, (2) write Writable to that
output, (3) encode a copy of bytes from that output stream, (4) create a
string from the encoded bytes, (5) write the encoded string to XContent. 
These steps allocate/use memory 5 times than writing the encode chars
directly to the output of XContent.

This API would help reduce memory usage when storing a large response 
of an async search.

Relates #67594
2021-06-06 12:11:52 -04:00
Ryan Ernst 6c4c4a0ecb
Upgrade Azure SDK and Jackson (#72833) (#72995)
This commit upgrades the Azure SDK to 12.11.0 and Jackson to 2.12.2. The
Jackson upgrade must happen at the same time due to Azure depending on
this new version of Jackson.

closes #66555
closes #67214

Co-authored-by: Francisco Fernández Castaño <francisco.fernandez.castano@gmail.com>
2021-05-27 07:55:18 -07:00
Ryan Ernst 8cd3944a0a
Revert "Upgrade Azure SDK and Jackson (#72833)"
This reverts commit dca0e92bef.
2021-05-06 20:51:31 -07:00
Ryan Ernst dca0e92bef
Upgrade Azure SDK and Jackson (#72833)
This commit upgrades the Azure SDK to 12.11.0 and Jackson to 2.12.2. The
Jackson upgrade must happen at the same time due to Azure depending on
this new version of Jackson.

closes #66555
closes #67214
2021-05-06 20:36:42 -07:00
Alan Woodward b27eaa38dc
Remove 'external values', and replace with swapped out XContentParsers (#72203)
The majority of field mappers read a single value from their positioned
XContentParser, and do not need to call nextToken. There is a general
assumption that the same holds for any multifields defined on them, and
so the XContentParser is passed down to their multifields builder as-is.
This assumption does not hold for mappers that accept json objects,
and so we have a second mechanism for passing values around called
'external values', where a mapper can set a specific value on its context
and child mappers can then check for these external values before reading
from xcontent. The disadvantage of this is that every field mapper now
needs to check its context for external values. Because the values are
defined by their java class, we can also know that in the vast majority of
cases this functionality is unused. We have only two mappers that actually
make use of this, CompletionFieldMapper and GeoPointFieldMapper.

This commit removes external values entirely, and replaces it with the ability
to pass a modified XContentParser to multifields. FieldMappers can just check
the parser attached to their context for data and don't need to worry about
multiple sources.

Plugins implementing field mappers will need to take the removal of external
values into account. Implementations that are passing structured objects
as external values should instead use ParseContext.switchParser and
wrap the objects using MapXContentParser.wrapObject().

GeoPointFieldMapper passes on a fake parser that just wraps its input data
formatted as a geohash; CompletionFieldMapper has a slightly more complicated
parser that in general wraps its metadata, but if textOrNull() is called without
the parser being advanced just returns its text input.

Relates to #56063
2021-04-29 09:17:18 +01:00
Alan Woodward 289d202cb2
Rework geo mappers to index value by value (#71696)
The various geo field mappers are organised in a hierarchy that shares
parsing and indexing code. This ends up over-complicating things,
particularly when we have some mappers that accept multiple values
and others that only accept singletons. It also leads to confusing
behaviour around ignore_malformed behaviour: geo fields will ignore
all values if a single one is badly formed, while all other field mappers
will only ignore the problem value and index the rest. Finally, this
structure makes adding index-time scripts to geo_point needlessly
complex.

This commit refactors the indexing logic of the hierarchy to move the
individual value indexing logic into the concrete implementations,
and aligns the ignore_malformed behaviour with that of other mappers.

It contains two breaking changes:

* The geo field mappers no longer check for external field values on the
  parse context. This added considerable complication to the refactored
  parse methods, and is unused anywhere in our codebase, but may
  impact plugin-based field mappers which expect to use geo fields
  as multifields
* The geo_point field mapper now passes geohashes to its multifields
  one-by-one, instead of formatting them into a comma-delimited
  string and passing them all at once. Completion multifields using
  this as an input should still behave as normal because by default
  they would split this combined geohash string on the commas in any
  case, but keyword subfields may look different.

Fixes #69601
2021-04-19 12:38:01 +01:00
Jack Conradson 065d7696c2
Add missing boolean array to unknown value writers for xcontent (#71651)
This change adds the ability to call value on an XContentBuilder and consume a boolean[]. This was 
missing from the set of other writers for the unknown value call.
2021-04-13 12:49:14 -07:00
Przemyslaw Gomulka 45ef2ab63c
Make restApiVersion on XContentBuilder final (#70878)
When passing in restApiVersion during creation of XContentBuilder
it makes it more clear that this field is final.
This prevents accidental change of the version during the xcontent
creation.
The withCompatibleVersion method can also be removed, since the field
only needs to be set in constructor.

relates #51816
2021-03-29 10:28:39 +02:00
Przemyslaw Gomulka a54685cf48
Parsing: Validate that fields are not registered twice (#70243)
It is possible that a developer accidentally declares two parsers for the
same field name.
This commit introduces a validation to prevent that from happening.
2021-03-18 07:45:44 +01:00
Przemyslaw Gomulka 9ad9c781de
Add compatible logging when parsing a compatible field (#69539)
A #68808 introduced a possibility to declare fields which will be only available to parsing when a compatible API was used.

This commit replaces deprecated log with compatible logging when a 'compatible only' field was used. Also includes a refactoring of LoggingDeprecationHandler method names

relates #51816
2021-03-09 12:29:40 +01:00
Przemyslaw Gomulka 8d09fbf82b
Allow for field declaration for future rest versions (#69774)
When renaming/removing a field, a new field might be declared which
should be parseable starting with the current version.
This commit changes the way ParseField is declared for compatible
Version. Instead of concrete version a boolean function has to be used
to indicate for what version a field is parseable. The onOrAfter and
equalTo functions are declared on RestApiVersion to allow for
this.
2021-03-05 08:27:00 +01:00
Joe Gallo f2763edb2d
Additional renames of RestApiCompatibleVersion to RestApiVersion (#69913) 2021-03-03 13:56:03 -05:00
Joe Gallo 638735bbb9
Rename RestApiCompatibleVersion to RestApiVersion (#69897) 2021-03-03 12:17:48 -05:00
Przemyslaw Gomulka f22adc47d8
Refactor ObjectParser and CompatibleObjectParser to support REST Compatible API (#68808)
In order to support compatible fields when parsing XContent additional information has to be set during ParsedField declaration.
This commit adds a set of RestApiCompatibleVersion on a ParsedField in order to specify on which versions a field is supported. By default ParsedField is allowed to be parsed on both current and previous major versions.

ObjectParser - which is used for constructing objects using 'setters' - has a modified fieldParsersMap to be Map of Maps. with key being RestApiCompatibility. This allows to choose set of field-parsers as specified on a request.
Under RestApiCompatibility.minimumSupported key, there is a map that contains field-parsers for both previous and current versions.
Under RestApiCompatibility.current there will be only current versions field (compatible fields not a present)

ConstructingObjectParser - which is used for constructing objects using 'constructors' - is modified to contain a map of Version To constructorArgInfo , declarations of fields to be set on a constructor depending on a version

relates #51816
2021-02-16 11:33:11 +01:00
Przemyslaw Gomulka 71d43b598d
Refactor usage of compatible version (#68648)
Compatible API version is at the moment represented by both Version and
byte - representing a major version. This can lead to a confusion which
representation to use, as well as to incorrect assumptions that minor
versions are supported (with the use of Version.V_7_0_0)

Current usage of XContentParser.useCompatible is also not allowing to
define two compatible implementations. This is not about
support N-2 compatibility, but to allow to continue development when a
major release is performed.

This commit is introducing the CompatibleVersion object responsible for
wrapping around a major version of compatible API.

relates #68100
2021-02-10 10:22:34 +01:00
Rory Hunter 780f273067
Replace NOT operator with explicit `false` check - part 8 (#68625)
Part 8.

We have an in-house rule to compare explicitly against `false` instead
of using the logical not operator (`!`). However, this hasn't
historically been enforced, meaning that there are many violations in
the source at present.

We now have a Checkstyle rule that can detect these cases, but before we
can turn it on, we need to fix the existing violations. This is being
done over a series of PRs, since there are a lot to fix.
2021-02-08 15:20:34 +00:00
Mark Vieira a92a647b9f Update sources with new SSPL+Elastic-2.0 license headers
As per the new licensing change for Elasticsearch and Kibana this commit
moves existing Apache 2.0 licensed source code to the new dual license
SSPL+Elastic license 2.0. In addition, existing x-pack code now uses
the new version 2.0 of the Elastic license. Full changes include:

 - Updating LICENSE and NOTICE files throughout the code base, as well
   as those packaged in our published artifacts
 - Update IDE integration to now use the new license header on newly
   created source files
 - Remove references to the "OSS" distribution from our documentation
 - Update build time verification checks to no longer allow Apache 2.0
   license header in Elasticsearch source code
 - Replace all existing Apache 2.0 license headers for non-xpack code
   with updated header (vendored code with Apache 2.0 headers obviously
   remains the same).
 - Replace all Elastic license 1.0 headers with new 2.0 header in xpack.
2021-02-02 16:10:53 -08:00
Przemyslaw Gomulka 392d924304
Extend NamedObjectRegistry with compatible entries (#68274)
when parsing xContent objects named object can be registered and need to
be parsed as per N-1 rules.
named objects are declared on a parser (PARSER.declareNamedObject) and
the corresponding parsing logic is registered in NamedObjectRegistry.
There is a need to use an old N-1 "configuration" of the registry in
order to support the compatible parsing logic.

This commit extends the NamedObjectRegistry with additional set of
compatible entries, which provide the compatible parsing logic. Those
entries are then in turn used in parseNamedObject when xContentParser is
using compatibility.

relates #51816
2021-02-02 09:16:01 +01:00
Przemyslaw Gomulka 0c648a2459
Make XContentParser aware of compatible API version (#68113)
Making XContentParser aware of the compatible API version. This is a preparatory work for supporting compatible changes to named xcontent object parsing.
XContentParser#useCompatibleApi will not be used for surgical compatible implementations, it will be only used in the infrastructure to select a compatible namedxcontent registry. Therefore information about the exact major version is not needed.

relates #51816
2021-02-01 19:05:48 +01:00
Rory Hunter ad1f876daa
Replace NOT operator with explicit `false` check (#67817)
We have an in-house rule to compare explicitly against `false` instead
of using the logical not operator (`!`). However, this hasn't
historically been enforced, meaning that there are many violations in
the source at present.

We now have a Checkstyle rule that can detect these cases, but before we
can turn it on, we need to fix the existing violations. This is being
done over a series of PRs, since there are a lot to fix.
2021-01-26 14:47:09 +00:00
Przemyslaw Gomulka 8ce39ddc2b
Make ParsedMediaType truly immutable (#67552)
By accident ParsedMediaType was not truly immutable and it was possible
to modify other ParsedMediaType's by accident when building.
ParsedMediaType#parseMediaType(XContentType,Map) was modifying
ParsedMediaTypes within XContentType. This should never happen.

This commit is addressing this by making Map of parameters immutable.

closes #67545
2021-01-18 08:54:18 +01:00
Przemyslaw Gomulka 08b6bd1141
Precompute ParsedMediaType for XContentType (#67409)
XContentType instances should have a precomputed ParsedMediaType instances
that will prevent unnecessary parsing of its media type.

The unnecessary parsing was causing a performance drop and this commit addresses
that. The performance was affected because XContentBuilder is newly created per 
each request, which in case of bulk request can be many instances. 
Each XContentBuilder instance is using XContentType.mediaType which was parsed
into ParsedMediaType. That in bulk request is unnecessary as the content-type 
is already known - from XContent#type

By default when Accept or Content-Type headers are not specified, 
application/json; charset=UTF-8 is used on a response content-type. 
The redundant charset parameter is dropped in this commit as well.
2021-01-14 14:01:02 +01:00
Przemyslaw Gomulka 5e74f79e22
Support response content-type with versioned media type (#65500)
This commit allows returning a correct requested response content-type - it did not work for versioned media types.
It is done by adding new vendor specific instances to XContent and TextFormat enums. These instances can then "format" the response content type string when provided with parameters. This is similar to what SQL plugin does with its media types.

#51816
2021-01-05 09:23:22 +01:00
Rene Groeschke defaa93902
Avoid tasks materialized during configuration phase (#65922)
* Avoid tasks materialized during configuration phase
* Fix RestTestFromSnippet testRoot setup
2020-12-12 16:14:17 +01:00
Hendrik Muhs 9b47889153
[Transform] use ISO dates in output instead of epoch millis (#65584)
Transform writes dates as epoch millis, this does not work for historic data in some cases or is
unsupported. Dates should be written as such. With this PR transform starts writing dates in ISO
format, but as existing transform might rely on the format it provides backwards compatibility for
old jobs as well as a setting to write dates as epoch millis.

fixes #63787
2020-12-07 15:34:28 +01:00
Przemyslaw Gomulka 91df1b8edf
Introduce Compatible Version plugin (#64481)
A RestCompatibilityPlugin and its xpack implementation allow to calculate a version requested on REST request. It uses accept, content-type headers to return a version.
It also performs a validation of allowed combinations of versions and values provided on accept/content-type headers

relates #51816
2020-11-23 14:19:23 +01:00
Przemyslaw Gomulka 618d8bcec6
Allow registering compatible handlers (#64423)
Adding an infrastructure to allow for registration of Compatible Handlers.
Compatible handlers are RestHandlers used for handling rest request from old version clients ( CURRENT-1 version). They might be registered under an endpoint that was removed or changed in CURRENT version (different path, method or an endpoint completely removed).
But they also can be registered under the same endpoint (same path, method as the RestHandler in CURRENT)
RestHandler's endpoint is at the moment 2dimensional - a method and a path.

This PR adds a 3rd dimension - a version.

Registration:
RestHandler declares a new compatibleWithVersion method, which will be overridden by Compatible Handlers and returning a Version.CURRENT -1. By default the method returns Version.CURRENT
compatibleWithVersion is used when iterating over handlers within RestController#registerHandler. The returned value is used to set a version on MethodHandlers

Lookup:
An interface CompatibleVersion is introduced in order to abstract a logic to calculate a compatible version requested by a user.
It is not implemented in this PR. A simplified, always returning Version.CURRENT implementation is used.
Within RestController, a version is calculated with the use of CompatibleVersion, then the lookup for MethodHandlers is performed (the logic is the same)
Once it is find, an additional lookup for a RestHandler for requested version is made.

The requested version has to be also passed down to XContentBuilder in order to allow for per version serialisation logic

relates #51816
2020-11-16 09:11:24 +01:00
Rene Groeschke 810e7ff6b0
Move tasks in build scripts to task avoidance api (#64046)
- Some trivial cleanup on build scripts
- Change task referencing in build scripts to use task avoidance api
where replacement is trivial.
2020-11-12 12:04:15 +01:00
Przemyslaw Gomulka 0bb64cbe1e
Handle incorrect header values (#64708)
When Accept or Content-Type header values are incorrect, a request
should be gracefully rejected and an exception message returned to a
client.

relates #64689
relates #51816
2020-11-10 09:07:32 +01:00
Przemyslaw Gomulka dd5bcd4093
Ignore media ranges when parsing (#64721)
Browsers are sending media ranges with quality factors on Accept header.
We should ignore the value and respond with applicaiton/json

closes #64689
relates #51816
2020-11-09 09:03:18 +01:00
Przemyslaw Gomulka 710500a50a
Do not allow spaces within MediaType's parameters (#64650)
Per https://tools.ietf.org/html/rfc7231#section-3.1.1.1 MediaType can
have spaces around parameters pair, but do not allow spaces within
parameter. That means that parameter pair forbids spaces around =

follow up after
#64406 (comment)
relates #51816
2020-11-06 09:25:21 +01:00
Przemyslaw Gomulka c219e176e9
Introduce per REST endpoint media types (#64406)
Per REST endpoint media types declaration allows to make parsing/validation more strict.

If a media type was declared only in one endpoint (for instance CSV in SQL endpoint) it should not be allowed to parse that media type when using other endpoints.
However, the Compatible API need to be able to understand all media types supported by Elasticsearch in order to parse a compatible-with=version parameter.
This implies that endpoints need to declare which media type they support and how to parse them (if introducing new media types - like SQL).

How to parse:
MediaType interface still serves as an abstraction on top of XContentType and TextFormat. It also has a declaration of mappings String-MediaType with parameters. Parameters declares the names of parameters and regex to validate its values.
This instructs how to perform the parsing. For instance - XContentType.JSON has the mapping of application/vnd.elasticsearch+json -> JSON and allows parameters compatible-with=\d and charset=utf-8

MediaTypeParser was simplified into ParsedMediaType class with static factory method for parsing.

How to declare:
RestHandler interface is extended with a validAcceptMediaTypes which returns a MediaTypeRegistry - a class that encapsulates mappings of string (type/subtype) to MediaType, allowed parameters and formatPathParameter values.
We only need to allow of declaration of valid media types for Accept header. Content-Type valid media types are fixed to XContentType instances - json, yaml, smile, cbor.

relates #51816
2020-11-05 09:47:13 +01:00
Lee Hinman 7620e9415c
Add assert that raw and readable xcontent field names are different (#63332)
This adds asserts that will catch the case where we accidentally provide the same raw and readable
field name in xcontent.
2020-10-06 10:52:49 -06:00
Przemyslaw Gomulka d04edcdfe7
Allow parsing Content-Type and Accept headers with version (#61427)
Content-Type and Accept headers can be populated with a versioned form of media types
like application/vnd.elasticsearch+json;compatible-with=7
when previously it was simple application/json or (cbor, yaml..) - this is still supported.
Extending MediaTypeParser to validate the parameters.

relates  #51816
2020-10-05 15:12:48 +02:00
Przemyslaw Gomulka 86ba7324c8
Media-type parser (#61987)
Splitting method XContentType.fromMediaTypeOrFormat into two separate methods. This will help to validate media type provided in Accept or Content-Type headers.
Extract parsing logic from XContentType (fromMediaType and fromFormat methods) to a separate MediaTypeParser class. This will help reuse the same parsing logic for XContentType and TextFormat (used in sql)

`Media-Types type/subtype; parameters` parsing is in defined https://tools.ietf.org/html/rfc7231#section-3.1.1.1

part of  #61427
2020-09-17 16:47:32 +02:00
Armin Braun 8fc9581d35
Speed up XContent Collection Parsing (#61442)
1. Get rid of the capturing lambda on the hot path that inlines very badly
2. Remove as many bounds checks as possible, thereby reducing method size and improving inlining
2020-08-27 09:24:49 +02:00
Armin Braun 305bebfcad
Stop Needlessly Copying Bytes in XContent Parsing (#61447)
Wrapping a `BytesArray` in a `StreamInput` for deserialization is inefficient.
This forces Jackson to internally buffer (i.e. copy) all bytes from the `BytesArray`
before deserializing, adding overhead for copying the bytes and managing the buffers.

This commit fixes a number of spots where `BytesArray` is the most common type of
`BytesReference` to special case this type and parse it more efficiently.
Also improves parsing `String`s to use the more efficient direct `String` parsing APIs.
2020-08-24 15:03:21 +02:00
Armin Braun 82f040716d
Fix Broken Stream Close in writeRawValue (#60625)
Small oversight in #56078 that only showed up during backporting where a stream copy was turned from a non-closing to a closing one. Enhanced part of a test in this PR to make it show up in master also even though we practically never use this method with stream targets that actually close.
2020-08-04 12:46:15 +02:00
Armin Braun e28dbde289
Unify Stream Copy Buffer Usage (#56078)
We have various ways of copying between two streams and handling thread-local
buffers throughout the codebase. This commit unifies a number of them and
removes buffer allocations in many spots.
2020-08-03 18:21:24 +02:00
Yang Wang a28ce1e21c
Improve role cache efficiency for API key roles (#58156)
This PR ensure that same roles are cached only once even when they are from different API keys.
API key role descriptors and limited role descriptors are now saved in Authentication#metadata 
as raw bytes instead of deserialised Map<String, Object>. 
Hashes of these bytes are used as keys for API key roles. Only when the required role is not found 
in the cache, they will be deserialised to build the RoleDescriptors. The deserialisation is directly 
from raw bytes to RoleDescriptors without going through the current detour of 
"bytes -> Map -> bytes -> RoleDescriptors".
2020-07-13 21:23:23 +10:00
Przemysław Witek d39bde9c95
Simplify parser declarations when specialist types are stored in strings (#58996) 2020-07-06 12:15:42 +02:00
Rene Groeschke 9526c7a4b3
Replace compile configuration usage with api (#58451)
- Use java-library instead of plugin to allow api configuration usage
- Remove explicit references to runtime configurations in dependency declarations
- Make test runtime classpath input for testing convention
  - required as java library will by default not have build jar file
  - jar file is now explicit input of the task and gradle will ensure its properly build
2020-06-30 09:37:09 +02:00
Rene Groeschke 5f9d1f1d7c
Unify dependency licenses task configuration (#58116)
- Remove duplicate dependency configuration
- Use task avoidance api accross the build
- Remove redundant licensesCheck config
2020-06-17 18:27:16 +02:00