Commit Graph

6203 Commits

Author SHA1 Message Date
Alan Woodward f2ac4f9953
Avoid using external values in parent-join and percolator mappers (#71834)
We would like to remove the use of 'external values' in document parsing.
This commit simplifies two of the four places it is currently used, by adding
direct indexValue methods to BinaryFieldMapper and ParentIdFieldMapper.

Relates to #56063
2021-04-20 12:18:42 +01:00
Alan Woodward ee3510b766
Add index-time scripts to geo_point field mapper (#71861)
This commit adds the ability to define an index-time geo_point field
with a script parameter, allowing you to calculate points from other
values within the indexed document.
2021-04-20 10:24:25 +01:00
Jack Conradson 4d986757e0
Disallow stored scripts from runtime field contexts (#71863)
This change disallows stored scripts using a runtime field context. This adds a constructor parameter to 
script context as to whether or not a context is allowed to be a stored script.
2021-04-19 13:28:48 -07:00
Nhat Nguyen 46ada227dc
Expose dynamic_templates parameter in Ingest (#71716)
This change exposes the newly introduced parameter `dynamic_templates`
in ingest. This parameter can be set by a set processor or a script processor.

Relates #69948
2021-04-19 11:34:13 -04:00
Luca Cavanna ee5cd443c4
Unify supported runtime fields script contexts (#71833)
There's a few places where we need to access all of the supported runtime fields script contexts. Up until now we have listed them in all those places, but a better way would be to have them listed in one place and access that same list from all consumers. This is what this commit introduces.

Along with the introduction of runtime fields contexts in ScriptModule, we rename the whitelist files so that they contain their corresponding context name to simplify looking them up.
2021-04-19 17:23:59 +02:00
Dan Hermann eb345b2a8f
Deprecate legacy index template API endpoints (#71309) 2021-04-16 08:07:28 -05:00
Dan Hermann 60345ac181
Option to disable device type parsing in user agent processor (#71625) 2021-04-16 07:08:30 -05:00
Przemko Robakowski 2b81d729be
Remove assertion from DatabaseRegistry (#71764)
This change removes assertion from DatabaseRegistry - we can easily loose .geoip_databases index with persistent task state still in cluster state, this is not assertion failing, this is usual failure and should be signalled as one.

This also tries to fix packaging tests by avoiding duplicates in elasticsearch.yml.

Closes #71762
2021-04-15 21:44:42 +02:00
Przemko Robakowski 308aee283d
Update GeoIP processor documentation (#71211)
This PR adds documentation for GeoIPv2 auto-update feature.
It also changes related settings names from geoip.downloader.* to ingest.geoip.downloader to have the same convention as current setting.

Relates to #68920

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
2021-04-15 13:47:09 +02:00
Przemko Robakowski 39eb12a972
Enable GeoIP downloader by default (#71505)
This change enables GeoIP downloader by default.
It removes feature flag but adds flag that is used by tests to disable it again (as we don't want to hammer GeoIP database service with every test cluster we spin up).

Relates to #68920
2021-04-15 12:28:37 +02:00
Jack Conradson 301dcb64aa
Add Runtime Fields Contexts to Painless Execute API (#71374)
This change adds support for the 7 different runtime fields contexts to the Painless Execute API. Each 
context can accept the standard script input (source and params) along with a user-defined document 
and an index name to pull mappings from. The results depend on the output of the runtime field 
type.

Closes #70467
2021-04-14 08:56:56 -07:00
Andrew Stucki c102566a64
Network direction processor supports dynamic internal networks specification (#68712) 2021-04-14 08:13:42 -05:00
Nhat Nguyen a461597c75
Upgrade to Lucene 8.8.2 on 8.0 (#71587) 2021-04-14 08:52:23 -04:00
Andrew Stucki f491af65c0
Registered domain processor (#67611) 2021-04-14 07:18:03 -05:00
Alan Woodward 05551dd77b
Add index-time scripts to date field mapper (#71633)
This commit allows you to set 'script' and 'on_script_error' parameters
on date field mappers, meaning that runtime date fields can be made indexed
simply by moving their definitions from the runtime section of the mappings
to the properties section.
2021-04-14 09:18:05 +01:00
Mark Vieira 50d6e1369a Mute Netty4HeadBodyIsEmptyIT.testTemplateExists 2021-04-13 18:00:01 -07:00
Lyudmila Fokina 3b0b7941ae
Warn users if security is implicitly disabled (#70114)
* Warn users if security is implicitly disabled

Elasticsearch has security features implicitly disabled by default for
Basic and Trial licenses, unless explicitly set in the configuration
file.
This may be good for onboarding, but it also lead to unintended insecure
 clusters.
 This change introduces clear warnings when security features are
 implicitly disabled.
 - a warning header in each REST response if security is implicitly
 disabled;
 - a log message during cluster boot.
2021-04-13 18:33:41 +02:00
Przemko Robakowski 46efa6ad04
Fix problems in GeoIPv2 code (#71598)
This change fixes number of problems in GeoIPv2 code:

- closes streams from Files.list in GeoIpCli, which should fix tests on Windows
- makes sure that total download time in GeoIP stats is non-negative (we serialize it as vInt which can cause problems with negative numbers and it can happen when clock was changed during operation)
- fixes handling of failed/simultaneous downloads, #69951 was meant as a way to prevent 2 persistent tasks to index chunks but it would prevent any update if single download failed mid indexing, this change uses timestamp (lastUpdate) as sort of UUID. This should still prevent 2 tasks to step on each other toes (overwriting chunks) but in the end still only single task should be able to update task state (this is handled by persistent tasks framework)
Closes #71145
2021-04-13 17:10:45 +02:00
Rory Hunter fb1921c9dc
Change deprecation logs data stream name (#68737)
More fixes to deprecation log indexing so that the data stream name and document
contents are more ECS-compatible.
2021-04-13 15:11:57 +01:00
Alan Woodward 67db2538f8
Add index-time scripts to IP field mapper (#71617)
This commit allows you to set 'script' and 'on_script_error' parameters
on IP field mappers, meaning that runtime IP fields can be made indexed
simply by moving their definitions from the runtime section of the mappings
to the properties section.
2021-04-13 13:40:10 +01:00
Dan Hermann 502ff027a8
URI parts processor handles URLs containing spaces (#71559) 2021-04-13 07:35:37 -05:00
Martijn van Groningen a478b5ff72
Fix GeoIpProcessorNonIngestNodeIT#testLazyLoading(...) (#71579)
Ensure that the index request is routed to the ingest,
so that the lazy loading occurs of geoip database
on ingest node (which is what is asserted later on)
Otherwise the database is lazy loaded on a different node.

(without this fix, this test fails reproducible with
`-Dtests.seed=2E234CC71CE96F4F`)

Closes #71251
2021-04-13 10:01:52 +02:00
Jack Conradson 597b8b4519
Remove loop counter for foreach loops (#71602)
This changes remove the loop counter for for each loops which is a regression from 7.9.3 
documented in #71584. Not loop counting for each loops should be relatively safe as they have a 
natural ending point. We can consider adding the check back in once we have configurable loop 
counters.
2021-04-12 16:35:50 -07:00
Alan Woodward 5e11709693
Add scripts to keyword field mapper (#71555)
This commit adds script and on_script_error parameters to
keyword field mappers, allowing you to define index-time scripts
for keyword fields.
2021-04-12 16:46:02 +01:00
Luca Cavanna 1469e18c98
Add support for script parameter to boolean field mapper (#71454)
Relates to #68984
2021-04-12 10:04:12 +02:00
Przemko Robakowski 44a2ae4893
Add GeoIP CLI integration test (#71381)
This change adds additional test to GeoIpDownloaderIT which tests that artifacts produces by GeoIP CLI tool can be consumed by cluster the same way as from our original service.
It does so by running the tool from fixture which then simply serves the generated files (this is exactly the way users are supposed to use the tool as well).

Relates to #68920
2021-04-08 12:49:29 +02:00
Stuart Tettemer 9d622e52e4
Script: User tree ToXContent (#70893)
Adds ability to serialize the user tree to an
XContent builder, handling all user tree decorations.

To avoid creating a tight dependency, the `ToXContent`
implementation is kept outside the relevant nodes.

Uses a wrapper around the `XContentBuilder` to change
checked `IOExceptions` into runtime exceptions to conform
to UserTreeVisitor API contract.

Adds new debugger method, `Debugger.phases` which
allows the caller to attach visitors at the three main phases
of Painless compilation.
2021-04-07 12:49:39 -05:00
Francisco Fernández Castaño e6894960f4
Include URLHttpClientIOException on URLBlobContainerRetriesTests testReadBlobWithReadTimeouts (#71318)
In some scenarios where the read timeout is too tight it's possible
that the http request times out before the response headers have
been received, in that case an URLHttpClientIOException is thrown.
This commit adds that exception type to the expected set of read timeout
exceptions.

Closes #70931
2021-04-06 14:58:57 +02:00
David Turner b690798348
Reduce size of MANAGEMENT threadpool on small node (#71171)
Today by default the `MANAGEMENT` threadpool always permits 5 threads
even if the node has a single CPU, which unfairly prioritises management
activities on small nodes. With this commit we limit the size of this
threadpool to the number of processors if less than 5.

Relates #70435
2021-04-06 12:58:07 +01:00
Christoph Büscher a413ae67e3
Propagate index errors in field_caps (#70245)
Currently we don't report any exceptions occuring during field_caps requests back to the user.
This PR adds a new failure section to the response which contains exceptions per index. 
In addition the response contains another field, `failed_indices`, with the number of indices that threw
an exception. If all of the requested indices fail, the whole request fails, otherwise the request succeeds 
and it is up to the caller to check for potential errors in the response body.

Closes #68994
2021-04-06 12:02:24 +02:00
Stuart Tettemer 44dc1af04b
Script: whitelist CIDR api (#71258)
* Script: whitelist CIDR api

Exposes the CIDR convenience class in painless.

API:
```
class CIDR {
  CIDR(String)
  boolean contains(String)
}
```

```
CIDR c = new CIDR('10.1.1.0/25');
c.contains('10.1.1.127'); // true
c.contains('10.1.1.129'); // false

c = new CIDR('2001:0db8:85a3::/64');
c.contains('2001:0db8:85a3:0000:0000:8a2e:0370:7334'); // true
c.contains('2001:0db8:85a3:0001:0000:8a2e:0370:7334'); // false

c.contains(null); // false
c.contains(''); // false
```

Closes: #60668
2021-04-05 09:44:53 -05:00
Benjamin Trent 9b1ef4982d
Muting test for issue #71251 (#71253) 2021-04-02 13:39:30 -04:00
Jake Landis 279fde375e
Apply REST API compatibility testing for the :modules (#71137) 2021-04-02 11:20:54 -05:00
Jason Tedor 32314493a2
Pass override settings when creating test cluster (#71203)
Today when creating an internal test cluster, we allow the test to
supply the node settings that are applied. The extension point to
provide these settings has a single integer parameter, indicating the
index (zero-based) of the node being constructed. This allows the test
to make some decisions about the settings to return, but it is too
simplistic. For example, imagine a test that wants to provide a setting,
but some values for that setting are not valid on non-data nodes. Since
the only information the test has about the node being constructed is
its index, it does not have sufficient information to determine if the
node being constructed is a non-data node or not, since this is done by
the test framework externally by overriding the final settings with
specific settings that dicate the roles of the node. This commit changes
the test framework so that the test has information about what settings
are going to be overriden by the test framework after the test provide
its test-specific settings. This allows the test to make informed
decisions about what values it can return to the test framework.
2021-04-02 10:20:36 -04:00
Jason Tedor a5a5278954
Remove legacy role settings (#71163)
This commit removes the previously deprecated legacy role
settings. These settings have been replaced by node.roles.
2021-04-01 19:31:55 -04:00
Dan Hermann 370ce5d516
Exclude invalid url-encoded strings from randomized tests (#71085) 2021-04-01 11:11:55 -05:00
Przemko Robakowski 61fe14565a
Add tool for preparing local GeoIp database service (#71018)
Air-gapped environments can't simply use GeoIp database service provided by Infra, so they have to either use proxy or recreate similar service themselves.
This PR adds tool to make this process easier. Basic workflow is:

download databases from MaxMind site to single directory (either .mmdb files or gzipped tarballs with .tgz suffix)
run the tool with $ES_PATH/bin/elasticsearch-geoip -s directory/to/use [-t target/directory]
serve static files from that directory (for example with docker run -v directory/to/use:/usr/share/nginx/html:ro nginx
use server above as endpoint for GeoIpDownloader (geoip.downloader.endpoint setting)
to update new databases simply put new files in directory and run the tool again
This change also adds support for relative paths in overview json because the cli tool doesn't know about the address it would be served under.

Relates to #68920
2021-03-31 12:30:21 +02:00
Alan Woodward 1653f2fe91
Add script parameter to long and double field mappers (#69531)
This commit adds a script parameter to long and double fields that makes
it possible to calculate a value for these fields at index time. It uses the same
script context as the equivalent runtime fields, and allows for multiple index-time
scripted fields to cross-refer while still checking for indirection loops.
2021-03-31 11:14:11 +01:00
Mark Vieira 9dd82f8200 Format JVM option properly 2021-03-30 17:41:59 -07:00
Mark Vieira e707a6372b Workaround for running old ES fixture on certain aarch64 systems 2021-03-30 16:37:31 -07:00
Shan Swanlow 1b60ec05a9
Add CIDR processing class (#69630)
Adapted logic from CIDRUtils to create a new class for CIDR processing.

Relates #60668
2021-03-30 10:00:45 -05:00
Przemko Robakowski f1feece422
Fix DatabaseRegistry for Windows (#71008)
In DatabaseRegistry we tried to replace file that was still open. This is not a problem under Linux and MacOS but Windows doesn't like it.
It was caught by our CI with reproducible failures when WindowsFS was set up by Lucene.
Now we skip one temp file and use GzipInputStream directly which fixes this problem.

Marking as non-issue since the code was not released yet.

Closes #70977
Closes #71006
2021-03-29 23:57:57 +02:00
Shahzad f7efa3eaba
Extract device type from user agent info (#69322) 2021-03-29 16:34:53 -05:00
Mark Vieira d407a9699e Mute DatabaseRegistryTests.testCheckDatabases 2021-03-29 11:12:50 -07:00
Alan Woodward c475fd9e8a
Move runtime fields classes into common packages (#70965)
Runtime fields currently live in their own java package. This is really
a leftover from when they were in their own module; now that they are
in core they should instead live in the common packages for classes of
their kind.

This commit makes the following moves:
org.elasticsearch.runtimefields.mapper => org.elasticsearch.index.mapper
org.elasticsearch.runtimefields.fielddata => org.elasticsearch.index.fielddata
org.elasticsearch.runtimefields.query => org.elasticsearch.search.runtime

The XFieldScript fields are moved out of the `mapper` package into 
org.elasticsearch.scripts, and the `PARSE_FROM_SOURCE` default scripts
are moved from these Script classes directly into the field type classes that
use them.
2021-03-29 12:02:01 +01:00
Przemko Robakowski b025f51ece
Add support for .tgz files in GeoIpDownloader (#70725)
We have to ship COPYRIGHT.txt and LICENSE.txt files alongside .mmdb files for legal compliance. Infra will pack these in single .tgz (gzipped tar) archive provided by GeoIP databases service.
This change adds support for that format to GeoIpDownloader and DatabaseRegistry
2021-03-29 12:46:27 +02:00
Mark Vieira 6339691fe3
Consolidate REST API specifications and publish under Apache 2.0 license (#70036) 2021-03-26 16:20:14 -07:00
Mayya Sharipova ccfdbb4d15
Fix binary docvalue_fields with padding (#70826)
Previously docvalue_fields for binary values with paddings did not
output padding. We consider it to be a bug because: 1) es would
not be able parse these values 2) output from source filtering
and fields API is different and does output padding.

This patches fixes this by outputing padding for binary
docvalue_fields where it is present.
2021-03-26 16:18:20 -04:00
Alan Woodward 19da36ab86
Remove MappedFieldType#setEagerGlobalOrdinals (#70920)
This is the only remaining setter on MappedFieldType, and removing
it makes the base class entirely final. We now only override the
eagerGlobalOrdinals method on types that actually support it.
2021-03-26 17:03:29 +00:00
Francisco Fernández Castaño 3f8a9256ea
Add searchable snapshots integration tests for URL repositories (#70709)
Relates #69521
2021-03-26 15:23:44 +01:00
Dan Hermann eb7d8beb0c
Fix typo in validation for destination port (#70883) 2021-03-26 07:27:03 -05:00
Nik Everett 91c700bd99
Super randomized tests for fetch fields API (#70278)
We've had a few bugs in the fields API where is doesn't behave like we'd
expect. Typically this happens because it isn't obvious what we expct. So
we'll try and use randomized testing to ferret out what we want. This adds
a test for most field types that asserts that `fields` works similarly
to `docvalues_fields`. We expect this to be true for most fields.

It does so by forcing all subclasses of `MapperTestCase` to define a
method that makes random values. It declares a few other hooks that
subclasses can override to further randomize the test.

We skip the test for a few field types that don't have doc values:
* `annotated_text`
* `completion`
* `search_as_you_type`
* `text`
We should come up with some way to test these without doc values, even
if it isn't as nice. But that is a problem for another time, I think.

We skip the test for a few more types just because I wanted to cut this
PR in half so we could get to reviewing it earlier. We'll get to those
in a follow up change.

I've filed a few bugs for things that are inconsistent with
`docvalues_fields`. Typically that means that we have to limit the
random values that we generate to those that *do* round trip properly.
2021-03-24 14:16:27 -04:00
Martijn van Groningen be38621ab3
Improve GeoIpDownloaderIT test case (#70640)
GeoIpDownloaderIT should be able to reuse test clusters and
run tests against a test cluster with multiple nodes.
2021-03-24 13:58:46 +01:00
Przemko Robakowski 11bd61d059
Fix GeoIpDownloaderStatsIT.testStats (#70809)
Node in GeoIpStats response can have no databases field if there are no databases yet downloaded to that node. We have to check if the key is there before processing it to avoid NPE.

Fixes #70789
2021-03-24 13:37:33 +01:00
Dan Hermann 1117915a9d
Remove obsolete BWC checks for ingest (#70779) 2021-03-24 07:23:21 -05:00
Przemko Robakowski f5b7aad8b7
Add stats endpoint to GeoIpDownloader (#70282)
This change adds _geoip/stats endpoint that can be used to collect basic data about geoip downloader (successful, failed and skipped downloads, current db count and total time spent downloading).
It also fixes missing/wrong origins for clients that will break if used with security.

Relates to #68920
2021-03-23 14:34:32 +01:00
Dan Hermann 2be24f0bb7
MurmurHash3 support for fingerprint processor (#70632) 2021-03-23 08:17:29 -05:00
Luca Cavanna edb42690bc
Split RuntimeFieldType from corresponding MappedFieldType (#70695)
So far the runtime section supports only leaf field types, hence the internal representation is based on `RuntimeFieldType` that extends directly `MappedFieldType`. This is straightforward but it is limiting for e.g. an alias field that points to another field, or for object fields that are not queryable directly, hence should not be a MappedFieldType, yet their subfields do.

This commit makes `RuntimeFieldType` an interface, effectively splitting the definition of a runtime fields as defined and returned in the mappings, from its internal representation in terms of `MappedFieldType`.

The existing runtime script field types still extend `MappedFieldType` and now also implement the new interface, which makes the change rather simple.
2021-03-23 10:57:44 +01:00
Jack Conradson 8092b4e01b
Add a new ANTLR lexer for Painless suggestions (#70517)
This adds a new lexer for Painless suggestions that is different from the standard Painless lexer in that is supports types within the lexer from a PainlessLookup. This is possible because a PainlessLookup is always required to generate appropriate suggestions. This also makes the required state machines coming up much simpler to implement.
2021-03-18 14:02:29 -07:00
Stuart Tettemer d599adc8a8
Script: Always dup new objects (#70479)
A script that creates a new object but doesn't use it,
such as by assigning it to a variable, would fail to compile
with an `ArrayIndexOutOfBoundsException`.

`new ArrayList(); return 1;` is an example that
currently fails.

This is because we do not `dup` the result of
`new` if the new object if the object is unread.

This is a problem because we always `pop` the 
operand stack at the end of a statement (see ASM 
phase's `visitStatement`).  The number of `pop`s 
depend on the type of expression in the statement.

A new object needs to be `pop`ed once.
However, the operand stack is empty if the
new object is not read in the statement.

This changes always `dup`s the result of `new` 
so `visitStatement` has something to `pop`.

Fixes: #70478
2021-03-18 12:03:34 -05:00
Francisco Fernández Castaño b1c4cb4451
Take into account range start to compute the current stream end on url repositories. (#70509)
Closes #70310
2021-03-18 15:44:03 +01:00
Dan Hermann 2c3f297090
Convert processor supports validation of IPv4/IPv6 addresses (#69989) 2021-03-18 01:19:14 -05:00
Martijn van Groningen 3d3ec5c4fe
Take the node id into account when creating geoip tmp dir. (#70462)
This change adjust where the geoip tmp directory is created
to avoid issues when running multiple nodes on the same machine.

In the java tmp dir, a 'geoip-databases' directory is created and
directly under this directory a directory with the node id as name is created.
This allows safely running multiple nodes on the same machine (this
happens mainly during tests).

Closes #69972
Relates to #68920
2021-03-17 13:20:56 +01:00
David Kyle 05637bc713
Mute URLBlobContainerRetriesTests::testReadRangeBlobWithRetries (#70374) 2021-03-15 11:41:55 +00:00
Przemyslaw Gomulka 91956145f0
[Rest Compatible Api] update and delete by query using size field (#69606)
A follow up after #69037 which added back size field for reindex api.
The original PR #43373 also removed the size field from update by query and delete by query APIs.
This commits allow to use size field with Compatible API for update_by_query and delete_by_query apis
2021-03-15 09:17:53 +01:00
Dan Hermann 79689fd899
Fix failing SetProcessorTests.testCopyFromOtherField test (#70150) 2021-03-12 06:19:07 -06:00
Przemko Robakowski 1787d7f988
Make GeoIp tests more robust (#70305)
This change modifies GeoIpDownloaderIT to wait in assertBusy despite of error (by wrapping whole body in try-catch) and adds additional assertion to debug failures tracked in #69594
2021-03-11 13:02:23 +01:00
Rory Hunter d181b947c2
Remove depth limit from checkstyle negation rule (#70274)
The Checkstyle rule that bans unary negation in favour of an explicit
`== false` has a `maximumDepth` of 2 configured, which meant that it
didn't catch all violations. The `maximumDepth` isn't required (actually
it has a really high default), so this change removes the limit and
fixes the resulting violations.
2021-03-10 22:06:50 +00:00
Mayya Sharipova 1de0b616eb
Add positive_score_impact to rank_features type (#69994)
rank_features field type misses positive_score_impact parameter
that rank_feature type has. This adds this parameter.

Closes #68619
2021-03-10 14:55:54 -05:00
Dan Hermann 32739ce2dc
Fix handling of non-integer port values in community_id processor (#70148) 2021-03-10 07:57:22 -06:00
Martijn van Groningen 9b0ee0d14f
Muted GeoIpDownloaderIT#testUseGeoIpProcessorWithDownloadedDBs(...) test,
see #69972
2021-03-10 14:13:01 +01:00
Martijn van Groningen 71c3854d76
Fix GeoIpDownloaderIT#testUseGeoIpProcessorWithDownloadedDBs(...) test (#70215)
The test failure looks legit, because there is a possibility that the same databases
was downloaded twice. See added comment in DatabaseRegistry class.

Relates to #69972
2021-03-10 13:30:25 +01:00
Martijn van Groningen 0c82c4c789
Fix DatabaseRegistryTests (#70180)
This test predefined expected md5 hashes in constants, that were expected with java15.
However java16 creates different md5 hashes and so the expected md5 hashes don't match
with the actual md5 hashes, which caused tests in this test suite to fail (running
with java16 only).

The tests now generates the expected md5 hash during the test instead of using predefined constants.

Closes #69986
2021-03-10 11:37:05 +01:00
Jim Ferenczi ff50da5a77
Remove the _parent_join metadata field (#70143)
This commit removes the metadata field _parent_join
that was needed to ensure that only one join field is used in a mapping.
It is replaced with a validation at the field level.
This change also fixes in [bug](https://github.com/elastic/kibana/issues/92960) in the handling of parent join fields in _field_caps.
This metadata field throws an unexpected exception in [7.11](https://github.com/elastic/elasticsearch/pull/63878)
when checking if the field is aggregatable.
That's now fixed since this unused field has been removed.
2021-03-10 09:19:30 +01:00
Martijn van Groningen e3a375e279
Fix ReloadingDatabasesWhilePerformingGeoLookupsIT (#70163)
Wait for ingest threads to stop using the DatabaseReaderLazyLoader, so the during the next run the db update thread doesn't try to remove the db again (because the file hasn't yet been deleted).

Also delete tmp dirs this test create at the end of the test, so that when repeating this test many times, this test doesn't accumulate many directories with database files.

Closes #69980
2021-03-09 21:13:32 +01:00
Alan Woodward 139ff8657a
Require `meta` field for MappedFieldType to be non-null (#70145)
The transport action for FieldCapabilities assumes the meta field for a MappedFieldType
is traversable. This commit adds a requirement to MappedFieldType itself to ensure that
it is implemented for all subtypes.
2021-03-09 15:40:03 +00:00
Przemko Robakowski 2950308be9
Fix clean up of old entries in DatabaseRegistry.initialize (#70135)
This change switches clean up in DatabaseRegistry.initialize from using Files.walk and stream operations to Files.walkFileTree which can be made more robust in case of errors
2021-03-09 13:54:45 +01:00
Przemko Robakowski dec44d067b
Fix GeoIpDwonloaderExecutor.setEnabled (#70111)
When geoip.downloader.enabled setting changes we should try to start/stop geo ip task from single node only- other requests will definitely fail.
This change also extends timeout in GeoIpDownloaderIT as current short one fails sometimes in CI
2021-03-08 22:28:41 +01:00
Francisco Fernández Castaño ae5308c638
Add support for range reads and retries to URL repositories (#69521) 2021-03-08 13:14:12 +01:00
David Turner 60d53c0206
Stop double-starting transport service in tests (#70056)
Today in tests we often use a utility method that creates and starts a
transport service, and then we start it again in the tests anyway. This
commit removes this unnecessary code and asserts that we only ever call
`TransportService#acceptIncomingRequests` once.
2021-03-08 11:04:43 +00:00
Martijn van Groningen 22c63aa711
Muted a few DatabaseRegistryTests tests, see #69986 2021-03-04 19:03:00 +01:00
Dan Hermann be917d4a09
Test for final pipelines when target index is changed (#69457) 2021-03-04 10:40:50 -06:00
Martijn van Groningen d5995c5922
Mute ReloadingDatabasesWhilePerformingGeoLookupsIT#test(), see #69980 2021-03-04 17:00:30 +01:00
Martijn van Groningen 9fb707a08a
Muted GeoIpDownloaderIT#testUseGeoIpProcessorWithDownloadedDBs() test, see #69972 2021-03-04 15:56:31 +01:00
Martijn van Groningen 6c35c25081
Add DatabaseRegistry for locally managing databases managed by GeoIpDownloader (#69540)
This component is responsible for making the databases maintained by GeoIpDownloader
available for ingest processors.

Also provided a lookup mechanism for geoip processors with fallback to {@link LocalDatabases}.
All databases are downloaded into a geoip tmp directory, which is created at node startup.

The following high level steps are executed after each cluster state update:
1) Check which databases are available in {@link GeoIpTaskState},
   which is part of the geoip downloader persistent task.
2) For each database check whether the databases have changed
   by comparing the local and remote md5 hash or are locally missing.
3) For each database identified in step 2 start downloading the database
   chunks. Each chunks is appended to a tmp file (inside geoip tmp dir) and
   after all chunks have been downloaded, the database is uncompressed and
   renamed to the final filename. After this the database is loaded and
   if there is an old instance of this database then that is closed.
4) Cleanup locally loaded databases that are no longer mentioned in {@link GeoIpTaskState}.

Relates to #68920
2021-03-04 15:01:13 +01:00
Stuart Tettemer 9370b1c006
Script: no compile rate limit for ingest templates (#69841)
* Script: no compile rate limit for ingest templates

Remove the compilation rate limit for ingest templates.

Creates a new context, `ingest_template`, with an unlimited
compilation rate limit.

The `template` context is used in many places so it cannot be
exempted from rate limits.

Fixes: #64595
2021-03-04 07:58:34 -06:00
Przemko Robakowski d683ee12e0
Use OpType.CREATE in GeoIpDownloader (#69951)
When indexing new chunks in GeoIpDwonloader we should never have to overwrite old chunks, if we try to that means that there are 2 simultaneous executions. This change forces one of them to throw error in such case.
2021-03-04 10:53:17 +01:00
Nik Everett 10e2f90560
Speed up aggs with sub-aggregations (#69806)
This allows many of the optimizations added in #63643 and #68871 to run
on aggregations with sub-aggregations. This should:
* Speed up `terms` aggregations on fields with less than 1000 values that
  also have sub-aggregations. Locally I see 2 second searches run in 1.2
  seconds.
* Applies that same speedup to `range` and `date_histogram` aggregations but
  it feels less impressive because the point range queries are a little
  slower to get up and go.
* Massively speed up `filters` aggregations with sub-aggregations that
  don't have a `parent` aggregation or collect "other" buckets. Also
  save a ton of memory while collecting them.
2021-03-03 18:04:47 -05:00
Dan Hermann 0c7e9d891d
Mute failing SetProcessorTests.testCopyFromOtherField (#69877) 2021-03-03 08:18:05 -06:00
Dan Hermann 926dc4f65b
Set processor's copy_from should deep copy non-primitive mutable types (#69349) 2021-03-03 07:09:32 -06:00
Przemko Robakowski 02dbe33780
Update GeoIP database service URL (#69862)
This change updates GeoIP database service URL to the new https://geoip.elastic.co/v1/database and removes (now optional) key/UUID parameter.
It also fixes geoip-fixture to provide 3 different test databases (City, Country and ASN).
It also unmutes GeoIpDownloaderIT. testGeoIpDatabasesDownload with additional logging and increased timeouts which tries to address #69594
2021-03-03 14:02:34 +01:00
Yang Wang c0fc847883 Add comment to explain the extra test check for FIPS 2021-03-03 12:48:26 +11:00
Martijn van Groningen 32dc9f5a21
Change reloading databases while performing ingest test (#69794)
to reload dbs via the `LocalDataBases#updateDatabase(...)`
method instead of relying on picking up changes from file system.

Closes #69475
2021-03-02 15:51:56 +01:00
Luca Cavanna 7ee55a7232
Move runtime fields yaml tests to runtime-fields-common module (#69790)
The runtime fields yaml tests are part of the x-pack yaml test suite, but the runtime fields code has been moved to server. This commit moves the yaml tests to the runtime-fields-common module. The reason why all the tests are moved to the module (rather than only the grok and dissect one which the module provides) is that painless is not available in the ordinary yaml tests and these tests rely on painless heavily.

The only tests that are left behind are the telemetry ones, as telemetry is the last bit that needs to be migrated.
2021-03-02 15:19:12 +01:00
Luca Cavanna a46977f8cd
Move grok and dissect runtime fields to specific module (#69673)
As a follow-up of moving runtime fields to server, we'd like to remove the xpack plugin portions that are left. One part of this is the grok and dissect implementation which depends on grok and dissect libraries that painless does not have available. A new runtime-fields-common module is created to hold their implementations and plug in the necessary whitelists.
2021-03-02 10:08:24 +01:00
Jay Modi 1487a5a991
Introduce system index types including external (#68919)
This commit introduces system index types that will be used to
differentiate behavior. Previously system indices were all treated the
same regardless of whether they belonged to Elasticsearch, a stack
component, or one of our solutions. Upon further discussion and
analysis this decision was not in the best interest of the various
teams and instead a new type of system index was needed. These system
indices will be referred to as external system indices. Within external
system indices, an option exists for these indices to be managed by
Elasticsearch or to be managed by the external product.

In order to represent this within Elasticsearch, each system index will
have a type and this type will be used to control behavior.

Closes #67383
2021-03-01 10:38:53 -07:00
Armin Braun bb77ab46e0
Stop Ignoring Exceptions on Close in Network Code (#69665)
We should not be ignoring and suppressing exceptions on releasing
network resources quietly in these spots.

Co-authored-by: David Turner <david.turner@elastic.co>
2021-03-01 14:38:18 +01:00
Armin Braun 4ce162a3dc
Forbid ActionListener#onFailure to Throw in more Places (#69420)
Same assertion we have in `.map` we can extend to more places
with recent changes to prevent unexpected listener behavior going forward.
2021-02-27 21:21:09 +01:00
Martijn van Groningen 2fd633791e
Muted test see #69475 2021-02-25 19:55:51 +01:00
Christoph Büscher c67b2384fe
Make keyword_marker filter updateable (#65457)
Currently we don't allow `keyword_marker` filter file resources to be reloaded via the
`_reload_search_analyzers` API. It would make sense to allow reloading this when the
file content has changed to allow e.g. for updating stemmer exeption rules at search time
without having to close and re-open the index in question. This change adds the updateable
flag to this token filter in the same way it is used for synonym filters. Analyzers containing
updateable keyword_marker filters would not be allowed to be used at index time but at
search time only, similar to what we allow for synonym filters.

Closes #65355
2021-02-25 16:44:11 +01:00
Martijn van Groningen 69884aded9
Overwrite database with different file in test. (#69598)
Forward port of #69525 to master branch.

When running with java 8 runtime, when overwriting a db file with the same content in a short time windown,
the change isn't detected and no reload happens which makes the test fail.
This change overwrites files using different source files.

Closes #69475
2021-02-25 14:56:48 +01:00
Henning Andersen 5147370de7
Mute testGeoIpDatabasesDownload (#69595) 2021-02-25 13:14:13 +01:00
Dan Hermann e52d60f848
Add missing tests for ingest processors (#69438) 2021-02-24 08:30:05 -06:00
Przemko Robakowski a64f4dbedf
Adjust GeoIpTaskParams and GeoIpTaskState versions (#69519)
This change adjust versions in GeoIpTaskParams and GeoIpTaskState after backporting #68424 to 7.x
2021-02-24 12:02:29 +01:00
Przemko Robakowski 6e6d5a29ee
Add ToS query parameter to GeoIP downloader (#69495)
This change adds query parameter confirming that we accept ToS of GeoIP database service provided by Infra.
It also changes integration test to use lower timeout when using local fixture.

Relates to #68920
2021-02-24 10:54:54 +01:00
Przemko Robakowski 2d6ee88ad3
Fix GeoIpProcessorNonIngestNodeIT.testLazyLoading (#69499)
This change disables GeoIP downlaoder in GeoIpProcessorNonIngestNodeIT which should fix failure from #69496
It also adds clean up in GeoIpDownloaderIT which disables downloader after test which should prevent similar failures in that class.

Closes #69496
2021-02-24 09:11:11 +01:00
Yang Wang 9711975619
[Test] Remove some watcher indices from comparison (#69497)
Since #67588, .triggered_watches and .watches indices are no longer created on node startup. This PR removing them from the warnings for comparison.
2021-02-24 14:25:59 +11:00
Mark Vieira 4e429afb4b
Simplify test fixture usage in ingest-geoip project (#69488) 2021-02-23 13:34:41 -08:00
Martijn van Groningen 9a18c42d34
Lower resource reload interval in geoip tests (#69480)
Lower resource reload interval setting for tests that reload config maxmind databases.

Closes #69475
2021-02-23 20:17:51 +01:00
Przemko Robakowski 2ba3e929e7
GeoIP database downloader (#68424)
This change adds component that will download new GeoIP databases from infra service
New databases are downloaded in chunks and stored in .geoip_databases index
Downloads are verified against MD5 checksum provided by the server
Current state of all stored databases is stored in cluster state in persistent task state

Relates to #68920
2021-02-23 19:41:18 +01:00
Joe Gallo f3aac00f5d
Apply yaml-rest-compat-test to rest-api-spec (#69462) 2021-02-23 13:15:53 -05:00
Martijn van Groningen 683a14c504
Allow custom geoip databases to be updated at runtime. (#68901)
Custom geoip databases can be provided via the config/ingest-geoip directory,
which are loaded at node startup time. This change adds the functionality
that reloads custom databases at runtime.

Added reference counting when getting a DatabaseReaderLazyLoader instance,
this to avoid closing a maxmind db reader while it is still being used.
There is a small window of time where this might happen during a database update.

A DatabaseReaderLazyLoader instance (which wraps a Maxmind db reader) from config database directory:
* Is closed immediately if there are no usages of it by any geoip processor instance as part of the database reload.
* When there are usages, then it is not closed immediately after a database reload. It is closed by the caller that did the last geoip lookup using this DatabaseReaderLazyLoader instance.
2021-02-23 15:16:52 +01:00
Dan Hermann adf012b847
Consolidate ingest processors into single module (#69121) 2021-02-23 07:35:50 -06:00
Armin Braun ad9760325c
Some Enhancements to ActionListener (#69103)
This adds best effort `toString` rendering the various wrapping
action listeners to make `TRACE` logging, that will currently only print the
top level listener `toString` which isn't helpful to find the original of a listener
in case of wrapped listeners, more useful (e.g. when logging rejected executions).
Also this change makes the `delegateX` methods less verbose to use and makes use of them
in a few spots where they weren't yet used.
2021-02-23 07:42:23 +01:00
Mark Vieira 2b5a103591
Ensure that rest compatibility test task classpath is setup correctly (#69306) 2021-02-22 10:49:33 -08:00
Luca Cavanna 47e28f5663
Remove DynamicRuntimeFieldsBuilder (#69350)
The extension point that used to allow to plug in the dynamic:runtime behaviour has been removed. The corresponding abstraction (interface + x-pack impl) can now be removed.
2021-02-22 16:22:10 +01:00
Luca Cavanna 1e610596eb
Move runtime fields to server (#69223)
This commit makes a start on moving runtime fields to server.

The runtime field types are now built-in. The dynamic fields builder extension (needed to make dynamic:runtime work) is removed: further simplifications are needed but are left for a follow-up. It is still possible to plug in custom runtime field types through the same extension point that the xpack plugin used to use.

The runtime fields script contexts are now also moved to server, and the corresponding whitelists are part of painless-lang with the exception of grok and dissect which will require additional work.

The runtime fields xpack plugin still exists to hold the grok and dissect extensions needed, which depend on grok and dissect lib and we will address as a follow-up. Also, the qa tests and yaml tests are yet to be moved.

Despite the need for additional work, runtime fields are fully functional and users should not be impacted by this change.
2021-02-22 12:19:07 +01:00
Jack Conradson 4405414915
Improve null def access error message (#69226)
This change adds a checkNull method that will check any def reference when accessed for null. If the 
def reference is null it throws a more descriptive error message with the name of the method/field 
that is accessed.

For the megamorphic cache, we use the checkNull as the filter instead of just getClass when looking 
up if we need to compute a new MethodHandle for a changed type.

Fixes: #53129
2021-02-19 08:15:24 -08:00
Mark Vieira f2b27c18c2 Mute reindex rest compatibility tests 2021-02-18 10:10:14 -08:00
David Turner d3e0a571eb
URL repos and searchable snapshots don't mix (#69197)
Provides docs and a better error message regarding using URL
repositories with searchable snapshots.

Relates #68918
2021-02-18 17:50:50 +00:00
Jason Tedor 0cd4863585
Introduce ES_JAVA_HOME (#68954)
This commit introduces a dedicated envirnoment variable ES_JAVA_HOME to
determine the JDK used to start (if not using the bundled JDK). This
environment variable will replace JAVA_HOME. The reason that we are
making this change is because JAVA_HOME is a common environment variable
and sometimes users have it set in their environment from other JDK
applications that they have installed on their system. In this case,
they would accidentally end up not using the bundled JDK despite their
intentions. By using a dedicated environment variable specific to
Elasticsearch, we avoid this potential for conflict. With this commit,
we introduce the new environment variable, and deprecate the use of
JAVA_HOME. We will remove support for JAVA_HOME in a future commit.
2021-02-17 12:41:23 -05:00
Stuart Tettemer dde3df29c5
Scripting: augmented javadoc to api spec (#69082)
Accept system property `packageSources` with mapping of
package names to source roots.

Format `<package0>:<path0>;<package1>:<path1>`.

Checks ESv2 License in addition to GPLv2.
2021-02-17 10:21:27 -06:00
Przemyslaw Gomulka 587e5b4ab9
[REST Compatible API] Reindex size field (#69037)
For re-index requests, the outer most "size" field was deprecated and
renamed to "max_docs" in 7.x (#43373). With this commit, the "size" field will
remain available (but still deprecated) through out 8.x when REST API
compatibility is requested.

relates #51816
2021-02-17 07:59:31 +01:00
Stuart Tettemer 36b69143d3
Scripting: fallback to declaring class javadoc (#68842)
If use the 'declaring' class's version of a method's javadoc if
the current class does not have javadoc for the method.

So, if HashMap.java's toString() has no javadoc, fall back to Object.java.
2021-02-16 12:36:41 -06:00
Alan Woodward 8fba6e4a6d
Handle ignored fields directly in SourceValueFetcher (#68738)
Currently, the value fetcher framework handles ignored fields by reading
the stored values of the _ignored metadata field, and passing these through
on calls to fetchValues(). However, this means that if a document has multiple
values indexed for a field, and one malformed value, then the fields API will
ignore everything, including the valid values, and return an empty list for this
document.

If a document source contains a malformed value, then it must have been
ignored at index time. Therefore, we can safely assume that if we get an
exception parsing values from source at fetch time, they were also ignored
at index time and they can be skipped. This commit moves this exception
handling directly into SourceValueFetcher and ArraySourceValueFetcher,
removing the need to inspect the _ignored metadata and fixing the case
of mixed valid and invalid values.
2021-02-16 15:19:15 +00:00
Christoph Büscher 45b1c46f1e
Relaxing score comparisons for rank_eval tests (#68976)
The currently allowed delta in this tests is to strict to account for slight floating point
differences across different platforms, e.g. ARM.

Closes #68936
2021-02-16 12:47:14 +01:00
Alan Woodward dbff7bea37
Rename DocValueFetcher.Leaf to FormattedDocValues (#68818)
Also moves it to a top-level interface in fielddata. It is not only used by
DocValueFetcher any more, and Leaf does not really describe what
it does or what it provides.
2021-02-15 10:03:25 +00:00
Gordon Brown 3f6472de74
Introduce "Feature States" for managing snapshots of system indices (#63513)
This PR expands the meaning of `include_global_state` for snapshots to include system indices. If `include_global_state` is `true` on creation, system indices will be included in the snapshot regardless of the contents of the `indices` field. If `include_global_state` is `true` on restoration, system indices will be restored (if included in the snapshot), regardless of the contents of the `indices` field. Index renaming is not applied to system indices, as system indices rely on their names matching certain patterns. If restored system indices are already present, they are automatically deleted prior to restoration from the snapshot to avoid conflicts.

This behavior can be overridden to an extent by including a new field in the snapshot creation or restoration call, `feature_states`, which contains an array of strings indicating the "feature" for which system indices should be snapshotted or restored. For example, this call will only restore the `watcher` and `security` system indices (in addition to `index_1`):

```
POST /_snapshot/my_repository/snapshot_2/_restore
{
  "indices": "index_1",
  "include_global_state": true,
  "feature_states": ["watcher", "security"]
}
```

If `feature_states` is present, the system indices associated with those features will be snapshotted or restored regardless of the value of `include_global_state`. All system indices can be omitted by providing a special value of `none` (`"feature_states": ["none"]`), or included by omitting the field or explicitly providing an empty array (`"feature_states": []`), similar to the `indices` field.

The list of currently available features can be retrieved via a new "Get Snapshottable Features" API:
```
GET /_snapshottable_features
```

which returns a response of the form:
```
{
    "features": [
        {
            "name": "tasks",
            "description": "Manages task results"
        },
        {
            "name": "kibana",
            "description": "Manages Kibana configuration and reports"
        }
    ]
}
```

Features currently map one-to-one with `SystemIndexPlugin`s, but this should be considered an implementation detail. The Get Snapshottable Features API and snapshot creation rely upon all relevant plugins being installed on the master node.

Further, the list of feature states included in a given snapshot is exposed by the Get Snapshot API, which now includes a new field, `feature_states`, which contains a list of the feature states and their associated system indices which are included in the snapshot. All system indices in feature states are also included in the `indices` array for backwards compatibility, although explicitly requesting system indices included in a feature state is deprecated. For example, an excerpt from the Get Snapshot API showing `feature_states`:
```
"feature_states": [
    {
        "feature_name": "tasks",
        "indices": [
            ".tasks"
        ]
    }
],
"indices": [
    ".tasks",
    "test1",
    "test2"
]
```

Co-authored-by: William Brafford <william.brafford@elastic.co>
2021-02-11 11:55:14 -07:00
Martijn van Groningen 5529b3d583
Changed how geoip cache is integrated with geoip processor. (#68581)
This change helps facilitate allowing maxmind databases to be updated at runtime.
This will make is easier to purge the cache if a database changes.

Made the following changes:
* Changed how geoip processor integrates with the cache. The cache is moved from the geoip processor to DatabaseReaderLazyLoader class.
* Changed the cache key from ip + response class to ip + database_path.
* Moved GeoIpCache from IngestGeoIpPlugin class to be a top level class.
2021-02-11 10:15:19 +01:00
Igor Motov 0bbc6addd9
Revert "Remove aggregation's postCollect phase (#68615)
This partially reverts #64016 and  and adds #67839 and adds
additional tests that would have caught issues with the changes
in #64016. It's mostly Nik's code, I am just cleaning things up
a bit.

Co-authored-by: Nik Everett <nik9000@gmail.com>
2021-02-10 19:12:50 -05:00
Stuart Tettemer c35eebea9d
Scripting: capture structured javadoc from stdlib (#68782)
Clean javadoc tags and strip html.
Methods and constructors have an optional `javadoc` field.  All fields under
`javadoc` are optional but at least one will be present.

Fields also have optional `javadoc` field which, if present, is a string.
```
"javadoc": {
  "description": "...",

  // from @param <param name> <param description>
  "parameters": {
    "p1": "<p1 description>",
    "p2": "<p2 description>"
  },

  // from @return
  "return": "...",

  // from @throws <type> <description>
  "throws": [
    [
      "IndexOutOfBoundsException",
      "<description>"
    ],
    [
      "IOException",
      "<description>"
    ]
  ]
}
```
2021-02-10 09:20:52 -06:00
Julie Tibshirani 936abca50a
Rename MatchQuery -> MatchQueryParser. (#68716)
This commit renames `MatchQuery` to make it clear it's not a query. Its purpose
is actually to produce Lucene queries through its `parse` method.

It also renames `MultiMatchQuery` -> `MultiMatchQueryParser`.
2021-02-09 08:56:00 -08:00
Stuart Tettemer 11524758e1
Scripting: enforce GPLv2 for parsed stdlib docs (#68601)
When parsing Java standard library javadocs, we need to ensure
that our use will comply with the license.  Our use complies
with GPLv2 licenses but may not comply with proprietary licenses.

Reject .java files that have non-GPL licenses when parsing them
for parameter names and javadoc comments.
2021-02-08 11:08:01 -06:00
Rory Hunter 2d44cce31e
Replace NOT operator with explicit `false` check - part 9 (#68645)
Part 9.

We have an in-house rule to compare explicitly against `false` instead
of using the logical not operator (`!`). However, this hasn't
historically been enforced, meaning that there are many violations in
the source at present.

We now have a Checkstyle rule that can detect these cases, but before we
can turn it on, we need to fix the existing violations. This is being
done over a series of PRs, since there are a lot to fix.
2021-02-08 15:28:57 +00:00
Julie Tibshirani af1cc495b2
Remove support for _type in searches (#68564)
Types are no longer allowed in requests in 8.0, so we can remove support for
using the `_type` field within a search request.

Relates to #41059.
Closes #68311.
2021-02-05 12:13:05 -08:00
Christoph Büscher e2d5183af0
Return structured nested data in ‘fields’ API
At the moment, the ‘fields’ API handles nested fields the same way I handles non-nested object arrays: it just returns them in a flat list. However, the relationship between nested fields is something we should try to preserve, since this is the main purpose of mapping something as “nested” instead of just using an object.

This PR changes this by returning grouped field values that are inside a nested object according to the nested object they initially appear in. Any further object structures inside a nested object are again returned as a flattened list. Fields inside nested fields don’t appear in the flattened response outside of the nested path any more. The grouping of fields inside nested objects is applied recursively if nested mappings are defined inside another nested mapping.

Closes #63709
2021-02-05 11:05:03 +01:00
Nik Everett a48a489e79
Painless: improve error message on non-constant (#68517)
As of #68088 painless can have methods where all parameters must be
constant. This improves the error message when the parameter isn't. It's
still not super great, but its better and its what we can easilly give
at that point in the compiler.
2021-02-04 11:08:32 -05:00
Nik Everett fdb147ad6a
Painless: improve bad regex pattern syntax error (#68520)
Adds extra information about the actual error in the pattern to the
error painless returns when you specify a bad pattern. This information
was hiding in the exception that `Pattern.compile` throws but isn't
included in its message so we were never showing it to anyone. These
error message include such gems as:

* named capturing group <name> does not exist
* Look-behind group does not have an obvious maximum length
* Unclosed counted closure

Now you'll get to know what you need to change about your pattern and
not just where it went wrong!
2021-02-04 09:49:13 -05:00
Nik Everett e686e18819
Simpler regex constants in painless (#68486)
Replaces the double `Pattern.compile` invocations in painless scripts
with the fancy constant injection we added in #68088. This caused one of
the tests to fail. It turns out that we weren't fully iterating the IR
tree during the constant folding phases. I started experimenting and
added a ton of tests that failed. Then I fixed them by changing the IR
tree walking code.
2021-02-03 16:51:01 -05:00
nagads 2af3e8e7e2
[Painless] Augmentation.join can't handle empty strings at the start (#68251)
Fixes #33434
2021-02-03 12:30:54 -05:00
Mark Vieira a92a647b9f Update sources with new SSPL+Elastic-2.0 license headers
As per the new licensing change for Elasticsearch and Kibana this commit
moves existing Apache 2.0 licensed source code to the new dual license
SSPL+Elastic license 2.0. In addition, existing x-pack code now uses
the new version 2.0 of the Elastic license. Full changes include:

 - Updating LICENSE and NOTICE files throughout the code base, as well
   as those packaged in our published artifacts
 - Update IDE integration to now use the new license header on newly
   created source files
 - Remove references to the "OSS" distribution from our documentation
 - Update build time verification checks to no longer allow Apache 2.0
   license header in Elasticsearch source code
 - Replace all existing Apache 2.0 license headers for non-xpack code
   with updated header (vendored code with Apache 2.0 headers obviously
   remains the same).
 - Replace all Elastic license 1.0 headers with new 2.0 header in xpack.
2021-02-02 16:10:53 -08:00
Stuart Tettemer 97eda6fab2
Scripting: refactor use of stdlib extractor (#68402)
If there's no java stdlib path, `StdlibJavadocExtractor` is unnecessary.

This creates a separate code path for that case, which removes a
bunch of checking that `StdlibJavadocExtractor` is `null`.
2021-02-02 12:28:11 -06:00
Rory Hunter b4514228f0
Replace NOT operator with explicit `false` check - part 5 (#68360)
Part 5.

We have an in-house rule to compare explicitly against `false` instead
of using the logical not operator (`!`). However, this hasn't
historically been enforced, meaning that there are many violations in
the source at present.

We now have a Checkstyle rule that can detect these cases, but before we
can turn it on, we need to fix the existing violations. This is being
done over a series of PRs, since there are a lot to fix.
2021-02-02 14:27:33 +00:00
Stuart Tettemer 460b33f9b2
Scripting: readable array types for context api (#68237)
Instead of fixing up array names post-hoc when generating the api spec
and painless context docs, fix them up in the `_scripts/painless/_context`
API call.
2021-02-01 15:10:23 -06:00
Nik Everett 419ce10989
Add grok and dissect methods to runtime fields (#68088)
This adds a `grok` and a `dissect` method to runtime fields which
returns a `Matcher` style object you can use to get the matched
patterns. A fairly simple script to extract the "verb" from an apache
log line with `grok` would look like this:
```
String verb = grok('%{COMMONAPACHELOG}').extract(doc["message"].value)?.verb;
if (verb != null) {
  emit(verb);
}
```

And `dissect` would look like:
```
String verb = dissect('%{clientip} %{ident} %{auth} [%{@timestamp}] "%{verb} %{request} HTTP/%{httpversion}" %{status} %{size}').extract(doc["message"].value)?.verb;
if (verb != null) {
  emit(verb);
}
```

We'll work later to get it down to a clean looking one liner, but for
now, this'll do.

The `grok` and `dissect` methods are special in that they only run at
script compile time. You can't pass non-constants to them. They'll
produce compile errors if you send in a bad pattern. This is nice
because they can be expensive to "compile" and there are many other
optimizations we can make when the patterns are available up front.

Closes #67825
2021-02-01 14:16:01 -05:00
Ignacio Vera 747773d5af
Upgrade to Lucene 8.8.0 (#68272) 2021-02-01 13:36:03 +01:00
Nik Everett b52ea0b02b
Fix painless build in eclipse (#68166)
Painless now has a `doc` source which has its own dependencies. That's
lovely for everything but Eclipse doesn't understand that sort of thing.
This adds the docs dependencies to the regular build path when building
with Eclipse.
2021-01-29 09:40:24 -05:00
Alan Woodward d981cf2dff
Remove intermediate SearchLookup classes (#68052)
SearchLookup has two intermediate classes, DocMap and StoredFieldsLookup, that
are simple factories for their Leaf implementations. They are never accessed
outside SearchLookup, with the exception of two calls on DocMap that can be
easily refactored. This commit removes them, making SearchLookup.getLeafSearchLookup
directly responsible for creating the leaf lookups.
2021-01-29 10:44:05 +00:00
Stuart Tettemer 7c98d9a052
Scripting: Parse stdlib files for parameter names (#67837)
* Scripting: Parse stdlib files for parameter names

* Task `generateContextApiSpec` takes optional system parameter
  `jdksrc` with path to extracted java standard library source
  files.
* stdlib source files are the source of `parameter_names` list
  and the `javadoc` value.
* javadoc values may contain newlines and markup such as
  `{@code XX}`, `<p>`, `@throws`

Example method:
```
{
  "declaring": "Appendable",
  "name": "append",
  "return": "Appendable",
  "javadoc": "Appends a subsequence of the specified character sequence to this...",
  "parameters": ["CharSequence", "int", "int" ],
  "parameter_names": ["csq", "start", "end"]
}
```
2021-01-28 13:38:47 -06:00
Gordon Brown 4fe7a612fc
Allow population of Enrich indices to work with System Index protections (#67406)
This PR does three things:
1) Tweaks existing reindex infrastructure so that different clients can be used for the "search" part and the "index" part of a reindex operation, and
2) Modifies Enrich to take advantage of this to perform the "search" part in the security context of the current user (so that DLS/FLS etc. are properly applied) while performing the "index" part in the security context of the Enrich plugin (so that access to system indices, and `.enrich-*` in particular, is allowed regardless of the permissions of the current user).
3) Adds integration tests for the above, to verify that Enrich does not leak info protected by DLS and/or FLS.

Co-authored-by: Jay Modi <jay.modi@elastic.co>
2021-01-28 10:17:26 -07:00
Alan Woodward f6aed63442
Simplify getting persistent map from SourceLookup (#67331)
SourceLookup has two different ways of getting its internal source as an object
that will persist once the lookup has moved to a separate object. The first,
source(), can return null, and so only works after the source map has been
set explicitly, or one of the other map functions has been called. The second,
loadSourceIfNeeded() will never return null, and can be used to lazily load
values from stored fields.

In the past, the distinction between these two methods was important because
you could use a null check on source() to see if the source field was enabled.
We can now do this by checking the isSourceEnabled() method on
SearchExecutionContext.

This commit merges both fields into a single source() method, with the semantics
of the old loadSourceIfNeeded() method.
2021-01-27 11:08:22 +00:00
Rory Hunter ad1f876daa
Replace NOT operator with explicit `false` check (#67817)
We have an in-house rule to compare explicitly against `false` instead
of using the logical not operator (`!`). However, this hasn't
historically been enforced, meaning that there are many violations in
the source at present.

We now have a Checkstyle rule that can detect these cases, but before we
can turn it on, we need to fix the existing violations. This is being
done over a series of PRs, since there are a lot to fix.
2021-01-26 14:47:09 +00:00
Dan Hermann b330493a4b
Rename mime_type configuration option to media_type (#67860) 2021-01-25 11:29:12 -06:00
Przemyslaw Gomulka c2c50d5aed
Make scripted search templates work with new mediaType from XContentType.JSON (#67677)
Stored scripts can have content_type option set, however when empty they default to XContentType.JSON#mediaType(). Commit 5e74f79 has changed this in master (ES8) method to return application/json;charset=utf-8 (previously application/json; charset=UTF-8)
This means that when upgrading ES from version 7 to 8 stored script will fail when being used as the encoder is being matched with string equality (map key)

This commit address this by adding back (in addition) the old application/json; charset=UTF-8 into the encoders map.

closes #66986
2021-01-21 12:03:38 +01:00
Jim Ferenczi e77c523bd9
Upgrade to a new lucene 8.8.0 snapshot (#67691)
This change upgrades to the latest Lucene 8.8.0 snapshot.
It also restores the compression on binary doc values that was lost in the last snapshot upgrade.
The compression is now configurable on binary doc values but we don't expose this functionality yet so this commit ensures that we pick the same compression mode as previous releases (BEST_COMPRESSION).
2021-01-19 13:33:19 +01:00
Armin Braun 6d025d3a27
Log Slowness on Sending Transport Messages (#67664)
Similar to #62444 but for the outbound path.

This does not detect slowness in individual transport handler logic,
this is done via the inbound handler logging already, but instead
warns if it takes a long time to hand off the message to the relevant
transport thread and then transfer the message over the wire.
This gives some visibility into the stability of the network
connection itself and into the reasons for slow network
responses (if they are the result of slow networking on the sender).
2021-01-19 12:19:32 +01:00
Mayya Sharipova 76482210b8
Add linear function to rank_feature query (#67438)
This adds a linear function to the set of functions available
for rank_feature query

Closes #49859
2021-01-18 11:44:13 -05:00
Rory Hunter 1a05a5ac24
Introduce deprecation categories (#67443)
Closes #64824. Introduce the concept of categories to deprecation
logging. Every location where we log a deprecation message must now
include a deprecation category.
2021-01-18 16:16:54 +00:00
Rene Groeschke ca96612245
Remove debugging printlns from build scripts 2021-01-18 15:19:19 +01:00
Rene Groeschke f83d545b81
Port UrlFixture to test fixture plugin (#67169)
- Port UrlFixture to test fixture plugin
- Avoid exposing PID and PORt for http fixture when not required
- Make AbstractHttpFixture work inside and outside docker
- Check directories when running UrlFixture
2021-01-18 14:59:18 +01:00
gf2121 92f85981a7
Avoid duplicate serialization for TermsQueryBuilder (#67223)
Avoid duplicate serialization for TermsQuery.
2021-01-18 09:04:29 +01:00
Nik Everett 217c9e0c04
Fix painless tests in eclipse (#67602)
`Augmentation.java` had a zero width space [1] in two method
definitions:
```
    public static String[] split(Pattern receiver, int limitFactor, CharSequence input, int limit) {
                                ^------- Right before the ( character
    public static Stream<String> splitAsStream(Pattern receiver, int limitFactor, CharSequence input) {
                                              ^ Right before the ( here too
```

Sadly, Eclipse and javac treat this character differently. Eclipse seems
to include it in the method name and javac seems to treat it as regular
space. This caused all the unit tests for painless to fail to load
because they couldn't find the `split` and `splitAsStream`
augmentations. But if you listed all of the methods they looked like
they were there. If you crack open the line in a hex editor you can see
it.

Eclipse is tracking [2] similar issues.

[1]: https://en.wikipedia.org/wiki/Zero-width_space
[2]: https://bugs.eclipse.org/bugs/show_bug.cgi?id=547601
2021-01-15 14:42:18 -05:00
Julie Tibshirani 5852fbedf5
Rename QueryShardContext -> SearchExecutionContext. (#67490)
We decided to rename `QueryShardContext` to clarify that it supports all parts
of search request execution. Before there was confusion over whether it should
only be used for building queries, or maybe only used in the query phase. This
PR also updates the javadocs.

Closes #64740.
2021-01-14 09:11:59 -08:00
Luca Cavanna df7041f45a
Remove last DocumentMapper reference from MappingLookup (#67157)
As part of #66295 we made QueryShardContext perform mapping lookups through MappingLookup rather than MapperService. That helps as MapperService relies on DocumentMapper which may change througout the execution of the search request. At search time, the percolate query also needs to parse documents, which made us add a parse method to MappingLookup.Such parse method currently relies on calling DocumentMapper#parseDocument through a function, but we would like to rather make this easier to follow. (see https://github.com/elastic/elasticsearch/pull/66295/files#r544639868)

We recently removed the need to provide the entire DocumentMapper to DocumentParser#parse, opening the possibility for using DocumentParser directly when needing to parse a document at query time. This commit adds everything that is needed (namely Mapping, IndexSettings and IndexAnalyzers) to MappingLookup so that it can parse a document through DocumentParser without relying on DocumentMapper.

As a bonus, given that MappingLookup holds a reference to these three additional objects, we can make DocumentMapper rely on MappingLookup to retrieve those and not hold its own same references to them.
Along the same lines, given that MappingLookup holds all that's necessary to parse a document, the signature of DocumentParser#parse can be simplified by replacing most of its arguments with MappingLookup and retrieving what is needed from it.
2021-01-12 11:48:51 +01:00
Ignacio Vera 604ee06a3b
Upgrade to lucene-8.8-snapshot-f73f6b1 (#67228) 2021-01-12 08:03:00 +01:00
Dan Hermann eddab39e2f
Configurable MIME type for mustache template encoding on set processor (#65314) 2021-01-07 07:40:57 -06:00
Tim Vernum 248b6a89e8
Update template warning for FIPS in Netty test (#67067)
This changes the expected error message (on FIPS) so that the
order of the templates (and their associated patterns) matches
the (newly updated) order generated by the server.

Relates: #67066
Resolves: #66820
2021-01-07 12:01:23 +11:00
Stuart Tettemer 8a001d1a40
Scripting: whitelist Json functions for ingest (#67118) 2021-01-06 13:08:01 -06:00
Stuart Tettemer 93bc36ef6f
Scripting: Add OSS whitelist to execute API (#67038)
* Scripting: Add OSS whitelist to execute API

* Ingest
* Score
* MovFn
* Json

Fixes: #67035
2021-01-05 15:27:27 -06:00
Przemko Robakowski e1c6cbced7
Fix whitespace as a separator in CSV processor (#67045)
This change fixes problem when using space or tab as a separator in CSV processor - we check if current character is separator before we check if it is whitespace.

This also improves tests to always check all combinations of separators and quotes.

Closes #67013
2021-01-05 22:19:31 +01:00
Jack Conradson b0eb81301a
Fix static inner class resolution in Painless (#67027)
When removing the "lexer hack" to remove type context from the lexer, static inner class resolution 
wasn't properly accounted for. This change adds code to handle static inner class resolution.
2021-01-05 11:05:24 -08:00
David Turner b3e550c289
Make InternalClusterInfoService async (#66993)
This commit reworks the InternalClusterInfoService to run
asynchronously, using timeouts on the stats requests instead of
implementing its own blocking timeouts. It also improves the logging of
failures by identifying the nodes that failed or timed out. Finally it
ensures that only a single refresh is running at once, enqueueing later
refresh requests to run immediately after the current refresh is
finished rather than racing them against each other.
2021-01-05 17:58:30 +00:00
Przemyslaw Gomulka 5e74f79e22
Support response content-type with versioned media type (#65500)
This commit allows returning a correct requested response content-type - it did not work for versioned media types.
It is done by adding new vendor specific instances to XContent and TextFormat enums. These instances can then "format" the response content type string when provided with parameters. This is similar to what SQL plugin does with its media types.

#51816
2021-01-05 09:23:22 +01:00
Jack Conradson fbedb66075
Remove leniency for casting from def to void in Painless (#66957)
This leniency was originally for lambda and method reference conversions, but they are both special 
cased now. This removes change removes the unnecessary leniency of a cast from a def type to a void 
type. This also fixes (#66175).
2021-01-04 15:00:02 -08:00
Luca Cavanna dbefc05e6e
Don't require DocumentMapper as an argument when parsing a document (#66780)
Currently, an incoming document is parsed through `DocumentMapper#parse`, which in turns calls `DocumentParser#parseDocument` providing `this` among other arguments. As part of the effort to reduce usages of `DocumentMapper` when possible, as it represents the mutable side of mappings (through mappings updates) and involves complexity, we can carry around only the needed components. This does add some required arguments to `DocumentParser#parseDocument` , though it makes dependencies clearer. This change does not affect end consumers as they all go through DocumentMapper anyways, but by not needed to provide DocumentMapper to parseDocument, we may be able to unblock further improvements down the line.

Relates to #66295
2021-01-04 15:34:44 +01:00
Rene Groeschke eee6e11883
Port all task definitions to task avoidance api (#66738)
This finishes porting all tasks created in gradle build scripts and plugins to use 
the task avoidance api (see #56610)

* Port all task definitions to task avoidance api
* Fix last task created during configuration
* Fix test setup in  :modules:reindex
* Declare proper task inputs
2021-01-04 12:32:19 +01:00
Mark Tozzi e26c9bbd52
Rename BYTES ValuesSourceType to reflect intended usage (#66762) 2020-12-30 12:39:17 -05:00
Mayya Sharipova 5b6675ab0d
Mute testTemplateExists (#66863)
Mute Netty4HeadBodyIsEmptyIT.testTemplateExists, as it fails in FIPS
mode.

Relates to #66820
2020-12-29 10:45:16 -05:00
Tim Vernum 22bc833d85
Skip netty4 yaml test in FIPS mode (#66842)
The "Netty loaded" YAML test asserts that the configured transport is
"netty4", however when in FIPS mode, the tests enable security and the
configured transport is "security4".

This change skips the netty4 yaml test when running in FIPS mode.

Resolves: #66818
2020-12-29 18:28:23 +11:00
Przemyslaw Gomulka 8f74f18257
Fix ingest java week based year defaulting (#65717)
If year, year of era, or weekbased year is not specified ingest Java
date processor is defaulting year to current year.
However the current implementation has mistaken weekBasedYear field with
weekOfWeekBasedYear. This has lead to incorrect defaulting.

relates #63458
2020-12-28 10:49:31 +01:00
Ioannis Kakavas bd873698bc
Ensure CI is run in FIPS 140 approved only mode (#64024)
We were depending on the BouncyCastle FIPS own mechanics to set
itself in approved only mode since we run with the Security
Manager enabled. The check during startup seems to happen before we
set our restrictive SecurityManager though in
org.elasticsearch.bootstrap.Elasticsearch , and this means that
BCFIPS would not be in approved only mode, unless explicitly
configured so.

This commit sets the appropriate JVM property to explicitly set
BCFIPS in approved only mode in CI and adds tests to ensure that we
will be running with BCFIPS in approved only mode when we expect to.
It also sets xpack.security.fips_mode.enabled to true for all test clusters
used in fips mode and sets the distribution to the default one. It adds a
password to the elasticsearch keystore for all test clusters that run in fips
mode.
Moreover, it changes a few unit tests where we would use bcrypt even in
FIPS 140 mode. These would still pass since we are bundling our own
bcrypt implementation, but are now changed to use FIPS 140 approved
algorithms instead for better coverage.

It also addresses a number of tests that would fail in approved only mode
Mainly:

    Tests that use PBKDF2 with a password less than 112 bits (14char). We
    elected to change the passwords used everywhere to be at least 14
    characters long instead of mandating
    the use of pbkdf2_stretch because both pbkdf2 and
    pbkdf2_stretch are supported and allowed in fips mode and it makes sense
    to test with both. We could possibly figure out the password algorithm used
    for each test and adjust password length accordingly only for pbkdf2 but
    there is little value in that. It's good practice to use strong passwords so if
    our docs and tests use longer passwords, then it's for the best. The approach
    is brittle as there is no guarantee that the next test that will be added won't
    use a short password, so we add some testing documentation too.
    This leaves us with a possible coverage gap since we do support passwords
    as short as 6 characters but we only test with > 14 chars but the
    validation itself was not tested even before. Tests can be added in a followup,
    outside of fips related context.

    Tests that use a PKCS12 keystore and were not already muted.

    Tests that depend on running test clusters with a basic license or
    using the OSS distribution as FIPS 140 support is not available in
    neither of these.

Finally, it adds some information around FIPS 140 testing in our testing
documentation reference so that developers can hopefully keep in
mind fips 140 related intricacies when writing/changing docs.
2020-12-23 21:00:49 +02:00
Jim Ferenczi c756ce1acf
Sort field tiebreaker for PIT (point in time) readers (#66093)
This commit introduces a new sort field called `_shard_doc` that
can be used in conjunction with a PIT to consistently tiebreak
identical sort values. The sort value is a numeric long that is
composed of the ordinal of the shard (assigned by the coordinating node)
and the internal Lucene document ID. These two values are consistent within
a PIT so this sort criteria can be used as the tiebreaker of any search
requests.
Since this sort criteria is stable we'd like to add it automatically to any
sorted search requests that use a PIT but we also need to expose it explicitly
in order to be able to:
* Reverse the order of the tiebreaking, useful to search "before" `search_after`.
* Force the primary sort to use it in order to benefit from the `search_after` optimization when sorting by index order (to be released in Lucene 8.8.

I plan to add the documentation and the automatic configuration for PIT in a follow up since this change is already big.

Relates #56828
2020-12-18 12:13:12 +01:00
Armin Braun 3819fcb582
Add Ability to Write a BytesReference to BlobContainer (#66501)
Except when writing actual segment files to the blob store
we always write `BytesReference` instead of a stream.
Only having the stream API available forces needless copies
on us. I fixed the straight-forward needless copying for
HDFS and FS repos in this PR, we could do similar fixes for
GCS and Azure as well and thus significantly reduce the peak
memory use of these writes on master nodes in particular.
2020-12-17 17:42:29 +01:00
Julie Tibshirani d0683141f4
Ensure all query builder tests consider older versions. (#66401)
This PR removes outdated overrides in some tests that prevent them from testing
older index versions. Also removes an old comment + logic from
AggregatorFactoriesTests.
2020-12-16 09:19:26 -08:00
Nik Everett 7b3c6f2a0c
Further clean up in AggregatorTestCase (#66395)
Drops `AggregatorTestCase#mapperServiceMock` because it is getting in
the way of other work I'm doing for runtime fields. It was only
overridden to test the `parent` and `child` aggregation to add the
`MappedFieldType`s for join fields in the backdoor. Those aggregations
can just as easily add those fields in the normal method calls.
2020-12-16 11:56:04 -05:00
Jim Ferenczi 6d1f43c6d2
Fix search_as_you_type field with term_vector (#66432)
This commit fixes a bug in the search_as_you_type field that was introduced during
the refactoring of the field mapper. The prefix field that is used internally
by the search_as_you_type mapper doesn't need term vector even if they are activated
on the main field. So this commit ensures that we don't copy the options from the main
field when we create the prefix sub-field.

Closes #66407
2020-12-16 17:04:51 +01:00
Martijn Laarman e31e3dea32
Add `visibility` the to rest-spec-api (#56104) 2020-12-14 12:23:28 +01:00
Rene Groeschke defaa93902
Avoid tasks materialized during configuration phase (#65922)
* Avoid tasks materialized during configuration phase
* Fix RestTestFromSnippet testRoot setup
2020-12-12 16:14:17 +01:00
Stuart Tettemer 0be70fd35b
Scripting: whitelist api spec, refs #49879 (#66120) 2020-12-09 12:24:36 -06:00
Stuart Tettemer 2aa2224b36
Scripting: Whitelist API spec gradle task (#66050)
Adds `generateContextApiSpec` gradle task that generates whitelist api
specs under `modules/lang-painless/src/main/generated/whitelist-json`.

The common classes are in `painless-common.json`, the specialized classes
per context are in `painless-$context.json`.

eg. `painless-aggs.json` has the specialization for the aggs contexts

Refs: #49879
2020-12-09 09:16:37 -06:00
Martijn Laarman 8d3def3e1f
Add Accept & Content-Type headers to rest api spec (#53979)
Co-authored-by: Russ Cam <russ.cam@elastic.co>
2020-12-09 14:43:05 +01:00
Luca Cavanna e144471b3e
Introduce dynamic runtime setting (#65489)
The dynamic:runtime setting is similar to dynamic:true in that it dynamically defines fields based on values parsed from incoming documents. Though instead of defining leaf fields under properties, it defines them as runtime fields under the runtime section. This is useful in scenarios where search speed can be traded for storage costs, given that runtime fields are loaded at runtime rather than indexed.
2020-12-08 15:29:24 +01:00
Jake Landis c35d7c0f5a
Convert module/mappers-extra to an internal cluster test (#65971)
modules/mappers-extra should be an internal cluster, not
a javaRestTest. This test will work correctly until you
you try to modify the javaRestTest test cluster. Then
it will treat the javaRestTest as an external cluster
to it's own test cluster potentially causing issues with
tests.
2020-12-08 07:59:27 -06:00
Rene Groeschke 0911d04467
Make AntFixture handling task provider api compliant (#65832)
This tweaks the AntFixture handling to make it compliant with the task avoidance api.
Tasks of type StandaloneRestTestTask are now generally finalised by using the typed ant stop task
which allows us to remove of errorprone dependsOn overrides in StandaloneRestTestTask. As a result
we also ported more task definitions in the build to task avoidance api.

Next work item regarding AntFixture handling is porting AntFixture to a plain Gradle task and remove
Groovy AntBuilder will allow us to port more build logic from Groovy to Java but is out of the scope of
This PR.
2020-12-08 13:07:36 +01:00
Fan Jingbo 74141c17e8
[DOCS] Fix typos (#65951) 2020-12-07 11:39:04 -05:00
Jack Conradson a44ad560a2
Complete replacing member data with decorations in the ir tree (#64825)
This change replaces all the member data in the ir nodes with decorations instead. This completes the 
transition to a decoration system in the ir tree. This change allows for maximum flexibility when 
modifying existing phases or adding additional phases.
2020-12-03 12:01:07 -08:00
Armin Braun 06a31a0aca
Add List Append Utility Method (#65576)
(list -> copy -> add one -> wrap immutable) is a pretty common pattern in CS
updates and tests => added a shortcut for it here and used it in easily identifyable
spots.
2020-12-01 02:47:21 +01:00
Nik Everett c227554080
Remove SearchContext from constructing aggregations (#64953)
This replaces the `SearchContext` passed to the ctor of `Aggregation`s
with `AggregationContext`. It ends up adding a fairly large number of
methods to `AggregationContext` but in exchange it shows a path to
removing a few methods from `SearchContext`. That seems nice!

It also gives us an accurate inventory of "all of the stuff" that
aggregations use to build and run.
2020-11-30 13:19:44 -05:00
Alan Woodward dcd4fadc32
Make runtime fields highlightable (#65560)
This commit adds the ability to request highlights for runtime fields.

There are two changes included here:

* You can ask QueryShardContext for the index analyzer for a specific
  field, and provide an optional lambda to be executed if the field is
  unmapped. This allows us to return the standard Keyword analyzer by
  default for fields that have not been configured in the mappings.
* HighlighterUtil.loadValues() now correctly handles values fetched
  from a DocValueFetcher as well as a source fetcher, by setting the
  LeafReaderContext from the passed-in HitContext.

The first change has a number of knock-on effects, notably that
MoreLikeThisQuery has to stop using its configured analyzer directly
in equals() and hashcode() checks as the anonymous analyzer
returned from QueryShardContext#getIndexAnalyzer() uses object
identity for comparisons.
2020-11-30 09:08:03 +00:00
Alan Woodward 1a8ce8716d
Restore use of default search and search_quote analyzers (#65491)
In the refactoring of TextFieldMapper, we lost the ability to define
a default search or search_quote analyzer in index settings. This
commit restores that ability, and adds some more comprehensive
testing.

Fixes #65434
2020-11-26 16:57:45 +00:00