Commit Graph

2479 Commits

Author SHA1 Message Date
Joe Gallo f444f64cd0
Adjust the MaxSinglePrimarySizeCondition version (#68553)
and re-enable BWC tests
2021-02-04 16:50:13 -05:00
Jason Tedor 01944627ed
Revert "Continue to publish REST API specifications under Apache 2.0 license (#68488)"
This reverts commit 92b59d994f.
2021-02-04 08:21:05 -05:00
Mark Vieira 92b59d994f
Continue to publish REST API specifications under Apache 2.0 license (#68488) 2021-02-03 13:46:20 -08:00
Joe Gallo 4d18334442
Add max_single_primary_size as a condition for the rollover index API (#67842) 2021-02-03 10:39:06 -05:00
Mark Vieira a92a647b9f Update sources with new SSPL+Elastic-2.0 license headers
As per the new licensing change for Elasticsearch and Kibana this commit
moves existing Apache 2.0 licensed source code to the new dual license
SSPL+Elastic license 2.0. In addition, existing x-pack code now uses
the new version 2.0 of the Elastic license. Full changes include:

 - Updating LICENSE and NOTICE files throughout the code base, as well
   as those packaged in our published artifacts
 - Update IDE integration to now use the new license header on newly
   created source files
 - Remove references to the "OSS" distribution from our documentation
 - Update build time verification checks to no longer allow Apache 2.0
   license header in Elasticsearch source code
 - Replace all existing Apache 2.0 license headers for non-xpack code
   with updated header (vendored code with Apache 2.0 headers obviously
   remains the same).
 - Replace all Elastic license 1.0 headers with new 2.0 header in xpack.
2021-02-02 16:10:53 -08:00
Hendrik Muhs 4cbe61467c
add possibility to mute yaml tests by operating system (#67681)
this change adds the possibility to mute yaml tests based on operating system to avoid muting whole
tests
2021-02-01 09:33:23 +01:00
William Brafford 42b748588d
Allow the "*,-*" ("match none") pattern for destructive actions when destructive_requires_name is true (#68021)
Since the "*,-*" pattern resolves to "no indices", it makes a normally
destructive action into a non-destructive one. Rather than throwing a
wildcards-not-allowed exception, we can allow this pattern to pass
without triggering an exception. This allows the security layer to
safely use a "*,-*" pattern to indicate a "no indices" result for its
index resolution step, which is important because otherwise we get
wildcards-not-allowed exceptions when trying to delete nonexistent
concrete indices. For simplicity, we require exactly "*,-*", rather than
any other wildcards that might be logically equivalent.
2021-01-28 14:08:29 -05:00
David Turner 06e141888f
Reinstate BWC snapshot tests (#67938)
This commit mostly reverts #67934, except for the change to the version
constant `REPOSITORY_UUID_IN_REPO_DATA_VERSION`.

Completes the backport of #67829 via #67899
2021-01-25 18:36:12 +00:00
David Turner faed3e7199
Temporarily suppress BWC snapshot tests (#67934)
This commit suppresses any BWC tests related to snapshots in `master` so
that #67899 can be merged to `7.x`. It will mostly be reverted after the
merge of #67899 is complete.

Relates #66431
2021-01-25 17:48:47 +00:00
David Turner e5a15d4fcb
Introduce repository UUIDs (#67829)
Today a snapshot repository does not have a well-defined identity. It
can be reregistered with a different cluster under a different name, and
can even be registered with multiple clusters in readonly mode.

This presents problems for cases where we need to refer to a specific
snapshot in a globally-unique fashion. Today we rely on the repository
being registered under the same name on every cluster, but this is not a
safe assumption.

This commit adds a UUID that can be used to uniquely identify a
repository. The UUID is stored in the top-level index blob, represented
by `RepositoryData`, and is also usually copied into the
`RepositoryMetadata` that represents the repository in the cluster
state. The repository UUID is exposed in the get-repositories API; other
more meaningful consumers will be added in due course.
2021-01-25 12:17:52 +00:00
David Turner bc1f50c523
Permit wait_for_active_shards warnings in master (#67498)
Part of the fixes for #66419, this commit permits nodes to emit the
deprecation warning regarding not specifying `?wait_for_active_shards`
when closing an index in 7.x versions for x ≥ 12. This change is
required on `master` too since the BWC tests encounter these warnings.

Relates #67246, which is the 7.x part of this change.
2021-01-14 15:55:43 +00:00
Andrei Stefan e3386e155c
Add minimum compatibility version to SearchRequest (#65896)
* Adds a minimum version request parameter to SearchRequest.
The minimum version helps failing a request if any shards
involved in the search do not meet the compatibility requirements
(all shards need to have a version equal or later than the minimum
version provided).
2021-01-13 00:50:30 +02:00
David Turner ec08f924c7
Introduce ?wait_for_active_shards=index-setting (#67158)
In 7.x the close indices API defaulted to `?wait_for_active_shards=0`
but from 8.0 it defaults to respecting the index settings instead.  This
commit introduces the `index-setting` value for this parameter on this
API allowing users to opt-in to the future behaviour today, and emits a
deprecation warning indicating that the default no longer needs to be
used and will be unsupported in future.

In 7.x a follow up PR will introduce support for the same
`index-setting` value for this parameter and will emit deprecation
warnings if users try and use the default instead.

Relates #66419
2021-01-11 08:33:16 +00:00
Nik Everett a9e8a6a31b
Update skip after backport of #67043 (#67191)
Now that #67043 has been backported we can update the skip so the bwc
tests don't complain.
2021-01-07 17:01:53 -05:00
Nik Everett f23e568948 Update skip before backport of #67043
When I merged #67043 it had an integration test for the thing it was
fixing but it still fails in the bwc tests. Yikes! I should know better
but life is life. Anyway, this updates the skip to ignore the test for
now. I'll reenable once the backport is in.
2021-01-07 10:57:39 -05:00
Nik Everett b0747c5a76
Fix bug with nested and filters agg (#67043)
Fixes a bug where nested documents that match a filter in the `filters`
agg will be counted as matching the filter. Usually nested documents
only match if you explicitly ask to match them. Worse, we only mach them
in the "filter by filter" mode that we wrote to speed up date_histogram.
The `filters` agg is fairly rare, but with #63643 we run
`date_histogram` and `range` aggregations using `filters.
2021-01-07 10:05:59 -05:00
Julie Tibshirani 1515f36f7f
Make sure shared source always represents the top-level root document. (#66725)
We started passing down the root document's _source when processing
nested hits, to avoid reloading and reparsing the root source for each hit.
Unfortunately the approach did not work when there are multiple layers of
`inner_hits`. In this case, the second-layer inner hit received its immediate
parent's source instead of the root source. This parent source is filtered to
just contain the parts corresponding to the nested document, but the source
parsing logic is designed to always operate on the top-level root source. This
caused failures when loading the second-layer inner hits.

This PR makes sure to always pass the root document's _source when processing
inner hits, even if there are multiple layers.
2021-01-05 08:17:26 -08:00
Nik Everett dd1ffe3900
Update skip after backport of #66295 (#66808)
Now that #66295 has landed in 7.11 we can run the bwc tests it adds
against that branch.
2020-12-23 16:33:34 -05:00
Ioannis Kakavas bd873698bc
Ensure CI is run in FIPS 140 approved only mode (#64024)
We were depending on the BouncyCastle FIPS own mechanics to set
itself in approved only mode since we run with the Security
Manager enabled. The check during startup seems to happen before we
set our restrictive SecurityManager though in
org.elasticsearch.bootstrap.Elasticsearch , and this means that
BCFIPS would not be in approved only mode, unless explicitly
configured so.

This commit sets the appropriate JVM property to explicitly set
BCFIPS in approved only mode in CI and adds tests to ensure that we
will be running with BCFIPS in approved only mode when we expect to.
It also sets xpack.security.fips_mode.enabled to true for all test clusters
used in fips mode and sets the distribution to the default one. It adds a
password to the elasticsearch keystore for all test clusters that run in fips
mode.
Moreover, it changes a few unit tests where we would use bcrypt even in
FIPS 140 mode. These would still pass since we are bundling our own
bcrypt implementation, but are now changed to use FIPS 140 approved
algorithms instead for better coverage.

It also addresses a number of tests that would fail in approved only mode
Mainly:

    Tests that use PBKDF2 with a password less than 112 bits (14char). We
    elected to change the passwords used everywhere to be at least 14
    characters long instead of mandating
    the use of pbkdf2_stretch because both pbkdf2 and
    pbkdf2_stretch are supported and allowed in fips mode and it makes sense
    to test with both. We could possibly figure out the password algorithm used
    for each test and adjust password length accordingly only for pbkdf2 but
    there is little value in that. It's good practice to use strong passwords so if
    our docs and tests use longer passwords, then it's for the best. The approach
    is brittle as there is no guarantee that the next test that will be added won't
    use a short password, so we add some testing documentation too.
    This leaves us with a possible coverage gap since we do support passwords
    as short as 6 characters but we only test with > 14 chars but the
    validation itself was not tested even before. Tests can be added in a followup,
    outside of fips related context.

    Tests that use a PKCS12 keystore and were not already muted.

    Tests that depend on running test clusters with a basic license or
    using the OSS distribution as FIPS 140 support is not available in
    neither of these.

Finally, it adds some information around FIPS 140 testing in our testing
documentation reference so that developers can hopefully keep in
mind fips 140 related intricacies when writing/changing docs.
2020-12-23 21:00:49 +02:00
Nik Everett 3e3152406a
Bust the request cache when the mapping changes (#66295)
This makes sure that we only serve a hit from the request cache if it
was build using the same mapping and that the same mapping is used for
the entire "query phase" of the search.

Closes #62033
2020-12-23 13:19:02 -05:00
Seth Michael Larson 11153fcb33
Use single backslash for nested paths (#66794) 2020-12-23 11:57:19 -06:00
Julie Tibshirani 652ff74adf Adjust test skips now that inner_hits fix is backported. 2020-12-21 15:08:34 -08:00
Julie Tibshirani 15f5758957
Fix regressions around nested hits and disabled _source. (#66572)
This PR fixes two bugs that can arise when _source is disabled and we fetch nested documents:
* Fix exception when highlighting `inner_hits` with disabled _source.
* Fix exception in nested `top_hits` with disabled _source.
* Add more tests for highlighting `inner_hits`.
2020-12-18 14:06:52 -08:00
Jim Ferenczi c756ce1acf
Sort field tiebreaker for PIT (point in time) readers (#66093)
This commit introduces a new sort field called `_shard_doc` that
can be used in conjunction with a PIT to consistently tiebreak
identical sort values. The sort value is a numeric long that is
composed of the ordinal of the shard (assigned by the coordinating node)
and the internal Lucene document ID. These two values are consistent within
a PIT so this sort criteria can be used as the tiebreaker of any search
requests.
Since this sort criteria is stable we'd like to add it automatically to any
sorted search requests that use a PIT but we also need to expose it explicitly
in order to be able to:
* Reverse the order of the tiebreaking, useful to search "before" `search_after`.
* Force the primary sort to use it in order to benefit from the `search_after` optimization when sorting by index order (to be released in Lucene 8.8.

I plan to add the documentation and the automatic configuration for PIT in a follow up since this change is already big.

Relates #56828
2020-12-18 12:13:12 +01:00
Steve Gordon 96555bfa33
Mark Cat Tasks API as experimental in rest-api-spec (#66536) 2020-12-17 17:54:21 +00:00
Rory Hunter 1eb50f876e Tweak version skip range in cat.plugin yml test 2020-12-17 10:06:32 +00:00
Rory Hunter 4ff612550e
Allow bootstrap plugins to appear in _cat/plugins (#66260)
Closes #66107.

Bootstrap plugins are not loaded in the main Elasticsearch process, but
instead take effect only when ES is starting. As such, these plugins are
skipped when ES loads all installed plugins.

As a result, it was impossible for the plugins _cat API to report
whether any bootstrap plugins are installed.

Fix this by adjusting how the loading process skips bootstrap plugins,
and then tweaking the plugins _cat API so that bootstrap plugins can
optionally be included in the response.
2020-12-17 09:30:16 +00:00
Julie Tibshirani 6bc56d18e5
Fix failure in fvh REST tests. (#66192)
In general, we can't guarantee that a match_all query will return documents in
the order they were indexed. This PR adds an ID to each document to avoid
relying on document order.
2020-12-15 11:28:51 -08:00
Jay Modi 7011cbac5d
Fix cat tasks api params in spec and handler (#66272)
This commit fixes the cat tasks api parameter specification and the
handler so that the parameters are consumed during request preparation.

Closes #59493
2020-12-14 12:29:25 -07:00
Jay Modi 410ae396bd
Remove suggest reference in some API specs (#66180)
This commit removes the reference to the suggest index metric in the
nodes stats and indices stats rest api spec files. Suggest has been
removed so it is no longer correct to have it in these files.

Closes #43407
2020-12-14 10:00:04 -07:00
Martijn Laarman e31e3dea32
Add `visibility` the to rest-spec-api (#56104) 2020-12-14 12:23:28 +01:00
Julie Tibshirani a40186a7f4 Adjust skip version in fvh REST tests.
We can expand the compatible versions now that the bug fix has been packported.
2020-12-10 13:39:47 -08:00
Fernando Briano 91595657a2
Wraps timestamp values in quotes in search.aggregation histogram YAML test (#66153) 2020-12-10 12:19:03 +00:00
Dimitrios Liappis 5b24829796
Mute fvh REST tests (#66149)
Relates #66147
2020-12-10 12:13:23 +02:00
Julie Tibshirani 3cbe0eadce Ensure consistent hit order in fvh REST tests. 2020-12-09 17:53:17 -08:00
Julie Tibshirani ddf1f4cdb8
Fix bug where fvh fragments could be loaded from wrong doc (#65641)
This PR fixes a regression where fvh fragments could be loaded from the wrong
document _source.

Some `FragmentsBuilder` implementations contain a `SourceLookup` to load from
_source. The lookup should be positioned to load from the current hit document.
However, since `FragmentsBuilder` are cached and shared across hits, the lookup
is never updated to load from the new documents. This means we accidentally
load _source from a different document.

The regression was introduced in #60179, which started storing `SourceLookup`
on `FragmentsBuilder`.

Fixes #65533.
2020-12-09 14:47:11 -08:00
Ioannis Kakavas 3cd93eef92
Skip REST YAML tests in FIPS 140 mode (#65735)
Currently we don't have a way to selectively mute REST YAML tests
in FIPS 140 mode.

This commit introduces a new feature (fips_140) that can be used
in skip blocks to allow that.
2020-12-09 15:45:31 +02:00
Martijn Laarman 8d3def3e1f
Add Accept & Content-Type headers to rest api spec (#53979)
Co-authored-by: Russ Cam <russ.cam@elastic.co>
2020-12-09 14:43:05 +01:00
Nik Everett 9e57a46c56
Update skip after backport (#66024)
After backporting #65707 we can now run it in our backwards
compatibility tests.
2020-12-08 12:01:56 -05:00
Nik Everett ecb4e16d35
Fix sneaky date_histogram bug (#65707)
`date_histogram` has a bug with `offset` and `extended_bounds` when it
needs to create an "empty" aggregation result: it includes the bounds
twice! Wooops!

I broke this a while back when I started trying to merge `offset` into
`Rounding`. I never finished that merge, sadly. Interestingly, we've
discovered that the merge is required to properly handle daylight
savings time (#56305) but it isn't really something we're looking to
solve today. For now, this just stops counting the offset twice.

Closes #65624
2020-12-07 15:39:52 -05:00
Nik Everett 92e803a06d
Fix test sending body as url parameter (#65779)
Our test framework randomly sends the body of requests over the `source`
parameter. We never send the body if it is more than 2000 bytes becuse
our HTTP receiver can't handle lines more the 4096 bytes. The thing is,
when we do that 2000 bytes check we do it against the reported length of
body, not the body after it has been url encoded. Than url encoding
strings can *vastly* increase their size. Which could cause us to send
some request over the URL that are longer than 4096 bytes.

This fixes that by checking the url encoded length as well. We keep the
2000 byte check of the unencoded length because it is a nice fast check,
even if it is a bit inaccurate.

Closes #65718
2020-12-07 12:26:34 -05:00
Seth Michael Larson 56a25bf08b
Mark Task APIs as experimental in rest-api-spec 2020-12-03 13:47:13 -06:00
Christoph Büscher 8c29e54bf3 Adapt test skip versions after backport 2020-12-03 16:10:55 +01:00
Christos Soulios 1c8a97784e
Lower minimum compatibility version of _doc_count field tests (#65790)
After merging _doc_count field type in v7.11.0 (#64594), this PR lowers the minimum compatibility
version from v8.0.0 to v7.11.0

Relates to #64503
2020-12-03 12:08:00 +02:00
Alan Woodward f0fc1b3dad
Use scriptless fields in CoreTestTranslator (#65599)
CoreTestTranslator re-writes some core search yaml tests into a tests
for runtime queries, and mimics source-only fields by building ad-hoc
painless scripts for each runtime type. We now have source-only fields
built in via scriptless mappings, so we can cut over to using these
instead.
2020-12-02 10:06:40 +00:00
Christoph Büscher 3c3a43249f
Support unmapped fields in search 'fields' option (#65386)
Currently, the 'fields' option only supports fetching mapped fields. Since
'fields' is meant to be the central place to retrieve document content, it
should allow for loading unmapped values. This change adds implementation and
tests for this feature.

Closes #63690
2020-12-01 21:40:27 +01:00
Christoph Büscher c327794ae8
Fix range query on date fields for number inputs (#63692)
Currently, if you write a date range query with numeric 'to' or 'from' bounds,
they can be interpreted as years if no format is provided. We use
"strict_date_optional_time||epoch_millis" in this case that can interpret inputs
like 1000 as the year 1000 for example. 
This PR change this to always interpret and parse numbers with the "epoch_millis"
parser if no other formatter was provided.

Closes #63680
2020-12-01 18:49:50 +01:00
Rene Groeschke 97749a3372
Port rest integ tests to use task avoidance api (#65011)
This ports the majority of the rest integ tests tasks to use the task avoidance api.

- There are some edge cases left that we need to investigate, but we can do that separately.
2020-11-26 10:30:06 +01:00
Christoph Büscher 6854974c1c
Reactivate deactivated integration test (#65388) 2020-11-24 10:48:14 +01:00
Mark Vieira 53b10dbb66 Mute "Tests catching other exceptions per item" 2020-11-23 10:16:01 -08:00
Christoph Büscher ae3880ed51
Better handling of item errors in _mtermvectors API (#65324)
Currently an error in a `_mtermvectors`, for example because querying through an
alias that has several indices assigned to it, fails the whole request. Instead
we should only fail the problematic item in the multi item request, like we e.g.
do in same situations in _mget.
2020-11-23 16:01:09 +01:00
Henning Andersen 21892817c4
Test version update for must_exist test (#65216)
Relates #65141
2020-11-19 07:45:01 +01:00
Henning Andersen 67e1128e05
Fix remove alias with must_exist (#65141)
Remove alias now parses the must_exist flag and results in a 404
(not found), if the index does not have the alias.

Closes #62642
Relates #58100

Co-Authored-By: Luca Cavanna <javanna@users.noreply.github.com>
2020-11-18 16:00:15 +01:00
Dan Hermann 923b2b90c5
Remove the deprecated local parameter for _cat/indices (#64868) 2020-11-16 07:53:16 -06:00
Dan Hermann f63a3b5cdc
Remove the deprecated local parameter for _cat/shards (#64867) 2020-11-13 07:34:15 -06:00
Lee Hinman bf63edde8d
Mark component and composable index template APIs as stable (#65013)
These were previously marked as experimental, but as we have not had any changes made or needed, we
are marking these as stable.
2020-11-12 14:16:57 -07:00
Dan Hermann c829f8edd1
Remove deprecated _upgrade API (#64732) 2020-11-12 11:09:56 -06:00
Fernando Briano b95bcc8346
[DOCS] Adds more information about skip in REST API spec tests (#64775) 2020-11-12 10:09:05 +00:00
Nik Everett e735d46bf2
Skip optimization if there are few docs (#64897)
In #63643 we added an optimization to the `range` agg that uses
`filters` if we believe it'd be faster to do. It turns out that building
the `ScorerSupplier` for the `filters` has a fairly high overhead. So if
we think the query would match only a couple thousand documents per
filter we avoid the optimization alltogether. This keeps aggregations
that complete in a couple of milliseconds from wasting 50ms juggling
filters.
2020-11-11 11:45:40 -05:00
Daniel Mitterdorfer 4acde700d3
Mute field collapsing tests in MixedClusterClientYamlTestSuiteIT (#64912)
Relates #52416
2020-11-11 11:44:03 +01:00
Nik Everett a08b52f3bd
Add `runtime_mappings` to search request (#64374)
This adds a way to specify the `runtime_mappings` on a search request
which are always "runtime" fields. It looks like:
```
curl -XDELETE -uelastic:password -HContent-Type:application/json localhost:9200/test
curl -XPOST -uelastic:password -HContent-Type:application/json 'localhost:9200/test/_bulk?pretty&refresh' -d'
{"index": {}}
{"animal": "cat", "sound": "meow"}
{"index": {}}
{"animal": "dog", "sound": "woof"}
{"index": {}}
{"animal": "snake", "sound": "hisssssssssssssssss"}
'

curl -XPOST -uelastic:password -HContent-Type:application/json localhost:9200/test/_search?pretty -d'
{
  "runtime_mappings": {
    "animal.upper": {
      "type": "keyword",
      "script": "for (String s : doc[\"animal.keyword\"]) {emit(s.toUpperCase())}"
    }
  },
  "query": {
    "match": {
      "animal.upper": "DOG"
    }
  }
}'
```

NOTE:
If we have to send a search request with runtime mappings to a node that
doesn't support runtime mappings at all then we'll fail the search
request entirely. The alternative would be to not send those runtime
mappings and let the node fail the search request with an "unknown field"
error. I believe this is would be hard to surprising because you defined
the field in the search request.

NOTE:
It isn't obvious but you can also use `runtime_mappings` to override fields
inside objects by naming the runtime fields with `.` in them. Like this:
```
curl -XDELETE -uelastic:password -HContent-Type:application/json localhost:9200/test
curl -uelastic:password -XPOST -HContent-Type:application/json localhost:9200/test/_bulk?refresh -d'
{"index":{}}
{"name": {"first": "Andrew", "last": "Wiggin"}}
{"index":{}}
{"name": {"first": "Julian", "last": "Delphiki", "suffix": "II"}}
'

curl -uelastic:password -XPOST -HContent-Type:application/json localhost:9200/test/_search?pretty -d'{
  "runtime_mappings": {
    "name.first": {
      "type": "keyword",
      "script": "if (\"Wiggin\".equals(doc[\"name.last.keyword\"].value)) {emit(\"Ender\");} else if (\"Delphiki\".equals(doc[\"name.last.keyword\"].value)) {emit(\"Bean\");}"
    }
  },
  "query": {
    "match": {
      "name.first": "Bean"
    }
  }
}'
```

Relates to #59332
2020-11-10 12:38:59 -05:00
Dan Hermann fae9b06cd5
Adjust deprecation version after backport (#64794)
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-11-09 13:43:47 -06:00
Dan Hermann 82242f7c3f
Adjust deprecation version after backport (#64789) 2020-11-09 13:43:24 -06:00
Nik Everett 7ceed1369d
Speed up date_histogram without children (#63643)
This speeds up `date_histogram` aggregations without a parent or
children. This is quite common - it's the aggregation that Kibana's Discover
uses all over the place. Also, we hope to be able to use the same
mechanism to speed aggs with children one day, but that day isn't today.

The kind of speedup we're seeing is fairly substantial in many cases:
```
|                              |                                            |  before |   after |    |
| 90th percentile service time |           date_histogram_calendar_interval | 9266.07 | 1376.13 | ms |
| 90th percentile service time |   date_histogram_calendar_interval_with_tz | 9217.21 | 1372.67 | ms |
| 90th percentile service time |              date_histogram_fixed_interval | 8817.36 | 1312.67 | ms |
| 90th percentile service time |      date_histogram_fixed_interval_with_tz | 8801.71 | 1311.69 | ms | <-- discover's agg
| 90th percentile service time | date_histogram_fixed_interval_with_metrics | 44660.2 | 43789.5 | ms |
```

This uses the work we did in #61467 to precompute the rounding points for
a `date_histogram`. Now, when we know the rounding points we execute the
`date_histogram` as a `range` aggregation. This is nice for two reasons:
1. We can further rewrite the `range` aggregation (see below)
2. We don't need to allocate a hash to convert rounding points
   to ordinals.
3. We can send precise cardinality estimates to sub-aggs.

Points 2 and 3 above are nice, but most of the speed difference comes from
point 1. Specifically, we now look into executing `range` aggregations as
a `filters` aggregation. Normally the `filters` aggregation is quite slow
but when it doesn't have a parent or any children then we can execute it
"filter by filter" which is significantly faster. So fast, in fact, that
it is faster than the original `date_histogram`.

The `range` aggregation is *fairly* careful in how it rewrites, giving up
on the `filters` aggregation if it won't collect "filter by filter" and
falling back to its original execution mechanism.


So an aggregation like this:

```
POST _search
{
  "size": 0,
  "query": {
    "range": {
      "dropoff_datetime": {
        "gte": "2015-01-01 00:00:00",
        "lt": "2016-01-01 00:00:00"
      }
    }
  },
  "aggs": {
    "dropoffs_over_time": {
      "date_histogram": {
        "field": "dropoff_datetime",
        "fixed_interval": "60d",
        "time_zone": "America/New_York"
      }
    }
  }
}
```

is executed like:

```
POST _search
{
  "size": 0,
  "query": {
    "range": {
      "dropoff_datetime": {
        "gte": "2015-01-01 00:00:00",
        "lt": "2016-01-01 00:00:00"
      }
    }
  },
  "aggs": {
    "dropoffs_over_time": {
      "range": {
        "field": "dropoff_datetime",
        "ranges": [
          {"from": 1415250000000, "to": 1420434000000},
          {"from": 1420434000000, "to": 1425618000000},
          {"from": 1425618000000, "to": 1430798400000},
          {"from": 1430798400000, "to": 1435982400000},
          {"from": 1435982400000, "to": 1441166400000},
          {"from": 1441166400000, "to": 1446350400000},
          {"from": 1446350400000, "to": 1451538000000},
          {"from": 1451538000000}
        ]
      }
    }
  }
}
```

Which in turn is executed like this:

```
POST _search
{
  "size": 0,
  "query": {
    "range": {
      "dropoff_datetime": {
        "gte": "2015-01-01 00:00:00",
        "lt": "2016-01-01 00:00:00"
      }
    }
  },
  "aggs": {
    "dropoffs_over_time": {
      "filters": {
        "filters": {
          "1": {"range": {"dropoff_datetime": {"gte": "2014-12-30 00:00:00", "lt": "2015-01-05 05:00:00"}}},
          "2": {"range": {"dropoff_datetime": {"gte": "2015-01-05 05:00:00", "lt": "2015-03-06 05:00:00"}}},
          "3": {"range": {"dropoff_datetime": {"gte": "2015-03-06 00:00:00", "lt": "2015-05-05 00:00:00"}}},
          "4": {"range": {"dropoff_datetime": {"gte": "2015-05-05 00:00:00", "lt": "2015-07-04 00:00:00"}}},
          "5": {"range": {"dropoff_datetime": {"gte": "2015-07-04 00:00:00", "lt": "2015-09-02 00:00:00"}}},
          "6": {"range": {"dropoff_datetime": {"gte": "2015-09-02 00:00:00", "lt": "2015-11-01 00:00:00"}}},
          "7": {"range": {"dropoff_datetime": {"gte": "2015-11-01 00:00:00", "lt": "2015-12-31 00:00:00"}}},
          "8": {"range": {"dropoff_datetime": {"gte": "2015-12-31 00:00:00"}}}
        }
      }
    }
  }
}
```

And *that* is faster because we can execute it "filter by filter".

Finally, notice the `range` query filtering the data. That is required for
the data set that I'm using for testing. The "filter by filter" collection
mechanism for the `filters` agg needs special case handling when the query
is a `range` query and the filter is a `range` query and they are both on
the same field. That special case handling "merges" the range query.
Without it "filter by filter" collection is substantially slower. Its still
quite a bit quicker than the standard `filter` collection, but not nearly
as fast as it could be.
2020-11-09 14:20:25 -05:00
Dan Hermann 5e146f007c
Adjust BWC after #64677 (Deprecate upgrade API) (#64695) 2020-11-06 08:20:04 -06:00
Christos Soulios 4dc833fa44
Add doc_count field mapper (#64503)
Bucket aggregations compute bucket doc_count values by incrementing the doc_count by 1 for every document collected in the bucket.

When using summary fields (such as aggregate_metric_double) one field may represent more than one document. To provide this functionality we have implemented a new field mapper (named doc_count field mapper). This field is a positive integer representing the number of documents aggregated in a single summary field.

Bucket aggregations will check if a field of type doc_count exists in a document and will take this value into consideration when computing doc counts.
2020-11-03 17:47:17 +02:00
Mark Tozzi f4e497e0bd
Revert terms bwc disable (#64233)
* Revert "disable BWC tests that will fail with the new include/exclude work (#64025)"

This reverts commit dc073d21d9.

* fix version number for BWC

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-11-02 10:50:43 -05:00
Nik Everett 2cb5803d29
Update skip after backport (#64369)
Now that #64214 has landed in 7.10 we can run the bwc test including it.
2020-10-29 14:39:13 -04:00
Boice Huang 41fbc52743
Deprecate the 'local' parameter of /_cat/indices (#62198) 2020-10-29 06:59:05 -05:00
Boice Huang 7d65278642
Deprecate the 'local' parameter of /_cat/shards (#62197) 2020-10-29 06:58:32 -05:00
Nik Everett 3af540b50d
Remove aggregation's postCollect phase (#64016)
After #63811 it became clear to me that `postCollect` is kind of
dangerous and not all that useful. So this removes it.

The trouble with `postCollect` is that it all happened right after we
finished calling `collect` on the `LeafBucketCollectors` but before we
built the aggregation results. But in #63811 we found out that we can't
call `postCollect` on the children of `parent` or `child` aggregators
until we know which *which* aggregation results we're building.

So this removes `postCollect` and moves all of the things we did at
post-collect phase into `buildAggregations` or into hooks called in
those methods.
2020-10-28 17:33:27 -04:00
Nik Everett 9dc43c95ad hush bwc tests
Backport is coming. Please wait.
2020-10-27 16:37:51 -04:00
Nik Everett 7feb19a74f
Make sure non-collecting aggs include sub-aggs (#64214)
Now that we're consistently using `cat_match` to filter which shards we
run on we can get this confusing case:
1. You have a search with, say, a range and a sub-agg.
2. That search has a query that `can_match` can recognize will match no
   docs. On *any* shard.
3. So we dutifully run it on a single shard so it can produce the
   "empty" aggs.
4. The shard we pick happens to not have the target of the range mapped.
5. This kicks in the special range aggregator that doesn't collect any
   documents.
6. Before this commit, that range aggregator *also* never produced any
   sub-aggs.

So, without this change, it was quite possible for a search that
happened to match no documents to "throw away" the sub-aggs of a range
and a few other aggs.

We've had this problem for a long, long time but it is more confusing
now because `can_match` is really kicking in and causing us to see cases
where it looks like you are targeting a lot of shards but you really are
only targeting a couple. It used to be that to get the "no sub-aggs"
behavior you had to explicitly target only shards that didn't map the
target field of the `range` agg. And, like, in that case it isn't too
bad because you targeted a sort of degenerate shard. But now that
`can_match` is doing its thing you can end up with the confusing steps
above. It took me several hours to track down what what happening I know
how the individual pieces of all of this works. It took four hours to
figure out how they fit together in this case....

Anyway! This replaces all the aggregator implementations that throw out
the sub-aggregators with ones that keep them. I think this'll be less
confusing in the future.

Closes #64142
2020-10-27 15:45:24 -04:00
Jason Tedor 117d79b5e9
Adjust defaults for tiered data roles (#64015)
This commit adjusts the defaults for the tiered data roles so that they
are enabled by default, or if the node has the legacy data role. This
ensures that the default experience is that the tiered data roles are
enabled.

To fully specifiy the behavior for the tiered data roles then:
 - starting a new node with the defaults: enabled
 - starting a new node with node.roles configured: enabled if and only
   if the tiered data roles are explicitly configured, independently
   of the node having the data role
 - starting a new node with node.data enabled: enabled unless the
   tiered data roles are explicitly disabled
 - starting a new node with node.data disabled: disabled unless the
   tiered data roles are explicitly enabled
2020-10-27 12:47:14 -04:00
Mark Tozzi dc073d21d9
disable BWC tests that will fail with the new include/exclude work (#64025) 2020-10-27 09:43:06 -04:00
Christoph Büscher 498e264df4
Count only mapped fields towards docvalue_fields limit (#63806)
Currently we count every field requested in the search request bodies
'docvalue_fields' section towards the limit defined by
the 'max_docvalue_fields_search' index setting which defaults to 100. This can
be a problem e.g. if the user searches across several indices with some fields
present in one index but not the other and has to add the joint set of field
names to the query. We currently trip the limit even if the number of actually
mapped fields in each index is below the limit.
This change adds a step to distiguish between mappend and unmapped fields and
only count the former towards the limit.

Closes #63730
2020-10-21 17:51:11 +02:00
Armin Braun 6128d357fc
Add REST Test for Snapshot Clone API (#63863)
Adds snapshot clone REST tests and HLRC support for the API.
2020-10-19 15:36:36 +02:00
Ignacio Vera 1dea28a878
Use Globals Ords in Cardinality aggregation for low cardinality fields (#62560)
New Cardinality aggregator implementation that uses global ords.
2020-10-13 09:19:15 +02:00
Julie Tibshirani 62857b49d1
Add support for missing value fetchers. (#63515)
This PR implements value fetching for the following field types:
* `text` phrase and prefix subfields
* `search_as_you_type`, plus its subfields
* `token_count`, which is implemented by fetching doc values

Supporting these types helps ensure that retrieving all fields through
`"fields": ["*"]` doesn't fail because of unsupported value fetchers.
2020-10-12 13:57:29 -07:00
Alan Woodward f4c85e4562
Convert TextFieldMapper to parametrized form (#63269)
As a result of this, we can remove a chunk of code from TypeParsers as well. Tests
for search/index mode analyzers have moved into their own file. This commit also
rationalises the serialization checks for parameters into a single SerializerCheck
interface that takes the values includeDefaults, isConfigured and the value
itself.

Relates to #62988
2020-10-07 10:29:29 +01:00
Igor Motov 504de90940
Update the cat tasks test skip version after backport (#63142)
Since #63036 is now backported, we can enable this test for earlier versions.

Relates to #61118
2020-10-01 14:46:24 -04:00
Igor Motov f55a8cc508
Add support for x_opaque_id to _cat/tasks (#63036)
Adds an optional column with support for x_opaque_id to _cat/tasks API.

Closes #61118
2020-10-01 11:58:16 -04:00
Alan Woodward 981258b02b
Remove TypeFieldMapper (#62838)
We don't need a special TypeFieldMapper for anything in particular; all access
to the type field can be done via a TypeFieldType that issues appropriate
deprecation warnings.

Relates to #41059
2020-09-30 15:47:29 +01:00
Alan Woodward 41bce50e71
Introduce FetchContext (#62357)
We currently pass a SearchContext around to share configuration among
FetchSubPhases. With the introduction of runtime fields, it would be useful
to start storing some state on this context to be shared between different
subphases (for example, stored fields or search lookups can be loaded lazily
but referred to by many different subphases). However, SearchContext is a
very large and unwieldy class, and adding more methods or state here feels
like a bridge too far.

This commit introduces a new FetchContext class that exposes only those
methods on SearchContext that are required for fetch phases. This reduces
the API surface area for fetch phases considerably, and should give us some
leeway to add further state.
2020-09-17 09:46:03 +01:00
Nik Everett 8a9028c169
Fix docvalue fetch for scaled floats (#62425)
In #61995 I moved the `docvalue_field` fetch code into a place where I
could share it with the fancy new `fields` fetch API. Specifically,
runtime fields can use it all that doc values code now. But I broke
`scaled_floats` by switching them how they are fetched from `double` to
`string`. This adds the override you need to switch them back.
2020-09-15 20:34:54 -04:00
Nik Everett 9a127adb4b
Implement fields fetch for runtime fields (#61995)
This implements the `fields` API in `_search` for runtime fields using
doc values. Most of that implementation is stolen from the
`docvalue_fields` fetch sub-phase, just moved into the same API that the
`fields` API uses. At this point the `docvalue_fields` fetch phase looks
like a special case of the `fields` API.

While I was at it I moved the "which doc values sub-implementation
should I use for fetching?" question from a bunch of `instanceof`s to a
method on `LeafFieldData` so we can be much more flexible with what is
returned and we're not forced to extend certain classes just to make the
fetch phase happy.

Relates to #59332
2020-09-15 15:57:26 -04:00
Nik Everett 049bca0959
Add more debugging information for cardinality agg (#62317)
This adds two extra bits of info to the profiler:
1. Count of the number of different types of collectors. This lets us figure
   out if we're using the optimization for segment ordinals. It adds a few
   more similar counters just for good measure.
2. Profiles the `getLeafCollector` and `postCollection` methods. These are
   non-trivial for some aggregations, like cardinality.
2020-09-15 08:49:13 -04:00
Julie Tibshirani 08cd1a6118 Adjust the skip version on 'fields' inner hits tests. 2020-09-14 12:29:16 -07:00
Julie Tibshirani f29c743a47
Support the 'fields' option in inner_hits and top_hits. (#62259)
This PR adds support for the 'fields' option in the following places:
* Anytime `inner_hits` is used, for both fetching nested/ child docs and field collapsing
* The `top_hits` aggregation

Addresses #61949.
2020-09-14 10:08:58 -07:00
Nik Everett 5649780939
Fix bug with terms' min_doc_count (#62130)
The `global_ordinals` implementation of `terms` had a bug when
`min_doc_count: 0` that'd cause sub-aggregations to have array index out
of bounds exceptions. Ooops. My fault. This fixes the bug by assigning
ordinals to those buckets.

Closes #62084
2020-09-09 11:21:56 -04:00
bellengao c0dfb45191
Add test for item-level error when no write index defined for an alias in bulk API (#55503)
Co-authored-by: Jake Landis <jake.landis@elastic.co>
2020-09-04 09:29:23 -05:00
Dan Hermann 5f290c226c
Adjust BWC after backport of 60818 (#61781) 2020-09-01 07:56:13 -05:00
Nhat Nguyen 879279c9b4
Introduce point in time APIs in x-pack basic (#61062)
This commit introduces a new API that manages point-in-times in x-pack 
basic. Elasticsearch pit (point in time) is a lightweight view into the
state of the data as it existed when initiated. A search request by
default executes against the most recent point in time. In some cases,
it is preferred to perform multiple search requests using the same point
in time. For example, if refreshes happen between search_after requests,
then the results of those requests might not be consistent as changes
happening between searches are only visible to the more recent point in
time.

A point in time must be opened before being used in search requests. The 
`keep_alive` parameter tells Elasticsearch how long it should keep a
point in time around.

```
POST /my_index/_pit?keep_alive=1m
```

The response from the above request includes a `id`, which should be 
passed to the `id` of the `pit` parameter of search requests.

```
POST /_search
{
    "query": {
        "match" : {
            "title" : "elasticsearch"
        }
    },
    "pit": {
            "id":  "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWICBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==",
            "keep_alive": "1m"
    }
}
```

Point-in-times are automatically closed when the `keep_alive` is 
elapsed. However, keeping point-in-times has a cost; hence,
point-in-times should be closed as soon as they are no longer used in
search requests.

```
DELETE /_pit
{
    "id" : "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWIBBXV1aWQyAAA="
}
```

#### Notable works in this change:

- Move the search state to the coordinating node: #52741
- Allow searches with a specific reader context: #53989
- Add the ability to acquire readers in IndexShard: #54966

Relates #46523
Relates #26472

Co-authored-by: Jim Ferenczi <jimczi@apache.org>
2020-08-24 20:24:35 -04:00
Julie Tibshirani 66304d380d Remove a skip in 'fields' REST tests.
The skip can be removed now that #61383 is backported.
2020-08-20 15:00:20 -07:00
Julie Tibshirani 468e58bb6a
Ensure fetch fields aren't dropped when rewriting search. (#61383)
Previously we didn't retain the requested fields when performing a shallow copy
of the search source. This meant that when a search was rewritten, we could drop
the requested fields and fail to return them in the response.
2020-08-20 14:11:19 -07:00
bellengao 651242b207
Fix wrong result when executing bulk requests with and without pipeline (#60818) 2020-08-19 06:54:58 -05:00
Nhat Nguyen 1d43e7648b Prevent shard relocation while closing index (#61072)
We might fail to close an index if some shards are being relocated.

Close #60913
2020-08-18 14:06:50 -04:00
Ryan Ernst fdb300961e
Increase docs and client rest test timeouts for Darwin (#61075)
The Darwin CI hosts continue to struggle with timeouts. This commit
increases the timouts for docs and client rest tests.

relates #58286
2020-08-13 21:21:24 -07:00
Nik Everett 9e9f4e3302
Some progress on failing runtime fields tests (bring #61098 to master) (#61101)
This breaks apart the a test for the `terms` aggregation into one that
work for runtime fields and one that doesn't.
2020-08-13 15:06:10 -04:00
Nik Everett 1ae9878dae
Break up a test for with runtime fields (brings #60931 to master) (#61103)
Breaks up an integration test into one that runtime fields can run and
one that runtime fields have to skip. This is because runtime fields
don't have global ords and we assert things *about* global ords in the
test we have to skip.
2020-08-13 15:05:12 -04:00
Nik Everett ce01e480e7
Fix a leftover skip (#60932)
Looks like I never updated a test skip many many months ago.
2020-08-10 16:24:57 -04:00
István Zoltán Szabó 7b71a686eb
[DOCS] Fixes broken links in rest API spec. (#60582) 2020-08-03 15:29:51 +02:00
Julie Tibshirani 74b56f3b67
Avoid using string 'y' in fields REST tests. (#60471)
Some yaml parsers interpret 'y' and 'yes' as the boolean 'true'.
2020-07-31 09:09:17 -07:00
James Rodewig aec26b1a23
[DOCS] Move search pagination content to one page (#60515) 2020-07-31 11:43:06 -04:00
Dan Hermann 90062d5c0d
Fix failing test for resolve index API (#60306) 2020-07-31 07:52:44 -05:00
Tim Brooks 50c56233ff
Reenable BWC after memory limit api backport (#60415)
This commit reenables BWC and updates version constants after the
bacport of #60342.
2020-07-29 13:30:59 -06:00
Nik Everett 944a6c243c
Allows nanosecond resolution in search_after (#60328)
This fixes `search_after` to properly parse string formatted dates that
have nanosecond resolution.

Closes #52424
2020-07-29 14:29:10 -04:00
Tim Brooks b1a6271ec8
Add configured indexing memory limit to node stats (#60342)
This commit adds the configured memory limit to the node stats API.
2020-07-29 11:20:59 -06:00
David Turner 48981b4042
Fix up BWC following backport of #60297 (#60313)
Co-authored-by: Igor Motov <igor@motovs.org>
2020-07-28 20:12:58 +01:00
Igor Motov a23f1a00f5
Prepare for backport of aggregations in node info (#60309)
This commit temporary disables all bwc tests until #60256 is merged.
2020-07-28 14:05:39 -04:00
Julie Tibshirani a1f7ff7f07 Adjust the BWC version for the 'fields' param.
We can lower it to 7.10.0 now that it's been backported.
2020-07-28 10:58:33 -07:00
David Turner 940d618186
Log and track open/close of transport connections (#60297)
Transport connections between nodes remain in place until one or other
node shuts down or the connection is disrupted by a flaky network.
Today it is very difficult to demonstrate that transient failures and
cluster instability are caused by the network even though this is often
the case. In particular, transport connections open and close without
logging anything, even at `DEBUG` level, making it very hard to quantify
the scale of the problem or to correlate the networking problems with
external events.

This commit adds the missing `DEBUG`-level logging when transport
connections open and close, and also tracks the total number of
transport connections a node has opened as a measure of the stability of
the underlying network.
2020-07-28 16:58:00 +01:00
Julie Tibshirani 8a89d95372
Add search `fields` parameter to support high-level field retrieval. (#60100)
This feature adds a new `fields` parameter to the search request, which
consults both the document `_source` and the mappings to fetch fields in a
consistent way. The PR merges the `field-retrieval` feature branch.

Addresses #49028 and #55363.
2020-07-27 13:25:55 -07:00
Igor Motov 20c5f7edff
Add aggregation list to node info (#60074)
Adds a full list of supported aggregations to the node info API. This list
will be used in transform tests and telemetry mapping tests that will be added
as follow-up PRs.

Fixes #59774
2020-07-27 14:45:02 -04:00
Tim Brooks 5c227dac88
Implement human readable indexing pressure stats (#60022)
The indexing pressure stats do not currently have human readable
variants. This commit add human readable variants and updates the
documentation.
2020-07-22 09:54:51 -06:00
Nik Everett ad3a3f07d4
Fix bug in deep pipeline agg serialization (forward port of tests in #59984) (#60018)
In #54716 I removed pipeline aggregators from the aggregation result
tree and caused us to read them from the request. This saves a bunch of
round trip bytes, which is neat. But there was a bug in the backwards
compatibility logic. You see, we still have to give the pipeline
aggregations to nodes older than 7.8 over the wire because that is how
they know what pipelines to run. They have the pipelines in the request
but they don't read them. They use the ones in the response tree.

Anyway, we had a bug where we were never sending pipelines defined two
levels down. So while you are upgrading the pipeline wouldn't run.
Sometimes. If the data node of the "first" result was post-7.8 and the
coordinating node was pre-7.8.

This fixes the bug.
2020-07-21 16:49:26 -04:00
Lee Hinman 15e674fa77 Fix skip version for 14_alias_to_multiple_indices.yml
The message was slightly changed but only in 7.9+

Relates to #59806
2020-07-17 16:56:13 -06:00
Lee Hinman ebcf5d525d
Fix retrieving data stream stats for a DS with multiple backing indices (#59806)
* Fix retrieving data stream stats for a DS with multiple backing indices

This API incorrectly had `allowAliasesToMultipleIndices` set to false in the default options for the
request. This changes it from `false` to `true` and enhances a test to exercise the functionality.

Resolves #59802

* Fix test for wording change
2020-07-17 13:52:44 -06:00
Benjamin Trent 1c7e16319d
[ML] adjusting bwc tests and serialization for require_alias (#59780) 2020-07-17 12:27:52 -04:00
Lee Hinman 19a380ac63
Allow simulating existing composable index template (#59733)
This change allows simulating replacing a composable template with a different version, for example:

```
POST /_index_template/_simulate/my-template
{
  "index_patterns": ["idx*"],
  "composed_of": ["ct1"],
  "priority": 10,
  "template": {
    "settings": {
      "index.lifecycle.name": "policy"
    }
  }
}
```

Should simulate as if `my-template` were replaced with the template specified in the body.

Resolves #59152
2020-07-17 10:12:30 -06:00
Benjamin Trent f72b893fd3
Adding new `require_alias` option to indexing requests (#58917)
This commit adds the `require_alias` flag to requests that create new documents.

This flag, when `true` prevents the request from automatically creating an index. Instead, the destination of the request MUST be an alias.

When the flag is not set, or `false`, the behavior defaults to the `action.auto_create_index` settings.

This is useful when an alias is required instead of a concrete index.

closes https://github.com/elastic/elasticsearch/issues/55267
2020-07-17 08:45:46 -04:00
Igor Motov 5d1d397e48
Prepare for backport of hard bounds in Histograms (#59720)
Histograms are used in many tests, so it is not practical to hunt them all down
and disable one by one. This commit temporary disables all bwc tests until
#59656 is merged.
2020-07-16 14:41:50 -04:00
Dan Hermann 902c1fa80a
Move REST specs for data streams (#59634) 2020-07-16 08:56:58 -05:00
James Baiera 1fb4a5ed31
Remove unneeded rest params from Data Stream Stats (#59575)
This PR removes the expand_wildcards and forbid_closed_indices parameters from the Data 
Streams Stats REST endpoint. These options are required for broadcast requests, but are not 
needed for anything in terms of resolving data streams. Instead, we just set a default set of 
IndicesOptions on the transport request.
2020-07-15 14:24:15 -04:00
James Baiera 589bb1f26c
Data Stream Stats API (#58707)
This API reports on statistics important for data streams, including the number of data 
streams, the number of backing indices for those streams, the disk usage for each data 
stream, and the maximum timestamp for each data stream
2020-07-14 14:51:32 -04:00
Tim Brooks aa14860597
Separate coordinating and primary bytes in stats (#59487)
Currently we combine coordinating and primary bytes into a single bucket
for indexing pressure stats. This makes sense for rejection logic.
However, for metrics it would be useful to separate them.
2020-07-14 12:22:42 -06:00
Mayya Sharipova b130a1bfbb
Fix the test version in highlighters test (#59515)
This was supposed to be done after the backport but was missed.

Related to #53408
2020-07-14 11:24:34 -04:00
Dan Hermann 1d3a723f68
Adds write_index_only option to put mapping API (#59396) 2020-07-14 08:25:10 -05:00
Andrei Dan 5609353c5d
Default to @timestamp in composable template datastream definition (#59317)
This makes the data_stream timestamp field specification optional when
defining a composable template.
When there isn't one specified it will default to `@timestamp`.
2020-07-14 11:45:48 +01:00
Andrei Dan 4e72f43d62
Composable templates: add a default mapping for @timestamp (#59244)
This adds a low precendece mapping for the `@timestamp` field with
type `date`.
This will aid with the bootstrapping of data streams as a timestamp
mapping can be omitted when nanos precision is not needed.
2020-07-14 09:19:00 +01:00
Tim Brooks 9f22634bcd
Update versions for indexing pressure backport (#59472)
This commit updates the node stats version constants to reflect the fact
that index pressure stats were backported to 7.9. It also reenables BWC
tests.
2020-07-13 18:29:32 -06:00
Igor Motov 12c61e0d80
Change 7.9.99 -> 7.99.99 in tests (#59469)
Since we most likely going to have 7.10 we should update version
in tests skips to 7.99.99.
2020-07-13 17:26:15 -04:00
Igor Motov 0af410ad0b
Adds hard_bounds to histogram aggregations (#59175)
* Adds hard_bounds to histogram aggregations

Adds a hard_bounds parameter to explicitly limit the buckets that a histogram
can generate. This is especially useful in case of open ended ranges that can
produce a very large number of buckets.

Co-authored-by: Mark Tozzi <mark.tozzi@gmail.com>
2020-07-13 16:43:11 -04:00
Nik Everett e6e906d0e7 Update skip after backport of #42035 2020-07-13 16:39:30 -04:00
Tim Brooks b87bb86d88
Adding indexing pressure stats to node stats API (#59247)
We have recently added internal metrics to monitor the amount of
indexing occurring on a node. These metrics introduce back pressure to
indexing when memory utilization is too high. This commit exposes these
stats through the node stats API.
2020-07-13 10:37:46 -06:00
Martijn van Groningen 40b9fd49e0
Make data streams a basic licensed feature. (#59293)
* Create new data-stream xpack module.
* Move TimestampFieldMapper to the new module,
  this results in storing a composable index template
  with data stream definition only to work with default
  distribution. This way data streams can only be used
  with default distribution, since a data stream can
  currently only be created if a matching composable index
  template exists with a data stream definition.
* Renamed `_timestamp` meta field mapper 
   to `_data_stream_timestamp` meta field mapper.
* Add logic to put composable index template api
  to fail if `_data_stream_timestamp` meta field mapper
  isn't registered. So that a more understandable
  error is returned when attempting to store a template
  with data stream definition via the oss distribution.

In a follow up the data stream transport and
rest actions can be moved to the xpack data-stream module.
2020-07-13 11:43:42 +02:00
Dan Hermann 198b4253d9
Data stream admin actions are now index-level actions (#59095) 2020-07-10 12:25:07 -05:00
Martijn van Groningen cb6b05d12b
Fix the timestamp field of a data stream to @timestamp (#59076)
The commit makes the following changes:
* The timestamp field of a data stream definition in a composable
  index template can only be set to '@timestamp'.
* Removed custom data stream timestamp field validation and reuse the validation from `TimestampFieldMapper` and
  instead only check that the _timestamp field mapping has been defined on a backing index of a data stream.
* Moved code that injects _timestamp meta field mapping from `MetadataCreateIndexService#applyCreateIndexRequestWithV2Template58956(...)` method
  to `MetadataIndexTemplateService#collectMappings(...)` method.
* Fixed a bug (#58956) that cases timestamp field validation to be performed
  for each template and instead of the final mappings that is created.
* only apply _timestamp meta field if index is created as part of a data stream or data stream rollover,
this fixes a docs test, where a regular index creation matches (logs-*) with a template with a data stream definition.

Relates to #58642
Relates to #53100
Closes #58956
Closes #58583
2020-07-08 09:41:47 +02:00
James Rodewig 2be9db01c8
[DOCS] Replace `datatype` with `data type` (#58972) 2020-07-07 13:52:10 -04:00
Andrei Dan 0d9c98a823
GET data stream API returns additional information (#59128)
This adds the data stream's index template, the configured ILM policy
(if any) and the health status of the data stream to the GET _data_stream
response.

Restoring a data stream from a snapshot could install a data stream that
doesn't match any composable templates. This also makes the `template` 
field in the `GET _data_stream` response optional.
2020-07-07 17:52:40 +01:00
Nik Everett 38667b8fe5
Update skip after backport (#59153)
Update a skip after backporting #59099.
2020-07-07 11:33:01 -04:00
Jake Landis bd49766fde
Add build plugin to rest-api-spec to properly generate pom (#59142)
The build plugin is still necessary to generate the POM file.
This commit adds the build plugin plugin back to the rest-api-spec
project. It was recently removed as it was thought to be
unnecessary.
2020-07-07 09:58:07 -05:00
Nik Everett 3b3ed4b4a7
Fix lookup support in adjacency matrix (#59099)
This request:
```
POST /_search
{
  "aggs": {
    "a": {
      "adjacency_matrix": {
        "filters": {
          "1": {
            "terms": { "t": { "index": "lookup", "id": "1", "path": "t" } }
          }
        }
      }
    }
  }
}
```

Would fail with a 500 error and a message like:
```
{
  "error": {
    "root_cause": [
      {
        "type": "illegal_state_exception",
        "reason":"async actions are left after rewrite"
      }
    ]
  }
}
```

This fixes that by moving the query rewrite phase from a synchronous
call on the data nodes into the standard aggregation rewrite phase which
can properly handle the asynchronous actions.
2020-07-06 18:53:19 -04:00
Jake Landis 333a5d8cdf
Create plugin for yamlTest task (#56841)
This commit creates a new Gradle plugin to provide a separate task name
and source set for running YAML based REST tests. The only project
converted to use the new plugin in this PR is distribution/archives/integ-test-zip.
For which the testing has been moved to :rest-api-spec since it makes the most
sense and it avoids a small but awkward change to the distribution plugin.

The remaining cases in modules, plugins, and x-pack will be handled in followups.

This plugin is distinctly different from the plugin introduced in #55896 since
the YAML REST tests are intended to be black box tests over HTTP. As such they
should not (by default) have access to the classpath for that which they are testing.

The YAML based REST tests will be moved to separate source sets (yamlRestTest).
The which source is the target for the test resources is dependent on if this
new plugin is applied. If it is not applied, it will default to the test source
set.

Further, this introduces a breaking change for plugin developers that
use the YAML testing framework. They will now need to either use the new source set
and matching task, or configure the rest resources to use the old "test" source set that
matches the old integTest task. (The former should be preferred).

As part of this change (which is also breaking for plugin developers) the
rest resources plugin has been removed from the build plugin and now requires
either explicit application or application via the new YAML REST test plugin.

Plugin developers should be able to fix the breaking changes to the YAML tests
by adding apply plugin: 'elasticsearch.yaml-rest-test' and moving the YAML tests
under a yamlRestTest folder (instead of test)
2020-07-06 12:13:01 -05:00
Dan Hermann be45d016e4
Delete data stream API accepts multiple names (#58833) 2020-07-06 06:45:58 -05:00
Dan Hermann 100e7a6651
Data stream support for indices shard stores API (#58591) 2020-07-06 06:44:31 -05:00
Dan Hermann c3aaf33d73
Ignore matching data streams if include_data_streams is false (#57900) 2020-07-03 08:46:01 -05:00
Dan Hermann e592a9a5e7
Add include_data_streams flag for authorization (#58154) 2020-07-02 23:53:31 -05:00
Dan Hermann 50ed781c9f
Mirror privileges over data streams to their backing indices (#58381) 2020-07-02 21:56:16 -05:00
Martijn van Groningen 001b3fb440
Add data stream timestamp validation via metadata field mapper (#58582)
This commit adds a new metadata field mapper that validates,
that a document has exactly a single timestamp value in the data stream timestamp field and
that the timestamp field mapping only has `type`, `meta` or `format` attributes configured.
Other attributes can affect the guarantee that an index with this meta field mapper has a 
useable timestamp field.

The MetadataCreateIndexService inserts a data stream timestamp field mapper whenever
a new backing index of a data stream is created.

Relates to #53100
2020-07-02 10:58:18 +02:00
David Turner acf031cdb5
Forbid read-only-allow-delete block in blocks API (#58727)
* Forbid read-only-allow-delete block in blocks API

The read-only-allow-delete block is not really under the user's control
since Elasticsearch adds/removes it automatically. This commit removes
support for it from the new API for adding blocks to indices that was
introduced in #58094.

* Missing xref

* Reword paragraph on read-only-allow-delete block
2020-07-01 12:57:34 +01:00
David Turner 83d6589b2a
Account for remaining recovery in disk allocator (#58029)
Today the disk-based shard allocator accounts for incoming shards by
subtracting the estimated size of the incoming shard from the free space on the
node. This is an overly conservative estimate if the incoming shard has almost
finished its recovery since in that case it is already consuming most of the
disk space it needs.

This change adds to the shard stats a measure of how much larger each store is
expected to grow, computed from the ongoing recovery, and uses this to account
for the disk usage of incoming shards more accurately.
2020-07-01 08:04:45 +01:00
Lee Hinman 29c05544ec
Fix template name in mapping composition yml test (#58788)
The warning was copied from elsewhere and just needed to use the correct template and index name.
2020-06-30 17:03:47 -06:00
Julie Tibshirani 416cb6b31e Adjust the skip version for template mapping merging REST test. 2020-06-30 13:22:16 -07:00
Martijn van Groningen 906aed4a88
Add data stream support to put mapping and update index settings APIs. (#58231)
Change update index setting and put mapping api
to execute on all backing indices if data stream is targeted.

Relates #53100
2020-06-30 17:23:27 +02:00
Lee Hinman 3b68df2355
Add default composable templates for new indexing strategy (#57629)
This commit adds the component and composable templates, as well as ILM policies, for the new
default indexing strategy. It installs:

- logs-default-mappings (component)
- logs-default-settings (component)
- logs-default-policy (ilm policy)
- logs-default-template (composable template)
- metrics-default-mappings (component)
- metrics-default-settings (component)
- metrics-default-policy (ilm policy)
- metrics-default-template (composable template)

These templates and policies are managed by a new x-pack module, `stack`, and can be disabled by
setting `stack.templates.enabled` to `false`.

These ensure that patterns for the `logs-*-*` and `metrics-*-*` indices are set up to create data
streams with the proper mappings and settings.

This also makes changes to the `IndexTemplateRegistry` to support installing component and
composable templates (previously it supported only legacy templates).

Resolves #56709
2020-06-30 09:19:37 -06:00
Yannick Welsch e4df92815e Adapt BWC after backport of (#58094) 2020-06-30 14:09:03 +02:00
Yannick Welsch 5e345e115b
Add index block api (#58094)
Adds an API for putting an index block in place, which also ensures for write blocks that, once successfully returning to
the user, all shards of the index are properly accounting for the block, for example that all in-flight writes to an index have
been completed after adding the write block.

This API allows coordinating more complex workflows, where it is crucial that an index is no longer receiving writes after
the API completes, useful for example when marking an index as read-only during an upgrade in order to reindex its
documents.
2020-06-30 09:33:15 +02:00
Julie Tibshirani 676893a263
Merge mappings for composable index templates (#58521)
This PR implements recursive mapping merging for composable index templates.

When creating an index, we perform the following:
* Add each component template mapping in order, merging each one in after the
last.
* Merge in the index template mappings (if present).
* Merge in the mappings on the index request itself (if present).

Some principles:
* All 'structural' changes are disallowed (but everything else is fine). An
object mapper can never be changed between `type: object` and `type: nested`. A
field mapper can never be changed to an object mapper, and vice versa.
* Generally, each section is merged recursively. This includes `object`
mappings, as well as root options like `dynamic_templates` and `meta`. Once we
reach 'leaf components' like field definitions, they always overwrite an
existing one instead of being merged.

Relates to #53101.
2020-06-29 15:00:40 -07:00
Enrico Zimuel 9a7a28958a
Added PERL reserved words in REST keywords (#58535) 2020-06-26 12:11:22 +02:00
Dan Hermann edc15d7c90
Add data stream support to open index API (#58487) 2020-06-25 09:13:53 -05:00
Dan Hermann 603c5f7a48
Data stream support for get field mappings API (#58488)
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-06-25 08:08:51 -05:00
Dan Hermann 991f635c7f
Data stream support for search shards API (#58486) 2020-06-25 07:57:56 -05:00
Rory Hunter 48c9a0776b Update rest-api-spec keyword list
Follow-up to 35aecf4c9a. Somehow I missed the fact that there's an ILM
API named `retry`, which is a keyword in Ruby. I've removed it from the
keywords list.
2020-06-25 09:53:16 +01:00
Rory Hunter 35aecf4c9a
Validate that REST API names do not contain keywords (#58452)
If an API name (or components of a name) overlaps with a reserved word in
the programming language for an ES client, then it's possible that the code
that is generated from the API will not compile. This PR adds validation to
check for such overlaps.
2020-06-25 09:47:05 +01:00
Martijn van Groningen 0166e0e5a3
Re-enable data streams yaml tests in bwc mode (#58403) 2020-06-24 14:55:57 +02:00
James Dorfman e99d287fbb
Add Variable Width Histogram Aggregation (#42035)
Implements a new histogram aggregation called `variable_width_histogram` which
dynamically determines bucket intervals based on document groupings. These
groups are determined by running a one-pass clustering algorithm on each shard
and then reducing each shard's clusters using an agglomerative
clustering algorithm.

This PR addresses #9572.

The shard-level clustering is done in one pass to minimize memory overhead. The
algorithm was lightly inspired by
[this paper](https://ieeexplore.ieee.org/abstract/document/1198387). It fetches
a small number of documents to sample the data and determine initial clusters.
Subsequent documents are then placed into one of these clusters, or a new one
if they are an outlier. This algorithm is described in more details in the
aggregation's docs.

At reduce time, a
[hierarchical agglomerative clustering](https://en.wikipedia.org/wiki/Hierarchical_clustering)
algorithm inspired by [this paper](https://arxiv.org/abs/1802.00304)
continually merges the closest buckets from all shards (based on their
centroids) until the target number of buckets is reached.

The final values produced by this aggregation are approximate. Each bucket's
min value is used as its key in the histogram. Furthermore, buckets are merged
based on their centroids and not their bounds. So it is possible that adjacent
buckets will overlap after reduction. Because each bucket's key is its min,
this overlap is not shown in the final histogram. However, when such overlap
occurs, we set the key of the bucket with the larger centroid to the midpoint
between its minimum and the smaller bucket’s maximum:
`min[large] = (min[large] + max[small]) / 2`. This heuristic is expected to
increases the accuracy of the clustering.

Nodes are unable to share centroids during the shard-level clustering phase. In
the future, resolving https://github.com/elastic/elasticsearch/issues/50863
would let us solve this issue. 

It doesn’t make sense for this aggregation to support the `min_doc_count`
parameter, since clusters are determined dynamically. The `order` parameter is
not supported here to keep this large PR from becoming too complex.
2020-06-23 09:26:54 -04:00
Martijn van Groningen 085ba99fba
Keep track of timestamp_field mapping as part of a data stream (#58096)
Relates to #53100

* use mapping source direcly instead of using mapper service to extract the relevant mapping details
* moved assertion to TimestampField class and added helper method for tests
* Improved logic that inserts timestamp field mapping into an mapping.
If the timestamp field path consisted out of object fields and
if the final mapping did not contain the parent field then an error
occurred, because the prior logic assumed that the object field existed.
2020-06-22 12:01:01 +02:00
Jim Ferenczi b0f4024879
Adapt bwc version after backport of #58299 (#58300)
This commit adapts the bwc version in preparation of the backport
to 7.x. The bwc tests are disabled in order to allow the merge of
#58299.

Relates #58299
2020-06-18 10:22:54 +02:00
Rory Hunter 1f6c953194
Rename dangling index APIs (#58266)
The dangling_indices.import API name could cause issues in the client
libs because import is a reserved word in many languages. Rename the
API to avoid this, and rename the other APIs for consistency.

Related to #48366.
2020-06-18 08:57:39 +01:00
Jim Ferenczi 90c9b95ca0
Allow index filtering in field capabilities API (#57276)
* Add index filtering in field capabilities API

This change allows to use an `index_filter` in the
field capabilities API. Indices are filtered from
the response if the provided query rewrites to `match_none`
on every shard:

````
GET metrics-*
{
  "index_filter": {
    "bool": {
      "must": [
        "range": {
          "@timestamp": {
            "gt": "2019"
          }
        }
      }
  }
}
````

The filtering is done on a best-effort basis, it uses the can match phase
to rewrite queries to `match_none` instead of fully executing the request.
The first shard that can match the filter is used to create the field
capabilities response for the entire index.

Closes #56195
2020-06-17 22:53:53 +02:00
Rory Hunter ebe8951879
Implement dangling indices API (#50920)
Part of #48366. Implement an API for listing, importing and deleting dangling
indices.

Co-authored-by: David Turner <david.turner@elastic.co>
2020-06-16 15:19:17 +01:00
Dan Hermann cce279bbb3
Prohibit clone, shrink, and split on a data stream's write index (#58104) 2020-06-16 08:37:48 -05:00
Nik Everett 7c7fe0152d
Save memory when auto_date_histogram is not on top (#57304)
This builds an `auto_date_histogram` aggregator that natively aggregates
from many buckets and uses it when the `auto_date_histogram` used to use
`asMultiBucketAggregator` which should save a significant amount of
memory in those cases. In particular, this happens when
`auto_date_histogram` is a sub-aggregator of a multi-bucketing aggregator
like `terms` or `histogram` or `filters`. For the most part we preserve
the original implementation when `auto_date_histogram` only collects from
a single bucket.

It isn't possible to "just port the aggregator" without taking a pretty
significant performance hit because we used to rewrite all of the
buckets every time we switched to a coarser and coarser rounding
configuration. Without some major surgery to how to delay sub-aggs
we'd end up rewriting the delay list zillions of time if there are many
buckets.

The multi-bucket version of the aggregator has a "budget" of "wasted"
buckets and only rewrites all of the buckets when we exceed that budget.
Now that we don't rebucket every time we increase the rounding we can no
longer get an accurate count of the number of buckets! So instead the
aggregator uses an estimate of the number of buckets to trigger switching
to a coarser rounding. This estimate is likely to be *terrible* when
buckets are far apart compared to the rounding. So it also uses the
difference between the first and last bucket to trigger switching to a
coarser rounding. Which covers for the shortcomings of the bucket
estimation technique pretty well. It also causes the aggregator to emit
fewer buckets in cases where they'd be reduced together on the
coordinating node. This is wonderful! But probably fairly rare.

All of that does buy us some speed improvements when the aggregator is
a child of multi-bucket aggregator:
Without metrics or time zone: 25% faster
With metrics: 15% faster
With time zone: 22% faster

Relates to #56487
2020-06-15 14:33:31 -04:00
Dan Hermann e515adb07e
Fix REST test for resolve index API (#58043) 2020-06-12 13:14:58 -05:00
Dan Hermann f9f39d75fa
Mute failing REST tests with correct syntax (#58048) 2020-06-12 09:28:00 -05:00
Dan Hermann 780603d9f7
Mute failing REST tests 2020-06-12 08:54:12 -05:00
Dan Hermann 9724fa9dc8
Resolve index API (#57626) 2020-06-12 06:25:16 -05:00
Martijn van Groningen eb6f46a342
Enforce valid field mapping exists for timestamp_field in templates. (#57741)
Relates to #53100
2020-06-12 13:22:20 +02:00
Martijn van Groningen 01b70b4068
Prohibit append-only writes targeting backing indices directly. (#57788)
Append-only writes can only target the corresponding data stream.

Relates to #53100
2020-06-11 11:29:27 +02:00
Russ Cam a85f2bede8
Mark Component and Index template APIs as experimental (#57910)
This commit marks the Component Template and
Index Template APIs as experimental.
2020-06-10 14:06:32 +10:00
Lee Hinman c688eb69c7
Disallow merging existing mapping field definitions in templates (#57701)
* Disallow merging existing mapping field definitions in templates

This commit changes the merge strategy introduced in #55607 and #55982. Instead of overwriting these
fields, we now prevent them from being merged with an exception when a user attempts to
overwrite a field.

As part of this, a more robust validation has been added. The existing validation checked whether
templates (composable and component) were valid on their own, this new validation now checks that
the composite template (mappings/settings/aliases) is valid. This means that when a composable
template is added or updated, we confirm that it is valid with its component pieces. When a
component template is updated we ensure that all composable templates that make use of the component
template continue to be valid before allowing the component template to be updated.

This change also necessitated changes in the tests, however, I have left tests that exercise mapping
merging with nested object fields as `@AwaitsFix`, as we intend to change the behavior soon to allow
merging in a recursive-with-replacement fashion (see: #57393). I have added tests that check the new
disallowing behavior in the meantime.

* Use functional instead of imperative prefix collection

* Use IndexService.withTempIndexService

* Rename tests

* Fix tests

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-06-08 08:57:13 -06:00
Dan Hermann 904bdae9ff
Change default backing index naming scheme (#57721) 2020-06-08 08:39:55 -05:00
Dan Hermann c4334ee074
Prohibit closing the write index for a data stream (#57692) 2020-06-05 10:00:13 -05:00
Nik Everett 1c7bd29f4c
update skip after backport of #57397 (#57694) 2020-06-04 15:37:53 -04:00
Nik Everett 4b9e378d4a Bump skip before backport 2020-06-04 12:16:20 -04:00
Nik Everett 69cd4435b2
Merge remaining sig_terms into terms (#57397)
Merges the remaining implementation of `significant_terms` into `terms`
so that we can more easilly make them work properly without
`asMultiBucketAggregator` which *should* save memory and speed them up.

Relates #56487
2020-06-04 11:22:03 -04:00
Russ Cam f77005a0e2
Update snapshot.delete.json to make snapshot a list (#57326)
Relates: elastic/elasticsearch#55474

This commit updates the snapshot.delete.json REST API spec
to make snapshot a list type, now that it can accept a
list of comma-separated snapshot names
2020-06-03 09:48:51 +10:00
Nik Everett 474a3fc49f
Update skip after backport of #57438 (#57550) 2020-06-02 16:22:03 -04:00
Nik Everett b072f5f002
Fix an optimization in terms agg (#57438)
When the `terms` agg runs against strings and uses global ordinals it
has an optimization when it collects segments that only ever have a
single value for the particular string. This is *very* common. But I
broke it in #57241. This fixes that optimization and adds `debug`
information that you can use to see how often we collect segments of
each type. And adds a test to make sure that I don't break the
optimization again.

We also had a specialiation for when there isn't a filter on the terms
to aggregate. I had removed that specialization in #57241 which resulted
in some slow down as well. This adds it back but in a more clear way.
And, hopefully, a way that is marginally faster when there *is* a
filter.

Closes #57407
2020-06-02 13:57:27 -04:00
Nik Everett 27bff25cf8
Update skip after backport of #57277 (#57379) 2020-05-29 16:20:07 -04:00
Nik Everett 460b204f8e
Save memory when histogram agg is not on top (#57277)
This saves some memory when the `histogram` aggregation is not a top
level aggregation by dropping `asMultiBucketAggregator` in favor of
natively implementing multi-bucket storage in the aggregator. For the
most part this just uses the `LongKeyedBucketOrds` that we built the
first time we did this.
2020-05-29 09:54:47 -04:00
Nik Everett d0a253db5b
Update skip after backport of #57241 (#57316) 2020-05-29 08:03:34 -04:00
Martijn van Groningen 9d07229879
Change cluster info actions to be able to resolve data streams. (#56878)
With this change the following APIs will be able to resolve data streams:
get index, get mappings and ilm explain APIs.

Relates to #53100
2020-05-29 11:04:55 +02:00
Russ Cam 0b041cccd8
Deprecate local param in get_mapping.json (#57265)
Relates: elastic/elasticsearch#55014

This commit deprecates the local param in get_mapping.json.
This parameter is a no-op and field mappings are always retrieved locally.
2020-05-29 12:24:44 +10:00
Nik Everett 29e9e79656 Update skip before backport
I accidentally didn't put the customary "skip the last version" on
 #57241 and the PR tests didn't catch it. This adds it.
2020-05-28 16:01:37 -04:00
Martijn van Groningen 9f6bc6856b
Re-able data stream bwc tests (#57293)
after merging #57275
2020-05-28 21:36:03 +02:00
Nik Everett 974d236fbc
Make global ords terms simpler to understand (#57241)
When the `terms` enum operates on non-numeric data it can collect it via
global ordinals. It actually has two separate collection strategies for,
one "dense" and one "remapping". Each of *those* strategies has two
"iteration" strategies that it uses to build buckets, depending on
whether or not we need buckets with `0` docs in them. Previously this
was done with several `null` checks and never really explained. This
change replaces those checks with two `CollectionStrategy` classes which
have good stuff like documentation.
2020-05-28 15:29:31 -04:00
Christoph Büscher 3d4f9fedaf
Check for negative "from" values in search request body (#54953)
Today we already disallow negative values for the "from" parameter in the search
API when it is set as a request parameter and setting it on the
SearchSourceBuilder, but it is still parsed without complaint from a search
body, leading to differing exceptions later. This PR changes this behavior to be
the same regardless of setting the value directly, as url parameter or in the
search body. While we silently accepted "-1" as meaning "unset" and used the
default value of 0 so far, any negative from-value is now disallowed.

Closes #54897
2020-05-28 16:25:19 +02:00
Martijn van Groningen f8b090b641
Ensure template exists when creating data stream (#56888)
Limit the creation of data streams only for namespaces that have a composable template with a data stream definition.

This way we ensure that mappings/settings have been specified and will be used at data stream creation and data stream rollover.

Also remove `timestamp_field` parameter from create data stream request and
let the create data stream api resolve the timestamp field
from the data stream definition snippet inside a composable template.

Relates to #53100
2020-05-28 13:11:15 +02:00
Dan Hermann 7a67395807
Limit _cat/indices test to versions with fix (#57244) 2020-05-27 16:06:33 -05:00
Nik Everett 69661252fc
Update skip after backport of #56789 (#57238) 2020-05-27 16:30:31 -04:00
Lee Hinman 4dc32611fc
Rename template V2 classes to ComposableTemplate (#57183)
This PR changes the name of the Index Template V2 classes to "Composable Templates", it also ensures there are no mentions of "V2" in the documentation or error/warning messages. V1 templates are referred to as "legacy" templates.

Resolves #56609
2020-05-27 09:32:10 -06:00
Nik Everett 9aaab6efdd
Save memory on numeric sig terms when not top (#56789)
This saves memory when running numeric significant terms which are not
at the top level by merging its collection into numeric terms and relying
on the optimization that we made in #55873.
2020-05-27 10:53:09 -04:00
Russ Cam 38a17f299f
Update track_total_hits to union type (#51846)
* Update track_total_hits to union type

This commit updates track_total_hits parameter type to a union
of boolean and number, to reflect the possible values that can
be passed.

* Update rest-api-spec/src/main/resources/rest-api-spec/api/search.json

Co-Authored-By: Karel Minarik <karel.minarik@gmail.com>

* Update rest-api-spec/src/main/resources/rest-api-spec/api/search.json

Co-authored-by: Karel Minarik <karel.minarik@gmail.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-05-27 12:25:45 +10:00
James Rodewig eae4a1c953
[DOCS] Add delete snapshot repo API docs (#57043)
Changes:

* Adds API reference docs for the delete snapshot repo API.

* Corrects an error in the delete snapshot repo API spec. Comma-separated
repository names are not supported.

* Relocates the existing delete snapshot repo API example docs.
2020-05-21 13:59:48 -04:00
Nik Everett 7790e815fe
Update skip after backport of #56921 (#56974) 2020-05-20 11:20:20 -04:00
Dan Hermann 4eb9d18b72
Handle exceptions when building _cat/indices response (#56993) 2020-05-20 09:44:40 -05:00
Nik Everett bea2341c9e
Save memory when date_histogram is not on top (#56921)
When `date_histogram` is a sub-aggregator it used to allocate a bunch of
objects for every one of it's parent's buckets. This uses the data
structures that we built in #55873 rework the `date_histogram`
aggregator instead of all of the allocation.

Part of #56487
2020-05-19 13:48:25 -04:00
Ioannis Kakavas 127646c496
Adjust version mute for reload secure settings (#56938)
We can safely run the reload_secure_settings tests
after 7.7.0 , the relevant changes have long been
backported there
2020-05-19 18:13:44 +03:00
Tomas Della Vedova f5360fc984
[DOCS] Fix component template API link in JSON specs (#56884) 2020-05-19 09:59:06 -04:00
Ioannis Kakavas 9ae9bdc9a3
Adjust reload keystore test to pass in FIPS (#56889)
In KeystoreWrapper class we determine if the error to decrypt a
given keystore is caused by a wrong password based on the exception
that the SunJCE implementation of AES is
throwing(AEADBadTagException). Other implementations from other
Security Providers fail with a different exception and as such we
cannot differentiate between a corrupted file and a wrong password
in a foolproof way.
As in other tests such as in
KeyStoreWrapperTests#testDecryptKeyStoreWithWrongPassword
we handle this by matching both possible exception messages.
2020-05-19 15:25:45 +03:00
Lee Hinman d3ccada06f
Add template simulation API for simulating template composition (#56842)
This adds an API for simulating template composition with or without an index template.

It looks like:

```
POST /_index_template/_simulate/my-template
```

To simulate a template named `my-template` that already exists, or, to simulate a template that does
not already exist:

```
POST /_index_template/_simulate
{
  "index_patterns": ["my-index"]
  "composed_of": ["ct1", "ct2"],
}
```

This is related to #55686, which adds an API to simulate composition based on an index name (hence
the `_simulate_index` vs `_simulate`).

This commit also adds reference documentation for both simulation APIs.

Relates to #53101
Resolves #56390
Resolves #56255
2020-05-18 15:11:42 -06:00
Dan Hermann a62483f7b5
Rename endpoint from plural "_data_streams" to singular "_data_stream" (#56762) 2020-05-15 08:23:43 -05:00
Ryan Ernst c0ee68b0a0
Move publishing configuration to a separate plugin (#56727)
This is another part of the breakup of the massive BuildPlugin. This PR
moves the code for configuring publications to a separate plugin. Most
of the time these publications are jar files, but this also supports the
zip publication we have for integ tests.
2020-05-14 18:56:59 -07:00
Lee Hinman cad030d8d7
Don't allow invalid template combinations (#56397)
This commit removes the ability to put V2 index templates that reference missing component templates.
It also prevents removing component templates that are being referenced by an existing V2 index
template.

Relates to #53101
Resolves #56314
2020-05-14 15:33:35 -06:00
Nik Everett f433fd472c
Update skip after backport of #56208 (#56719) 2020-05-13 17:36:31 -04:00
Nik Everett 4a8d93f55b
Add list of defered aggregations to the profiler (#56208)
This adds a few things to the `breakdown` of the profiler:
* `histogram` aggregations now contain `total_buckets` which is the
  count of buckets that they collected. This could be useful when
  debugging a histogram inside of another bucketing agg that is fairly
  selective.
* All bucketing aggs that can delay their sub-aggregations will now add
  a list of delayed sub-aggregations. This is useful because we
  sometimes have fairly involved logic around which sub-aggregations get
  delayed and this will save you from having to guess.
* Aggregtations wrapped in the `MultiBucketAggregatorWrapper` can't
  accurately add anything to the breakdown. Instead they the wrapper
  adds a marker entry `"multi_bucket_aggregator_wrapper": true` so we
  can be quickly pick out such aggregations when debugging.

It also fixes a bug where `_count` breakdown entries were contributing
to the overall `time_in_nanos`. They didn't add a large amount of time
so it is unlikely that this caused a big problem, but I was there.

To support the arbitrary breakdown data this reworks the profiler so
that the `breakdown` can contain any data that is supported by
`StreamOutput#writeGenericValue(Object)` and
`XContentBuilder#value(Object)`.
2020-05-13 08:30:38 -04:00
Martijn van Groningen e9cc3de173
Fix allowed warning in data stream rest test. (#56630) 2020-05-12 21:02:37 +02:00
Jake Landis 525522e187
json spec: allow null for documentation url (#55749)
This commit allows the JSON schema's documentation.url property to have a null value.
This can useful for cases where a feature is under development, and does not have
documentation published yet.

This commit also adds a documentation.url for two ml resources.
2020-05-12 12:51:24 -05:00
Martijn van Groningen c4082384db
Enable bwc tests after backporting index templates v2 data stream integration (#56615)
Relates to #55377
2020-05-12 18:10:20 +02:00
James Rodewig 3ebbf895f2
[DOCS] Add clean up snapshot repository API docs (#56519) 2020-05-12 08:56:29 -04:00
Martijn van Groningen 74e2c01138
Auto create data streams using index templates v2 (#55377)
This commit adds the ability to auto create data streams using index templates v2.
Index templates (v2) now have a data_steam field that includes a timestamp field,
if provided and index name matches with that template then a data stream
(plus first backing index) is auto created. 

Relates to #53100
2020-05-12 13:42:59 +02:00
Lee Hinman fc708ccca4
Remove prefer_v2_templates query string parameter (#56546)
This commit removes the `prefer_v2_templates` flag and setting. This was a brief setting that
allowed specifying whether V1 or V2 template should be used when an index is created. It has been
removed in favor of V2 templates always having priority.

Relates to #53101
Resolves #56528

This is not a breaking change because this flag was never in a released version.
2020-05-11 14:56:48 -06:00
Nik Everett 7c367de13b
Update skip after backport of #56252 (#56379) 2020-05-07 15:41:45 -04:00
Nik Everett 923fc988ad
Fix auto_date_histogram interval (#56252)
`auto_date_histogram` was returning the incorrect `interval` because
of a combination of two things:
1. When pipeline aggregations rewrote `auto_date_histogram` we reset the
   interval to 1. Oops. Fixed that.
2. *Every* bucket aggregation was rewriting its buckets as though there
   was a pipeline aggregation even if there aren't any. This is a bit
   silly so we skip that too.

Closes #56116
2020-05-07 08:19:53 -04:00
Przemko Robakowski 7ca47f52e8
Add prefer_v2_templates parameter to Reindex (#56253)
* prefer_v2_templates for reindex
2020-05-06 22:01:13 +02:00
Jake Landis 32269f1a6d
_cat/threadpool remove "size" and add "time" params (#55736)
The rest spec and documentation for _cat/threadpool supports a "size" parameter.
However, the "size" parameter will have no impact since there are no values
of type "SizeValue" of the return value of this _cat api.

This commit removes the "size" param from the spec and documentation.

This commit also adds support for the "time" param since and support to
format the time param for the "keep_alive" column. By default, the output
should not change since the "TimeValue" rendered default (via RestTable)
is toString(), and the code prior to this also called toString().

closes #54478
2020-05-06 14:26:08 -05:00
Dan Hermann 117055d49e
Get index includes parent data stream for backing indices (#56022) 2020-05-05 13:40:15 -05:00
Jake Landis e392ce939a
deprecrate size from cat.thread_pool in json spec (#55984) 2020-04-30 11:36:20 -05:00
Andrei Dan e256becad7
Conditionally run tests asserting overlapping templates (#56028)
Only run the tests verifyin the overlapping index templates when there is
no `global` index template (ie. when the default shards are not changed)
2020-04-30 16:07:34 +01:00
Andrei Dan 475790c34e
Add HLRC support for simulate index template api (#55936) 2020-04-30 14:24:46 +01:00
Andrei Dan e3e9782b20
Update template v2 api rest spec (#55948)
This removed the specification of `order` as it is not a parameter of the
v2 put template api (the priority is the equivalent of `order` and is
defined in the body) and add a bit of description for the `cause` parameter
(which is currently used as a cluster update task tracking)
2020-04-30 10:52:52 +01:00
Andrei Dan 1a5845edce
Add simulate template composition API _index_template/_simulate_index/{name} (#55686)
This adds a new api to simulate matching the given index name against the
 index templates in the system.

The syntax for the new API takes the following form:

POST _index_template/_simulate_index/{index_name}
{
  "index_patterns": ["logs-*"],
  "priority": 15,
  "template": {
	"settings": {
		"number_of_shards": 3
	}
       ...
   }
}

Where the body is optional, but we support the entire body used by the
PUT _index_template/{name} api. When the body is specified we'll simulate
matching the given index against a system that'd have the given index
template together with the index templates that exist in the system.

The response, in both cases, will return the matching template's resolved
settings, mappings and aliases, together with a special field that'll print any
overlapping templates and their corresponding index patterns.
2020-04-29 11:27:15 +01:00
David Roberts e0f38896fd
[ML] Adjust BWC after daily_model_snapshot_retention_after_days backport (#55911)
Simplifying BWC code after merging #55891
2020-04-29 10:49:40 +01:00
zacharymorn 498fc66cc8
Add API specs for voting config exclusions (#55760)
Closes #48131
2020-04-29 08:34:10 +01:00
Lee Hinman 3fc17b1b55
Adjust skip version for _cat/templates yml tests (#55871)
Now that #55829 has been backported (#55866) we can adjust these skip versions to allow testing with
7.8+.

Relates to #53101
2020-04-28 11:05:00 -06:00
Dan Hermann 0077714bfe
REST test for rolling data streams (#55802) 2020-04-28 11:52:36 -05:00
Lee Hinman 61acf602fc
Add support for V2 index templates to /_cat/templates (#55829)
This adds support for V2 index templates to the cat templates API. It uses the `order` field as
priority in order not to break compatibility, while adding the `composed_of` field to show component
templates that are used from an index template.

Relates to #53101
2020-04-28 09:25:54 -06:00
Dan Hermann bcf86000e5
Delete index API properly handles backing indices for data streams (#55690) 2020-04-24 16:34:57 -05:00
Zachary Tong 76170eded1 Update version skip after backport 2020-04-24 10:19:03 -04:00
Zachary Tong 9f165bd44e
Aggs must specify a `field` or `script` (or both) (#52226)
* Aggs must specify a `field` or `script` (or both)

This adds a validation to VSParserHelper to ensure that a field or
script or both are specified by the user.  This is technically
required today already, but throws an exception much deeper
in the agg framework and has a very unintuitive error for the user
(as well as eating more resources instead of failing early)

* Fix StringStats test

* Add yaml test

* Skip test on older versions

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-04-23 14:26:38 -04:00
Jake Landis 2b0900d33b
Validate REST specs against schema (#55117)
A JSON schema was recently introduced for the REST API specification. #54252
This PR introduces a 3rd party validation tool to ensure that the
REST specification conforms to the schema.

The task is applied to the 3 projects that contain REST API specifications.
The plugin wires this task into the precommit commit task, and should be
considered as part of the public API for the build tools for any plugin
developer to contribute their plugin's specification.

An ignore parameter has been introduced for the task to allow specific
file to be ignored from the validation. The ignored files in this PR
will soon get issues logged and a link so they can be fixed.

Closes #54314
2020-04-21 18:18:18 -05:00
Fernando Briano 4e9dd2b292
Add skip arbitrary_key to nodes.reload_secure_settings YAML test (#55402) 2020-04-21 16:19:38 +01:00
Lee Hinman 93021f72aa
Adjust serialization versions for prefer_v2_templates flag (#55478)
This adjusts the minimum version for serialization for #55411.

It should only be merged after #55476 has been merged
2020-04-20 14:07:40 -06:00
Lee Hinman 0202e1ae96
Add prefer_v2_templates flag and index setting (#55411)
This commit adds a new querystring parameter on the following APIs:
- Index
- Update
- Bulk
- Create Index
- Rollover

These APIs now support a `?prefer_v2_templates=true|false` flag. This flag changes the preference
creation to use either V2 index templates or V1 templates. This flag defaults to `false` and will be
changed to `true` for 8.0+ in subsequent work.

Additionally, setting this flag internally sets the `index.prefer_v2_templates` index-level setting.
This setting is used so that actions that automatically create a new index (things like rollover
initiated by ILM) will inherit the preference from the original index. This setting is dynamic so
that a transition from v1 to v2 templates can occur for long-running indices grouped by an alias
performing periodic rollover.

This also adds support for sending this parameter to the High Level Rest Client.

Relates to #53101
2020-04-20 10:04:42 -06:00
zhichen 05066aecf0
Add Bulk stats track the bulk per shard (#52208)
* Add Bulk stats track the bulk sizes per shard and the time spent on the bulk shard request (#50536)(#47345)
2020-04-20 11:09:29 +02:00
Dan Hermann 4a8b84349d
Mute data stream YML tests until backport (#55406) 2020-04-17 10:48:43 -05:00
Dan Hermann e1730452e3
Add explicit generation attribute to data streams (#55342) 2020-04-17 09:02:09 -05:00
Martijn van Groningen d649826a94
Re-enable data stream yaml bwc tests. (#55367)
After backporting #55337
2020-04-17 09:44:31 +02:00