Commit Graph

6411 Commits

Author SHA1 Message Date
Lee Hinman 5adbf67c08
Add ILM histore store index (#50287)
* Add ILM histore store index

This commit adds an ILM history store that tracks the lifecycle
execution state as an index progresses through its ILM policy. ILM
history documents store output similar to what the ILM explain API
returns.

An example document with ALL fields (not all documents will have all
fields) would look like:

```json
{
  "@timestamp": 1203012389,
  "policy": "my-ilm-policy",
  "index": "index-2019.1.1-000023",
  "index_age":123120,
  "success": true,
  "state": {
    "phase": "warm",
    "action": "allocate",
    "step": "ERROR",
    "failed_step": "update-settings",
    "is_auto-retryable_error": true,
    "creation_date": 12389012039,
    "phase_time": 12908389120,
    "action_time": 1283901209,
    "step_time": 123904107140,
    "phase_definition": "{\"policy\":\"ilm-history-ilm-policy\",\"phase_definition\":{\"min_age\":\"0ms\",\"actions\":{\"rollover\":{\"max_size\":\"50gb\",\"max_age\":\"30d\"}}},\"version\":1,\"modified_date_in_millis\":1576517253463}",
    "step_info": "{... etc step info here as json ...}"
  },
  "error_details": "java.lang.RuntimeException: etc\n\tcaused by:etc etc etc full stacktrace"
}
```

These documents go into the `ilm-history-1-00000N` index to provide an
audit trail of the operations ILM has performed.

This history storage is enabled by default but can be disabled by setting
`index.lifecycle.history_index_enabled` to `false.`

Resolves #49180
2019-12-18 16:09:59 -07:00
James Rodewig b8a62ce8f7
[DOCS] Document `thread_pool` node stats (#50330) 2019-12-18 16:57:38 -05:00
lcawl d8a94f0397 [DOCS] Fixes security links 2019-12-18 11:51:03 -08:00
Lisa Cawley 68e02a19d8
[DOCS] Move machine learning results definitions into APIs (#50257) 2019-12-18 09:50:31 -08:00
Igor Motov a26e4d1e5e
Geo: Switch generated WKT to upper case (#50285)
Switches generated WKT to upper case to
conform to the standard recommendation.

Relates #49568
2019-12-18 07:28:56 -10:00
James Rodewig a762c29dcf
[DOCS] Clarify frozen indices are read-only (#50318)
The freeze index API docs state that frozen indices are blocked for
write operations.

While this implies frozen indices are read-only, it does not explicitly
use the term "read-only", which is found in other docs, such as the
force merge docs.

This adds the "ready-only" term to the freeze index API docs as well
as other clarification.
2019-12-18 12:17:41 -05:00
Christoph Büscher 7f90ff64a3
[Docs] Remove `intervals` filter rule from allowed top-level rules (#50320)
The `filter` rule is not allowed on the top-level of the query, so removing it
from the list of allowed rules. Where it can be nested inside other rules, those
rules already mention it.
2019-12-18 17:35:35 +01:00
Adrien Grand 2d627ba757
Add per-field metadata. (#49419)
This PR adds per-field metadata that can be set in the mappings and is later
returned by the field capabilities API. This metadata is completely opaque to
Elasticsearch but may be used by tools that index data in Elasticsearch to
communicate metadata about fields with tools that then search this data. A
typical example that has been requested in the past is the ability to attach
a unit to a numeric field.

In order to not bloat the cluster state, Elasticsearch requires that this
metadata be small:
 - keys can't be longer than 20 chars,
 - values can only be numbers or strings of no more than 50 chars - no inner
   arrays or objects,
 - the metadata can't have more than 5 keys in total.

Given that metadata is opaque to Elasticsearch, field capabilities don't try to
do anything smart when merging metadata about multiple indices, the union of
all field metadatas is returned.

Here is how the meta might look like in mappings:

```json
{
  "properties": {
    "latency": {
      "type": "long",
      "meta": {
        "unit": "ms"
      }
    }
  }
}
```

And then in the field capabilities response:

```json
{
  "latency": {
    "long": {
      "searchable": true,
      "aggreggatable": true,
      "meta": {
        "unit": [ "ms" ]
      }
    }
  }
}
```

When there are no conflicts, values are arrays of size 1, but when there are
conflicts, Elasticsearch includes all unique values in this array, without
giving ways to know which index has which metadata value:

```json
{
  "latency": {
    "long": {
      "searchable": true,
      "aggreggatable": true,
      "meta": {
        "unit": [ "ms", "ns" ]
      }
    }
  }
}
```

Closes #33267
2019-12-18 17:27:38 +01:00
Kevin Woblick 77d94caa70 [DOCS] Add warning about Docker port exposure (#50169)
Docker bypasses the Uncomplicated Firewall (UFW) on Linux by editing the `iptables` config directly, which leads to the exposure of port 9200, even if you blocked it via UFW.

This adds a warning along with work-arounds to the docs.

Signed-off-by: Kovah <mail@kovah.de>
2019-12-18 09:03:44 -05:00
István Zoltán Szabó 50e26d40a2
[DOCS] Adds GET, GET stats and DELETE inference APIs (#50224)
Co-Authored-By: Lisa Cawley <lcawley@elastic.co>
2019-12-18 09:10:12 +01:00
Lisa Cawley 6d608e6a0d
[DOCS] Move transform resource definitions into APIs (#50108) 2019-12-17 09:01:31 -08:00
James Rodewig 230d4765d3
[DOCS] Add identifier mapping tip to numeric and keyword datatype docs (#49933)
Users often mistakenly map numeric IDs to numeric datatypes. However,
this is often slow for the `term` and other term-level queries.

The "Tune for search speed" docs includes advice for mapping numeric
IDs to `keyword` fields. However, this tip is not included in the
`numeric` or `keyword` field datatype doc pages.

This rewords the tip in the "Tune for search speed" docs, relocates it
to the `numeric` field docs, and reuses it using tagged regions.
2019-12-17 09:31:07 -05:00
Jim Ferenczi 804a5042e7
Optimize composite aggregation based on index sorting (#48399)
Co-authored-by: Daniel Huang <danielhuang@tencent.com>

This is a spinoff of #48130 that generalizes the proposal to allow early termination with the composite aggregation when leading sources match a prefix or the entire index sort specification.
In such case the composite aggregation can use the index sort natural order to early terminate the collection when it reaches a composite key that is greater than the bottom of the queue.
The optimization is also applicable when a query other than match_all is provided. However the optimization is deactivated for sources that match the index sort in the following cases:
  * Multi-valued source, in such case early termination is not possible.
  * missing_bucket is set to true
2019-12-17 14:02:06 +01:00
Lisa Cawley 207094cd67
[DOCS] Moves model snapshot resource definitions into APIs (#50157)
Co-Authored-By: Ed Savage <32410745+edsavage@users.noreply.github.com>
2019-12-16 10:42:30 -08:00
Ignacio Vera 7c559be31c
"CONTAINS" support for BKD-backed geo_shape and shape fields (#50141)
Lucene 8.4 added support for "CONTAINS", therefore in this commit those
changes are integrated in Elasticsearch. This commit contains as well a
bug fix when querying with a geometry collection with "DISJOINT" relation.
2019-12-16 07:43:42 +01:00
James Rodewig 6749b63ace
[DOCS] Add `index-extra-title-page.html` for direct HTML migration (#50189) 2019-12-13 12:44:12 -05:00
Tim Brooks 0cedb9e251
Update remote cluster stats to support simple mode (#49961)
Remote cluster stats API currently only returns useful information if
the strategy in use is the SNIFF mode. This PR modifies the API to
provide relevant information if the user is in the SIMPLE mode. This
information is the configured addresses, max socket connections, and
open socket connections.
2019-12-13 09:16:53 -07:00
James Rodewig 9907b0aab8
[DOCS] Reformat token count limit filter docs (#49835) 2019-12-13 08:43:35 -05:00
James Rodewig ff2259bf81
[DOCS] Document JVM node stats (#49500)
* [DOCS] Document JVM node stats

Documents the `jvm` parameters returned by the `_nodes/stats` API.

Co-Authored-By: James Baiera <james.baiera@gmail.com>
2019-12-12 15:41:20 -05:00
Lisa Cawley d442ff9223
[DOCS] Updates transform screenshots and text (#50059) 2019-12-12 08:20:39 -08:00
James Rodewig 4dfc07c922
[DOCS] Reformat lowercase token filter docs (#49935) 2019-12-12 09:39:06 -05:00
James Rodewig 2d9ee5ddfe
[DOCS] Correct percentile rank agg example response (#50052)
The example snippets in the percentile rank agg docs use a test dataset
named `latency`, which is generated from docs/gradle.build.

At some point the dataset and example snippets were updated, but the
text surrounding the snippets was not. This means the text and the
example snippets shown no longer match up.

This corrects that by changing the snippets using /TESTRESPONSE magic comments.
2019-12-12 08:38:48 -05:00
István Zoltán Szabó 3857e3d94f
[DOCS] Moves data frame analytics job resource definitions into APIs (#50021) 2019-12-12 10:59:37 +01:00
Lisa Cawley ca482127fa
[DOCS] Move job count resource definitions into API (#50057)
Co-Authored-By: Przemysław Witek <przemyslaw.witek@elastic.co>
Co-Authored-By: David Roberts <dave.roberts@elastic.co>
Co-Authored-By: Ed Savage <32410745+edsavage@users.noreply.github.com>
2019-12-11 11:17:15 -08:00
Lisa Cawley 3d96e6b68e
[DOCS] Move datafeed resource definitions into APIs (#50005)
Co-Authored-By: István Zoltán Szabó <istvan.szabo@elastic.co>
2019-12-11 09:50:41 -08:00
Przemko Robakowski 64e1a774fc
CSV ingest processor (#49509)
* CSV Processor for Ingest

This change adds new ingest processor that breaks line from CSV file into separate fields.
By default it conforms to RFC 4180 but can be tweaked.

Closes #49113
2019-12-11 14:52:04 +01:00
Patryk Krawaczyński de4f701a19 [DOCS] Document `index.queries.cache.enabled` as a static setting (#49886) 2019-12-10 14:23:14 -05:00
Peter Johnson 263f5bd6b6 [Docs] Fix typo in function-score-query.asciidoc (#50030) 2019-12-10 17:33:36 +01:00
Adrien Grand 1329acc094
Upgrade to lucene 8.4.0-snapshot-662c455. (#50016)
Lucene 8.4 is about to be released so we should check it doesn't cause problems
with Elasticsearch.
2019-12-10 17:09:36 +01:00
Lisa Cawley 3e6dc03de6
[DOCS] Removes realm type security setting (#50001) 2019-12-10 08:03:43 -08:00
James Rodewig 0062d5f301
[DOCS] Remove shadow replica reference (#50029)
Removes a reference to shadow replicas from the cat shards API docs
and a comment in cluster/routing/UnassignedInfo.java.

Shadow replicas were removed with #23906.
2019-12-10 09:30:04 -05:00
Dimitris Athanasiou 269425b54d
[ML] Introduce randomize_seed setting for regression and classification (#49990)
This adds a new `randomize_seed` for regression and classification.
When not explicitly set, the seed is randomly generated. One can
reuse the seed in a similar job in order to ensure the same docs
are picked for training.
2019-12-10 10:22:53 +02:00
James Rodewig e520b85675
[DOCS] Skip synced flush docs tests (#49986)
The current snippets in the synced flush docs can cause conflicts with
other background syncs, such as the global checkpoint sync or retention
lease sync, in the docs tests.

This skips tests for those snippets to avoid conflicts.
2019-12-09 13:16:16 -05:00
Artur Carvalho c21eb986b2 [Docs] Fix typo in getting-started.asciidoc (#49985) 2019-12-09 16:25:21 +01:00
James Rodewig 4415f1a536
[DOCS] Correct inline shape snippets in shape query docs (#49921)
In the shape query docs, the index mapping snippet uses the "geometry"
shape field mapping. However, the doc index snippet uses the "location"
property.

This changes the "location" property to "geometry". It also adds a
comment containing the search result snippet. This should prevent
similar issues in the future.
2019-12-09 08:39:17 -05:00
Ryan Ernst 59a571edd5
Fix incorrect use of multiline NOTE in rpm docs (#49962)
This was a copy/paste error from #49893. This commit converts the NOTE
to use inline style instead of one needing closing linebreak.
2019-12-06 17:43:12 -08:00
Ryan Ernst 16a7a04664
Disable repo configuration for rpm based systems (#49893)
This commit changes the recommended repository file for rpm based
systems to be disabled by default. This is a safer practice so upgrades
of the system do no accidentally upgrade elasticsearch itself.

closes #30660
2019-12-06 15:54:30 -08:00
Lisa Cawley 0f51bc2f72
[DOCS] Move anomaly detection job resource definitions into APIs (#49700)
Co-Authored-By: István Zoltán Szabó <istvan.szabo@elastic.co>
2019-12-06 15:32:07 -08:00
Przemko Robakowski c57032f622
Allow list of IPs in geoip ingest processor (#49573)
* Allow list of IPs in geoip ingest processor

This change lets you use array of IPs in addition to string in geoip processor source field.
It will set array containing geoip data for each element in source, unless first_only parameter
option is enabled, then only first found will be returned.

Closes #46193
2019-12-06 21:57:06 +01:00
István Zoltán Szabó e5d512a8ed
[DOCS] Fixes classification evaluation example response. (#49905) 2019-12-06 13:24:22 +01:00
István Zoltán Szabó 37cc0b6c9e
[DOCS] Fixes attribute in transforms overview. (#49898) 2019-12-06 10:23:01 +01:00
Hendrik Muhs 25474c62f1
[Transform][DOCS]rewrite client ip example to use continuous transform (#49822)
adapt the transform example for suspicious client ips to use continuous transform
2019-12-06 08:19:21 +01:00
Orhan Toy f002cd1b6f [DOCS] Minor typo fixes in reindex.asciidoc (#49863) 2019-12-05 20:24:22 +01:00
István Zoltán Szabó f7a5b73972
[DOCS] Adds an example of preprocessing actions to the PUT DFA API docs (#49831) 2019-12-05 14:15:19 +01:00
István Zoltán Szabó c793e80d3b
[DOCS] Fixes typo in the ML anomaly detection time functions docs. (#49834) 2019-12-05 09:57:01 +01:00
James Rodewig 72dd49ddcc
[DOCS] Document `minimum_should_match` defaults for `bool` query (#48865)
Adds documentation for the `minimum_should_match` parameter to the `bool` query docs. Includes docs for the default values:

- `1` if the `bool` query includes at least one `should` clause and no `must` or `filter` clauses
- `0` otherwise
2019-12-04 12:44:13 -05:00
James Rodewig e964a97005
[DOCS] Reformat length token filter docs (#49805)
* Adds a title abbreviation
* Updates the description and adds a Lucene link
* Reformats the parameters section
* Adds analyze, custom analyzer, and custom filter snippets

Relates to #44726.
2019-12-04 09:58:19 -05:00
Alexander Reelsen 062f9f03bf
Docs: Fix & test more grok processor documentation (#49447)
The documentation contained a small error, as bytes and duration was not
properly converted to a number and thus remained a string.

The documentation is now also properly tested by providing a full blown
simulate pipeline example.
2019-12-03 11:47:27 +01:00
James Rodewig 1a574115c1
[DOCS] Document CCR compatibility requirements (#49776)
* Creates a prerequisites section in the cross-cluster replication (CCR)
  overview.
* Adds concise definitions for local and remote cluster in a CCR context.
* Documents that the ES version of the local cluster must be the same
  or a newer compatible version as the remote cluster.
2019-12-02 15:52:13 -05:00
James Rodewig 6ea54eecf0
[DOCS] Reformat keep types and keep words token filter docs (#49604)
* Adds title abbreviations
* Updates the descriptions and adds Lucene links
* Reformats parameter definitions
* Adds analyze and custom analyzer snippets
* Adds explanations of token types to keep types token filter and tokenizer docs
2019-12-02 09:22:21 -05:00