Commit Graph

205 Commits

Author SHA1 Message Date
Lisa Cawley dd4ede5c56
[DOCS] Adds filter and calendar attributes (#50566) 2020-01-02 10:59:54 -08:00
lcawl c7408a25f1 [DOCS] Minor fixes in ML APIs 2019-12-30 15:21:18 -08:00
James Rodewig e8a6d4a3fb
[DOCS] Remove unneeded redirects (#50476)
The docs/reference/redirects.asciidoc file stores a list of relocated or
deleted pages for the Elasticsearch Reference documentation.

This prunes several older redirects that are no longer needed and
don't require work to fix broken links in other repositories.
2019-12-26 07:49:41 -05:00
Lisa Cawley 6501338a9e
[DOCS] Remove redundant results from ML APIs (#50477) 2019-12-24 08:34:03 -08:00
Orhan Toy 48342740c5 [DOCS] Fixes "enables you to" typos (#50225) 2019-12-23 14:38:37 -05:00
Lisa Cawley 362ce41eaf
[DOCS] Updates ML links (#50387) 2019-12-19 14:47:28 -08:00
lcawl d8a94f0397 [DOCS] Fixes security links 2019-12-18 11:51:03 -08:00
Lisa Cawley 68e02a19d8
[DOCS] Move machine learning results definitions into APIs (#50257) 2019-12-18 09:50:31 -08:00
István Zoltán Szabó 50e26d40a2
[DOCS] Adds GET, GET stats and DELETE inference APIs (#50224)
Co-Authored-By: Lisa Cawley <lcawley@elastic.co>
2019-12-18 09:10:12 +01:00
Lisa Cawley 207094cd67
[DOCS] Moves model snapshot resource definitions into APIs (#50157)
Co-Authored-By: Ed Savage <32410745+edsavage@users.noreply.github.com>
2019-12-16 10:42:30 -08:00
István Zoltán Szabó 3857e3d94f
[DOCS] Moves data frame analytics job resource definitions into APIs (#50021) 2019-12-12 10:59:37 +01:00
Lisa Cawley ca482127fa
[DOCS] Move job count resource definitions into API (#50057)
Co-Authored-By: Przemysław Witek <przemyslaw.witek@elastic.co>
Co-Authored-By: David Roberts <dave.roberts@elastic.co>
Co-Authored-By: Ed Savage <32410745+edsavage@users.noreply.github.com>
2019-12-11 11:17:15 -08:00
Lisa Cawley 3d96e6b68e
[DOCS] Move datafeed resource definitions into APIs (#50005)
Co-Authored-By: István Zoltán Szabó <istvan.szabo@elastic.co>
2019-12-11 09:50:41 -08:00
Dimitris Athanasiou 269425b54d
[ML] Introduce randomize_seed setting for regression and classification (#49990)
This adds a new `randomize_seed` for regression and classification.
When not explicitly set, the seed is randomly generated. One can
reuse the seed in a similar job in order to ensure the same docs
are picked for training.
2019-12-10 10:22:53 +02:00
Lisa Cawley 0f51bc2f72
[DOCS] Move anomaly detection job resource definitions into APIs (#49700)
Co-Authored-By: István Zoltán Szabó <istvan.szabo@elastic.co>
2019-12-06 15:32:07 -08:00
István Zoltán Szabó e5d512a8ed
[DOCS] Fixes classification evaluation example response. (#49905) 2019-12-06 13:24:22 +01:00
István Zoltán Szabó f7a5b73972
[DOCS] Adds an example of preprocessing actions to the PUT DFA API docs (#49831) 2019-12-05 14:15:19 +01:00
István Zoltán Szabó c793e80d3b
[DOCS] Fixes typo in the ML anomaly detection time functions docs. (#49834) 2019-12-05 09:57:01 +01:00
Dimitris Athanasiou bad07b76f7
[ML] Add optional source filtering during data frame reindexing (#49690)
This adds a `_source` setting under the `source` setting of a data
frame analytics config. The new `_source` is reusing the structure
of a `FetchSourceContext` like `analyzed_fields` does. Specifying
includes and excludes for source allows selecting which fields
will get reindexed and will be available in the destination index.

Closes #49531
2019-11-29 14:20:31 +02:00
lcawl 3b3f3ca925 [DOCS] Fixes typo in ML resources 2019-11-26 10:28:18 -08:00
lcawl 63b944c00f [DOCS] Fixes data type formatting 2019-11-26 08:21:39 -08:00
David Roberts 40c951d781
[ML] Add default categorization analyzer definition to ML info (#49545)
The categorization job wizard in the ML UI will use this
information when showing the effect of the chosen categorization
analyzer on a sample of input.
2019-11-25 13:20:12 +00:00
Dimitris Athanasiou 5a6967af57
[ML][DOCS] Anomaly detection job retention days settings do not require restart (#49546) 2019-11-25 15:12:41 +02:00
Dimitris Athanasiou 0390ec3627
[ML] Explain data frame analytics API (#49455)
This commit replaces the _estimate_memory_usage API with
a new API, the _explain API.

The API consolidates information that is useful before
creating a data frame analytics job.

It includes:

- memory estimation
- field selection explanation

Memory estimation is moved here from what was previously
calculated in the _estimate_memory_usage API.

Field selection is a new feature that explains to the user
whether each available field was selected to be included or
not in the analysis. In the case it was not included, it also
explains the reason why.
2019-11-22 20:08:14 +02:00
Lisa Cawley 8d214e851c
[DOCS] Clarify ML job closure prerequisites (#49265) 2019-11-19 08:31:24 -08:00
David Roberts b6c6387af5
[TEST] Mute docs snippet test in close-job.asciidoc (#49000)
Due to https://github.com/elastic/elasticsearch/pull/48583#issuecomment-552991325
2019-11-12 17:31:07 +00:00
Benjamin Trent ee8853fbc1
[ML] Add new geo_results.(actual_point|typical_point) fields for `lat_long` results (#47050)
[ML] Add new geo_results.(actual_point|typical_point) fields for `lat_long` results (#47050)

Related PR: https://github.com/elastic/ml-cpp/pull/809
2019-11-11 13:21:18 -05:00
István Zoltán Szabó 7180b90646
[DOCS] Removes best practice about fields that are highly correlated to the dependent variable. (#48935) 2019-11-11 10:00:11 -05:00
István Zoltán Szabó e9cec6e1f7
[DOCS] Extends analyzed_fields description in PUT DFA API docs. (#48307) 2019-11-11 09:53:59 -05:00
István Zoltán Szabó 6c3fed8d4d
[DOCS] Adds classification type DFA API docs and ml-shared.asciidoc (#48241) 2019-11-06 07:40:27 -05:00
István Zoltán Szabó fe92cd0a26
[DOCS] Adds classification type evaluation docs to the DFA evaluation API (#47657) 2019-11-06 07:37:14 -05:00
Lisa Cawley 29ac34a45c
[DOCS] Re-enable code snippet testing in close anomaly detection job API (#48259) 2019-10-28 08:08:38 -07:00
David Roberts d308095b28
[ML] Add option to stop datafeed that finds no data (#47922)
Adds a new datafeed config option, max_empty_searches,
that tells a datafeed that has never found any data to stop
itself and close its associated job after a certain number
of real-time searches have returned no data.
2019-10-14 13:26:06 +01:00
David Roberts fd83c18cc1
[ML] Add lazy assignment job config option (#47726)
This change adds:

- A new option, allow_lazy_open, to anomaly detection jobs
- A new option, allow_lazy_start, to data frame analytics jobs

Both work in the same way: they allow a job to be
opened/started even if no ML node exists that can
accommodate the job immediately. In this situation
the job waits in the opening/starting state until ML
node capacity is available. (The starting state for data
frame analytics jobs is new in this change.)

Additionally, the ML nightly maintenance tasks now
creates audit warnings for ML jobs that are unassigned.
This means that jobs that cannot be assigned to an ML
node for a very long time will show a yellow warning
triangle in the UI.

A final change is that it is now possible to close a job
that is not assigned to a node without using force.
This is because previously jobs that were open but
not assigned to a node were an aberration, whereas
after this change they'll be relatively common.
2019-10-14 12:13:01 +01:00
István Zoltán Szabó 448d19f0ca
[DOCS] Adds supported fields section to the PUT DFA API description (#47842) 2019-10-10 12:34:39 +02:00
István Zoltán Szabó ab08c0cd76
[DOCS] Extends the analyzed_fields description in the PUT DFA API docs (#47791) 2019-10-09 18:13:33 +02:00
Dimitris Athanasiou e99435a7f6
[ML] Additional outlier detection parameters (#47600)
Adds the following parameters to `outlier_detection`:

- `compute_feature_influence` (boolean): whether to compute or not
   feature influence scores
- `outlier_fraction` (double): the proportion of the data set assumed
   to be outlying prior to running outlier detection
- `standardization_enabled` (boolean): whether to apply standardization
   to the feature values
2019-10-07 15:28:21 +03:00
Lisa Cawley 4e4990c6a0
[DOCS] Cleans up links to security content (#47610) 2019-10-04 16:10:26 -07:00
István Zoltán Szabó b03be6e816
[DOCS] Fixes an attribute in the update datafeed API docs. (#47551) 2019-10-04 08:42:30 +02:00
István Zoltán Szabó c0da956b6e
[DOCS] Amends update datafeed API docs (#47448) 2019-10-03 13:12:19 +02:00
István Zoltán Szabó 4977baf63a
[DOCS] Adds examples to the PUT dfa and the evaluate dfa APIs (#46966)
* [DOCS] Adds examples to the PUT dfa and the evaluate dfa APIs.

* [DOCS] Removes extra lines from examples.

* Update docs/reference/ml/df-analytics/apis/evaluate-dfanalytics.asciidoc

Co-Authored-By: Lisa Cawley <lcawley@elastic.co>

* Update docs/reference/ml/df-analytics/apis/put-dfanalytics.asciidoc

Co-Authored-By: Lisa Cawley <lcawley@elastic.co>

* [DOCS] Explains examples.
2019-10-02 10:26:20 +02:00
István Zoltán Szabó aa7c4030cd
[DOCS] Fine tunes update anomaly detection job API documentation (#47280)
* [DOCS] Fine tunes update anomaly detection job API documentation.
* [DOCS] Removes delimiter to fix the table.
2019-10-02 10:04:35 +02:00
István Zoltán Szabó 4073499f43
[DOCS] Fixes typos in the PUT dfa and the evaluate dfa documentation. (#47348) 2019-10-02 09:49:59 +02:00
István Zoltán Szabó a6c517a96e
[DOCS] Changes wording to move away from data frame terminology in the ES repo (#47093)
* [DOCS] Changes wording to move away from data frame terminology in the ES repo.
Co-Authored-By: Lisa Cawley <lcawley@elastic.co>
2019-10-01 08:04:06 +02:00
István Zoltán Szabó 14227106b0
[DOCS] Adds regression analytics resources and examples to the data frame analytics APIs and the evaluation API (#46176)
* [DOCS] Adds regression analytics resources and examples to the data frame analytics APIs.
Co-Authored-By: Benjamin Trent <ben.w.trent@gmail.com>
Co-Authored-By: Tom Veasey <tveasey@users.noreply.github.com>
2019-09-19 09:10:11 +02:00
István Zoltán Szabó bd4d46c416
[DOCS] Adds outlier detection params to the data frame analytics resources (#46323)
* [DOCS] Adds outlier detection params to the data frame analytics resources.
Co-Authored-By: Tom Veasey <tveasey@users.noreply.github.com>
Co-Authored-By: Lisa Cawley <lcawley@elastic.co>
2019-09-16 14:21:50 +02:00
James Rodewig 5c78f606c2
[DOCS] Change // CONSOLE comments to [source,console] (#46440) 2019-09-09 10:45:37 -04:00
James Rodewig e43be90e6c
[DOCS] [5 of 5] Change // TESTRESPONSE comments to [source,console-results] (#46449) 2019-09-06 14:05:36 -04:00
James Rodewig 97802d8aff
[DOCS] Change // CONSOLE comments to [source,console] (#46441) 2019-09-06 10:55:16 -04:00
István Zoltán Szabó e39cdd63c3
[DOCS] Adds progress parameter description to the GET stats data frame analytics API doc. (#46434) 2019-09-06 15:17:18 +02:00
James Rodewig 466c59a4a7
[DOCS] Replace "// TESTRESPONSE" magic comments with "[source,console-result] (#46295) 2019-09-05 16:47:18 -04:00
István Zoltán Szabó 626bbccd6e
[DOCS] [PUT DFA] Documents inline the child params of source and dest (#45649)
* [DOCS] [PUT DFA] Documents inline the child params of source and dest.

* [DOCS] Fixes indentation issues and amends dfa definitions.
2019-08-29 14:38:14 +02:00
Benjamin Trent 9f716fbd4c
[ML] Throw an error when a datafeed needs CCS but it is not enabled for the node (#46044)
Though we allow CCS within datafeeds, users could prevent nodes from accessing remote clusters. This can cause mysterious errors and difficult to troubleshoot.

This commit adds a check to verify that `cluster.remote.connect` is enabled on the current node when a datafeed is configured with a remote index pattern.
2019-08-28 15:06:26 -05:00
Dimitris Athanasiou f6a97decac
[ML] Improve progress reportings for DF analytics (#45856)
Previously, the stats API reports a progress percentage
for DF analytics tasks that are running and are in the
`reindexing` or `analyzing` state.

This means that when the task is `stopped` there is no progress
reported. Thus, one cannot distinguish between a task that never
run to one that completed.

In addition, there are blind spots in the progress reporting.
In particular, we do not account for when data is loaded into the
process. We also do not account for when results are written.

This commit addresses the above issues. It changes progress
to being a list of objects, each one describing the phase
and its progress as a percentage. We currently have 4 phases:
reindexing, loading_data, analyzing, writing_results.

When the task stops, progress is persisted as a document in the
state index. The stats API now reports progress from in-memory
if the task is running, or returns the persisted document
(if there is one).
2019-08-23 17:31:36 +03:00
Przemysław Witek 31f6e78acd
Allow the user to specify 'query' in Evaluate Data Frame request (#45775) 2019-08-22 08:27:38 +02:00
Dimitris Athanasiou 8af319481e
[ML] Add description to DF analytics (#45774) 2019-08-21 19:58:09 +03:00
Przemysław Witek c6a25a818d
Add docs for HLRC for Estimate memory usage API (#45538) 2019-08-21 12:52:17 +02:00
Przemysław Witek 7107c221a7
Implement ml/data_frame/analytics/_estimate_memory_usage API endpoint (#45188) 2019-08-13 20:59:35 +02:00
István Zoltán Szabó 78e35c2c3d
[DOCS] Adds supported time units ref to the ML and DF API params. (#45322) 2019-08-08 13:43:55 +02:00
Lisa Cawley 46912c8f3d
[DOCS] Reformats ML update APIs (#45253) 2019-08-06 11:05:01 -07:00
István Zoltán Szabó fbd9c9e2e3
[DOCS] Makes clearer the note under freq_rare. (#45193) 2019-08-05 13:28:22 +02:00
James Rodewig 8b152d6d79
Rename "indices APIs" to "index APIs" (#44863) 2019-08-02 14:09:46 -04:00
Lisa Cawley 53980c6267
[DOCS] Clarifies bucket span in overall buckets API (#45110) 2019-08-02 08:36:39 -07:00
Lisa Cawley 285f2e0625
[DOCS] Updates terms in machine learning get APIs (#44986) 2019-07-30 10:52:23 -07:00
István Zoltán Szabó c22296d0c2
[DOCS] Adds allow no jobs param to the GET, GET stats and Close APIs (#44503) 2019-07-30 14:22:14 +02:00
Lisa Cawley 75999ff83c
[DOCS] Updates anomaly detection terminology (#44888) 2019-07-26 11:07:01 -07:00
Lisa Cawley 3f31859669
[DOCS] Updates terms in machine learning datafeed APIs (#44883) 2019-07-26 10:47:03 -07:00
István Zoltán Szabó 84793476ba
[DOCS] Amends data frame analytics resources, GET, and PUT API docs (#44806)
This PR addresses the feedback in  https://github.com/elastic/ml-team/issues/175#issuecomment-512215731.

* Adds an example to `analyzed_fields`
* Includes `source` and `dest` objects inline in the resource page
* Lists `model_memory_limit` in the PUT API page
* Amends the `analysis` section in the resource page
* Removes Properties headings in subsections
2019-07-26 11:39:59 +02:00
Lisa Cawley aefb72040c
[DOCS] Updates terms in machine learning calendar APIs (#44866) 2019-07-25 11:20:42 -07:00
Lisa Cawley 990e037728
[DOCS] Updates terms in anomaly detection job APIs (#44839) 2019-07-25 08:58:16 -07:00
István Zoltán Szabó 5275392b47
[DOCS] Adds allow no datafeeds query param to the GET, GET stats and STOP datafeed APIs (#44499) 2019-07-25 16:45:06 +02:00
James Rodewig ea1adb61c2
[DOCS] Update anchors and links for Elasticsearch API relocation (#44500) 2019-07-19 09:16:35 -04:00
Lisa Cawley dbe7a48e82
[DOCS] Fixes query default value (#44572) 2019-07-18 08:15:28 -07:00
Lisa Cawley 4fd8e34662
[DOCS] Moves content to ML anomaly-detection folder (#44520) 2019-07-17 13:48:12 -07:00
Lisa Cawley 146be77ec3
[DOCS] Separates data frame analytics APIs (#44451)
* [DOCS] Separates data frame analytics APIs

* [DOCS] Adds links between new pages
2019-07-16 13:22:27 -07:00
James Rodewig bd52e148c5
[DOCS] Remove :edit_url: overrides. (#44445)
These overrides do not work in Asciidoctor and are no longer needed.
2019-07-16 15:02:38 -04:00
Lisa Cawley 2316703b93
[DOCS] Removes unnecessary resource definition pages (#44289)
* [DOCS] Removes calendar resource definition page

* [DOCS] Removes scheduled event and filter resource definitions
2019-07-15 09:44:57 -07:00
David Kyle 4402cf38bf
Wait for pending tasks in docs tests cleanup (#44123)
ML and Data Frame tests should wait for pending tasks
2019-07-15 11:58:09 +01:00
James Rodewig e5a3ae97e2 Revert "[DOCS] Fix broken links for ES API docs move (#44279)"
This reverts commit 3bdd2f4432.
2019-07-12 17:06:51 -04:00
James Rodewig 860984536c Revert "[DOCS] Fix broken link reused in Stack Overview"
This reverts commit c08c253432.
2019-07-12 17:06:44 -04:00
James Rodewig f9c09fa7f6 Revert "[DOCS] Fix broken links"
This reverts commit 313030263f.
2019-07-12 17:06:28 -04:00
James Rodewig 313030263f [DOCS] Fix broken links 2019-07-12 14:03:30 -04:00
James Rodewig c08c253432 [DOCS] Fix broken link reused in Stack Overview 2019-07-12 13:15:05 -04:00
James Rodewig 3bdd2f4432
[DOCS] Fix broken links for ES API docs move (#44279)
* [DOCS] Fix broken links for ES API docs move

Co-Authored-By: Lisa Cawley <lcawley@elastic.co>
2019-07-12 12:46:22 -04:00
Lisa Cawley b3a7b2221b
[DOCS] Reformats API parameter details (#44194) 2019-07-12 08:26:31 -07:00
Lisa Cawley 727199e398
[DOCS] Removes links to ML tutorial (#44251) 2019-07-12 08:25:23 -07:00
István Zoltán Szabó 74c16efe2a
[DOCS] Adds data frame analytics API and evaluate API resource documentation (#43972)
This PR adds the resource documentation of the data frame analytics APIs and the evaluate API to the ML API doc pool.
2019-07-11 18:05:05 +02:00
lcawl c9a265b092 [DOCS] Fixes formatting in data frame analytics API 2019-07-10 17:58:17 -07:00
Przemysław Witek 1572080a63
[ML] Add DatafeedTimingStats to datafeed GetDatafeedStatsAction.Response (#43045) 2019-07-09 16:07:27 +02:00
David Kyle 071b652874 Mute put job docs test
Relates to #43271
2019-07-09 13:15:25 +01:00
Lisa Cawley 0601aaf621
[DOCS] Enables testing for create job ML API (#44022) 2019-07-08 11:25:21 -07:00
Lisa Cawley f60b35cbcc
[DOCS] Fixes earliest_record_timestamp data type (#44030) 2019-07-08 10:14:37 -07:00
István Zoltán Szabó cccf5bac43
[DOCS] Adds data frame analytics APIs to the ML APIs (#43875)
This PR adds the reference documentation pages of the data frame analytics APIs (PUT, START, STOP, GET, GET stats, DELETE, Evaluate) to the ML APIs pool.
2019-07-05 13:34:05 +02:00
Lisa Cawley f1e3a8fd6c
[DOCS] Adds data frame API response codes for allow_no_match (#43666) 2019-06-27 15:16:24 -07:00
Lisa Cawley c75773745c
[DOCS] Updates ML APIs to use new API template (#43711) 2019-06-27 13:58:42 -07:00
lcawl 66e1853f34 [DOCS] Adds anchors and attributes to ML APIs 2019-06-27 09:43:43 -07:00
Matthew Adams 4c8f089ebd Clarify storage location of ML Snapshots (#43437)
The existing language was misleading about the model snapshots and where they are located. Saying "to disk" sounds like files external to Elasticsearch IMO. It raises the obvious question, where on disk? which node? Is it in the Elasticsearch snapshot repo? The model snapshots are held in an internal index.
2019-06-24 09:13:21 +01:00
Przemysław Witek 13596c807a
Report exponential_avg_bucket_processing_time which gives more weight to recent buckets (#43189) 2019-06-16 20:41:27 +02:00
Ryan Ernst a3f2f4079c
Add native code info to ML info api (#43172)
The machine learning feature of xpack has native binaries with a
different commit id than the rest of code. It is currently exposed in
the xpack info api. This commit adds that commit information to the ML
info api, so that it may be removed from the info api.
2019-06-13 11:38:29 -07:00
lcawl aa4ff855a6 [DOCS] Fix link to ML node description 2019-06-13 11:17:12 -07:00