elasticsearch

Commit Graph

Author	SHA1	Message	Date
David Kyle	bf245e4c07	Make Inference processor field_map and inference_config optional (#58868 ) Relaxes the requirement that the inference ingest processor must has a field_map and inference_config defined even if they are empty.	2020-07-03 08:36:57 +01:00
DeDe Morton	b5e374d958	[DOCS] Change Beats links to refactored getting started docs (#58790 )	2020-07-02 17:10:09 -07:00
Nik Everett	a4d30352c7	Document using stored scripts for ingest (#58783 ) This documents using stored scripts for complex conditionals in indest.	2020-07-01 13:35:13 -04:00
István Zoltán Szabó	d0042fb791	[DOCS] Updates results_field description in the inference processor docs (#58554 )	2020-06-29 11:28:17 +02:00
Jake Landis	5088ab151a	Update hh to HH in date processor example (#58089 ) (#58142 ) Co-authored-by: Leaf-Lin <39002973+Leaf-Lin@users.noreply.github.com>	2020-06-15 17:03:42 -05:00
bellengao	efc4c9a210	Add ignore_empty_value parameter in set ingest processor (#57030 )	2020-06-15 07:26:57 -05:00
Jake Landis	f5910664b7	Ensure Joni warning are logged at debug (#57302 ) When Joni, the regex engine that powers grok emits a warning it does so by default to System.err. System.err logs are all bucketed together in the server log at WARN level. When Joni emits a warning, it can be extremely verbose, logging a message for each execution again that pattern. For ingest node that means for every document that is run that through Grok. Fortunately, Joni provides a call back hook to push these warnings to a custom location. This commit implements Joni's callback hook to push the Joni warning to the Elasticsearch server logger (logger.org.elasticsearch.ingest.common.GrokProcessor) at debug level. Generally these warning indicate a possible issue with the regular expression and upon creation of the Grok processor will do a "test run" of the expression and log the result (if any) at WARN level. This WARN level log should only occur on pipeline creation which is a much lower frequency then every document. Additionally, the documentation is updated with instructions for how to set the logger to debug level.	2020-06-09 13:33:27 -05:00
Lisa Cawley	8b9293b3bf	[DOCS] Replace docdir attribute with es-repo-dir (#57489 )	2020-06-01 15:55:05 -07:00
Adam Locke	d77388f919	[DOCS] Add links to `flattened` datatype (#56794 ) * Changes for #52239. * Incorporating review feedback from Julie T. Also single-sourcing nexted options in the Mapping page and referencing them in the Nested page. * Moving tip after the introduction and clarifying limits. * Update docs/reference/mapping.asciidoc Co-authored-by: James Rodewig <james.rodewig@elastic.co> * Update docs/reference/mapping/types/nested.asciidoc Co-authored-by: James Rodewig <james.rodewig@elastic.co> Co-authored-by: James Rodewig <james.rodewig@elastic.co>	2020-05-19 13:40:26 -04:00
James Rodewig	2f930f1ec0	[DOCS] Correct `query` datatype in enrich policy definition (#56224 ) Corrects the datatype for the `query` property of an enrich policy object. The `query` property is a query object, not a string.	2020-05-13 08:34:22 -04:00
Thiago Souza	863a883286	[DOCS] Correct get enrich policy API request example (#56207 )	2020-05-05 12:34:50 -04:00
István Zoltán Szabó	ca2f98382f	[DOCS] Changes feature importance links to point to the new page (#55531 ) * [DOCS] Changes feature importance links to point to the new page. * [DOCS] Fixes line breaks.	2020-04-28 09:02:14 +02:00
Benjamin Trent	c1afda4a23	[ML] adding prediction_field_type to inference config (#55128 ) Data frame analytics dynamically determines the classification field type. This field type then dictates the encoded JSON that is written to Elasticsearch. Inference needs to know about this field type so that it may provide the EXACT SAME predicted values as analytics. Here is added a new field `prediction_field_type` which indicates the desired type. Options are: `string` (DEFAULT), `number`, `boolean` (where close_to(1.0) == true, false otherwise). Analytics provides the default `prediction_field_type` when the model is created from the process.	2020-04-15 08:32:48 -04:00
István Zoltán Szabó	a0662399c7	[DOCS] Makes PUT inference API docs collapsible (#54653 ) Co-authored-by: lcawl <lcawley@elastic.co>	2020-04-03 09:45:42 +02:00
Benjamin Trent	4e1ff31c3c	[ML] add new inference_config field to trained model config (#54421 ) A new field called `inference_config` is now added to the trained model config object. This new field allows for default inference settings from analytics or some external model builder. The inference processor can still override whatever is set as the default in the trained model config.	2020-04-02 10:34:17 -04:00
lcawl	2641a39fd5	[DOCS] Fixes shared attribute for feature importance	2020-04-01 14:46:38 -07:00
AndyHunt66	ba8253f5ee	[DOCS] Remove redundant sentence in ingest processor docs (#54329 )	2020-03-27 08:23:46 -04:00
István Zoltán Szabó	a65e95e093	[DOCS] Adds feature importance mapping subsection to inference processor docs (#54190 )	2020-03-26 09:22:12 +01:00
bellengao	8ffe5d1f94	Support array for all string ingest processors	2020-03-17 15:22:30 -05:00
Benjamin Trent	970f726c1f	[ML] renaming inference processor field field_mappings to new name field_map (#53433 ) This renames the `inference` processor configuration field `field_mappings` to `field_map`. `field_mappings` is now deprecated.	2020-03-12 12:49:25 -04:00
James Rodewig	bc7643c65b	[DOCS] Reduce content reuse in enrich docs (#53460 ) Restructures the 'Update an enrich policy' section to: * Migrate the content to the section. It was previously stored in the Put Enrich Policy API docs. * Remove the warning tag admonition from the section content. * Replace a reused section earlier in the "Set up an enrich processor" page with a link. No substantive changes were made to the content.	2020-03-12 05:40:57 -04:00
Benjamin Trent	4e1f029b04	[ML][Inference] adds new default_field_map field to trained models (#53294 ) Adds a new `default_field_map` field to trained model config objects. This allows the model creator to supply field map if it knows that there should be some map for inference to work directly against the training data. The use case internally is having analytics jobs supply a field mapping for multi-field fields. This allows us to use the model "out of the box" on data where we trained on `foo.keyword` but the `_source` only references `foo`.	2020-03-11 12:23:56 -04:00
Orhan Toy	bce4a3bd4b	[DOCS] Fix formatting of simulate ingest pipeline API docs (#52754 ) Wraps request routes for the simulate ingest pipelines in the API docs. This ensures the routes display in monospace.	2020-03-02 11:41:22 -05:00
David Pilato	e51b8a51aa	[DOS] Fix typo in CSV processor docs (#52649 ) Corrects an example array in a snippet of the CSV processor docs.	2020-02-25 08:47:58 -05:00
bellengao	21061f7479	[DOCS] Fix typo in ingest node docs (#52671 )	2020-02-25 07:51:02 -05:00
Benjamin Trent	20f54272f0	[ML] Adds feature importance to option to inference processor (#52218 ) This adds machine learning model feature importance calculations to the inference processor. The new flag in the configuration matches the analytics parameter name: `num_top_feature_importance_values` Example: ``` "inference": { "field_mappings": {}, "model_id": "my_model", "inference_config": { "regression": { "num_top_feature_importance_values": 3 } } } ``` This will write to the document as follows: ``` "inference" : { "feature_importance" : { "FlightTimeMin" : -76.90955548511226, "FlightDelayType" : 114.13514762158526, "DistanceMiles" : 13.731580450792187 }, "predicted_value" : 108.33165831875137, "model_id" : "my_model" } ``` This is done through calculating the [SHAP values](https://arxiv.org/abs/1802.03888). It requires that models have populated `number_samples` for each tree node. This is not available to models that were created before 7.7. Additionally, if the inference config is requesting feature_importance, and not all nodes have been upgraded yet, it will not allow the pipeline to be created. This is to safe-guard in a mixed-version environment where only some ingest nodes have been upgraded. NOTE: the algorithm is a Java port of the one laid out in ml-cpp: https://github.com/elastic/ml-cpp/blob/master/lib/maths/CTreeShapFeatureImportance.cc usability blocked by: https://github.com/elastic/ml-cpp/pull/991	2020-02-21 16:36:21 -05:00
Russ Cam	94f6f946ef	Specify name on enrich.get_policy as list type (#50217 ) This commit updates the enrich.get_policy API to specify name as a list, in line with other URL parts that accept a comma-separated list of values. In addition, update the get enrich policy API docs to align the URL part name in the documentation with the name used in the REST API specs.	2020-02-20 12:33:06 +11:00
Yang Wang	5c9f79534f	Expose more authentication info to ingest pipeline (#51305 ) The changes add more granularity for identiying the data ingestion user. The ingest pipeline can now be configure to record authentication realm and type. It can also record API key name and ID when one is in use. This improves traceability when data are being ingested from multiple agents and will become more relevant with the incoming support of required pipelines (#46847) Resolves: #49106	2020-02-10 13:56:07 +11:00
Przemko Robakowski	5560135542	Add empty_value parameter to CSV processor (#51567 ) * Add empty_value parameter to CSV processor This change adds `empty_value` parameter to the CSV processor. This value is used to fill empty fields. Fields will be skipped if this parameter is ommited. This behavior is the same for both quoted and unquoted fields. * docs updated * Fix compilation problem Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-02-05 22:36:00 +01:00
David Kyle	34743bcd6f	[ML] Remove stray field from inference docs (#51870 ) model_info_field is not a valid option	2020-02-05 10:49:36 +00:00
Florian Kelbert	bd52041f92	[DOCS] Remove unneeded comma from CSV processor example (#51859 )	2020-02-04 09:23:43 -05:00
István Zoltán Szabó	4e0e6e83e0	[DOCS] Fixes indentation in inference processor code snippet (#51252 )	2020-01-21 16:21:17 +01:00
Martijn van Groningen	2b2935fd52	Add pipeline name to ingest metadata (#50467 ) This commit adds the name of the current pipeline to ingest metadata. This pipeline name is accessible under the following key: '_ingest.pipeline'. Example usage in pipeline: PUT /_ingest/pipeline/2 { "processors": [ { "set": { "field": "pipeline_name", "value": "{{_ingest.pipeline}}" } } ] } Closes #42106	2020-01-15 16:17:05 +01:00
Igor Motov	7f81467378	Geo: Switch generated GeoJson type names to camel case (#50285 ) (#50400 ) Switches generated GeoJson type names to camel case to conform to the standard. Closes #49568	2019-12-20 04:47:42 -10:00
István Zoltán Szabó	b8cae37374	[DOCS] Adds inference processor documentation (#50204 ) Co-Authored-By: Lisa Cawley <lcawley@elastic.co>	2019-12-19 12:19:44 +01:00
Igor Motov	a26e4d1e5e	Geo: Switch generated WKT to upper case (#50285 ) Switches generated WKT to upper case to conform to the standard recommendation. Relates #49568	2019-12-18 07:28:56 -10:00
Przemko Robakowski	64e1a774fc	CSV ingest processor (#49509 ) * CSV Processor for Ingest This change adds new ingest processor that breaks line from CSV file into separate fields. By default it conforms to RFC 4180 but can be tweaked. Closes #49113	2019-12-11 14:52:04 +01:00
Przemko Robakowski	c57032f622	Allow list of IPs in geoip ingest processor (#49573 ) * Allow list of IPs in geoip ingest processor This change lets you use array of IPs in addition to string in geoip processor source field. It will set array containing geoip data for each element in source, unless first_only parameter option is enabled, then only first found will be returned. Closes #46193	2019-12-06 21:57:06 +01:00
Alexander Reelsen	062f9f03bf	Docs: Fix & test more grok processor documentation (#49447 ) The documentation contained a small error, as bytes and duration was not properly converted to a number and thus remained a string. The documentation is now also properly tested by providing a full blown simulate pipeline example.	2019-12-03 11:47:27 +01:00
James Rodewig	37baa50815	[DOCS] Explicitly document enrich `target_field` includes `match_field` (#49407 ) When the enrich processor appends enrich data to an incoming document, it adds a `target_field` to contain the enrich data. This `target_field` contains both the `match_field` AND `enrich_fields` specified in the enrich policy. Previously, this was reflected in the documented example but not explicitly stated. This adds several explicit statements to the docs.	2019-12-02 09:12:21 -05:00
Martijn van Groningen	88aea2107d	Add templating support to pipeline processor. (#49030 ) This commit adds templating support to the pipeline processor's `name` option. Closes #39955	2019-11-27 13:45:11 +01:00
Martijn van Groningen	4013e814e8	Add templating support to enrich processor (#49093 ) Adds support for templating to `field` and `target_field` options.	2019-11-27 07:52:42 +01:00
Martijn van Groningen	2ba00c8149	Introduce on_failure_pipeline ingest metadata inside on_failure block (#49076 ) In case an exception occurs inside a pipeline processor, the pipeline stack is kept around as header in the exception. Then in the on_failure processor the id of the pipeline the exception occurred is made accessible via the `on_failure_pipeline` ingest metadata. Closes #44920	2019-11-26 14:49:51 +01:00
Lisa Cawley	9cc247d929	[DOCS] Fixes security links (#49563 )	2019-11-25 12:59:59 -08:00
James Rodewig	71ca343874	[DOCS] Clean up example pipeline and enrich policy in docs snippets (#49341 )	2019-11-19 17:02:58 -05:00
James Rodewig	c9e9685bfd	[DOCS] Add high-level docs for enrich processor and policies (#49194 ) * [DOCS] Add high-level docs for enrich policies * fix typos * fix typo * add warning for enrich policy changes * add addtl cross-links to execute API docs * Reword match and geo_match policy example headings	2019-11-19 13:56:51 -05:00
James Rodewig	4ccd3a2b3f	[DOCS] Correct required file ext for user agent ingest processor (#48688 ) For the user agent ingest processor, custom regex files must end with the `.yml` file extension. This corrects the docs which said the `.yaml` extension was required.	2019-10-30 11:10:35 -04:00
Dan Hermann	fcc18dc19b	Add option to split processor for preserving trailing empty fields (#48664 )	2019-10-30 07:23:47 -05:00
Shaunak Kashyap	93ecb9b7ab	[DOCS] Remove extraneous comma in Enrich Stats API's JSON response (#48539 )	2019-10-25 11:35:13 -05:00
James Rodewig	25d3add88a	[DOCS] Remove duplicate links for ingest processor overview (#48394 )	2019-10-23 10:54:53 -05:00
Martijn van Groningen	1ef8dc4030	Also validate source index at put enrich policy time. (#48254 ) This changes tests to create a valid source index prior to creating the enrich policy.	2019-10-21 19:34:57 +02:00
Alexander Reelsen	fd65eec64c	update ingest-user-agent regexes.yml (#47807 ) This new regexes are from: `154eba17f5/regexes.yaml`	2019-10-18 16:14:44 +02:00
James Rodewig	17610e740a	[DOCS] Add `wait_for_completion` parm to execute enrich policy API docs (#48077 )	2019-10-15 13:46:55 -04:00
Martijn van Groningen	ddf3bc25d8	Change how `max_matches` affects `target_field` option. (#47982 ) Prior to this change the `target_field` would always be a json array field in the document being ingested. This to take into account that multiple enrich documents could be inserted into the `target_field`. However the default `max_matches` is `1`. Meaning that by default only a single enrich document would be added to `target_field` json array field. This commit changes this; if `max_matches` is set to `1` then the single document would be added as a json object to the `target_field` and if it is configured to a higher value then the enrich documents will be added as a json array (even if a single enrich document happens to be enriched).	2019-10-14 21:04:47 +02:00
Martijn van Groningen	e06598ba56	Merge remote-tracking branch 'es/master' into enrich	2019-10-14 10:17:18 +02:00
Alan Woodward	566e1b7d33	Remove type field from DocWriteRequest and associated Response objects (#47671 ) This commit removes the type field from index, update and delete requests, and their associated responses. Relates to #41059	2019-10-11 10:23:55 +01:00
James Rodewig	17eef81f83	[DOCS] Add docs for `geo_match` enrich policy type (#47745 )	2019-10-09 08:39:11 -04:00
Tal Levy	4d3f6816a7	Merge remote-tracking branch 'elastic/master' into enrich	2019-10-04 13:30:57 -07:00
James Rodewig	6ef5300e13	[DOCS] Reformat simulate pipeline API (#47301 )	2019-10-01 14:29:05 -04:00
James Rodewig	e2b9c1b764	[DOCS] Reformat put pipeline API (#47171 )	2019-10-01 14:19:26 -04:00
James Rodewig	4ebb44ffaf	[DOCS] Reformat delete pipeline API (#47172 )	2019-09-30 09:44:41 -04:00
Martijn van Groningen	a23c7af811	Add config namespace in get policy api response (#47162 ) Currently the policy config is placed directly in the json object of the toplevel `policies` array field. For example: ``` { "policies": [ { "match": { "name" : "my-policy", "indices" : ["users"], "match_field" : "email", "enrich_fields" : [ "first_name", "last_name", "city", "zip", "state" ] } } ] } ``` This change adds a `config` field in each policy json object: ``` { "policies": [ { "config": { "match": { "name" : "my-policy", "indices" : ["users"], "match_field" : "email", "enrich_fields" : [ "first_name", "last_name", "city", "zip", "state" ] } } } ] } ``` This allows us in the future to add other information about policies in the get policy api response. The UI will consume this API to build an overview of all policies. The UI may in the future include additional information about a policy and the plan is to include that in the get policy api, so that this information can be gathered in a single api call. An example of the information that is likely to be added is: * Last policy execution time * The status of a policy (executing, executed, unexecuted) * Information about the last failure if exists	2019-09-30 14:36:53 +02:00
Martijn van Groningen	f676d9730d	Merge remote-tracking branch 'es/master' into enrich	2019-09-27 13:51:17 +02:00
James Rodewig	223110491b	[DOCS] Reformat get pipeline API (#47131 )	2019-09-26 08:26:01 -04:00
Alan Woodward	c1f99e2d75	Remove `_type` from SearchHit (#46942 ) This commit removes the `_type` field from all search hit responses. Relates to #41059	2019-09-23 19:14:54 +01:00
James Rodewig	2d77751716	[DOCS] Minor editorial changes to enrich docs	2019-09-23 13:23:57 -04:00
Martijn van Groningen	1118da0199	fixed tests	2019-09-23 11:08:58 +02:00
Martijn van Groningen	afc16ba518	Merge remote-tracking branch 'es/master' into enrich	2019-09-23 09:34:53 +02:00
Alan Woodward	7c90801aff	Remove types from Get/MultiGet (#46587 ) This commit removes types from the ShardGetService, and propagates this API change up through the Transport and Rest actions for Get and MultiGet Relates to #41059	2019-09-20 14:22:57 +01:00
James Rodewig	a18b587f59	[DOCS] Correct `<enrich-policy>` parm description for comma-sep list (#46682 )	2019-09-18 08:30:22 -04:00
Alexander Reelsen	3cf99cf83f	Expose cache setting in UserAgentPlugin (#46533 ) The setting was not registered. Also documentation has been added.	2019-09-16 11:29:59 +02:00
James Rodewig	2fbaf32412	[DOCS] Change // CONSOLE comments to [source,console] (#46669 )	2019-09-12 10:13:21 -04:00
James Rodewig	2b95e6ac54	[DOCS] Reformat enrich stats API (#46600 )	2019-09-11 13:52:20 -04:00
Martijn van Groningen	534527451b	Add enrich stats api (#46462 ) The enrich api returns enrich coordinator stats and information about currently executing enrich policies. The coordinator stats include per ingest node: * The current number of search requests in the queue. * The total number of outstanding remote requests that have been executed since node startup. Each remote request is likely to include multiple search requests. This depends on how much search requests are in the queue at the time when the remote request is performed. * The number of current outstanding remote requests. * The total number of search requests that `enrich` processors have executed since node startup. The current execution policies stats include: * The name of policy that is executing * A full blow task info object that is executing the policy. Relates to #32789	2019-09-11 12:58:46 +02:00
Michael Basnight	53b19af59b	Allow comma separated ids in get enrich policy API (#46351 ) This commit changes the GET REST api so it will accept an optional comma separated list of enrich policy ids. This change also modifies the behavior of the GET API in that it will not error if it is passed a bad enrich id anymore, but will instead just return an empty list.	2019-09-10 10:42:28 -05:00
James Rodewig	a97ed3e92b	[DOCS] Update "Enrich your data" tutorials (#46417 ) * Move enrich docs to separate file * Rewrite enrich processor tutorial	2019-09-09 08:44:56 -04:00
Martijn van Groningen	f97cc7f355	Merge remote-tracking branch 'es/master' into enrich	2019-09-09 08:38:37 +02:00
James Rodewig	e43be90e6c	[DOCS] [5 of 5] Change // TESTRESPONSE comments to [source,console-results] (#46449 )	2019-09-06 14:05:36 -04:00
James Rodewig	97802d8aff	[DOCS] Change // CONSOLE comments to [source,console] (#46441 )	2019-09-06 10:55:16 -04:00
James Rodewig	466c59a4a7	[DOCS] Replace "// TESTRESPONSE" magic comments with "[source,console-result] (#46295 )	2019-09-05 16:47:18 -04:00
James Rodewig	dace374e26	[DOCS] Separate Enrich API Docs (#46286 ) * Add enrich policy common parameter * Add enrich APIs to REST APIs index * Add put enrich policy API docs * Add get enrich policy API docs * Add delete enrich policy API docs * Add execute enrich policy API docs	2019-09-04 14:11:52 -04:00
Martijn van Groningen	43ede36286	Change exact match processor to match processor. (#46041 ) Besides a rename, this changes allows to processor to attach multiple enrich docs to the document being ingested. Also in order to control the maximum number of enrich docs to be included in the document being ingested, the `max_matches` setting is added to the enrich processor. Relates #32789	2019-09-04 15:05:27 +02:00
Martijn van Groningen	63fe69fea4	Merge remote-tracking branch 'es/master' into enrich	2019-09-02 08:45:43 +02:00
Tal Levy	e1c060ab43	Add Circle Processor (#43851 ) add circle-processor that translates circles to polygons	2019-08-28 13:01:01 -07:00
Martijn van Groningen	c8436a7a36	Merge remote-tracking branch 'es/master' into enrich	2019-08-28 10:05:14 +02:00
James Rodewig	ad8eb03295	[DOCS] Relocate Ingest API docs to REST API section (#45812 )	2019-08-23 11:54:40 -04:00
Martijn van Groningen	f14874ca47	Change how type is stored in an enrich policy. (#45789 ) A policy type controls how the enrich index is created and the query executed against the match field. Currently there is a single policy type (`exact_match`). In the near future more policy types will be added and different policy may have different configuration options. For this reason type should be a json object instead of a string field: ``` { "exact_match": { ... } } ``` instead of: ``` { "type": "exact_match", ... } ``` This will make streaming parsing of enrich policies easier as in the new format, the parsing code can know ahead what configuration fields to expect. In the latter format that is not possible if the type field appears not as the first field. Relates to #32789	2019-08-23 13:38:12 +02:00
Martijn van Groningen	2879e6717e	Enrich processor configuration changes (#45466 ) Enrich processor configuration changes: * Renamed `enrich_key` option to `field` option. * Replaced `set_from` and `targets` options with `target_field`. The `target_field` option behaves different to how `set_from` and `targets` worked. The `target_field` is the field that will contain the looked up document. Relates to #32789	2019-08-22 09:22:40 +02:00
Michael Basnight	a7c5925104	Consolidate enrich list all and get by name APIs (#45705 ) The get and list APIs are a single API in this commit. Whether requesting one named policy or all policies, a list of policies is returened. The list API code has all been removed and the GET api is what remains, which contains much of the list response code.	2019-08-20 10:05:45 -05:00
Martijn van Groningen	5707bc7f5d	Merge remote-tracking branch 'es/master' into enrich	2019-08-16 09:42:36 +02:00
Jake Landis	9c388084d5	Fix bug in ingest node documentation (#45589 ) The "Conditionals with the Pipeline Processor" incorrectly documents how to create a pipeline of pipelines with a failure condition. The example as-is will always execute the fail processor. The change here updates the documentation to correct guard the fail processor with an if condition.	2019-08-15 15:08:42 -05:00
Michael Basnight	9e22fd4db8	Fail delete policy if pipeline exists (#44438 ) If a pipeline that refrences the policy exists, we should not allow the policy to be deleted. The user will need to remove the processor from the pipeline before deleting the policy. This commit adds a check to ensure that the policy cannot be deleted if it is referenced by any pipeline in the system.	2019-08-14 13:43:41 -05:00
Martijn van Groningen	25599984fe	Improve naming of enrich policy fields. (#45494 ) Renamed `enrich_key` to `match_field` and renamed `enrich_values` to `enrich_fields`. Relates #32789	2019-08-14 11:44:31 +02:00
István Zoltán Szabó	a0ba1a79ea	[DOCS] Reformats cluster node info API (#45446 ) Co-Authored-By: James Rodewig <james.rodewig@elastic.co>	2019-08-13 13:26:57 +02:00
István Zoltán Szabó	5cba5ac01c	[DOCS] Reformats cluster node stats API (#45441 ) Co-Authored-By: James Rodewig <james.rodewig@elastic.co>	2019-08-13 12:46:47 +02:00
Martijn van Groningen	bfa25b4ce0	Add initial version of enrich processor docs. (#45084 ) Relates to #32789	2019-08-12 20:36:10 +02:00
Alexander Reelsen	b7553af720	Add back lowercase processor in docs (#45090 ) This got lost in a refactoring in `9137d92ca6`	2019-08-06 09:20:04 -04:00
Jason Tedor	3c6bc34c72	Fix GeoIP custom database directory in docs (#43383 ) These docs were misleading for package installations of Elasticsearch. Instead, we should refer to $ES_CONFIG/ingest-geoip as the path to place the custom database files. For non-package installations, this is the same as $ES_HOME/config, but for package installations this is not the case as the config directory for package installations is /etc/elasticsearch, and is not relative to $ES_HOME. This commit corrects the docs.	2019-06-19 13:25:02 -04:00
Brandon Morelli	bcb77b4fde	[docs] Add missing comma (#43073 ) Adds a missing comma to a code example	2019-06-17 06:52:48 -07:00
Marios Trivyzas	c8125417dc	[Docs] Add note for date patterns used for index search. (#42810 ) Add an explanatory NOTE section to draw attention to the difference between small and capital letters used for the index date patterns. e.g.: HH vs hh, MM vs mm. Closes: #22322	2019-06-03 22:26:01 +02:00
Jack Conradson	c59fbb3358	Reorganize Painless doc structure (#42303 )	2019-05-21 13:47:47 -04:00
Alexander Reelsen	2a9da80a24	Add HTML strip processor (#41888 ) This processor uses the lucene HTMLStripCharFilter class to remove HTML entities from a field. This adds to the char filter, so that there is possibility to store the stripped version as well. Note, that the characeter filter replaces tags with a newline, so that the produced HTML will look slightly different than the incoming HTML with regards to newlines.	2019-05-09 12:59:45 +02:00
Flavio Pompermaier	ed3e25ae7d	Fix wrong property name (#40636 )	2019-05-09 08:52:36 +02:00
James Rodewig	737b359b94	[DOCS] Escape quotes to avoid smart quotes in Asciidoctor (#41603 )	2019-04-30 16:30:58 -04:00
James Rodewig	adf67053f4	[DOCS] Add anchors for Asciidoctor migration (#41648 )	2019-04-30 10:19:09 -04:00
Jason Tedor	e99bbd4b0b	Fix date index name processor default date_formats (#40915 ) This commit is a correction of a doc bug in the docs for the ingest date-index-name processor. The correct pattern is yyyy-MM-dd'T'HH:mm:ss.SSSXX. This is due to the transition from Joda time to Java time where Z does not mean the same thing between the two.	2019-04-05 17:45:30 -04:00
ajoshbiol	dd01da9f6f	Adding an example in the Set processor documentation to address #30604 (#39941 ) * Added an example of using set to copy values from one field to another. * Modified the document type to match the test.	2019-03-12 10:47:47 -07:00
Jake Landis	66ec35801c	Execute ingest node pipeline before creating the index (#39607 ) Prior to this commit (and after 6.5.0), if an ingest node changes the _index in a pipeline, the original target index would be created. For daily indexes this could create an extra, empty index per day. This commit changes the TransportBulkAction to execute the ingest node pipeline before attempting to create the index. This ensures that the only index created is the original or one set by the ingest node pipeline. This was the execution order prior to 6.5.0 (#32786). The execution order was changed in 6.5 to better support default pipelines. Specifically the execution order was changed to be able to read the settings from the index meta data. This commit also includes a change in logic such that if the target index does not exist when ingest node pipeline runs, it will now pull the default pipeline (if one exists) from the settings of the best matched of the index template. Relates #32786 Relates #32758 Closes #36545	2019-03-06 16:18:43 -06:00
Alexander Reelsen	5f7168ea74	Remove joda time mentions in documentation (#38720 ) This is the forward port of #38720 (not containing the 7.0 migration docs)	2019-02-14 10:18:48 +01:00
Jake Landis	431c4fd55e	fix dissect doc "ip" --> "clientip" (#38545 ) Forward port of #38512.	2019-02-08 16:52:33 -06:00
Lee Hinman	645db34e0e	bad formatted JSON object (#38515 ) (#38525 ) It just need to replace the wrong " , " to " : " Backport of #38515	2019-02-06 13:02:02 -07:00
Gordon Brown	292e0f6fb7	Deprecate `_type` in simulate pipeline requests (#37949 ) As mapping types are being removed throughout Elasticsearch, the use of `_type` in pipeline simulation requests is deprecated. Additionally, the default `_type` used if one is not supplied has been changed to `_doc` for consistency with the rest of Elasticsearch.	2019-02-04 16:11:44 -07:00
Jake Landis	5b008a34aa	Ingest node - user agent, move device to an object (#38115 ) When the ingest node user agent parses the device field, it will result in a string value. To match the ecs schema this commit moves the value of the parsed device to an object with an inner field named 'name'. There are not any passivity concerns since this modifies an unreleased change. closes #38094 relates #37329	2019-01-31 13:54:34 -06:00
Lee Hinman	cac6b8e06f	Add ECS schema for user-agent ingest processor (#37727 ) (#37984 ) * Add ECS schema for user-agent ingest processor (#37727) This switches the format of the user agent processor to use the schema from [ECS](https://github.com/elastic/ecs). So rather than something like this: ``` { "patch" : "3538", "major" : "70", "minor" : "0", "os" : "Mac OS X 10.14.1", "os_minor" : "14", "os_major" : "10", "name" : "Chrome", "os_name" : "Mac OS X", "device" : "Other" } ``` The structure is now like this: ``` { "name" : "Chrome", "original" : "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36", "os" : { "name" : "Mac OS X", "version" : "10.14.1", "full" : "Mac OS X 10.14.1" }, "device" : "Other", "version" : "70.0.3538.102" } ``` This is now the default for 7.0. The deprecated `ecs` setting in 6.x is not supported. Resolves #37329 * Remove `ecs` setting from docs	2019-01-30 11:24:18 -07:00
Christoph Büscher	34f2d2ec91	Remove remaining occurances of "include_type_name=true" in docs (#37646 )	2019-01-22 15:13:52 +01:00
Julie Tibshirani	36a3b84fc9	Update the default for include_type_name to false. (#37285 ) * Default include_type_name to false for get and put mappings. * Default include_type_name to false for get field mappings. * Add a constant for the default include_type_name value. * Default include_type_name to false for get and put index templates. * Default include_type_name to false for create index. * Update create index calls in REST documentation to use include_type_name=true. * Some minor clean-ups around the get index API. * In REST tests, use include_type_name=true by default for index creation. * Make sure to use 'expression == false'. * Clarify the different IndexTemplateMetaData toXContent methods. * Fix FullClusterRestartIT#testSnapshotRestore. * Fix the ml_anomalies_default_mappings test. * Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests. We make sure to specify include_type_name=true during xContent parsing, so we continue to test the legacy typed responses. XContent generation for the typeless responses is currently only covered by REST tests, but we will be adding unit test coverage for these as we implement each typeless API in the Java HLRC. This commit also refactors GetMappingsResponse to follow the same appraoch as the other mappings-related responses, where we read include_type_name out of the xContent params, instead of creating a second toXContent method. This gives better consistency in the response parsing code. * Fix more REST tests. * Improve some wording in the create index documentation. * Add a note about types removal in the create index docs. * Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL. * Make sure to mention include_type_name in the REST docs for affected APIs. * Make sure to use 'expression == false' in FullClusterRestartIT. * Mention include_type_name in the REST templates docs.	2019-01-14 13:08:01 -08:00
Josh Soref	edb48321ba	[DOCS] Various spelling corrections (#37046 )	2019-01-07 14:44:12 +01:00
Adam Thomson	ac4aecc92d	[Docs] Update ingest-node.asciidoc (#37116 )	2019-01-04 19:33:06 +01:00
Jason Tedor	9137d92ca6	Refactor ingest node API docs (#36962 ) This commit is a simple refactoring of the ingest node API docs, breaking each API into a single file for ease of maintaining.	2018-12-23 08:59:18 -05:00
Jason Tedor	7562768bd6	Fix ingest cross-doc links This commit fixes some cross-doc links from the old ingest plugins page to the new ingest processor pages that arose after converting ingest-geoip and ingest-user-agent to modules.	2018-12-22 20:51:18 -05:00
Jason Tedor	e14f27c033	Fix titles of GeoIP and User Agent processor docs This commit makes the titles of the new GeoIP and User Agent processor docs look more like the titles of the docs for other processors.	2018-12-22 20:31:07 -05:00
Jason Tedor	1f574bd17a	Package ingest-user-agent as a module (#36956 ) This commit moves ingest-user-agent from being a plugin to being a module that is packaged with Elasticsearch distributions.	2018-12-22 20:20:53 -05:00
Jason Tedor	434021c3ec	Add placeholder ingest-geoip plugin page (#36958 ) This commit adds a placeholder ingest-geoip plugin page as there are other components in the Elastic Stack that still refer to these pages. These docs would be broken without this placeholder page forcing teams responsible for those docs to scramble to fix the build over the weekend before a holiday period. Instead, we add a placeholder page so the docs build continues to function, and those teams can fix their docs without the constraint of a broken build. We also cleanup a few minor docs issues that were missed during the initial changes to convert ingest-geoip to a module.	2018-12-22 09:49:56 -05:00
Jason Tedor	e1717df0ac	Package ingest-geoip as a module (#36898 ) This commit moves ingest-geoip from being a plugin to being a module that is packaged with Elasticsearch distributions.	2018-12-22 07:21:49 -05:00
Jason Tedor	35911d8dd7	Split the ingest processor docs into multiple files (#36887 ) This commit breaks the single ingest docs file into multiple files, factoring out the processor docs into a documentation file per processor. This will help make this content easier to maintain.	2018-12-20 08:04:54 -05:00
Boaz Leskes	e356b8cb95	Add doc's sequence number + primary term to GetResult and use it for updates (#36680 ) This commit adds the last sequence number and primary term of the last operation that have modified a document to `GetResult` and uses it to power the Update API. Relates #36148 Relates #10708	2018-12-17 15:22:13 +01:00
Jake Landis	4b99a663c1	ingest: fix broken doc link	2018-11-26 10:34:42 -06:00
Jake Landis	7f7b31723e	ingest: extended `if` documentation (#35044 ) part of #33188	2018-11-26 09:35:45 -06:00
Chris Cho	e572a21c4b	[Docs] Improve Convert Processor description (#35280 ) Sometimes users are confused about whether they can use the Convert Processor for changing an existing fields type to other types even if the existing one is already ingested. This confusion is from the first line of description. Changing this and also adding a some detail to the code snippet.	2018-11-07 17:01:35 +01:00
Jake Landis	c2766b65cf	ingest: raise visibility of ingest plugin documentation (#35048 ) * move the set security user processor to the main documentation * link to plugin processors part of #33188	2018-11-05 11:44:10 -06:00
Jake Landis	77fab62ebe	ingest: add common options to each processor's documentation (#35091 ) * adds `if`, `on_failure`, `tag`, and `ignore_failure` to table for each processor part of #33188 * added ingore_failure * fix whitespace noise	2018-11-01 11:08:04 -05:00
Armin Braun	f79bdec58a	INGEST: Document Pipeline Processor (#33418 ) * Added documentation for Pipeline Processor * Relates #33188	2018-10-23 15:36:57 -05:00
Jake Landis	a8e1ee34ca	ingest: document fields that support templating (#34536 ) This change also updates many of the examples to use ecs as the example. Some additional minor improvements are also included. Part of #33188	2018-10-23 13:28:44 -05:00
Jake Landis	c447fc258a	ingest: documentation for the drop processor (#34570 )	2018-10-23 12:30:23 -05:00
Armin Braun	f0f732908e	INGEST: Document Processor Conditional (#33388 ) * INGEST: Document Processor Conditional Relates #33188	2018-10-23 17:37:30 +02:00
Jake Landis	79b507dbf5	ingest: Introduce the dissect processor (#32884 ) * ingest: Introduce the dissect processor The ingest node dissect processor is an alternative to Grok to split a string based on a pattern. Dissect differs from Grok such that regular expressions are not used to split the string. Dissect can be used to parse a source text field with a simpler pattern, and is often faster the Grok for basic string parsing. This processor uses the dissect library which does most of the work.	2018-08-28 07:11:20 -07:00
Jake Landis	3d4c84f7ca	ingest: doc: move Dot Expander Processor doc to correct position (#31743 ) No changes to the content.	2018-08-03 07:21:05 -07:00
Armin Braun	7aa8a0a927	INGEST: Extend KV Processor (#31789 ) (#32232 ) * INGEST: Extend KV Processor (#31789) Added more capabilities supported by LS to the KV processor: * Stripping of brackets and quotes from values (`include_brackets` in corresponding LS filter) * Adding key prefixes * Trimming specified chars from keys and values Refactored the way the filter is configured to avoid conditionals during execution. Refactored Tests a little to not have to add more redundant getters for new parameters. Relates #31786 * Add documentation	2018-07-20 22:32:50 +02:00
Armin Braun	e46ed73379	Ingest: Add ignore_missing option to RemoveProc (#31693 ) Added `ignore_missing` setting to the RemoveProcessor to fix #23086	2018-07-09 10:24:34 +02:00
Jake Landis	c0056cddd8	ingest: Introduction of a bytes processor (#31733 ) ingest: Introduction of a bytes processor This processor allows for human readable byte values (e.g. 1kb) to be converted to value in bytes (e.g. 1024). Internally this processor re-uses "ByteSizeValue.parseBytesSizeValue" which supports conversions up to Long.MAX_VALUE and the following units: "b", "kb", "mb", "gb", "tb", pb". This change also introduces a generic return type for the AbstractStringProcessor to allow for code reuse while supporting a String -> T conversion. (String -> Long in this case).	2018-07-03 10:40:56 -05:00
Armin Braun	13e1cf6191	ingest: Add ignore_missing property to foreach filter (#22147 ) (#31578 )	2018-06-26 20:04:41 +02:00
Martijn van Groningen	6030d4be1e	[INGEST] Interrupt the current thread if evaluation grok expressions take too long (#31024 ) This adds a thread interrupter that allows us to encapsulate calls to org.joni.Matcher#search() This method can hang forever if the regex expression is too complex. The thread interrupter in the background checks every 3 seconds whether there are threads execution the org.joni.Matcher#search() method for longer than 5 seconds and if so interrupts these threads. Joni has checks that that for every 30k iterations it checks if the current thread is interrupted and if so returns org.joni.Matcher#INTERRUPTED Closes #28731	2018-06-12 07:49:03 +02:00
Tanguy Leroux	42608881b0	[Docs] Remove mention pattern files in Grok processor (#31170 ) Pattern files have been removed in `16fa3e546e`	2018-06-11 09:32:12 +02:00
rzmf	080cefec73	Fix missing comma in ingest-node.asciidoc (#29343 )	2018-04-03 11:33:44 +01:00
Nik Everett	762226bee9	Docs: Support triple quotes (#28915 ) Adds support for triple quoted strings to the documentation test generator. Kibana's CONSOLE tool has supported them for a year but we were unable to use them in Elasticsearch's docs because the process that converts example snippets into tests couldn't handle this. This change adds code to convert them into standard JSON so we can pass them to Elasticsearch.	2018-03-16 12:46:39 -04:00
Jiri Tyr	c713d62f88	[Docs] Fix link to Grok patterns (#29088 )	2018-03-16 14:13:17 +01:00
Devin Young	e8a78df555	Fix markdown formatting (#28392 )	2018-01-26 08:15:16 -07:00
Sian Lerk Lau	5e3ba8a88d	Enable convert processor to support Long and Double. (#27957 ) Closes #23085	2018-01-03 11:27:55 +01:00
Sian Lerk Lau	47eefbe889	Enable grok processor to support long, double and boolean (#27896 )	2017-12-20 11:19:49 -08:00
David Pilato	3ca39186d1	Fix missing comma in examples (#27904 )	2017-12-19 18:28:39 +01:00

1 2 3 4 5 ...

337 Commits