elasticsearch

Commit Graph

Author	SHA1	Message	Date
Joe Gallo	dd32cb6439	Document new ip_location processor (#116623 )	2024-11-11 19:55:57 -05:00
Joe Gallo	2302cdbe45	Document new ip_location APIs (#116611 )	2024-11-11 13:52:47 -05:00
Joe Gallo	b517abcb07	Document new ip geolocation fields (#116603 )	2024-11-11 11:13:56 -05:00
Giorgos Bamparopoulos	9ad09b6ee0	Fix a typo in the example for using pre-existing pipeline definitions (#116084 )	2024-11-04 16:06:16 +01:00
István Zoltán Szabó	9394e88c0f	[DOCS] Updates inference processor docs. (#115566 )	2024-10-25 10:18:01 +02:00
Keith Massey	2ff6bb0543	Adding support for additional mapping to simulate ingest API (#114742 )	2024-10-21 17:08:50 -05:00
Quentin Pradet	fc23f2f1c6	[DOCS] Fix User agent processor properties (#112518 )	2024-10-15 17:35:26 +04:00
Pete Gillin	c8c6f5af53	Actually add `terminate` docs page (#114440 ) A docs page for the `terminate` processor was added in https://github.com/elastic/elasticsearch/pull/114157, but the change to include it in the outer processor reference page was omitted. This change corrects that oversight.	2024-10-10 08:34:43 +01:00
Keith Massey	fb482f863d	Adding index_template_substitutions to the simulate ingest API (#114128 ) This adds support for a new `index_template_substitutions` field to the body of an ingest simulate API request. These substitutions can be used to change the pipeline(s) used for ingest, or to change the mappings used for validation. It is similar to the `component_template_substitutions` added in #113276. Here is an example that shows both of those usages working together: ``` ## First, add a couple of pipelines that set a field to a boolean: PUT /_ingest/pipeline/foo-pipeline?pretty { "processors": [ { "set": { "field": "foo", "value": true } } ] } PUT /_ingest/pipeline/bar-pipeline?pretty { "processors": [ { "set": { "field": "bar", "value": true } } ] } ## Now, create three component templates. One provides a mapping enforces that the only field is "foo" ## and that field is a keyword. The next is similar, but adds a `bar` field. The final one provides a setting ## that makes "foo-pipeline" the default pipeline. ## Remember that the "foo-pipeline" sets the "foo" field to a boolean, so using both of these templates ## together would cause a validation exception. These could be in the same template, but are provided ## separately just so that later we can show how multiple templates can be overridden. PUT _component_template/mappings_template { "template": { "mappings": { "dynamic": "strict", "properties": { "foo": { "type": "keyword" } } } } } PUT _component_template/mappings_template_with_bar { "template": { "mappings": { "dynamic": "strict", "properties": { "foo": { "type": "keyword" }, "bar": { "type": "boolean" } } } } } PUT _component_template/settings_template { "template": { "settings": { "index": { "default_pipeline": "foo-pipeline" } } } } ## Here we create an index template pulling in both of the component templates above PUT _index_template/template_1 { "index_patterns": ["foo"], "composed_of": ["mappings_template", "settings_template"] } ## We can index a document here to create the index, or not. Either way the simulate call ought to work the same POST foo-1/_doc { "foo": "FOO" } ## This will not blow up with validation exceptions because the substitute "index_template_substitutions" ## uses `mappings_template_with_bar`, which adds the bar field. ## And the bar-pipeline is executed rather than the foo-pipeline because the substitute ## "index_template_substitutions" uses a substitute `settings_template`, so the value of "foo" ## does not get set to an invalid type. POST _ingest/_simulate?pretty&index=foo-1 { "docs": [ { "_id": "asdf", "_source": { "foo": "foo", "bar": "bar" } } ], "component_template_substitutions": { "settings_template": { "template": { "settings": { "index": { "default_pipeline": "bar-pipeline" } } } } }, "index_template_substitutions": { "template_1": { "index_patterns": ["foo"], "composed_of": ["mappings_template_with_bar", "settings_template"] } } } ```	2024-10-09 10:15:37 +11:00
Pete Gillin	43e5258b3c	Add a `terminate` ingest processor (#114157 ) This processor simply causes any remaining processors in the pipeline to be skipped. It will normally be executed conditionally using the `if` option. (If this pipeline is being called from another pipeline, the calling pipeline is not terminated.) For example, this: ``` POST /_ingest/pipeline/_simulate { "pipeline": { "description": "Appends just 'before' to the steps field if the number field is present, or both 'before' and 'after' if not", "processors": [ { "append": { "field": "steps", "value": "before" } }, { "terminate": { "if": "ctx.error != null" } }, { "append": { "field": "steps", "value": "after" } } ] }, "docs": [ { "_index": "index", "_id": "doc1", "_source": { "name": "okay", "steps": [] } }, { "_index": "index", "_id": "doc2", "_source": { "name": "bad", "error": "oh no", "steps": [] } } ] } ``` returns something like this: ``` { "docs": [ { "doc": { "_index": "index", "_version": "-3", "_id": "doc1", "_source": { "name": "okay", "steps": [ "before", "after" ] }, "_ingest": { "timestamp": "2024-10-04T16:25:20.448881Z" } } }, { "doc": { "_index": "index", "_version": "-3", "_id": "doc2", "_source": { "name": "bad", "error": "oh no", "steps": [ "before" ] }, "_ingest": { "timestamp": "2024-10-04T16:25:20.448932Z" } } } ] } ```	2024-10-08 17:39:53 +01:00
István Zoltán Szabó	57955cb8d4	[DOCS] Adds DeBERTA v2 to the tokenizers list in API docs (#112752 ) Co-authored-by: Max Hniebergall <137079448+maxhniebergall@users.noreply.github.com>	2024-10-07 10:23:46 +02:00
Liam Thompson	6e400c12a7	[DOCS] Port connector docs from Enterprise Search guide (#112953 )	2024-09-30 10:22:37 +02:00
Sam Xiao	6917f1679a	Tag redacted document in ingest pipeline (#113552 ) Adds a new option trace_redact in redact processor to indicate a document has been redacted in the ingest pipeline. If a document is processed by a redact processor AND any field is redacted, ingest metadata _ingest._redact._is_redacted = true will be set. Closes #94633	2024-09-27 12:24:24 -04:00
kosabogi	6e73c1423b	Adds text_similarity task type to inference processor documentation (#113517 )	2024-09-26 16:12:28 +02:00
Keith Massey	cd950bb2fa	Adding component template substitutions to the simulate ingest API (#113276 )	2024-09-25 15:30:22 -05:00
Stef Nestor	e6b15f4bf7	(Doc+) Inference Pipeline ignores Mapping Analyzers (#112522 ) * (Doc+) Inference Pipeline ignores Mapping Analyzers From internal Dev feedback (will cross-link after), this updates that inference processors within ingest pipelines run before mapping analyzers effectively ignoring them. So if users want analyzers to take effect, they would need to select the analyzer's ingest pipeline process equivalent and run it higher in flow than the inference processor. --------- Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>	2024-09-11 16:05:15 -06:00
Keith Massey	4aa3c3d7ee	Add support for templates when validating mappings in the simulate ingest API (#111161 )	2024-09-05 09:25:53 -05:00
Panos Koutsovasilis	29453cb2ce	fix: support all allowed protocol numbers (#111528 ) * fix(CommunityIdProcessor): support all allowed protocol numbers * fix(CommunityIdProcessor): update documentation	2024-08-26 08:37:40 +03:00
Niels Bauman	e0c1ccbc1e	Make enrich cache based on memory usage (#111412 ) The max enrich cache size setting now also supports an absolute max size in bytes (of used heap space) and a percentage of the max heap space, next to the existing flat document count. The default is 1% of the max heap space. This should prevent issues where the enrich cache takes up a lot of memory when there are large documents in the cache.	2024-08-23 09:26:55 +02:00
István Zoltán Szabó	1ba72e4602	[DOCS] Documents output_field behavior after multiple inference runs (#111875 ) Co-authored-by: David Kyle <david.kyle@elastic.co>	2024-08-15 12:36:59 +02:00
Keith Massey	c6a7537df7	Ingest download databases docs (#111688 ) Co-authored-by: Joe Gallo <joegallo@gmail.com>	2024-08-08 09:23:56 -05:00
Joe Gallo	1aa5b2face	Fix geoip processor isp_organization_name property and docs (#111372 )	2024-07-26 18:28:44 -04:00
Niels Bauman	86727a8741	Add size_in_bytes to enrich cache stats (#110578 ) As preparation for #106081, this PR adds the `size_in_bytes` field to the enrich cache. This field is calculated by summing the ByteReference sizes of all the search hits in the cache. It's not a perfect representation of the size of the enrich cache on the heap, but some experimentation showed that it's quite close.	2024-07-12 08:53:53 +02:00
Matt Culbreth	81b8495388	Mark the Redact processor as Generally Available	2024-07-02 16:58:57 -04:00
Kathleen DeRusso	7a1d532ffb	Pass over Sparse Vector docs for correctness (#110282 ) * Remove legacy mentions of text expansion queries * Add missing query_vector param to sparse_vector query docs * Fix formatting errors in sparse vector query dsl doc * Remove unnecessary test setup block	2024-07-02 13:37:25 -04:00
Joe Gallo	d9941f6285	Ingest geoip new databases release highlight (#109355 )	2024-06-04 12:48:19 -04:00
Joe Gallo	e1b2b599de	Add continent_code support to the geoip processor (#108780 )	2024-05-17 11:48:23 -04:00
Joe Gallo	babab0a8c0	Add support for the 'Connection Type' database to the geoip processor (#108683 )	2024-05-15 17:58:08 -04:00
Keith Massey	639eee577e	Adding user_type support for the enterprise database for the geoip processor (#108687 )	2024-05-15 12:23:52 -05:00
Keith Massey	69ec54d541	Add support for the 'ISP' database to the geoip processor (#108651 )	2024-05-15 09:27:06 -05:00
Joe Gallo	cc6597df23	Add support for the 'Domain' database to the geoip processor (#108639 )	2024-05-14 17:49:05 -04:00
Keith Massey	bcd62e8d03	Adding hits_time_in_millis and misses_time_in_millis to enrich cache stats (#107579 )	2024-04-18 15:19:24 -05:00
Keith Massey	8adc2926a2	Fixed the spelling of the word successful in docs (#107595 )	2024-04-18 08:08:30 -05:00
Liam Thompson	33a71e3289	[DOCS] Refactor book-scoped variables in `docs/reference/index.asciidoc` (#107413 ) * Remove `es-test-dir` book-scoped variable * Remove `plugins-examples-dir` book-scoped variable * Remove `:dependencies-dir:` and `:xes-repo-dir:` book-scoped variables - In `index.asciidoc`, two variables (`:dependencies-dir:` and `:xes-repo-dir:`) were removed. - In `sql/index.asciidoc`, the `:sql-tests:` path was updated to fuller path - In `esql/index.asciidoc`, the `:esql-tests:` path was updated idem * Replace `es-repo-dir` with `es-ref-dir` * Move `:include-xpack: true` to few files that use it, remove from index.asciidoc	2024-04-17 14:37:07 +02:00
Keith Massey	f5c7938ab8	Adding cache_stats to geoip stats API (#107334 )	2024-04-16 16:57:14 -05:00
Joe Gallo	6ff3a2628a	Add support for the 'Enterprise' database to the geoip processor (#107377 )	2024-04-11 16:45:10 -04:00
Joe Gallo	5266f79b16	Add support for the 'Anonymous IP' database to the geoip processor (#107287 )	2024-04-11 14:05:52 -04:00
Keith Massey	48a88c575c	Renaming GeoIpDownloaderStatsAction (#107290 ) Renaming GeoIpDownloaderStatsAction to GeoIpStatsAction	2024-04-10 09:21:24 -05:00
Jennie Soria	30828a5680	Update geoip.asciidoc (#105908 ) The GeoIP endpoint does not use the xpack http client. The GeoIP downloader uses the JDKs builtin cacerts. If customer is using custom https endpoint they need to provide the cacert in the jdk, whether our jdk bundled in or their jdk. Otherwise they will see something like ``` ...PKiX path building failed: sun.security.provier.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target... ```	2024-03-05 11:26:49 +01:00
Liam Thompson	52aefa59eb	[DOCS] Ingest processors docs improvements (#104384 ) * [DOCS] Categorize ingest processors on overview page, summarize use cases * Add overview info, subheading, links * Apply suggestions from review Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co> * Insert space --------- Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>	2024-01-17 11:50:29 +01:00
ShourieG	147484b059	[elasticsearch][processors] - Added support for override flag in rename processor (#103565 ) * added override flag for rename processer along with factory tests * added yaml tests for rename processor using the override flag * updated renameProcessor tests to include override flag as a parameter * updated rename processor tests to incorporate override flag = true scenario * updated rename processor asciidoc with override option * updated rename processor asciidoc with override option * removed unnecessary supresswarnings tag * corrected formatting errors * updated processor tests * fixed yaml tests * Prefer early throw style here * Whitespace * Move and rewrite this test It's just a simple test of the primary behavior of the rename processor, so put it first and simplify it. * Rename this test It doesn't actually exercise template snippets * Tidy up this test --------- Co-authored-by: Joe Gallo <joegallo@gmail.com>	2024-01-11 16:00:02 +05:30
Adam Demjen	a26ff243f6	[Docs] [Enterprise Search] ML inference pipeline documentation updates (#103022 ) * Remove mapping step, wording and screenshot updates * Notes about pipeline name and model deployment * Address CR comments	2024-01-02 09:56:50 -05:00
Abdon Pijpelink	ac973f0064	[DOCS] Improve enrich policy execute 'wait_for_completion' docs (#102291 ) * [DOCS] Improve enrich policy execute 'wait_for_completion' docs * Update docs/reference/ingest/apis/enrich/execute-enrich-policy.asciidoc Co-authored-by: James Rodewig <james.rodewig@elastic.co> --------- Co-authored-by: James Rodewig <james.rodewig@elastic.co>	2023-11-27 17:17:06 +01:00
Abdon Pijpelink	bc59315baa	[DOCS] Examples for ES\|QL DISSECT and WHERE (#102591 ) * DISSECT examples * WHERE examples * Remove references to empty keys * Fix non-deterministic test	2023-11-27 10:56:48 +01:00
Keith Massey	643d825c45	Adding a simulate ingest api (#101409 ) This commit introduces a new _ingest/simulate API that runs any pipelines on the given data that would be executed for a given index, but instead of indexing the data into the index, returns the transformed documents and the list of pipelines that were executed.	2023-11-15 17:25:09 -06:00
Liam Thompson	ddd94446f8	[DOCS] Fix incorrect image paths (#102082 )	2023-11-13 16:00:00 +01:00
Felix Barnsteiner	978a5469ce	Add support for marking component templates as deprecated (#101148 )	2023-11-02 19:28:20 +01:00
István Zoltán Szabó	c34e0c0746	[DOCS] Clarifies that inference input must be single string (#101301 )	2023-10-25 17:18:05 +02:00
Liam Thompson	a6ed18c144	[DOCS] [Enterprise Search] Migrate ingest pipelines/ML docs (#101156 ) * WIP, port docs - Update link syntax - Update ids - Fix n^n build failures :/ - * Fix id for doclink * Let's try this on for size * Idem * Update attributes, Test image rendering * Update image name * Fix typo * Update filename * Add images, cleanup, standardize naming * Tweak heading * Cleanup, rewordings - Modified introduction in `search-inference-processing.asciidoc`. - Changed "Search connector" to "Elastic connector". - Adjusted heading levels in `search-inference-processing.asciidoc`. - Simplified ingest pipelines intro in `search-ingest-pipelines.asciidoc`. - Edited ingest pipelines section for the Content UI. - Reordered file inclusions in `search-ingest-pipelines.asciidoc`. - Formatted inference pipeline creation into steps in `search-nlp-tutorial.asciidoc`. * Lingering erroneousness * Delete FAQ	2023-10-25 17:17:24 +02:00
Abdon Pijpelink	284f81873f	[DOCS] Expand ES\|QL DISSECT and GROK documentation (#101225 ) * Add 'Process data with DISSECT and GROK' page * Expand DISSECT docs * More DISSECT and GROK enhancements * Improve examples * Fix CSV tests * Review feedback * Reword	2023-10-25 13:19:17 +02:00

1 2 3 4 5 ...

447 Commits