elasticsearch

Commit Graph

Author	SHA1	Message	Date
James Rodewig	ddf5c0a76a	[DOCS] Reformat n-gram token filter docs (#49438 ) Reformats the edge n-gram and n-gram token filter docs. Changes include: * Adds title abbreviations * Updates the descriptions and adds Lucene links * Reformats parameter definitions * Adds analyze and custom analyzer snippets * Adds notes explaining differences between the edge n-gram and n-gram filters Additional changes: * Switches titles to use "n-gram" throughout. * Fixes a typo in the edge n-gram tokenizer docs * Adds an explicit anchor for the `index.max_ngram_diff` setting	2019-11-22 10:38:01 -05:00
Christoph Büscher	ed86750fa4	Allow custom characters in token_chars of ngram tokenizers (#49250 ) Currently the `token_chars` setting in both `edgeNGram` and `ngram` tokenizers only allows for a list of predefined character classes, which might not fit every use case. For example, including underscore "_" in a token would currently require the `punctuation` class which comes with a lot of other characters. This change adds an additional "custom" option to the `token_chars` setting, which requires an additional `custom_token_chars` setting to be present and which will be interpreted as a set of characters to inlcude into a token. Closes #25894	2019-11-20 10:36:39 +01:00
James Rodewig	3cf6569e0e	[DOCS] Reformat elision token filter docs (#49262 )	2019-11-19 10:54:29 -05:00
James Rodewig	ee6f80b1de	[DOCS] Reformat fingerprint token filter docs (#49311 )	2019-11-19 10:54:16 -05:00
gpaimla	d1ea9910c3	Implement Lucene EstonianAnalyzer, Stemmer (#49149 ) This PR adds a new analyzer and stemmer for the Estonian language. Closes #48895	2019-11-18 17:19:54 +01:00
James Rodewig	2fe9ba53ec	[DOCS] Note limitations of `max_gram` parm in `edge_ngram` tokenizer for index analyzers (#49007 ) The `edge_ngram` tokenizer limits tokens to the `max_gram` character length. Autocomplete searches for terms longer than this limit return no results. To prevent this, you can use the `truncate` token filter to truncate tokens to the `max_gram` character length. However, this could return irrelevant results. This commit adds some advisory text to make users aware of this limitation and outline the tradeoffs for each approach. Closes #48956.	2019-11-13 14:27:10 -05:00
James Rodewig	c4e113ec60	[DOCS] Reformat compound word token filters (#49006 ) * Separates the compound token filters doc pages into separate token filter pages: * Dictionary decompounder token filter * Hyphenation decompounder token filter * Adds analyze API examples for each compound token filter * Adds a redirect for the removed compound token filters page Co-Authored-By: debadair <debadair@elastic.co>	2019-11-13 09:35:00 -05:00
James Rodewig	547f30077c	[DOCS] Reformat condition token filter (#48775 )	2019-11-11 08:49:01 -05:00
Julian Simioni	05bc46e7e4	[Docs] Consolidate single example into a single line (#48904 ) The first example of splitting rules for the `word_delimiter` token filter was spread across two bullet points. This makes it look like they are two separate splitting rules.	2019-11-08 15:13:29 -05:00
James Rodewig	8ce338ee3d	[DOCS] Reformat decimal digit token filter docs (#48722 )	2019-11-01 12:37:24 -04:00
Peter Johnson	65700b6940	[DOCS] Fix typo in synonym token filter docs (#48691 )	2019-10-31 09:13:15 -04:00
James Rodewig	eb9eb927ff	[DOCS] Remove unneeded filter from common grams analyze ex (#48748 )	2019-10-31 09:07:27 -04:00
James Rodewig	60f9de543b	[DOCS] Reformat common grams token filter (#48426 )	2019-10-30 08:40:11 -04:00
James Rodewig	31fc615381	[DOCS] Reformat ASCII folding token filter docs (#48143 )	2019-10-23 15:06:18 -05:00
James Rodewig	a0795163a9	[DOCS] Reformat classic token filter docs (#48314 )	2019-10-23 09:38:22 -05:00
James Rodewig	bb635e5a9e	[DOCS] Reformat CJK bigram and CJK width token filter docs (#48210 )	2019-10-21 09:43:59 -04:00
James Rodewig	c367c5cf75	[DOCS] Reformat apostrophe token filter docs (#48076 )	2019-10-16 08:50:12 -04:00
Wilder Pereira	630bfa1001	[DOCS] Remove unneeded spaces from custom analyzer snippet (#47332 )	2019-10-15 15:52:52 -04:00
James Rodewig	59933abb0e	[DOCS] Sort analyzers, tokenizers, and token filters alphabetically (#48068 )	2019-10-15 15:46:50 -04:00
Alan Woodward	c1f99e2d75	Remove `_type` from SearchHit (#46942 ) This commit removes the `_type` field from all search hit responses. Relates to #41059	2019-09-23 19:14:54 +01:00
James Rodewig	de2c8f7231	Fixed sample code for minhash (#46385 ) The sample code is wrong. Field type is required for the sample field. I guess the intention was to give the sample field the name ```fingerprint```, mapping it as ```text``` using the custom analyzer ```my_analyzer```	2019-09-12 13:29:07 -04:00
Abhilash Bolla	b4c18b9c44	Fixed grammar in pattern replace char filter docs. (#46546 ) Minor grammar fix in the pattern replace char filter docs.	2019-09-10 09:46:06 -07:00
James Rodewig	5772c1c7dd	[DOCS] [2 of 5] Change // CONSOLE comments to [source,console] (#46353 )	2019-09-09 13:13:41 -04:00
James Rodewig	e43be90e6c	[DOCS] [5 of 5] Change // TESTRESPONSE comments to [source,console-results] (#46449 )	2019-09-06 14:05:36 -04:00
James Rodewig	466c59a4a7	[DOCS] Replace "// TESTRESPONSE" magic comments with "[source,console-result] (#46295 )	2019-09-05 16:47:18 -04:00
James Rodewig	be7b873a43	[DOCS] Correct custom analyzer callouts (#46030 )	2019-08-29 10:07:52 -04:00
MK Swanson	f47886e44a	[DOCS] Modified section headings, edited text for clarity. (#44988 ) * [DOCS] Modified section headings, edited text for clarity. * [DOCS] Modified section headings, edited text for clarity. * [DOCS] Modified section headings, edited text for clarity.	2019-07-30 16:03:05 -04:00
James Rodewig	ea1adb61c2	[DOCS] Update anchors and links for Elasticsearch API relocation (#44500 )	2019-07-19 09:16:35 -04:00
Christoph Büscher	56ee1a5e00	Allow reloading of search time analyzers (#43313 ) Currently changing resources (like dictionaries, synonym files etc...) of search time analyzers is only possible by closing an index, changing the underlying resource (e.g. synonym files) and then re-opening the index for the change to take effect. This PR adds a new API endpoint that allows triggering reloading of certain analysis resources (currently token filters) that will then pick up changes in underlying file resources. To achieve this we introduce a new type of custom analyzer (ReloadableCustomAnalyzer) that uses a ReuseStrategy that allows swapping out analysis components. Custom analyzers that contain filters that are markes as "updateable" will automatically choose this implementation. This PR also adds this capability to `synonym` token filters for use in search time analyzers. Relates to #29051	2019-06-27 18:27:11 +02:00
Alan Woodward	d2c696d54b	Require [articles] setting in elision filter (#43083 ) We should throw an exception at construction time if a list of articles is not provided, otherwise we can get random NPEs during indexing. Relates to #43002	2019-06-27 08:56:26 +01:00
Sachin Frayne	31a37fbb00	Correct the description of generate_word_parts (#43026 )	2019-06-10 11:37:34 +01:00
James Rodewig	8685a7b8d2	[DOCS] Add explicit `articles_case` parameter to Elision Token Filter example (#42987 )	2019-06-07 11:22:32 -04:00
Mayya Sharipova	6f12eb168f	Fix error with mapping in docs	2019-05-30 10:06:38 -04:00
Peter Dyson	588228816a	[DOCS] path_hierarchy tokenizer examples (#39630 ) Closes #17138	2019-05-30 09:19:56 -04:00
Alan Woodward	72c7910299	Improvements to docs around multiplexer and synonyms (#41645 ) This commit fixes a multiplexer doc error concerning synonyms, and adds suggestions on how to combine the two filters.	2019-05-07 09:09:28 +01:00
James Rodewig	b33b5fc122	[DOCS] Add attribute to escape minimal pt token link in Asciidoctor (#41613 )	2019-04-30 14:11:24 -04:00
James Rodewig	adf67053f4	[DOCS] Add anchors for Asciidoctor migration (#41648 )	2019-04-30 10:19:09 -04:00
Guilherme Ferreira	378d74be00	[Docs] Correct default stop list constant (#41342 )	2019-04-23 19:14:31 +02:00
Guilherme Ferreira	17463d2be4	[Docs] Correct spelling of "_none_" (#41192 )	2019-04-15 15:12:55 +02:00
Guilherme Ferreira	9f74a932eb	[Docs] Correct spelling the "_none_" stopwords element (#41191 )	2019-04-15 14:17:53 +02:00
Christoph Büscher	5be4827a78	Correct indention in synonym docs (#40711 ) The stopword filter should be on the same level as the synonym filter in the example request. Correcting this for better readability.	2019-04-02 01:43:02 +02:00
Mayya Sharipova	aad93977f5	Correct errors in min_hash filter documentation Related to #39671	2019-03-08 16:16:03 -05:00
Mayya Sharipova	5b852fa184	Add documentation for min_hash filter (#39671 ) * Add documentation for min_hash filter Closes #20757	2019-03-07 08:47:32 -05:00
jimczi	89b80c64ee	fix typo in synonym graph filter docs	2019-03-05 18:18:45 +01:00
Jim Ferenczi	f3e8d66ffb	Remove beta marker from the synonym_graph docs (#38185 )	2019-02-19 10:47:59 +01:00
Christoph Büscher	7bb2da197d	Remove `nGram` and `edgeNGram` token filter names (#38911 ) In #30209 we deprecated the camel case `nGram` filter name in favour of `ngram` and did the same for `edgeNGram` and `edge_ngram`. Using these names has been deprecated since 6.4 and is issuing deprecation warnings since then. I think we can remove these filters in 8.0. In a backport of this PR I would change what was a dreprecation warning from 6.4. to an error starting with new indices created in 7.0.	2019-02-15 20:15:05 +01:00
Mayya Sharipova	da63ee5252	Correct rebuilt persian analyzer (#38724 ) Make substitution of \u200C with a space explicit The problem with this symbol `\u200C` in a test string, that SHOULD be substituted with space in the rebuilt Persian analyzer, but it is not. Correcting this line `"mappings": [ "\\u200C=> "] <1>` to `"mappings": [ "\\u200C=>\\u0020"] <1>` in solves the problem. This change explicitly says to substitute ZWNJ with a space. Closes #38188	2019-02-11 10:46:18 -05:00
Christoph Büscher	34f2d2ec91	Remove remaining occurances of "include_type_name=true" in docs (#37646 )	2019-01-22 15:13:52 +01:00
Christoph Büscher	3a96608b3f	Remove more include_type_name and types from docs (#37601 )	2019-01-18 14:11:18 +01:00
Christoph Büscher	25aac4f77f	Remove `include_type_name` in asciidoc where possible (#37568 ) The "include_type_name" parameter was temporarily introduced in #37285 to facilitate moving the default parameter setting to "false" in many places in the documentation code snippets. Most of the places can simply be reverted without causing errors. In this change I looked for asciidoc files that contained the "include_type_name=true" addition when creating new indices but didn't look likey they made use of the "_doc" type for mappings. This is mostly the case e.g. in the analysis docs where index creating often only contains settings. I manually corrected the use of types in some places where the docs still used an explicit type name and not the dummy "_doc" type.	2019-01-18 09:34:11 +01:00

1 2 3 4 5

248 Commits