Commit Graph

354 Commits

Author SHA1 Message Date
James Rodewig c0c105c210
[DOCS] Fix typo (#92481) 2022-12-21 10:17:49 +01:00
Abdon Pijpelink 8abd39ab98
Fix typo in stop-tokenfilter.asciidoc (#91128) (#91207)
Since ignore_case is set to true in our custom stop words filter, the matching will be case-insensitive.

(cherry picked from commit a03fba9d77)

Co-authored-by: Siniša Subašić <68671543+sinisuba@users.noreply.github.com>
2022-11-01 15:32:16 +01:00
Elasticsearch addict b5a635cae9
[DOCS] Add note for tokenizers that don't support keep types token filter (#87553)
Closes #85946
2022-06-13 11:28:32 +02:00
Elasticsearch addict c3a6190173
Fix docs link to Lucene stop filter (#87037)
Updates the docs with the new Javadoc location for Lucene's stop filter.

Closes #87034.
2022-05-23 11:06:35 -07:00
Abele Mălan 9ecb96fcf3
Fix some typos in plugins & reference docs (#84667)
This pull request removes a few instances of duplicate words or
punctuation and erroneous spelling from the docs.
2022-03-07 12:29:58 -05:00
Tobias Stadler e3deacf547
[DOCS] Fix typos (#83895) 2022-02-15 12:42:17 -05:00
James Rodewig c7917d7996
[DOCS] Remove Hunspell dictionaries location config (#82704) (#82954)
User can no longer set location for Hunspell dictionaries. `<config-dir>/hunspell` directory is silently used everytime no matter what configuration is used.

Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
(cherry picked from commit 1a4fd34129)

Co-authored-by: Jan Jíša <jenda.jisa@gmail.com>
2022-01-24 11:17:12 -05:00
nexusalf e04911bf97 [Docs] Update edgengram-tokenizer.asciidoc (#79577)
The original example of "snapped" does not apply to this section since it is talking about edge ngrams.
The change replaces the term with "approximate" as a valid example.
2021-10-26 13:05:35 +02:00
Adam Locke 59aeb8552c
change a typo in first letter of a user query (#76394) (#76450)
Co-authored-by: Arseni Prokharchyk <2657789+arsen91@users.noreply.github.com>
2021-08-12 14:28:51 -04:00
James Rodewig 0224621423 [DOCS] Fix formatting 2021-05-04 12:29:14 -04:00
James Rodewig 693807a6d3
[DOCS] Fix double spaces (#71082) 2021-03-31 09:57:47 -04:00
James Rodewig e76c229b33
[DOCS] Note you can omit `type` for custom analyzers (#70754) 2021-03-23 11:13:20 -04:00
Adam Locke aba4422606
[DOCS] Focus scripting docs on Painless (#69748)
* Initial changes for scripting.

* Shorten script examples.

* Expanding types docs.

* Updating types.

* Fixing broken cross-link.

* Fixing map error.

* Incorporating review feedback.

* Fixing broken table.

* Adding more info about reference types.

* Fixing broken path.

* Adding more info an examples for def type.

* Adding more info on operators.

* Incorporating review feedback.

* Adding notconsole for example.

* Removing comments in example.

* More review feedback.

* Editorial changes.

* Incorporating more reviewer feedback.

* Rewrites based on review feedback.

* Adding new sections for storing scripts and shortening scripts.

* Adding redirect for stored scripts.

* Adding DELETE for stored script plus link.

* Adding section for updating docs with scripts.

* Incorporating final feedback from reviews.

* Tightening up a few areas.

* Minor change around other languages.

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2021-03-18 15:58:33 -04:00
James Rodewig 3bac730a50
[DOCS] Fix nori tokenizer link (#70564) 2021-03-18 11:04:46 -04:00
Fabien Caylus 34176844b7
[DOCS] Fix Lucene's stop words links (#70405) 2021-03-16 17:06:12 -04:00
James Rodewig 630604bd45
[DOCS] Fix case sensitivity for elision token filter (#69873) 2021-03-03 09:09:05 -05:00
James Rodewig 9b88ae92e6
[DOCS] Fix typos for duplicate words (#69125) 2021-02-17 10:34:20 -05:00
James Rodewig c65615911f
[DOCS] Expand simple query string query's multi-position token section (#68753) 2021-02-09 16:07:02 -05:00
James Rodewig d5d8be9bff [DOCS] Fix typo 2021-02-03 10:45:16 -05:00
James Rodewig 86814df052
[DOCS] Clean up index template xrefs (#67264) 2021-01-11 12:38:09 -05:00
Toast 966189fa6a
[DOCS] Fix typo (#65912) 2020-12-05 10:05:13 -05:00
James Rodewig fa7c63e6c4
[DOCS] Fix whitespace in pattern replace token filter docs (#64345) 2020-10-29 10:07:10 -04:00
James Rodewig 1ea83359bb
[DOCS] Fix case for 'Boolean' (#64299) 2020-10-29 09:04:43 -04:00
Elasticsearch addict 32c7e08c6d
[DOCS] Fix pattern replace token filter intro (#64189)
Removes an incorrect statement about anchoring regex patterns on tokens.
2020-10-27 09:33:03 -04:00
James Rodewig 39d064d668
[DOCS] Update snowball links (#63351) 2020-10-06 15:29:57 -04:00
James Rodewig 80a828c15f
[DOCS] Update link to Snowball documentation (#63305) (#63347)
The current link points to an obsolete site, which is no longer maintained.

Co-authored-by: Stefan Walter <67258699+rd-stefan-walter@users.noreply.github.com>
2020-10-06 13:40:51 -04:00
James Rodewig b3e8767a35
[DOCS] Clarify that v2.0+ hyphenation files aren't supported (#60579) (#63072)
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>

Co-authored-by: jgkirschbaum <juergen.kirschbaum@gmail.com>
2020-09-30 09:28:23 -04:00
James Rodewig a94e5cb7c4
[DOCS] Replace Wikipedia links with attribute (#61171) 2020-08-17 09:44:24 -04:00
James Rodewig 5827d09ba6
[DOCS] Add xref to multiplexer token filter docs (#60431) (#61170)
Co-authored-by: paiboon auengkongkatong <paiboon15721@gmail.com>
2020-08-14 15:10:33 -04:00
James Rodewig 5d9de8ce46
[DOCS] Add missing lang values to snowball token filter (#60489) 2020-08-04 17:26:37 -04:00
Alexander Reelsen c7ac9e7073
[DOCS] http -> https, remove outdated plugin docs (#60380)
Plugin discovery documentation contained information about installing
Elasticsearch 2.0 and installing an oracle JDK, both of which is no
longer valid.

While noticing that the instructions used cleartext HTTP to install
packages, this commit replaces HTTPs links instead of HTTP where possible.

In addition a few community links have been removed, as they do not seem
to exist anymore.
2020-07-31 15:58:38 -04:00
James Rodewig 441c3a21b1
[DOCS] Update my-index examples (#60132)
Changes the following example index names to `my-index-000001` for consistency:

* `my-index`
* `my_index`
* `myindex`
2020-07-27 14:46:39 -04:00
James Rodewig 2774cd6938
[DOCS] Swap `[float]` for `[discrete]` (#60124)
Changes instances of `[float]` in our docs for `[discrete]`.

Asciidoctor prefers the `[discrete]` tag for floating headings:
https://asciidoctor.org/docs/asciidoc-asciidoctor-diffs/#blocks
2020-07-23 11:48:22 -04:00
James Rodewig 80b674fb25
[DOCS] Reformat snippets to use two-space indents (#59973) 2020-07-21 12:24:26 -04:00
malpani 08de504b44
Support ignore_keywords flag for word delimiter graph token filter (#59563)
This commit allows customizing the word delimiter token filters to skip processing 
tokens tagged as keyword through the `ignore_keywords` flag Lucene's 
WordDelimiterGraphFilter already exposes.

Fix for #59491
2020-07-21 16:11:11 +01:00
Rui Almeida 2c450214ac
[DOCS] Fix keyword marker docs (#59834) 2020-07-20 08:54:55 -04:00
James Rodewig 8b6e310070
[DOCS] Reformat `predicate_token_filter` tokenfilter (#57705) 2020-07-16 13:07:19 -04:00
James Rodewig 2be9db01c8
[DOCS] Replace `datatype` with `data type` (#58972) 2020-07-07 13:52:10 -04:00
James Rodewig 8439c888b6
[DOCS] Fix headings for simple analyzer docs (#58910) 2020-07-02 09:28:56 -04:00
James Rodewig 05da3e0e48
[DOCS] Fix analyzer page titles (#58362)
Changes the titles for analyzer pages to sentence case.

Also changes the 'Pattern character filter' page title to sentence case.
2020-06-26 09:30:37 -04:00
James Rodewig b2b3599012
[DOCS] Fix tokenizer page titles (#58361)
Changes the titles for tokenizer pages to sentence case.

Also moves the 'Path hierarchy tokenizer examples' page within the
'Path hierarchy tokenizer' page and adds a related redirect.
2020-06-26 09:08:44 -04:00
James Rodewig bb66d594d1
[DOCS] Reformat `pattern_replace` token filter (#57699)
Changes:

* Rewrites description and adds Lucene link
* Adds analyze example
* Adds parameter definitions
* Adds custom analyzer example
2020-06-11 12:04:22 -04:00
James Rodewig fd8af38078
[DOCS] Reformat `mapping` charfilter (#57818)
Changes:

* Adds title abbreviation
* Adds Lucene link to description
* Adds standard headings
* Simplifies analyze example
* Simplifies analyzer example and adds contextual text
2020-06-09 12:23:08 -04:00
James Rodewig 06b41614a2 [DOCS] Fix typo in `html_strip` char filter docs 2020-06-08 10:37:16 -04:00
James Rodewig 98a64da87c
[DOCS] Reformat `html_strip` charfilter (#57764)
Changes:

* Converts title to sentence case
* Adds a title abbreviation
* Adds Lucene link to description
* Reformat sections
2020-06-08 08:30:23 -04:00
Tomasz Elendt 66ded59929
Support multiple tokens on LHS in stemmer_override rules (#56113) (#56484)
This commit adds support for rules with multiple tokens on LHS, also
known as "contraction rules", into stemmer override token
filter. Contraction rules are handy into translating multiple
inflected words into the same root form. One side effect of this change is
that it brings stemmer override rules format closer to synonym rules
format so that it makes it easier to translate one into another.

This change also makes stemmer override rules parser more strict so
that it should catch more errors which were previously accepted.

Closes #56113
2020-05-29 22:28:41 +02:00
James Rodewig 16be0e65d3
[DOCS] Reformat `min_hash` token filter docs (#57181)
Changes:

* Rewrites description and adds a Lucene link
* Reformats the configurable parameters as a definition list
* Changes the `Theory` heading to `Using the min_hash token filter for
  similarity search`
* Adds some additional detail to the analyzer example
2020-05-27 14:55:27 -04:00
James Rodewig 00ab16ff97
[DOCS] Reformat `shingle` token filter (#57040)
Changes:

* Rewrites description and adds Lucene link
* Adds analyze example
* Rewrites parameter documentation
* Updates custom analyzer and filter examples
* Adds anchor to `index.max_shingle_diff` index-level setting
2020-05-21 13:41:51 -04:00
James Rodewig 2ed91444fe
[DOCS] Reformat `hunspell` token filter (#56955)
Changes:

* Rewrites description and adds Lucene link
* Adds analyze example
* Rewrites parameter documentation
* Updates custom analyzer example
* Rewrites related setting documentation
2020-05-20 14:29:08 -04:00
Andrei Balici da31b4b83d
Add `max_token_length` setting to the CharGroupTokenizer (#56860)
Adds `max_token_length` option to the CharGroupTokenizer.
Updates documentation as well to reflect the changes.

Closes #56676
2020-05-20 14:15:57 +02:00