elasticsearch

Commit Graph

Author	SHA1	Message	Date
elasticsearchmachine	c5eb558371	Bump to version 8.16.0	2024-07-04 09:10:43 +00:00
Martijn van Groningen	6eaf171411	Add some information about the impact of index.codec setting. (#110413 )	2024-07-04 09:20:19 +02:00
George Wallace	b6e9860919	Update role-mapping-resources.asciidoc (#110441 ) made it clear that some characters need to be escaped properly Co-authored-by: Jan Doberstein <jan.doberstein@elastic.co>	2024-07-03 13:00:52 -06:00
Lisa Cawley	748dbd51e4	[DOCS] Add serverless details in Elasticsearch security privileges (#109718 )	2024-07-03 09:52:21 -07:00
Tim Grein	406b969c62	[Inference API] Add Google Vertex AI reranking docs (#110390 )	2024-07-03 14:03:12 +02:00
Johannes Fredén	89cd966b24	Add bulk delete roles API (#110383 ) * Add bulk delete roles API	2024-07-03 11:04:53 +02:00
Sylvain Wallez	e78bdc953a	ESQL: add Arrow dataframes output format (#109873 ) Initial support for Apache Arrow's streaming format as a response for ES\|QL. It triggers based on the Accept header or the format request parameter. Arrow has implementations in every mainstream language and is a backend of the Python Pandas library, which is extremely popular among data scientists and data analysts. Arrow's streaming format has also become the de facto standard for dataframe interchange. It is an efficient binary format that allows zero-cost deserialization by adding data access wrappers on top of memory buffers received from the network. This PR builds on the experiment made by @nik9000 in PR #104877 Features/limitations: - all ES\|QL data types are supported - multi-valued fields are not supported - fields of type _source are output as JSON text in a varchar array. In a future iteration we may want to offer the choice of the more efficient CBOR and SMILE formats. Technical details: Arrow comes with its own memory management to handle vectors with direct memory, reference counting, etc. We don't want to use this as it conflicts with Elasticsearch's own memory management. We therefore use the Arrow library only for the metadata objects describing the dataframe schema and the structure of the streaming format. The Arrow vector data is produced directly from ES\|QL blocks. --------- Co-authored-by: Nik Everett <nik9000@gmail.com>	2024-07-03 10:29:57 +02:00
Carlos Delgado	30b32b6a46	semantic_text: Updated copy-to docs (#110350 )	2024-07-03 10:18:40 +02:00
Fang Xing	8abc8857f2	[ES\|QL] weighted_avg (#109993 ) * weighted_avg	2024-07-02 18:29:02 -04:00
Matt Culbreth	81b8495388	Mark the Redact processor as Generally Available	2024-07-02 16:58:57 -04:00
Nik Everett	6fbc52d170	ESQL docs: Push down needs index and doc_values (#110353 ) This adds a `NOTE` to each comparison saying that pushing the comparison to the search index requires that the field have an `index` and `doc_values`. This is unique compared to the rest of Elasticsearch which only requires an `index` and it's caused by our insistence that comparisons only return true for single-valued fields. We can in future accelerate comparisons without `doc_values`, but we just haven't written that code yet.	2024-07-02 14:22:50 -04:00
Kathleen DeRusso	7a1d532ffb	Pass over Sparse Vector docs for correctness (#110282 ) * Remove legacy mentions of text expansion queries * Add missing query_vector param to sparse_vector query docs * Fix formatting errors in sparse vector query dsl doc * Remove unnecessary test setup block	2024-07-02 13:37:25 -04:00
Felix Barnsteiner	cdbe092d90	Update docs now that keyword dimensions support ignore_above (#110385 ) This is a follow-up from https://github.com/elastic/elasticsearch/pull/110337	2024-07-02 17:04:57 +02:00
Johannes Fredén	55476041d9	Add BulkPutRoles API (#109339 ) * Add BulkPutRoles API	2024-07-02 15:45:39 +02:00
Tim Grein	390439ad9f	[Inference API] Add Google Vertex AI text embeddings docs (#110317 )	2024-07-02 14:47:14 +02:00
Mike Pellegrini	d288dbf94e	Fix Semantic Query Parameter Formatting (#110355 )	2024-07-02 08:07:35 -04:00
Iván Cea Fontenla	c89ee3b648	ESQL: Renamed TopList to Top (#110347 ) Rename TopList aggregation to Top, after internal discussions	2024-07-02 03:52:24 +10:00
Jedr Blaszyk	3b827f6a8c	Create `manage_connector` privilege (#110128 ) * Create manage_seaech_connector privilege * `manage_search_connector` -> `manage_connector` and exclude connector secrets patterns from this privilege * Add `monitor_connector` privilege * Update Kibana system privilege to monitor_connector for telemetry * Rename privilege to 'manage_connector_state' Since privilege names are often namespaced and used with globs, we want to ensure that if there's a future privilege like `manage_connector_secrets`, that it is not implicitly included in this new privileg's <name>. By extending the privilege name to include "_state", we better namespace this distinct from any "_secrets" namespace. Revert "Rename privilege to 'manage_connector_state'" This reverts commit `70b89eee76`. After further discussion with the security team, this name change is not needed after all since the secret management privileges aren't currently prefixed with "manage_" --------- Co-authored-by: Sean Story <sean.j.story@gmail.com>	2024-07-01 12:41:28 -05:00
Tim Grein	99749aa277	[Inference API] Fix wording in Azure AI Studio docs (#110322 )	2024-07-01 14:37:56 +02:00
Tim Grein	6accd6e247	[Inference API] Fix wording in delete-inference docs (#110321 )	2024-07-01 13:37:30 +02:00
Tim Grein	35eae4029a	Fix typo in get-inference docs (retrives -> retrieves) (#110320 )	2024-07-01 10:13:48 +02:00
István Zoltán Szabó	43f5696406	[DOCS] Refactors PUT inference API docs (#109812 )	2024-07-01 10:12:16 +02:00
Nikolaj Volgushev	78c812f845	Fix security index settings docs (#110126 ) Docs tweak with a typo fix and a clarification on how the two available settings interact (essentially https://github.com/elastic/elasticsearch/issues/27871). I'm also open to including this info in the more generic settings API but feels like a simple enough callout to add to the security API.	2024-07-01 18:07:15 +10:00
Kostas Krikellas	6ae652f90e	Support index sorting with nested fields (#110251 ) This PR piggy-backs on recent changes in Lucene 9.11.1 (https://github.com/apache/lucene/pull/12829, https://github.com/apache/lucene/pull/13341/), setting the parent doc when nested fields are present. This allows moving nested documents along with parent ones during sorting. With this change, sorting is now allowed on fields outside nested objects. Sorting on fields within nested objects is still not supported (throws an exception). Fixes #107349	2024-07-01 17:24:17 +10:00
Costin Leau	b906ce3d66	ESQL: change from quoting from backtick to quote (#108395 ) * ESQL: change from quoting from backtick to quote For historical reasons, the source declaration inside FROM command is treated as an identifier, using backticks (`) for escaping the value. This is inconsistent since the source is not an identifier (field name) but an index name which has different semantics. `index` means a field name index while "index" means a literal with said value. In case of FROM, the index name/location is more like a literal (also in unquoted form) than an identifier (that is a reference to a value). This PR tweaks the grammar and plugs in the quoted string logic so that both the single quote (") and triple quote (""") are allowed. * Update grammar * Add more tests * Add a few more tests * Add extra test * Update docs/changelog/108395.yaml * Adress review comments * Add doc note * Revert test rename * Fix quoting with remote cluster * Update docs/reference/esql/source-commands/from.asciidoc Co-authored-by: marciw <333176+marciw@users.noreply.github.com> --------- Co-authored-by: Bogdan Pintea <bogdan.pintea@elastic.co> Co-authored-by: Bogdan Pintea <pintea@mailbox.org> Co-authored-by: marciw <333176+marciw@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2024-06-30 20:01:31 +03:00
George Wallace	dea593db3f	Update behavioral-analytics-start.asciidoc (#110271 )	2024-06-28 09:01:48 -06:00
Mayya Sharipova	405e39660b	Support k parameter for knn query (#110233 ) Introduce an optional k param for knn query If k is not set, knn query has the previous behaviour: - `num_candidates` docs is collected from each shard. This `num_candidates` docs are used for combining with results with other queries and aggregations on each shard. - docs from all shards are merged to produce the top global `size` results If k is set, the behaviour instead is following: - `k` docs is collected from each shard. This `k` docs are used for combining results with other queries and aggregations on each shard. - similarly, docs from all shards are merged to produce the top global `size` results. Having `k` param makes it more intuitive for users to address their needs. They also don't need to care and can skip `num_candidates` param for this query as it is of more internal details to tune how knn search operates. Closes #108473	2024-06-28 09:59:28 -04:00
Nick Tindall	8edb3b07e7	Make repository analysis API available to non-operators (#110179 ) Closes #100381	2024-06-28 09:07:20 +10:00
Kathleen DeRusso	19fc0d9cad	Deprecate text_expansion and weighted_tokens queries (#109880 )	2024-06-27 13:24:57 -04:00
Iván Cea Fontenla	fc0313f429	ESQL: Add aggregations testing base and docs (#110042 ) - Added a new `AbstractAggregationTestCase` base class for tests, that shares most of the code of function tests, adapted for aggregations. Including both testing and docs generation. - Reused the `AbstractFunctionTestCase` class to also let us test evaluators if the aggregation is foldable - Added a `TopListTests` example - This includes the docs for Top_list _(Also added a missing include of Ip_prefix docs)_ - Adapted Kibana docs to use `type: "agg"` (@drewdaemon) The current tests are very basic: Consume a page, generate an output, all in Single aggregation mode (No intermediates, no grouping). More complex testing will be added in future PRs Initial PR of https://github.com/elastic/elasticsearch/issues/109917	2024-06-27 21:21:55 +10:00
Jedr Blaszyk	5179b0db29	[Connector API] Update status when setting/resetting connector error (#110192 )	2024-06-27 12:17:33 +02:00
Benjamin Trent	5add44d7d1	Adds new `bit` element_type for dense_vectors (#110059 ) This commit adds `bit` vector support by adding `element_type: bit` for vectors. This new element type works for indexed and non-indexed vectors. Additionally, it works with `hnsw` and `flat` index types. No quantization based codec works with this element type, this is consistent with `byte` vectors. `bit` vectors accept up to `32768` dimensions in size and expect vectors that are being indexed to be encoded either as a hexidecimal string or a `byte[]` array where each element of the `byte` array represents `8` bits of the vector. `bit` vectors support script usage and regular query usage. When indexed, all comparisons done are `xor` and `popcount` summations (aka, hamming distance), and the scores are transformed and normalized given the vector dimensions. Note, indexed bit vectors require `l2_norm` to be the similarity. For scripts, `l1norm` is the same as `hamming` distance and `l2norm` is `sqrt(l1norm)`. `dotProduct` and `cosineSimilarity` are not supported. Note, the dimensions expected by this element_type are always to be divisible by `8`, and the `byte[]` vectors provided for index must be have size `dim/8` size, where each byte element represents `8` bits of the vectors. closes: https://github.com/elastic/elasticsearch/issues/48322	2024-06-27 04:48:41 +10:00
István Zoltán Szabó	31f0253b43	[DOCS] Adds link to ES-Cohere notebook and clarifies requirements. (#110195 )	2024-06-26 17:22:40 +02:00
Oleksandr Kolomiiets	b68e7d76c9	Remove obsolete sentence from TSDS docs (#110162 )	2024-06-26 08:21:52 -07:00
Kostas Krikellas	3afd53e26a	Remove `average` from downsampling statistics in documentation (#110189 )	2024-06-26 17:23:06 +03:00
Pius	79623c7609	Update search-application-api.asciidoc (#110113 ) Add a subsection about cross cluster search support (or the lack of).	2024-06-26 12:20:28 +02:00
David Kyle	3c1c8d0f32	[ML] Increase response size limit for batched requests (#110112 ) Increase the default to 50MB and do not retry when the limit is exceeded	2024-06-26 10:31:06 +01:00
Kathleen DeRusso	1f46a94dec	Add documentation for individual query rules (#110006 ) * Add individual query rule API docs * Update docs/reference/query-rules/apis/get-query-rule.asciidoc Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> * Update docs/reference/query-rules/apis/delete-query-rule.asciidoc Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> * Update docs/reference/query-rules/apis/get-query-rule.asciidoc Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> * PR feedback --------- Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>	2024-06-25 14:35:08 -04:00
Benjamin Trent	1c1733d823	Add some docs explaining filter performance and behavior for HNSW (#110108 )	2024-06-25 08:42:24 -04:00
Martijn van Groningen	851e955181	Remove obsolete information about tsdb dimensions limit. (#110047 )	2024-06-25 11:41:25 +02:00
Martijn van Groningen	1b0e800f5b	Add a note about enabling time series index mode via a component template (#110050 ) Closes #109149	2024-06-25 17:22:31 +10:00
Jedr Blaszyk	a257fed44b	[Connector API] Add metadata to sync job stats endpoint (#109927 )	2024-06-25 08:04:56 +02:00
Mayya Sharipova	5c87eef89d	[DOCS Vectors with cosine automatically normalized (#110071 ) PR #99445 introduced automatic normalization of dense vectors with cosine similarity. This adds a note about this in the documentation. Relates to #99445	2024-06-22 22:32:25 +10:00
Benjamin Trent	d97cb686a5	Correct positioning for unique token filter (#109395 ) This is an extension of: https://github.com/elastic/elasticsearch/pull/35420 closes: https://github.com/elastic/elasticsearch/issues/35411	2024-06-22 09:44:24 +10:00
Kathleen DeRusso	41a61b069b	Mark Query Rules as GA (#110004 ) * Mark query rules APIs as stable * Remove preview label from docs * Update docs/changelog/110004.yaml	2024-06-21 15:26:51 -04:00
Carlos Delgado	d332ed7d16	Enforce synonyms limit on APIs (#109981 )	2024-06-21 18:16:16 +02:00
Jan Kuipers	13478b2bca	Fix put inference API docs (#110025 ) * Fix put inference API docs * Update docs/changelog/110025.yaml * Delete docs/changelog/110025.yaml	2024-06-21 16:01:08 +02:00
Craig Taverner	536d614694	ES\|QL ST_DISTANCE Function (#108764 ) * WIP Started refactoring in preparation for ST_DISTANCE * Initial evaluators for ST_DISTANCE * Update docs/changelog/108764.yaml * Fix invalid changelog generated by CI * Register function and get unit tests working * Fixed failing meta function description tests, and refined descriptions * Added initial CsvTests and calculate Geo differently to Cartesian * Added more csv-spec tests and changed to arcDistance for accuracy * Added generated docs files * Link to generated docs * Fix examples tag for linking from generated docs * Skip wrapper function And note that we might want to include instead some of the related intelligence from Circle2D::HaversineDistance class * Added ST_DWITHIN and more tests for ST_DISTANCE and ST_DWITHIN * Code style * Added more tests, this time for sorting on distance * Fixes after rebase on main * The ST_DWITHIN cannot use BinarySpatialFunction because it is ternary So we moved the common code to a separate SpatialTypeResolver, and made a simpler TernarySpatialFunction based on a simple TernaryScalarFunction. This had additional consequences, simplifying the points-only cases. The main reason for this change was to support StDWithinTests which need to test a lot of things that involve varying all three input types, generating expected error strings, etc. The original hack of just adding to BinarySpatialFunction worked for the actual integration tests, but clearly did not satisfy all the use cases tested by the unit tests. We also restricted ST_DWITHIN to take only a double as the third argument, because otherwise the number of evaluators would explode, since we need a separate evaluator for each Block type, and Integer and Double use different block types. * Fixed function count after rebasing on main * Update docs/changelog/108764.yaml * Added generated docs for ST_DWITHIN * Connect docs for ST_DWITHIN * Add back issue link * Remove support for ST_DWITHIN * Update docs/changelog/108764.yaml * Bring back link to issue in changelog * Update x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/spatial/StDistance.java Co-authored-by: Ignacio Vera <iverase@gmail.com> * Revert reformatting of function descriptions We should put this into a separate PR * Github merged commit with incorrectly formatted whitespace --------- Co-authored-by: Ignacio Vera <iverase@gmail.com>	2024-06-21 11:59:44 +02:00
David Turner	5662f988b2	Remove trappy timeouts in snapshot APIs (#109828 ) Wholesale fix of every `TRAPPY_IMPLICIT_DEFAULT_MASTER_NODE_TIMEOUT` in `o.e.snapshots` and `o.e.repositories`, just pulling them up to the REST layer (where they become API params), the test suite (where they become `TEST_REQUEST_TIMEOUT`), or some other place where an explicit value is available. Relates #107984	2024-06-21 07:11:12 +10:00
Oleksandr Kolomiiets	8bc5ecdc31	Support synthetic source together with ignore_malformed in histogram fields (#109882 )	2024-06-20 09:09:45 -07:00

1 2 3 4 5 ...

11731 Commits