elasticsearch/docs/reference/esql/processing-commands/stats.asciidoc

[discrete]
[[esql-stats-by]]
=== `STATS ... BY`

**Syntax**

[source,esql]
----
STATS [column1 =] expression1[, ..., [columnN =] expressionN] [BY grouping_column1[, ..., grouping_columnN]]
----

*Parameters*

`columnX`::
The name by which the aggregated value is returned. If omitted, the name is
equal to the corresponding expression (`expressionX`).

`expressionX`::
An expression that computes an aggregated value.

`grouping_columnX`::
The column containing the values to group by.

*Description*

The `STATS ... BY` processing command groups rows according to a common value
and calculate one or more aggregated values over the grouped rows. If `BY` is
omitted, the output table contains exactly one row with the aggregations applied
over the entire dataset.

The following aggregation functions are supported:

include::../functions/aggregation-functions.asciidoc[tag=agg_list]

NOTE: `STATS` without any groups is much much faster than adding a group.

NOTE: Grouping on a single column is currently much more optimized than grouping
      on many columns. In some tests we have seen grouping on a single `keyword`
      column to be five times faster than grouping on two `keyword` columns. Do 
      not try to work around this by combining the two columns together with 
      something like <<esql-concat>> and then grouping - that is not going to be
      faster.

*Examples*

Calculating a statistic and grouping by the values of another column:

[source.merge.styled,esql]
----
include::{esql-specs}/docs.csv-spec[tag=stats]
----
[%header.monospaced.styled,format=dsv,separator=|]
|===
include::{esql-specs}/docs.csv-spec[tag=stats-result]
|===

Omitting `BY` returns one row with the aggregations applied over the entire
dataset:

[source.merge.styled,esql]
----
include::{esql-specs}/docs.csv-spec[tag=statsWithoutBy]
----
[%header.monospaced.styled,format=dsv,separator=|]
|===
include::{esql-specs}/docs.csv-spec[tag=statsWithoutBy-result]
|===

It's possible to calculate multiple values:

[source,esql]
----
include::{esql-specs}/docs.csv-spec[tag=statsCalcMultipleValues]
----

It's also possible to group by multiple values (only supported for long and
keyword family fields):

[source,esql]
----
include::{esql-specs}/docs.csv-spec[tag=statsGroupByMultipleValues]
----
Restructure ES\|QL docs (#100806) * Break out 'Limitations' into separate page * Add REST API docs * Restructure commands, functions, and operators refs * Add placeholder for getting started guide * Group 'Syntax', 'Metafields', and 'MV fields' under 'Language' * Add placeholder for Kibana page * Add link from landing page * Apply uniform formatting to ACOS, CASE, and DATE_PARSE function refs * Reword default LIMIT * Add support for COUNT() Move 'Commands' and 'Functions and operators' to individual pages --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> 2023-10-17 23:36:14 +08:00			`[discrete]`
[DOCS] Move processing commands to a file per command 2023-06-06 00:38:55 +08:00			`[[esql-stats-by]]`
			=== `STATS ... BY`
[DOCS] Uniform formatting for ES\|QL commands (#101728) * Source commands * Missing word * Processing commands * Apply suggestions from code review Co-authored-by: Alexander Spies <alexander.spies@elastic.co> * Review feedback * Add sort detail for mv * More review feedback --------- Co-authored-by: Alexander Spies <alexander.spies@elastic.co> 2023-11-06 15:42:13 +08:00
			`Syntax`

			`[source,esql]`
			`----`
			`STATS [column1 =] expression1[, ..., [columnN =] expressionN] [BY grouping_column1[, ..., grouping_columnN]]`
			`----`

			`Parameters`

			`columnX`::
			`The name by which the aggregated value is returned. If omitted, the name is`
			equal to the corresponding expression (`expressionX`).

			`expressionX`::
			`An expression that computes an aggregated value.`

			`grouping_columnX`::
			`The column containing the values to group by.`

			`Description`

			The `STATS ... BY` processing command groups rows according to a common value
			and calculate one or more aggregated values over the grouped rows. If `BY` is
			`omitted, the output table contains exactly one row with the aggregations applied`
			`over the entire dataset.`

			`The following aggregation functions are supported:`

			`include::../functions/aggregation-functions.asciidoc[tag=agg_list]`

			NOTE: `STATS` without any groups is much much faster than adding a group.

			`NOTE: Grouping on a single column is currently much more optimized than grouping`
			on many columns. In some tests we have seen grouping on a single `keyword`
			column to be five times faster than grouping on two `keyword` columns. Do
			`not try to work around this by combining the two columns together with`
			`something like <<esql-concat>> and then grouping - that is not going to be`
			`faster.`

			`Examples`

			`Calculating a statistic and grouping by the values of another column:`
[DOCS] Move processing commands to a file per command 2023-06-06 00:38:55 +08:00
Docs: compress results into query (ESQL-1259) This compresses the results and the query on the page to take up less space and make them more obviously connected. 2023-06-12 22:37:45 +08:00			`[source.merge.styled,esql]`
[DOCS] Move processing commands to a file per command 2023-06-06 00:38:55 +08:00			`----`
			`include::{esql-specs}/docs.csv-spec[tag=stats]`
			`----`
Docs: compress results into query (ESQL-1259) This compresses the results and the query on the page to take up less space and make them more obviously connected. 2023-06-12 22:37:45 +08:00			`[%header.monospaced.styled,format=dsv,separator=\|]`
[DOCS] Move processing commands to a file per command 2023-06-06 00:38:55 +08:00			`\|===`
			`include::{esql-specs}/docs.csv-spec[tag=stats-result]`
			`\|===`

[DOCS] Uniform formatting for ES\|QL commands (#101728) * Source commands * Missing word * Processing commands * Apply suggestions from code review Co-authored-by: Alexander Spies <alexander.spies@elastic.co> * Review feedback * Add sort detail for mv * More review feedback --------- Co-authored-by: Alexander Spies <alexander.spies@elastic.co> 2023-11-06 15:42:13 +08:00			Omitting `BY` returns one row with the aggregations applied over the entire
			`dataset:`
[DOCS] Move processing commands to a file per command 2023-06-06 00:38:55 +08:00
Docs: compress results into query (ESQL-1259) This compresses the results and the query on the page to take up less space and make them more obviously connected. 2023-06-12 22:37:45 +08:00			`[source.merge.styled,esql]`
[DOCS] Move processing commands to a file per command 2023-06-06 00:38:55 +08:00			`----`
			`include::{esql-specs}/docs.csv-spec[tag=statsWithoutBy]`
			`----`
Docs: compress results into query (ESQL-1259) This compresses the results and the query on the page to take up less space and make them more obviously connected. 2023-06-12 22:37:45 +08:00			`[%header.monospaced.styled,format=dsv,separator=\|]`
[DOCS] Move processing commands to a file per command 2023-06-06 00:38:55 +08:00			`\|===`
			`include::{esql-specs}/docs.csv-spec[tag=statsWithoutBy-result]`
			`\|===`

			`It's possible to calculate multiple values:`

			`[source,esql]`
			`----`
			`include::{esql-specs}/docs.csv-spec[tag=statsCalcMultipleValues]`
			`----`

			`It's also possible to group by multiple values (only supported for long and`
			`keyword family fields):`

			`[source,esql]`
			`----`
			`include::{esql-specs}/docs.csv-spec[tag=statsGroupByMultipleValues]`
			`----`