762 lines
25 KiB
Plaintext
Executable File
762 lines
25 KiB
Plaintext
Executable File
[[getting-started]]
|
||
= Getting started with {es}
|
||
|
||
[partintro]
|
||
--
|
||
Ready to take {es} for a test drive and see for yourself how you can use the
|
||
REST APIs to store, search, and analyze data?
|
||
|
||
Follow this getting started tutorial to:
|
||
|
||
. Get an {es} cluster up and running
|
||
. Index some sample documents
|
||
. Search for documents using the {es} query language
|
||
. Analyze the results using bucket and metrics aggregations
|
||
|
||
|
||
Need more context?
|
||
|
||
Check out the <<elasticsearch-intro,
|
||
Elasticsearch Introduction>> to learn the lingo and understand the basics of
|
||
how {es} works. If you're already familiar with {es} and want to see how it works
|
||
with the rest of the stack, you might want to jump to the
|
||
{stack-gs}/get-started-elastic-stack.html[Elastic Stack
|
||
Tutorial] to see how to set up a system monitoring solution with {es}, {kib},
|
||
{beats}, and {ls}.
|
||
|
||
TIP: The fastest way to get started with {es} is to
|
||
https://www.elastic.co/cloud/elasticsearch-service/signup[start a free 14-day
|
||
trial of Elasticsearch Service] in the cloud.
|
||
--
|
||
|
||
[[getting-started-install]]
|
||
== Get {es} up and running
|
||
|
||
To take {es} for a test drive, you can create a
|
||
https://www.elastic.co/cloud/elasticsearch-service/signup[hosted deployment] on
|
||
the {es} Service or set up a multi-node {es} cluster on your own
|
||
Linux, macOS, or Windows machine.
|
||
|
||
[float]
|
||
[[run-elasticsearch-hosted]]
|
||
=== Run {es} on Elastic Cloud
|
||
|
||
When you create a deployment on the {es} Service, the service provisions
|
||
a three-node {es} cluster along with Kibana and APM.
|
||
|
||
To create a deployment:
|
||
|
||
. Sign up for a https://www.elastic.co/cloud/elasticsearch-service/signup[free trial]
|
||
and verify your email address.
|
||
. Set a password for your account.
|
||
. Click **Create Deployment**.
|
||
|
||
Once you've created a deployment, you're ready to <<getting-started-index>>.
|
||
|
||
[float]
|
||
[[run-elasticsearch-local]]
|
||
=== Run {es} locally on Linux, macOS, or Windows
|
||
|
||
When you create a deployment on the {es} Service, a master node and
|
||
two data nodes are provisioned automatically. By installing from the tar or zip
|
||
archive, you can start multiple instances of {es} locally to see how a multi-node
|
||
cluster behaves.
|
||
|
||
To run a three-node {es} cluster locally:
|
||
|
||
. Download the Elasticsearch archive for your OS:
|
||
+
|
||
Linux: https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-{version}-linux-x86_64.tar.gz[elasticsearch-{version}-linux-x86_64.tar.gz]
|
||
+
|
||
["source","sh",subs="attributes,callouts"]
|
||
--------------------------------------------------
|
||
curl -L -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-{version}-linux-x86_64.tar.gz
|
||
--------------------------------------------------
|
||
// NOTCONSOLE
|
||
+
|
||
macOS: https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-{version}-darwin-x86_64.tar.gz[elasticsearch-{version}-darwin-x86_64.tar.gz]
|
||
+
|
||
["source","sh",subs="attributes,callouts"]
|
||
--------------------------------------------------
|
||
curl -L -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-{version}-darwin-x86_64.tar.gz
|
||
--------------------------------------------------
|
||
// NOTCONSOLE
|
||
+
|
||
Windows:
|
||
https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-{version}-windows-x86_64.zip[elasticsearch-{version}-windows-x86_64.zip]
|
||
|
||
. Extract the archive:
|
||
+
|
||
Linux:
|
||
+
|
||
["source","sh",subs="attributes,callouts"]
|
||
--------------------------------------------------
|
||
tar -xvf elasticsearch-{version}-linux-x86_64.tar.gz
|
||
--------------------------------------------------
|
||
+
|
||
macOS:
|
||
+
|
||
["source","sh",subs="attributes,callouts"]
|
||
--------------------------------------------------
|
||
tar -xvf elasticsearch-{version}-darwin-x86_64.tar.gz
|
||
--------------------------------------------------
|
||
+
|
||
Windows PowerShell:
|
||
+
|
||
["source","powershell",subs="attributes,callouts"]
|
||
--------------------------------------------------
|
||
Expand-Archive elasticsearch-{version}-windows-x86_64.zip
|
||
--------------------------------------------------
|
||
|
||
. Start elasticsearch from the `bin` directory:
|
||
+
|
||
Linux and macOS:
|
||
+
|
||
["source","sh",subs="attributes,callouts"]
|
||
--------------------------------------------------
|
||
cd elasticsearch-{version}/bin
|
||
./elasticsearch
|
||
--------------------------------------------------
|
||
+
|
||
Windows:
|
||
+
|
||
["source","powershell",subs="attributes,callouts"]
|
||
--------------------------------------------------
|
||
cd elasticsearch-{version}\bin
|
||
.\elasticsearch.bat
|
||
--------------------------------------------------
|
||
+
|
||
You now have a single-node {es} cluster up and running!
|
||
|
||
. Start two more instances of {es} so you can see how a typical multi-node
|
||
cluster behaves. You need to specify unique data and log paths
|
||
for each node.
|
||
+
|
||
Linux and macOS:
|
||
+
|
||
["source","sh",subs="attributes,callouts"]
|
||
--------------------------------------------------
|
||
./elasticsearch -Epath.data=data2 -Epath.logs=log2
|
||
./elasticsearch -Epath.data=data3 -Epath.logs=log3
|
||
--------------------------------------------------
|
||
+
|
||
Windows:
|
||
+
|
||
["source","powershell",subs="attributes,callouts"]
|
||
--------------------------------------------------
|
||
.\elasticsearch.bat -E path.data=data2 -E path.logs=log2
|
||
.\elasticsearch.bat -E path.data=data3 -E path.logs=log3
|
||
--------------------------------------------------
|
||
+
|
||
The additional nodes are assigned unique IDs. Because you're running all three
|
||
nodes locally, they automatically join the cluster with the first node.
|
||
|
||
. Use the cat health API to verify that your three-node cluster is up running.
|
||
The cat APIs return information about your cluster and indices in a
|
||
format that's easier to read than raw JSON.
|
||
+
|
||
You can interact directly with your cluster by submitting HTTP requests to
|
||
the {es} REST API. Most of the examples in this guide enable you to copy the
|
||
appropriate cURL command and submit the request to your local {es} instance from
|
||
the command line. If you have Kibana installed and running, you can also
|
||
open Kibana and submit requests through the Dev Console.
|
||
+
|
||
TIP: You'll want to check out the
|
||
https://www.elastic.co/guide/en/elasticsearch/client/index.html[{es} language
|
||
clients] when you're ready to start using {es} in your own applications.
|
||
+
|
||
[source,js]
|
||
--------------------------------------------------
|
||
GET /_cat/health?v
|
||
--------------------------------------------------
|
||
// CONSOLE
|
||
+
|
||
The response should indicate that the status of the `elasticsearch` cluster
|
||
is `green` and it has three nodes:
|
||
+
|
||
[source,txt]
|
||
--------------------------------------------------
|
||
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
|
||
1565052807 00:53:27 elasticsearch green 3 3 6 3 0 0 0 0 - 100.0%
|
||
--------------------------------------------------
|
||
// TESTRESPONSE[s/1565052807 00:53:27 elasticsearch/\\d+ \\d+:\\d+:\\d+ integTest/]
|
||
// TESTRESPONSE[s/3 3 6 3/\\d+ \\d+ \\d+ \\d+/]
|
||
// TESTRESPONSE[s/0 0 -/0 \\d+ -/]
|
||
// TESTRESPONSE[non_json]
|
||
+
|
||
NOTE: The cluster status will remain yellow if you are only running a single
|
||
instance of {es}. A single node cluster is fully functional, but data
|
||
cannot be replicated to another node to provide resiliency. Replica shards must
|
||
be available for the cluster status to be green. If the cluster status is red,
|
||
some data is unavailable.
|
||
|
||
[float]
|
||
[[gs-other-install]]
|
||
=== Other installation options
|
||
|
||
Installing {es} from an archive file enables you to easily install and run
|
||
multiple instances locally so you can try things out. To run a single instance,
|
||
you can run {es} in a Docker container, install {es} using the DEB or RPM
|
||
packages on Linux, install using Homebrew on macOS, or install using the MSI
|
||
package installer on Windows. See <<install-elasticsearch>> for more information.
|
||
|
||
[[getting-started-index]]
|
||
== Index some documents
|
||
|
||
Once you have a cluster up and running, you're ready to index some data.
|
||
There are a variety of ingest options for {es}, but in the end they all
|
||
do the same thing: put JSON documents into an {es} index.
|
||
|
||
You can do this directly with a simple PUT request that specifies
|
||
the index you want to add the document, a unique document ID, and one or more
|
||
`"field": "value"` pairs in the request body:
|
||
|
||
[source,js]
|
||
--------------------------------------------------
|
||
PUT /customer/_doc/1
|
||
{
|
||
"name": "John Doe"
|
||
}
|
||
--------------------------------------------------
|
||
// CONSOLE
|
||
|
||
This request automatically creates the `customer` index if it doesn't already
|
||
exist, adds a new document that has an ID of `1`, and stores and
|
||
indexes the `name` field.
|
||
|
||
Since this is a new document, the response shows that the result of the
|
||
operation was that version 1 of the document was created:
|
||
|
||
[source,js]
|
||
--------------------------------------------------
|
||
{
|
||
"_index" : "customer",
|
||
"_type" : "_doc",
|
||
"_id" : "1",
|
||
"_version" : 1,
|
||
"result" : "created",
|
||
"_shards" : {
|
||
"total" : 2,
|
||
"successful" : 2,
|
||
"failed" : 0
|
||
},
|
||
"_seq_no" : 26,
|
||
"_primary_term" : 4
|
||
}
|
||
--------------------------------------------------
|
||
// TESTRESPONSE[s/"_seq_no" : \d+/"_seq_no" : $body._seq_no/]
|
||
// TESTRESPONSE[s/"successful" : \d+/"successful" : $body._shards.successful/]
|
||
// TESTRESPONSE[s/"_primary_term" : \d+/"_primary_term" : $body._primary_term/]
|
||
|
||
|
||
The new document is available immediately from any node in the cluster.
|
||
You can retrieve it with a GET request that specifies its document ID:
|
||
|
||
[source,js]
|
||
--------------------------------------------------
|
||
GET /customer/_doc/1
|
||
--------------------------------------------------
|
||
// CONSOLE
|
||
// TEST[continued]
|
||
|
||
The response indicates that a document with the specified ID was found
|
||
and shows the original source fields that were indexed.
|
||
|
||
[source,js]
|
||
--------------------------------------------------
|
||
{
|
||
"_index" : "customer",
|
||
"_type" : "_doc",
|
||
"_id" : "1",
|
||
"_version" : 1,
|
||
"_seq_no" : 26,
|
||
"_primary_term" : 4,
|
||
"found" : true,
|
||
"_source" : {
|
||
"name": "John Doe"
|
||
}
|
||
}
|
||
--------------------------------------------------
|
||
// TESTRESPONSE[s/"_seq_no" : \d+/"_seq_no" : $body._seq_no/ ]
|
||
// TESTRESPONSE[s/"_primary_term" : \d+/"_primary_term" : $body._primary_term/]
|
||
|
||
[float]
|
||
[[getting-started-batch-processing]]
|
||
=== Indexing documents in bulk
|
||
|
||
If you have a lot of documents to index, you can submit them in batches with
|
||
the {ref}/docs-bulk.html[bulk API]. Using bulk to batch document
|
||
operations is significantly faster than submitting requests individually as it minimizes network roundtrips.
|
||
|
||
The optimal batch size depends a number of factors: the document size and complexity, the indexing and search load, and the resources available to your cluster. A good place to start is with batches of 1,000 to 5,000 documents
|
||
and a total payload between 5MB and 15MB. From there, you can experiment
|
||
to find the sweet spot.
|
||
|
||
To get some data into {es} that you can start searching and analyzing:
|
||
|
||
. Download the https://github.com/elastic/elasticsearch/blob/master/docs/src/test/resources/accounts.json?raw=true[`accounts.json`] sample data set. The documents in this randomly-generated data set represent user accounts with the following information:
|
||
+
|
||
[source,js]
|
||
--------------------------------------------------
|
||
{
|
||
"account_number": 0,
|
||
"balance": 16623,
|
||
"firstname": "Bradshaw",
|
||
"lastname": "Mckenzie",
|
||
"age": 29,
|
||
"gender": "F",
|
||
"address": "244 Columbus Place",
|
||
"employer": "Euron",
|
||
"email": "bradshawmckenzie@euron.com",
|
||
"city": "Hobucken",
|
||
"state": "CO"
|
||
}
|
||
--------------------------------------------------
|
||
// NOTCONSOLE
|
||
|
||
. Index the account data into the `bank` index with the following `_bulk` request:
|
||
+
|
||
[source,sh]
|
||
--------------------------------------------------
|
||
curl -H "Content-Type: application/json" -XPOST "localhost:9200/bank/_bulk?pretty&refresh" --data-binary "@accounts.json"
|
||
curl "localhost:9200/_cat/indices?v"
|
||
--------------------------------------------------
|
||
// NOTCONSOLE
|
||
+
|
||
////
|
||
This replicates the above in a document-testing friendly way but isn't visible
|
||
in the docs:
|
||
+
|
||
[source,js]
|
||
--------------------------------------------------
|
||
GET /_cat/indices?v
|
||
--------------------------------------------------
|
||
// CONSOLE
|
||
// TEST[setup:bank]
|
||
////
|
||
+
|
||
The response indicates that 1,000 documents were indexed successfully.
|
||
+
|
||
[source,txt]
|
||
--------------------------------------------------
|
||
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
|
||
yellow open bank l7sSYV2cQXmu6_4rJWVIww 5 1 1000 0 128.6kb 128.6kb
|
||
--------------------------------------------------
|
||
// TESTRESPONSE[s/128.6kb/\\d+(\\.\\d+)?[mk]?b/]
|
||
// TESTRESPONSE[s/l7sSYV2cQXmu6_4rJWVIww/.+/ non_json]
|
||
|
||
[[getting-started-search]]
|
||
== Start searching
|
||
|
||
Once you have ingested some data into an {es} index, you can search it
|
||
by sending requests to the `_search` endpoint. To access the full suite of
|
||
search capabilities, you use the {es} Query DSL to specify the
|
||
search criteria in the request body. You specify the name of the index you
|
||
want to search in the request URI.
|
||
|
||
For example, the following request retrieves all of the documents in the `bank`
|
||
index sorted by account number:
|
||
|
||
[source,js]
|
||
--------------------------------------------------
|
||
GET /bank/_search
|
||
{
|
||
"query": { "match_all": {} },
|
||
"sort": [
|
||
{ "account_number": "asc" }
|
||
]
|
||
}
|
||
--------------------------------------------------
|
||
// CONSOLE
|
||
// TEST[continued]
|
||
|
||
By default, the `hits` section of the response includes the first 10 documents
|
||
that match the search criteria:
|
||
|
||
[source,js]
|
||
--------------------------------------------------
|
||
{
|
||
"took" : 63,
|
||
"timed_out" : false,
|
||
"_shards" : {
|
||
"total" : 5,
|
||
"successful" : 5,
|
||
"skipped" : 0,
|
||
"failed" : 0
|
||
},
|
||
"hits" : {
|
||
"total" : {
|
||
"value": 1000,
|
||
"relation": "eq"
|
||
},
|
||
"max_score" : null,
|
||
"hits" : [ {
|
||
"_index" : "bank",
|
||
"_type" : "_doc",
|
||
"_id" : "0",
|
||
"sort": [0],
|
||
"_score" : null,
|
||
"_source" : {"account_number":0,"balance":16623,"firstname":"Bradshaw","lastname":"Mckenzie","age":29,"gender":"F","address":"244 Columbus Place","employer":"Euron","email":"bradshawmckenzie@euron.com","city":"Hobucken","state":"CO"}
|
||
}, {
|
||
"_index" : "bank",
|
||
"_type" : "_doc",
|
||
"_id" : "1",
|
||
"sort": [1],
|
||
"_score" : null,
|
||
"_source" : {"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"}
|
||
}, ...
|
||
]
|
||
}
|
||
}
|
||
--------------------------------------------------
|
||
// TESTRESPONSE[s/"took" : 63/"took" : $body.took/]
|
||
// TESTRESPONSE[s/\.\.\./$body.hits.hits.2, $body.hits.hits.3, $body.hits.hits.4, $body.hits.hits.5, $body.hits.hits.6, $body.hits.hits.7, $body.hits.hits.8, $body.hits.hits.9/]
|
||
|
||
The response also provides the following information about the search request:
|
||
|
||
* `took` – how long it took {es} to run the query, in milliseconds
|
||
* `timed_out` – whether or not the search request timed out
|
||
* `_shards` – how many shards were searched and a breakdown of how many shards
|
||
succeeded, failed, or were skipped.
|
||
* `max_score` – the score of the most relevant document found
|
||
* `hits.total.value` - how many matching documents were found
|
||
* `hits.sort` - the document's sort position (when not sorting by relevance score)
|
||
* `hits._score` - the document's relevance score (not applicable when using `match_all`)
|
||
|
||
Each search request is self-contained: {es} does not maintain any
|
||
state information across requests. To page through the search hits, you specify
|
||
the `from` and `size` parameters in your request.
|
||
|
||
For example, the following request gets hits 10 through 19:
|
||
|
||
[source,js]
|
||
--------------------------------------------------
|
||
GET /bank/_search
|
||
{
|
||
"query": { "match_all": {} },
|
||
"sort": [
|
||
{ "account_number": "asc" }
|
||
],
|
||
"from": 10,
|
||
"size": 10
|
||
}
|
||
--------------------------------------------------
|
||
// CONSOLE
|
||
// TEST[continued]
|
||
|
||
Now that you've seen how to submit a basic search request, you can start to
|
||
construct queries that are a bit more interesting than `match_all`.
|
||
|
||
To search for specific _terms_ within a field, you can use a `match` query.
|
||
For example, the following request searches the `address` field to find
|
||
customers whose addresses contain `mill` or `lane`:
|
||
|
||
[source,js]
|
||
--------------------------------------------------
|
||
GET /bank/_search
|
||
{
|
||
"query": { "match": { "address": "mill lane" } }
|
||
}
|
||
--------------------------------------------------
|
||
// CONSOLE
|
||
// TEST[continued]
|
||
|
||
To perform a phrase search rather than matching individual terms, you use
|
||
`match_phrase` instead of `match`. For example, the following request only
|
||
matches addresses that contain the phrase `mill lane`:
|
||
|
||
[source,js]
|
||
--------------------------------------------------
|
||
GET /bank/_search
|
||
{
|
||
"query": { "match_phrase": { "address": "mill lane" } }
|
||
}
|
||
--------------------------------------------------
|
||
// CONSOLE
|
||
// TEST[continued]
|
||
|
||
To construct more complex queries, you can use a `bool` query to combine
|
||
multiple query criteria. You can designate criteria as required (must match),
|
||
desirable (should match), or undesirable (must not match).
|
||
|
||
For example, the following request searches the `bank` index for accounts that
|
||
belong to customers who are 40 years old, but excludes anyone who lives in
|
||
Idaho (ID):
|
||
|
||
[source,js]
|
||
--------------------------------------------------
|
||
GET /bank/_search
|
||
{
|
||
"query": {
|
||
"bool": {
|
||
"must": [
|
||
{ "match": { "age": "40" } }
|
||
],
|
||
"must_not": [
|
||
{ "match": { "state": "ID" } }
|
||
]
|
||
}
|
||
}
|
||
}
|
||
--------------------------------------------------
|
||
// CONSOLE
|
||
// TEST[continued]
|
||
|
||
Each `must`, `should`, and `must_not` element in a Boolean query is referred
|
||
to as a query clause. How well a document meets the criteria in each `must` or
|
||
`should` clause contributes to the document's _relevance score_. The higher the
|
||
score, the better the document matches your search criteria. By default, {es}
|
||
returns documents ranked by these relevance scores.
|
||
|
||
The criteria in a `must_not` clause is treated as a _filter_. It affects whether
|
||
or not the document is included in the results, but does not contribute to
|
||
how documents are scored. You can also explicitly specify arbitrary filters to
|
||
include or exclude documents based on structured data.
|
||
|
||
For example, the following request uses a range filter to limit the results to
|
||
accounts with a balance between $20,000 and $30,000 (inclusive).
|
||
|
||
[source,js]
|
||
--------------------------------------------------
|
||
GET /bank/_search
|
||
{
|
||
"query": {
|
||
"bool": {
|
||
"must": { "match_all": {} },
|
||
"filter": {
|
||
"range": {
|
||
"balance": {
|
||
"gte": 20000,
|
||
"lte": 30000
|
||
}
|
||
}
|
||
}
|
||
}
|
||
}
|
||
}
|
||
--------------------------------------------------
|
||
// CONSOLE
|
||
// TEST[continued]
|
||
|
||
[[getting-started-aggregations]]
|
||
== Analyze results with aggregations
|
||
|
||
Aggregations provide the ability to group and extract statistics from your data. The easiest way to think about aggregations is by roughly equating it to the SQL GROUP BY and the SQL aggregate functions. In Elasticsearch, you have the ability to execute searches returning hits and at the same time return aggregated results separate from the hits all in one response. This is very powerful and efficient in the sense that you can run queries and multiple aggregations and get the results back of both (or either) operations in one shot avoiding network roundtrips using a concise and simplified API.
|
||
|
||
To start with, this example groups all the accounts by state, and then returns the top 10 (default) states sorted by count descending (also default):
|
||
|
||
[source,js]
|
||
--------------------------------------------------
|
||
GET /bank/_search
|
||
{
|
||
"size": 0,
|
||
"aggs": {
|
||
"group_by_state": {
|
||
"terms": {
|
||
"field": "state.keyword"
|
||
}
|
||
}
|
||
}
|
||
}
|
||
--------------------------------------------------
|
||
// CONSOLE
|
||
// TEST[continued]
|
||
|
||
In SQL, the above aggregation is similar in concept to:
|
||
|
||
[source,sh]
|
||
--------------------------------------------------
|
||
SELECT state, COUNT(*) FROM bank GROUP BY state ORDER BY COUNT(*) DESC LIMIT 10;
|
||
--------------------------------------------------
|
||
|
||
And the response (partially shown):
|
||
|
||
[source,js]
|
||
--------------------------------------------------
|
||
{
|
||
"took": 29,
|
||
"timed_out": false,
|
||
"_shards": {
|
||
"total": 5,
|
||
"successful": 5,
|
||
"skipped" : 0,
|
||
"failed": 0
|
||
},
|
||
"hits" : {
|
||
"total" : {
|
||
"value": 1000,
|
||
"relation": "eq"
|
||
},
|
||
"max_score" : null,
|
||
"hits" : [ ]
|
||
},
|
||
"aggregations" : {
|
||
"group_by_state" : {
|
||
"doc_count_error_upper_bound": 20,
|
||
"sum_other_doc_count": 770,
|
||
"buckets" : [ {
|
||
"key" : "ID",
|
||
"doc_count" : 27
|
||
}, {
|
||
"key" : "TX",
|
||
"doc_count" : 27
|
||
}, {
|
||
"key" : "AL",
|
||
"doc_count" : 25
|
||
}, {
|
||
"key" : "MD",
|
||
"doc_count" : 25
|
||
}, {
|
||
"key" : "TN",
|
||
"doc_count" : 23
|
||
}, {
|
||
"key" : "MA",
|
||
"doc_count" : 21
|
||
}, {
|
||
"key" : "NC",
|
||
"doc_count" : 21
|
||
}, {
|
||
"key" : "ND",
|
||
"doc_count" : 21
|
||
}, {
|
||
"key" : "ME",
|
||
"doc_count" : 20
|
||
}, {
|
||
"key" : "MO",
|
||
"doc_count" : 20
|
||
} ]
|
||
}
|
||
}
|
||
}
|
||
--------------------------------------------------
|
||
// TESTRESPONSE[s/"took": 29/"took": $body.took/]
|
||
|
||
We can see that there are 27 accounts in `ID` (Idaho), followed by 27 accounts
|
||
in `TX` (Texas), followed by 25 accounts in `AL` (Alabama), and so forth.
|
||
|
||
Note that we set `size=0` to not show search hits because we only want to see the aggregation results in the response.
|
||
|
||
Building on the previous aggregation, this example calculates the average account balance by state (again only for the top 10 states sorted by count in descending order):
|
||
|
||
[source,js]
|
||
--------------------------------------------------
|
||
GET /bank/_search
|
||
{
|
||
"size": 0,
|
||
"aggs": {
|
||
"group_by_state": {
|
||
"terms": {
|
||
"field": "state.keyword"
|
||
},
|
||
"aggs": {
|
||
"average_balance": {
|
||
"avg": {
|
||
"field": "balance"
|
||
}
|
||
}
|
||
}
|
||
}
|
||
}
|
||
}
|
||
--------------------------------------------------
|
||
// CONSOLE
|
||
// TEST[continued]
|
||
|
||
Notice how we nested the `average_balance` aggregation inside the `group_by_state` aggregation. This is a common pattern for all the aggregations. You can nest aggregations inside aggregations arbitrarily to extract pivoted summarizations that you require from your data.
|
||
|
||
Building on the previous aggregation, let's now sort on the average balance in descending order:
|
||
|
||
[source,js]
|
||
--------------------------------------------------
|
||
GET /bank/_search
|
||
{
|
||
"size": 0,
|
||
"aggs": {
|
||
"group_by_state": {
|
||
"terms": {
|
||
"field": "state.keyword",
|
||
"order": {
|
||
"average_balance": "desc"
|
||
}
|
||
},
|
||
"aggs": {
|
||
"average_balance": {
|
||
"avg": {
|
||
"field": "balance"
|
||
}
|
||
}
|
||
}
|
||
}
|
||
}
|
||
}
|
||
--------------------------------------------------
|
||
// CONSOLE
|
||
// TEST[continued]
|
||
|
||
This example demonstrates how we can group by age brackets (ages 20-29, 30-39, and 40-49), then by gender, and then finally get the average account balance, per age bracket, per gender:
|
||
|
||
[source,js]
|
||
--------------------------------------------------
|
||
GET /bank/_search
|
||
{
|
||
"size": 0,
|
||
"aggs": {
|
||
"group_by_age": {
|
||
"range": {
|
||
"field": "age",
|
||
"ranges": [
|
||
{
|
||
"from": 20,
|
||
"to": 30
|
||
},
|
||
{
|
||
"from": 30,
|
||
"to": 40
|
||
},
|
||
{
|
||
"from": 40,
|
||
"to": 50
|
||
}
|
||
]
|
||
},
|
||
"aggs": {
|
||
"group_by_gender": {
|
||
"terms": {
|
||
"field": "gender.keyword"
|
||
},
|
||
"aggs": {
|
||
"average_balance": {
|
||
"avg": {
|
||
"field": "balance"
|
||
}
|
||
}
|
||
}
|
||
}
|
||
}
|
||
}
|
||
}
|
||
}
|
||
--------------------------------------------------
|
||
// CONSOLE
|
||
// TEST[continued]
|
||
|
||
There are many other aggregations capabilities that we won't go into detail here. The {ref}/search-aggregations.html[aggregations reference guide] is a great starting point if you want to do further experimentation.
|
||
|
||
[[getting-started-next-steps]]
|
||
== Where to go from here
|
||
|
||
Now that you've set up a cluster, indexed some documents, and run some
|
||
searches and aggregations, you might want to:
|
||
|
||
* {stack-gs}/get-started-elastic-stack.html#install-kibana[Dive in to the Elastic
|
||
Stack Tutorial] to install Kibana, Logstash, and Beats and
|
||
set up a basic system monitoring solution.
|
||
|
||
* {kibana-ref}/add-sample-data.html[Load one of the sample data sets into Kibana]
|
||
to see how you can use {es} and Kibana together to visualize your data.
|
||
|
||
* Try out one of the Elastic search solutions:
|
||
** https://swiftype.com/documentation/site-search/crawler-quick-start[Site Search]
|
||
** https://swiftype.com/documentation/app-search/getting-started[App Search]
|
||
** https://swiftype.com/documentation/enterprise-search/getting-started[Enterprise Search]
|