elasticsearch/docs/reference/getting-started.asciidoc

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

513 lines
12 KiB
Plaintext
Raw Normal View History

[chapter]
2014-04-19 01:51:59 +08:00
[[getting-started]]
= Quick start
2014-04-19 01:51:59 +08:00
This guide helps beginners learn how to:
2014-04-19 01:51:59 +08:00
* Install and run {es} in a test environment
* Add data to {es}
* Search and sort data
* Extract fields from unstructured content during a search
2014-04-19 01:51:59 +08:00
[discrete]
[[run-elasticsearch]]
=== Run {es}
2014-04-19 01:51:59 +08:00
The simplest way to set up {es} is to create a managed deployment with {ess} on
{ecloud}. If you prefer to manage your own test environment, you can install and
run {es} using Docker.
include::{es-repo-dir}/tab-widgets/quick-start-install-widget.asciidoc[]
[discrete]
[[send-requests-to-elasticsearch]]
=== Send requests to {es}
You send data and other requests to {es} using REST APIs. This lets you interact
with {es} using any client that sends HTTP requests, such as
https://curl.se[curl]. You can also use {kib}'s console to send requests to
{es}.
include::{es-repo-dir}/tab-widgets/api-call-widget.asciidoc[]
2014-04-19 01:51:59 +08:00
[discrete]
[[add-data]]
=== Add data
You add data to {es} as JSON objects called documents. {es} stores these
documents in searchable indices.
For time series data, such as logs and metrics, you typically add documents to a
data stream made up of multiple auto-generated backing indices.
A data stream requires an index template that matches its name. {es} uses this
template to configure the stream's backing indices. Documents sent to a data
stream must have a `@timestamp` field.
2014-04-19 01:51:59 +08:00
[discrete]
[[add-single-document]]
==== Add a single document
2014-04-19 01:51:59 +08:00
Submit the following indexing request to add a single log entry to the
`logs-my_app-default` data stream. Since `logs-my_app-default` doesn't exist, the
request automatically creates it using the built-in `logs-*-*` index template.
[source,console]
----
POST logs-my_app-default/_doc
{
"@timestamp": "2099-05-06T16:21:15.000Z",
"event": {
"original": "192.0.2.42 - - [06/May/2099:16:21:15 +0000] \"GET /images/bg.jpg HTTP/1.0\" 200 24736"
}
}
----
// TEST[s/_doc/_doc?refresh=wait_for/]
2014-04-19 01:51:59 +08:00
The response includes metadata that {es} generates for the document:
* The backing `_index` that contains the document. {es} automatically generates
the names of backing indices.
* A unique `_id` for the document within the index.
2014-04-19 01:51:59 +08:00
[source,console-result]
----
2014-04-19 01:51:59 +08:00
{
"_index": ".ds-logs-my_app-default-2099-05-06-000001",
"_id": "gl5MJXMBMk1dGnErnBW8",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1
2014-04-19 01:51:59 +08:00
}
----
// TESTRESPONSE[s/"_index": ".ds-logs-my_app-default-2099-05-06-000001"/"_index": $body._index/]
// TESTRESPONSE[s/"_id": "gl5MJXMBMk1dGnErnBW8"/"_id": $body._id/]
2014-04-19 01:51:59 +08:00
[discrete]
[[add-multiple-documents]]
==== Add multiple documents
2014-04-19 01:51:59 +08:00
Use the `_bulk` endpoint to add multiple documents in one request. Bulk data
must be newline-delimited JSON (NDJSON). Each line must end in a newline
character (`\n`), including the last line.
[source,console]
----
PUT logs-my_app-default/_bulk
{ "create": { } }
{ "@timestamp": "2099-05-07T16:24:32.000Z", "event": { "original": "192.0.2.242 - - [07/May/2020:16:24:32 -0500] \"GET /images/hm_nbg.jpg HTTP/1.0\" 304 0" } }
{ "create": { } }
{ "@timestamp": "2099-05-08T16:25:42.000Z", "event": { "original": "192.0.2.255 - - [08/May/2099:16:25:42 +0000] \"GET /favicon.ico HTTP/1.0\" 200 3638" } }
----
// TEST[continued]
// TEST[s/_bulk/_bulk?refresh=wait_for/]
[discrete]
[[qs-search-data]]
=== Search data
2014-04-19 01:51:59 +08:00
Indexed documents are available for search in near real-time. The following
search matches all log entries in `logs-my_app-default` and sorts them by
`@timestamp` in descending order.
2014-04-19 01:51:59 +08:00
[source,console]
----
GET logs-my_app-default/_search
2014-04-19 01:51:59 +08:00
{
"query": {
"match_all": { }
},
"sort": [
{
"@timestamp": "desc"
}
]
}
----
// TEST[continued]
2014-04-19 01:51:59 +08:00
By default, the `hits` section of the response includes up to the first 10
documents that match the search. The `_source` of each hit contains the original
JSON object submitted during indexing.
2014-04-19 01:51:59 +08:00
[source,console-result]
----
2014-04-19 01:51:59 +08:00
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
2014-04-19 01:51:59 +08:00
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": ".ds-logs-my_app-default-2099-05-06-000001",
"_id": "PdjWongB9KPnaVm2IyaL",
"_score": null,
"_source": {
"@timestamp": "2099-05-08T16:25:42.000Z",
"event": {
"original": "192.0.2.255 - - [08/May/2099:16:25:42 +0000] \"GET /favicon.ico HTTP/1.0\" 200 3638"
}
},
"sort": [
4081940742000
]
},
...
]
}
}
----
// TESTRESPONSE[s/"took": 2/"took": $body.took/]
// TESTRESPONSE[s/"_index": ".ds-logs-my_app-default-2099-05-06-000001"/"_index": $body.hits.hits.0._index/]
// TESTRESPONSE[s/"_id": "PdjWongB9KPnaVm2IyaL"/"_id": $body.hits.hits.0._id/]
// TESTRESPONSE[s/\.\.\./$body.hits.hits.1,$body.hits.hits.2/]
2014-04-19 01:51:59 +08:00
[discrete]
[[get-specific-fields]]
==== Get specific fields
2014-04-19 01:51:59 +08:00
Parsing the entire `_source` is unwieldy for large documents. To exclude it from
the response, set the `_source` parameter to `false`. Instead, use the `fields`
parameter to retrieve the fields you want.
2014-04-19 01:51:59 +08:00
[source,console]
----
GET logs-my_app-default/_search
2014-04-19 01:51:59 +08:00
{
"query": {
"match_all": { }
},
"fields": [
"@timestamp"
],
"_source": false,
"sort": [
{
"@timestamp": "desc"
}
]
}
----
// TEST[continued]
// TEST[s/_search/_search?filter_path=hits.hits&size=1/]
2014-04-19 01:51:59 +08:00
The response contains each hit's `fields` values as a flat array.
[source,console-result]
----
{
...
"hits": {
...
"hits": [
{
"_index": ".ds-logs-my_app-default-2099-05-06-000001",
"_id": "PdjWongB9KPnaVm2IyaL",
"_score": null,
"fields": {
"@timestamp": [
"2099-05-08T16:25:42.000Z"
]
},
"sort": [
4081940742000
]
},
...
]
}
}
----
// TESTRESPONSE[s/\.\.\.//]
// TESTRESPONSE[s/"_index": ".ds-logs-my_app-default-2099-05-06-000001"/"_index": $body.hits.hits.0._index/]
// TESTRESPONSE[s/"_id": "PdjWongB9KPnaVm2IyaL"/"_id": $body.hits.hits.0._id/]
// TESTRESPONSE[s/4081940742000\n \]\n \},\n/4081940742000\]}/]
[discrete]
[[search-date-range]]
==== Search a date range
2014-04-19 01:51:59 +08:00
To search across a specific time or IP range, use a `range` query.
2014-04-19 01:51:59 +08:00
[source,console]
----
GET logs-my_app-default/_search
2014-04-19 01:51:59 +08:00
{
"query": {
"range": {
"@timestamp": {
"gte": "2099-05-05",
"lt": "2099-05-08"
}
}
},
"fields": [
"@timestamp"
],
"_source": false,
"sort": [
{
"@timestamp": "desc"
}
]
}
----
// TEST[continued]
2014-04-19 01:51:59 +08:00
You can use date math to define relative time ranges. The following query
searches for data from the past day, which won't match any log entries in
`logs-my_app-default`.
2014-04-19 01:51:59 +08:00
[source,console]
----
GET logs-my_app-default/_search
2014-04-19 01:51:59 +08:00
{
"query": {
"range": {
"@timestamp": {
"gte": "now-1d/d",
"lt": "now/d"
}
}
},
"fields": [
"@timestamp"
],
"_source": false,
"sort": [
{
"@timestamp": "desc"
}
]
}
----
// TEST[continued]
2014-04-19 01:51:59 +08:00
[discrete]
[[extract-fields]]
==== Extract fields from unstructured content
You can extract <<runtime-search-request,runtime fields>> from unstructured
content, such as log messages, during a search.
2014-04-19 01:51:59 +08:00
Use the following search to extract the `source.ip` runtime field from
`event.original`. To include it in the response, add `source.ip` to the `fields`
parameter.
2014-04-19 01:51:59 +08:00
[source,console]
----
GET logs-my_app-default/_search
2014-04-19 01:51:59 +08:00
{
"runtime_mappings": {
"source.ip": {
"type": "ip",
"script": """
String sourceip=grok('%{IPORHOST:sourceip} .*').extract(doc[ "event.original" ].value)?.sourceip;
if (sourceip != null) emit(sourceip);
"""
}
},
2014-04-19 01:51:59 +08:00
"query": {
"range": {
"@timestamp": {
"gte": "2099-05-05",
"lt": "2099-05-08"
}
2014-04-19 01:51:59 +08:00
}
},
"fields": [
"@timestamp",
"source.ip"
],
"_source": false,
"sort": [
{
"@timestamp": "desc"
}
]
}
----
// TEST[continued]
2014-04-19 01:51:59 +08:00
[discrete]
2021-04-27 05:47:26 +08:00
[[combine-queries]]
==== Combine queries
2014-04-19 01:51:59 +08:00
You can use the `bool` query to combine multiple queries. The following search
combines two `range` queries: one on `@timestamp` and one on the `source.ip`
runtime field.
2014-04-19 01:51:59 +08:00
[source,console]
----
GET logs-my_app-default/_search
2014-04-19 01:51:59 +08:00
{
"runtime_mappings": {
"source.ip": {
"type": "ip",
"script": """
String sourceip=grok('%{IPORHOST:sourceip} .*').extract(doc[ "event.original" ].value)?.sourceip;
if (sourceip != null) emit(sourceip);
"""
}
},
2014-04-19 01:51:59 +08:00
"query": {
"bool": {
"filter": [
{
"range": {
"@timestamp": {
"gte": "2099-05-05",
"lt": "2099-05-08"
}
}
},
{
"range": {
"source.ip": {
"gte": "192.0.2.0",
"lte": "192.0.2.240"
}
2014-04-19 01:51:59 +08:00
}
}
]
2014-04-19 01:51:59 +08:00
}
},
"fields": [
"@timestamp",
"source.ip"
],
"_source": false,
"sort": [
{
"@timestamp": "desc"
}
]
}
----
// TEST[continued]
2014-04-19 01:51:59 +08:00
[discrete]
[[aggregate-data]]
==== Aggregate data
2014-04-19 01:51:59 +08:00
Use aggregations to summarize data as metrics, statistics, or other analytics.
2014-04-19 01:51:59 +08:00
The following search uses an aggregation to calculate the
`average_response_size` using the `http.response.body.bytes` runtime field. The
aggregation only runs on documents that match the `query`.
2014-04-19 01:51:59 +08:00
[source,console]
----
GET logs-my_app-default/_search
2014-04-19 01:51:59 +08:00
{
"runtime_mappings": {
"http.response.body.bytes": {
"type": "long",
"script": """
String bytes=grok('%{COMMONAPACHELOG}').extract(doc[ "event.original" ].value)?.bytes;
if (bytes != null) emit(Integer.parseInt(bytes));
"""
}
},
2014-04-19 01:51:59 +08:00
"aggs": {
"average_response_size":{
"avg": {
"field": "http.response.body.bytes"
2014-04-19 01:51:59 +08:00
}
}
},
"query": {
"bool": {
"filter": [
{
"range": {
"@timestamp": {
"gte": "2099-05-05",
"lt": "2099-05-08"
}
}
}
]
}
},
"fields": [
"@timestamp",
"http.response.body.bytes"
],
"_source": false,
"sort": [
{
"@timestamp": "desc"
}
]
}
----
// TEST[continued]
2014-04-19 01:51:59 +08:00
The responses `aggregations` object contains aggregation results.
2014-04-19 01:51:59 +08:00
[source,console-result]
----
{
...
2014-04-19 01:51:59 +08:00
"aggregations" : {
"average_response_size" : {
"value" : 12368.0
2014-04-19 01:51:59 +08:00
}
}
}
----
// TESTRESPONSE[s/\.\.\./"took": "$body.took", "timed_out": false, "_shards": "$body._shards", "hits": "$body.hits",/]
2014-04-19 01:51:59 +08:00
[discrete]
[[explore-more-search-options]]
==== Explore more search options
2014-04-19 01:51:59 +08:00
To keep exploring, index more data to your data stream and check out <<common-search-options>>.
2014-04-19 01:51:59 +08:00
[discrete]
[[clean-up]]
=== Clean up
2014-04-19 01:51:59 +08:00
When you're done, delete your test data stream and its backing indices.
2014-04-19 01:51:59 +08:00
[source,console]
----
DELETE _data_stream/logs-my_app-default
----
// TEST[continued]
2014-04-19 01:51:59 +08:00
You can also delete your test deployment.
2014-04-19 01:51:59 +08:00
include::{es-repo-dir}/tab-widgets/quick-start-cleanup-widget.asciidoc[]
2014-04-19 01:51:59 +08:00
[discrete]
[[whats-next]]
=== What's next?
* Get the most out of your time series data by setting up data tiers and
{ilm-init}. See <<use-elasticsearch-for-time-series-data>>.
* Use {fleet} and {agent} to collect logs and metrics directly from your data
sources and send them to {es}. See the
{observability-guide}/ingest-logs-metrics-uptime.html[Ingest logs, metrics, and uptime data with {agent}].
2014-04-19 01:51:59 +08:00
* Use {kib} to explore, visualize, and manage your {es} data. See the
{kibana-ref}/get-started.html[{kib} quick start guide].