elasticsearch/docs/reference/troubleshooting/common-issues/disk-usage-exceeded.asciidoc

[[fix-watermark-errors]]
=== Fix watermark errors

++++
<titleabbrev>Watermark errors</titleabbrev>
++++
:keywords: {es}, high watermark, low watermark, full disk, flood stage watermark

When a data node is critically low on disk space and has reached the
<<cluster-routing-flood-stage,flood-stage disk usage watermark>>, the following
error is logged: `Error: disk usage exceeded flood-stage watermark, index has read-only-allow-delete block`. 

To prevent a full disk, when a node reaches this watermark, {es} <<index-block-settings,blocks writes>>
to any index with a shard on the node. If the block affects related system
indices, {kib} and other {stack} features may become unavailable. For example, 
this could induce {kib}'s `Kibana Server is not Ready yet` 
{kibana-ref}/access.html#not-ready[error message]. 

{es} will automatically remove the write block when the affected node's disk
usage falls below the <<cluster-routing-watermark-high,high disk watermark>>. 
To achieve this, {es} attempts to rebalance some of the affected node's shards 
to other nodes in the same data tier.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

[[fix-watermark-errors-rebalance]]
==== Monitor rebalancing

To verify that shards are moving off the affected node until it falls below high 
watermark., use the <<cat-shards,cat shards API>> and <<cat-recovery,cat recovery API>>: 

[source,console]
----
GET _cat/shards?v=true

GET _cat/recovery?v=true&active_only=true
----

If shards remain on the node keeping it about high watermark, use the 
<<cluster-allocation-explain,cluster allocation explanation API>> to get an 
explanation for their allocation status.

[source,console]
----
GET _cluster/allocation/explain
{
  "index": "my-index",
  "shard": 0,
  "primary": false
}
----
// TEST[s/^/PUT my-index\n/]
// TEST[s/"primary": false,/"primary": false/]

[[fix-watermark-errors-temporary]]
==== Temporary Relief

To immediately restore write operations, you can temporarily increase 
<<disk-based-shard-allocation,disk watermarks>> and remove the 
<<index-block-settings,write block>>.

[source,console]
----
PUT _cluster/settings
{
  "persistent": {
    "cluster.routing.allocation.disk.watermark.low": "90%",
    "cluster.routing.allocation.disk.watermark.low.max_headroom": "100GB",
    "cluster.routing.allocation.disk.watermark.high": "95%",
    "cluster.routing.allocation.disk.watermark.high.max_headroom": "20GB",
    "cluster.routing.allocation.disk.watermark.flood_stage": "97%",
    "cluster.routing.allocation.disk.watermark.flood_stage.max_headroom": "5GB",
    "cluster.routing.allocation.disk.watermark.flood_stage.frozen": "97%",
    "cluster.routing.allocation.disk.watermark.flood_stage.frozen.max_headroom": "5GB"
  }
}

PUT */_settings?expand_wildcards=all
{
  "index.blocks.read_only_allow_delete": null
}
----
// TEST[s/^/PUT my-index\n/]

When a long-term solution is in place, to reset or reconfigure the disk watermarks:

[source,console]
----
PUT _cluster/settings
{
  "persistent": {
    "cluster.routing.allocation.disk.watermark.low": null,
    "cluster.routing.allocation.disk.watermark.low.max_headroom": null,
    "cluster.routing.allocation.disk.watermark.high": null,
    "cluster.routing.allocation.disk.watermark.high.max_headroom": null,
    "cluster.routing.allocation.disk.watermark.flood_stage": null,
    "cluster.routing.allocation.disk.watermark.flood_stage.max_headroom": null,
    "cluster.routing.allocation.disk.watermark.flood_stage.frozen": null,
    "cluster.routing.allocation.disk.watermark.flood_stage.frozen.max_headroom": null
  }
}
----

[[fix-watermark-errors-resolve]]
==== Resolve

To resolve watermark errors permanently, perform one of the following actions:

* Horizontally scale nodes of the affected <<data-tiers,data tiers>>.

* Vertically scale existing nodes to increase disk space.

* Delete indices using the <<indices-delete-index,delete index API>>, either
permanently if the index isn't needed, or temporarily to later 
<<snapshots-restore-snapshot,restore>>.

* update related <<index-lifecycle-management,ILM policy>> to push indices 
through to later <<data-tiers,data tiers>>

TIP: On {ess} and {ece}, indices may need to be temporarily deleted via
its {cloud}/ec-api-console.html[Elasticsearch API Console] to later
<<snapshots-restore-snapshot,snapshot restore>> in order to resolve
<<cluster-health,cluster health>> `status:red` which will block
{cloud}/ec-activity-page.html[attempted changes]. If you experience issues
with this resolution flow on {ess}, kindly reach out to
https://support.elastic.co[Elastic Support] for assistance.

[discrete]
[[fix-watermark-errors-prevent]]
=== Prevent watermark errors

To avoid watermark errors in future, perform one of the following actions:

* If you're using {ess}, {ece}, or {eck}: Enable <<xpack-autoscaling,autoscaling>>.

* Set up {kibana-ref}/kibana-alerts.html[stack monitoring alerts] on top of
<<monitor-elasticsearch-cluster,{es} monitoring>> to be notified before
the flood-stage watermark is reached.
[DOCS] Improve watermark troubleshooting documentation (#94222) 2023-03-01 21:34:14 +08:00			`[[fix-watermark-errors]]`
			`=== Fix watermark errors`

			`++++`
			`<titleabbrev>Watermark errors</titleabbrev>`
			`++++`
Add link to flood-stage watermark exception message (#111315) Links the exception when hitting the flood-stage watermark to docs about this watermark and how to troubleshoot and resolve the problem. 2024-08-07 00:15:42 +08:00			`:keywords: {es}, high watermark, low watermark, full disk, flood stage watermark`
[DOCS] Improve watermark troubleshooting documentation (#94222) 2023-03-01 21:34:14 +08:00
			`When a data node is critically low on disk space and has reached the`
			`<<cluster-routing-flood-stage,flood-stage disk usage watermark>>, the following`
			error is logged: `Error: disk usage exceeded flood-stage watermark, index has read-only-allow-delete block`.

Add link to flood-stage watermark exception message (#111315) Links the exception when hitting the flood-stage watermark to docs about this watermark and how to troubleshoot and resolve the problem. 2024-08-07 00:15:42 +08:00			`To prevent a full disk, when a node reaches this watermark, {es} <<index-block-settings,blocks writes>>`
[DOCS] Improve watermark troubleshooting documentation (#94222) 2023-03-01 21:34:14 +08:00			`to any index with a shard on the node. If the block affects related system`
Add link to flood-stage watermark exception message (#111315) Links the exception when hitting the flood-stage watermark to docs about this watermark and how to troubleshoot and resolve the problem. 2024-08-07 00:15:42 +08:00			`indices, {kib} and other {stack} features may become unavailable. For example,`
			this could induce {kib}'s `Kibana Server is not Ready yet`
			`{kibana-ref}/access.html#not-ready[error message].`
Split common cluster issues page into separate pages (#88495) 2022-07-18 23:54:02 +08:00
			`{es} will automatically remove the write block when the affected node's disk`
Add link to flood-stage watermark exception message (#111315) Links the exception when hitting the flood-stage watermark to docs about this watermark and how to troubleshoot and resolve the problem. 2024-08-07 00:15:42 +08:00			`usage falls below the <<cluster-routing-watermark-high,high disk watermark>>.`
			`To achieve this, {es} attempts to rebalance some of the affected node's shards`
			`to other nodes in the same data tier.`
Split common cluster issues page into separate pages (#88495) 2022-07-18 23:54:02 +08:00
[AutoOps] Reference AutoOps solution on troubleshooting pages (#119630) * Reference AutoOps on troubleshooting pages * Integrate reviewer's feedback 2025-01-09 23:24:20 +08:00			`****`
			`If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].`
			`****`

Add link to flood-stage watermark exception message (#111315) Links the exception when hitting the flood-stage watermark to docs about this watermark and how to troubleshoot and resolve the problem. 2024-08-07 00:15:42 +08:00			`[[fix-watermark-errors-rebalance]]`
			`==== Monitor rebalancing`

			`To verify that shards are moving off the affected node until it falls below high`
			`watermark., use the <<cat-shards,cat shards API>> and <<cat-recovery,cat recovery API>>:`
Split common cluster issues page into separate pages (#88495) 2022-07-18 23:54:02 +08:00
			`[source,console]`
			`----`
			`GET _cat/shards?v=true`
Add link to flood-stage watermark exception message (#111315) Links the exception when hitting the flood-stage watermark to docs about this watermark and how to troubleshoot and resolve the problem. 2024-08-07 00:15:42 +08:00
			`GET _cat/recovery?v=true&active_only=true`
Split common cluster issues page into separate pages (#88495) 2022-07-18 23:54:02 +08:00			`----`

Add link to flood-stage watermark exception message (#111315) Links the exception when hitting the flood-stage watermark to docs about this watermark and how to troubleshoot and resolve the problem. 2024-08-07 00:15:42 +08:00			`If shards remain on the node keeping it about high watermark, use the`
			`<<cluster-allocation-explain,cluster allocation explanation API>> to get an`
			`explanation for their allocation status.`
Split common cluster issues page into separate pages (#88495) 2022-07-18 23:54:02 +08:00
			`[source,console]`
			`----`
			`GET _cluster/allocation/explain`
			`{`
			`"index": "my-index",`
			`"shard": 0,`
(Doc+) Removing "current_node" from Allocation Explain API under Fix Watermark Errors (#111946) 👋 howdy, team! This just simplifies the Allocation Explain API request to not need to include the `current_node` which may not be known when troubleshooting the [Fix Watermark Errors](https://www.elastic.co/guide/en/elasticsearch/reference/current/fix-watermark-errors.html) guide. TIA! Stef 2024-08-20 22:22:22 +08:00			`"primary": false`
Split common cluster issues page into separate pages (#88495) 2022-07-18 23:54:02 +08:00			`}`
			`----`
			`// TEST[s/^/PUT my-index\n/]`
			`// TEST[s/"primary": false,/"primary": false/]`

Add link to flood-stage watermark exception message (#111315) Links the exception when hitting the flood-stage watermark to docs about this watermark and how to troubleshoot and resolve the problem. 2024-08-07 00:15:42 +08:00			`[[fix-watermark-errors-temporary]]`
			`==== Temporary Relief`

(Doc+) Expand watermark resolution (#119174) * (Doc+) Expand watermark resolution Relaunch https://github.com/elastic/elasticsearch/pull/116892 since the original one seems to be outdated and hard to update branch. * Apply suggestions from code review Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com> --------- Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com> 2025-01-30 02:31:50 +08:00			`To immediately restore write operations, you can temporarily increase`
Add link to flood-stage watermark exception message (#111315) Links the exception when hitting the flood-stage watermark to docs about this watermark and how to troubleshoot and resolve the problem. 2024-08-07 00:15:42 +08:00			`<<disk-based-shard-allocation,disk watermarks>> and remove the`
			`<<index-block-settings,write block>>.`
Split common cluster issues page into separate pages (#88495) 2022-07-18 23:54:02 +08:00
			`[source,console]`
			`----`
			`PUT _cluster/settings`
			`{`
			`"persistent": {`
			`"cluster.routing.allocation.disk.watermark.low": "90%",`
Introduce max headroom for disk watermark stages (#88639) Introduce max headroom settings for the low, high, and flood disk watermark stages, similar to the existing max headroom setting for the flood stage of the frozen tier. Introduce new max headrooms in HealthMetadata and in ReactiveStorageDeciderService. Add multiple tests in DiskThresholdDeciderUnitTests, DiskThresholdDeciderTests and DiskThresholdMonitorTests. Moreover, addition & subtraction for ByteSizeValue, and min. 2022-09-19 19:59:18 +08:00			`"cluster.routing.allocation.disk.watermark.low.max_headroom": "100GB",`
Split common cluster issues page into separate pages (#88495) 2022-07-18 23:54:02 +08:00			`"cluster.routing.allocation.disk.watermark.high": "95%",`
Introduce max headroom for disk watermark stages (#88639) Introduce max headroom settings for the low, high, and flood disk watermark stages, similar to the existing max headroom setting for the flood stage of the frozen tier. Introduce new max headrooms in HealthMetadata and in ReactiveStorageDeciderService. Add multiple tests in DiskThresholdDeciderUnitTests, DiskThresholdDeciderTests and DiskThresholdMonitorTests. Moreover, addition & subtraction for ByteSizeValue, and min. 2022-09-19 19:59:18 +08:00			`"cluster.routing.allocation.disk.watermark.high.max_headroom": "20GB",`
			`"cluster.routing.allocation.disk.watermark.flood_stage": "97%",`
			`"cluster.routing.allocation.disk.watermark.flood_stage.max_headroom": "5GB",`
			`"cluster.routing.allocation.disk.watermark.flood_stage.frozen": "97%",`
			`"cluster.routing.allocation.disk.watermark.flood_stage.frozen.max_headroom": "5GB"`
Split common cluster issues page into separate pages (#88495) 2022-07-18 23:54:02 +08:00			`}`
			`}`

			`PUT */_settings?expand_wildcards=all`
			`{`
			`"index.blocks.read_only_allow_delete": null`
			`}`
			`----`
			`// TEST[s/^/PUT my-index\n/]`

Add link to flood-stage watermark exception message (#111315) Links the exception when hitting the flood-stage watermark to docs about this watermark and how to troubleshoot and resolve the problem. 2024-08-07 00:15:42 +08:00			`When a long-term solution is in place, to reset or reconfigure the disk watermarks:`
Split common cluster issues page into separate pages (#88495) 2022-07-18 23:54:02 +08:00
			`[source,console]`
			`----`
			`PUT _cluster/settings`
			`{`
			`"persistent": {`
			`"cluster.routing.allocation.disk.watermark.low": null,`
Introduce max headroom for disk watermark stages (#88639) Introduce max headroom settings for the low, high, and flood disk watermark stages, similar to the existing max headroom setting for the flood stage of the frozen tier. Introduce new max headrooms in HealthMetadata and in ReactiveStorageDeciderService. Add multiple tests in DiskThresholdDeciderUnitTests, DiskThresholdDeciderTests and DiskThresholdMonitorTests. Moreover, addition & subtraction for ByteSizeValue, and min. 2022-09-19 19:59:18 +08:00			`"cluster.routing.allocation.disk.watermark.low.max_headroom": null,`
Split common cluster issues page into separate pages (#88495) 2022-07-18 23:54:02 +08:00			`"cluster.routing.allocation.disk.watermark.high": null,`
Introduce max headroom for disk watermark stages (#88639) Introduce max headroom settings for the low, high, and flood disk watermark stages, similar to the existing max headroom setting for the flood stage of the frozen tier. Introduce new max headrooms in HealthMetadata and in ReactiveStorageDeciderService. Add multiple tests in DiskThresholdDeciderUnitTests, DiskThresholdDeciderTests and DiskThresholdMonitorTests. Moreover, addition & subtraction for ByteSizeValue, and min. 2022-09-19 19:59:18 +08:00			`"cluster.routing.allocation.disk.watermark.high.max_headroom": null,`
			`"cluster.routing.allocation.disk.watermark.flood_stage": null,`
			`"cluster.routing.allocation.disk.watermark.flood_stage.max_headroom": null,`
			`"cluster.routing.allocation.disk.watermark.flood_stage.frozen": null,`
			`"cluster.routing.allocation.disk.watermark.flood_stage.frozen.max_headroom": null`
Split common cluster issues page into separate pages (#88495) 2022-07-18 23:54:02 +08:00			`}`
			`}`
Introduce max headroom for disk watermark stages (#88639) Introduce max headroom settings for the low, high, and flood disk watermark stages, similar to the existing max headroom setting for the flood stage of the frozen tier. Introduce new max headrooms in HealthMetadata and in ReactiveStorageDeciderService. Add multiple tests in DiskThresholdDeciderUnitTests, DiskThresholdDeciderTests and DiskThresholdMonitorTests. Moreover, addition & subtraction for ByteSizeValue, and min. 2022-09-19 19:59:18 +08:00			`----`
Add link to flood-stage watermark exception message (#111315) Links the exception when hitting the flood-stage watermark to docs about this watermark and how to troubleshoot and resolve the problem. 2024-08-07 00:15:42 +08:00
			`[[fix-watermark-errors-resolve]]`
			`==== Resolve`

(Doc+) Expand watermark resolution (#119174) * (Doc+) Expand watermark resolution Relaunch https://github.com/elastic/elasticsearch/pull/116892 since the original one seems to be outdated and hard to update branch. * Apply suggestions from code review Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com> --------- Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com> 2025-01-30 02:31:50 +08:00			`To resolve watermark errors permanently, perform one of the following actions:`
Add link to flood-stage watermark exception message (#111315) Links the exception when hitting the flood-stage watermark to docs about this watermark and how to troubleshoot and resolve the problem. 2024-08-07 00:15:42 +08:00
(Doc+) Expand watermark resolution (#119174) * (Doc+) Expand watermark resolution Relaunch https://github.com/elastic/elasticsearch/pull/116892 since the original one seems to be outdated and hard to update branch. * Apply suggestions from code review Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com> --------- Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com> 2025-01-30 02:31:50 +08:00			`* Horizontally scale nodes of the affected <<data-tiers,data tiers>>.`
Add link to flood-stage watermark exception message (#111315) Links the exception when hitting the flood-stage watermark to docs about this watermark and how to troubleshoot and resolve the problem. 2024-08-07 00:15:42 +08:00
(Doc+) Expand watermark resolution (#119174) * (Doc+) Expand watermark resolution Relaunch https://github.com/elastic/elasticsearch/pull/116892 since the original one seems to be outdated and hard to update branch. * Apply suggestions from code review Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com> --------- Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com> 2025-01-30 02:31:50 +08:00			`* Vertically scale existing nodes to increase disk space.`
Add link to flood-stage watermark exception message (#111315) Links the exception when hitting the flood-stage watermark to docs about this watermark and how to troubleshoot and resolve the problem. 2024-08-07 00:15:42 +08:00
(Doc+) Expand watermark resolution (#119174) * (Doc+) Expand watermark resolution Relaunch https://github.com/elastic/elasticsearch/pull/116892 since the original one seems to be outdated and hard to update branch. * Apply suggestions from code review Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com> --------- Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com> 2025-01-30 02:31:50 +08:00			`* Delete indices using the <<indices-delete-index,delete index API>>, either`
			`permanently if the index isn't needed, or temporarily to later`
			`<<snapshots-restore-snapshot,restore>>.`
Add link to flood-stage watermark exception message (#111315) Links the exception when hitting the flood-stage watermark to docs about this watermark and how to troubleshoot and resolve the problem. 2024-08-07 00:15:42 +08:00
			`* update related <<index-lifecycle-management,ILM policy>> to push indices`
			`through to later <<data-tiers,data tiers>>`
(Doc+) Expand watermark resolution (#119174) * (Doc+) Expand watermark resolution Relaunch https://github.com/elastic/elasticsearch/pull/116892 since the original one seems to be outdated and hard to update branch. * Apply suggestions from code review Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com> --------- Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com> 2025-01-30 02:31:50 +08:00
			`TIP: On {ess} and {ece}, indices may need to be temporarily deleted via`
			`its {cloud}/ec-api-console.html[Elasticsearch API Console] to later`
			`<<snapshots-restore-snapshot,snapshot restore>> in order to resolve`
			<<cluster-health,cluster health>> `status:red` which will block
			`{cloud}/ec-activity-page.html[attempted changes]. If you experience issues`
			`with this resolution flow on {ess}, kindly reach out to`
			`https://support.elastic.co[Elastic Support] for assistance.`

[DOCS] Fix missing id syntax (#121264) * [DOCS] Fix missing id syntax * Update docs/reference/troubleshooting/common-issues/disk-usage-exceeded.asciidoc * fix id 2025-01-30 19:52:37 +08:00			`[discrete]`
			`[[fix-watermark-errors-prevent]]`
			`=== Prevent watermark errors`
(Doc+) Expand watermark resolution (#119174) * (Doc+) Expand watermark resolution Relaunch https://github.com/elastic/elasticsearch/pull/116892 since the original one seems to be outdated and hard to update branch. * Apply suggestions from code review Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com> --------- Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com> 2025-01-30 02:31:50 +08:00
[DOCS] Fix missing id syntax (#121264) * [DOCS] Fix missing id syntax * Update docs/reference/troubleshooting/common-issues/disk-usage-exceeded.asciidoc * fix id 2025-01-30 19:52:37 +08:00			`To avoid watermark errors in future, perform one of the following actions:`
(Doc+) Expand watermark resolution (#119174) * (Doc+) Expand watermark resolution Relaunch https://github.com/elastic/elasticsearch/pull/116892 since the original one seems to be outdated and hard to update branch. * Apply suggestions from code review Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com> --------- Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com> 2025-01-30 02:31:50 +08:00
			`* If you're using {ess}, {ece}, or {eck}: Enable <<xpack-autoscaling,autoscaling>>.`

			`* Set up {kibana-ref}/kibana-alerts.html[stack monitoring alerts] on top of`
			`<<monitor-elasticsearch-cluster,{es} monitoring>> to be notified before`
[DOCS] Fix missing id syntax (#121264) * [DOCS] Fix missing id syntax * Update docs/reference/troubleshooting/common-issues/disk-usage-exceeded.asciidoc * fix id 2025-01-30 19:52:37 +08:00			`the flood-stage watermark is reached.`