Skip to content

[DOCS] Rewrite of sizing your shards-rebase #124444

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
48 changes: 37 additions & 11 deletions docs/reference/how-to/size-your-shards.asciidoc
Original file line number Diff line number Diff line change
@@ -1,17 +1,40 @@
[[size-your-shards]]
== Size your shards
[discrete]
[[what-is-a-shard]]
=== What is a shard?

A shard is a basic unit of storage in {es}. Every index is divided into one or more shards to help distribute data and workload across nodes in a cluster. This division allows {es} to handle large datasets and perform operations like searches and indexing efficiently. For more detailed information on shards, see <<nodes-shards, this page>>.

[discrete]
[[sizing-shard-guidelines]]
=== General guidelines

Balancing the number and size of your shards is important for the performance and stability of an {es} cluster:

* Too many shards can degrade search performance and make the cluster unstable. This is referred to as _oversharding_.
* Very large shards can slow down search operations and prolong recovery times after failures.

To avoid either of these states, implement the following guidelines:

[discrete]
[[general-sizing-guidelines]]
==== General sizing guidelines

* Aim for shard sizes between 10GB and 50GB
* Keep the number of documents on each shard below 200 million

[discrete]
[[shard-distribution-guidelines]]
==== Shard distribution guidelines

Each index in {es} is divided into one or more shards, each of which may be
replicated across multiple nodes to protect against hardware failures. If you
are using <<data-streams>> then each data stream is backed by a sequence of
indices. There is a limit to the amount of data you can store on a single node
so you can increase the capacity of your cluster by adding nodes and increasing
the number of indices and shards to match. However, each index and shard has
some overhead and if you divide your data across too many shards then the
overhead can become overwhelming. A cluster with too many indices or shards is
said to suffer from _oversharding_. An oversharded cluster will be less
efficient at responding to searches and in extreme cases it may even become
unstable.
To ensure that each node is working optimally, distribute shards evenly across nodes. Uneven distribution can cause some nodes to work harder than others, leading to performance degradation and instability.

While {es} automatically balances shards, you need to configure indices with an appropriate number of shards and replicas to allow for even distribution across nodes.

If you are using <<data-streams>>, each data stream is backed by a sequence of indices, each index potentially having multiple shards.

In addition to these these general guidelines, you should develop a tailored <<create-a-sharding-strategy, sharding strategy>> that considers your specific infrastructure, use case, and performance expectations.

[discrete]
[[create-a-sharding-strategy]]
Expand Down Expand Up @@ -208,6 +231,7 @@ index can be <<indices-delete-index,removed>>. You may then consider setting
<<indices-add-alias,Create Alias>> against the destination index for the source
index's name to point to it for continuity.

See this https://www.youtube.com/watch?v=sHyNYnwbYro[fixing shard sizes video] for an example troubleshooting walkthrough.

[discrete]
[[shard-count-recommendation]]
Expand Down Expand Up @@ -571,6 +595,8 @@ PUT _cluster/settings
}
----

See this https://www.youtube.com/watch?v=tZKbDegt4-M[fixing "max shards open" video] for an example troubleshooting walkthrough. For more information, see <<troubleshooting-shards-capacity-issues,Troubleshooting shards capacity>>.

[discrete]
[[troubleshooting-max-docs-limit]]
==== Number of documents in the shard cannot exceed [2147483519]
Expand Down