"The cluster shard limits prevent creation of more than 1000 non-frozen shards per node, and 3000 frozen shards per dedicated frozen node. Make sure you have enough nodes of each type in your cluster to handle the number of shards you need."
This is from latest Elastic Search doc version 8.12. I want to know the rationales or reasons of such design. In my scenario, this limitation is very easy to reach, forgive me we are using a much old version (2.7.6), but I believe the reasons apply also.
The machine we are using is dedicated to run Elastic Search data node, with 64 cores and 256G memory installed. Why does it not support more shards? if it cannot be scaled up as expected, need I run more than one instances on single machine?
There is no 2.7.6 version of Elasticsearch. That being said, the
cluster.max_shards_per_nodecluster setting came out around version 6.5 as a safeguard to prevent your nodes from being overloaded. So if you're below that version, you can probably have more shards per node, but it's not necessarily a good thing.The reason for introducing this setting was to protect your nodes from being overloaded. Each shard is a Lucene search engine which needs resources (CPU, RAM, file descriptors, etc). Each of the shard on the node competes for those resources, both at indexing time and at search time. Creating more and more shards on a node will only spread those resources thinner over all shards. The more shards you add, the less resources each of them can use, up to the point where your node dies.
According to the machine you have, if you're running only a single instance, you're not exploiting the full capacity of your hardware and you're probably wasting resources. You can run at least four ES instances on it with 64GB of RAM each (~32GB for the heap), which means 4000 shards instead of 1000. You can also experiment running more smaller instances (i.e. 8 instances with 32GB of RAM) in order to increase your max shard count.