RevengeFNF Posted May 4, 2018 Posted May 4, 2018 In Elasticsearch, its possible to set the number of Shards the index will use when creating the Index. IPS is using the default 5 shards, which is not optimal for everybody(probably for most of the clients). For example in my case i only use 1 node(localhost) and my Index is way below 50Gb, so my optimal shard count is only 1. 5 shards in my case is slowing down elasticsearch. Its possible in the Elasticsearch options to have a field to choose the number of shards? Basically depending on the value we choose it would add something like this: "settings" : { "number_of_shards": 2 }
RevengeFNF Posted May 5, 2018 Author Posted May 5, 2018 If anyone wants to manual set the number of shards, you can create a template and then you just need to reindex the search. For the template creation, you just need to run this on your server: curl -XPUT http://localhost:9200/_template/content -H 'Content-Type: application/json' -d ' { "template" : "content*", "settings" : { "number_of_shards": 1 }, "mappings" : { "content" : { } } }' With this template i have set the number of shard's to 1 and then i just reindex the search and its working fine. Elasticsearch always recommends to use only 1 shard per node, unless the data is bigger than 50Gb. In that case its better to split the data into more shards. But more than 1 shard per node, its going to be slower than just 1 shard.
maddog107_merged Posted July 27, 2018 Posted July 27, 2018 On 5/5/2018 at 9:59 AM, RevengeFNF said: If anyone wants to manual set the number of shards, you can create a template and then you just need to reindex the search. For the template creation, you just need to run this on your server: curl -XPUT http://localhost:9200/_template/content -H 'Content-Type: application/json' -d ' { "template" : "content*", "settings" : { "number_of_shards": 1 }, "mappings" : { "content" : { } } }' With this template i have set the number of shard's to 1 and then i just reindex the search and its working fine. Elasticsearch always recommends to use only 1 shard per node, unless the data is bigger than 50Gb. In that case its better to split the data into more shards. But more than 1 shard per node, its going to be slower than just 1 shard. The only article I saw on the matter was from 2014, do you have any benchmarks? I have a very powerful machine with lots of cores and SSD's, I would be curious to know if going from 5->1 makes a huge diff. I run it on a single node (maybe I should disable replication). [root@BZ1 ~]# curl 'http://192.168.50.100:9200/_cat/indices?v' health status index uuid pri rep docs.count docs.deleted store.size pri.store.size yellow open images UKZClgI0TUWFSyKQ0LH9sg 5 1 5412415 0 19.5gb 19.5gb yellow open bz_search e_JEGBY4SWSElKt1ovqAig 5 1 4634505 1974140 12.7gb 12.7gb
Recommended Posts
Archived
This topic is now archived and is closed to further replies.