Following the recent reindexing, we observed that Icinga has been throwing some false positives due to segment merges etc. We should reconfigure icinga to limit these false alerts etc.
After watching the trend of this check for about a week now, I discovered that wikis like enwiki, wikidatawiki and cebwiki shards sizes usually grow beyond the warning threshold but never hit the critical threshold before some of them go back below the warning threshold.
The throttling was obviously a good idea, but I suggest we increase the warning and critical threshold. Currently, warning is 35gb while critical is 50gb. I suggest we make warning 50gb and critical 60gb. Such that if any index hit the warning threshold and stays there for a while (a week), then an inplace reindexing should immediately follow.
I think the proposal make sense. This check is here so that we don't forget to reshard when needed, but there isn't a hard limit on the max shard size (well, there is the overall disk space, but we're going to be in trouble well before that). The main goal being to get a low priority alert when things are climbing too high. And "too high" isn't well defined. So we have some latitude as to what limit we want to set.
The main point is that we should ensure that this check does not flap too much, and does not alert us too early.
In short: I think W=50GB and C=60Gb is fine.