Page MenuHomePhabricator

Reduce shard count on all wikis in beta cluster to 1
Closed, ResolvedPublic2 Estimated Story Points

Description

Beta cluster is using the production shard counts but has trivial amounts of data. Currently there are 434 shards and 184 indices in beta-cluster, we could bring that down to 184 shards and save a bit of memory pressure on elasticsearch.

Event Timeline

Gehel triaged this task as High priority.Sep 5 2022, 3:24 PM
Gehel moved this task from needs triage to Current work on the Discovery-Search board.
Gehel edited projects, added Discovery-Search (Current work); removed Discovery-Search.
Gehel set the point value for this task to 2.Sep 5 2022, 3:41 PM

Change 833463 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[operations/mediawiki-config@master] cirrus: Limit shard count to 1 in deployment-prep

https://gerrit.wikimedia.org/r/833463

Change 833463 merged by jenkins-bot:

[operations/mediawiki-config@master] cirrus: Limit shard count to 1 in deployment-prep

https://gerrit.wikimedia.org/r/833463

Mentioned in SAL (#wikimedia-operations) [2022-09-21T20:20:49Z] <samtar@deploy1002> Started scap: Backport for [[gerrit:833463|cirrus: Limit shard count to 1 in deployment-prep (T316711)]]

Mentioned in SAL (#wikimedia-operations) [2022-09-21T20:21:12Z] <samtar@deploy1002> samtar and ebernhardson: Backport for [[gerrit:833463|cirrus: Limit shard count to 1 in deployment-prep (T316711)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2022-09-21T20:25:08Z] <samtar@deploy1002> Finished scap: Backport for [[gerrit:833463|cirrus: Limit shard count to 1 in deployment-prep (T316711)]] (duration: 04m 19s)

started reindex from deployment-mwmaint02

reindexing finished, but a variety of indices still have more shards than expected. Will need to review why the config change didn't do as I expected.

Change 836301 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[operations/mediawiki-config@master] wmgCirrusSearchShardCount: Override prod settings for beta cluster

https://gerrit.wikimedia.org/r/836301

Change 836301 merged by jenkins-bot:

[operations/mediawiki-config@master] wmgCirrusSearchShardCount: Override prod settings for beta cluster

https://gerrit.wikimedia.org/r/836301

started up reindexing on deployment-prep again, after verifying that $wgCirrusSearchShardCount is appropriately set via shell.php

Change 838272 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[operations/mediawiki-config@master] beta: Set shard count for commonswiki_file to 1

https://gerrit.wikimedia.org/r/838272

This is 99% of the way there, only remaining index is commonswiki_file. When applying the new shard settings I failed to set an explicit value for the file index suffix, which only exists on commonswiki, which results in cirrus refusing to create a new index there. Once the next patch is shipped we should be able to close this out.

Change 838272 merged by jenkins-bot:

[operations/mediawiki-config@master] beta: Set shard count for commonswiki_file to 1

https://gerrit.wikimedia.org/r/838272

commonswiki_file is now complete as well. Test reports no indices with multiple shards:

curl -s https://deployment-elastic09.deployment-prep.eqiad1.wikimedia.cloud:9243/_cat/indices | awk '$5 > 1 { print $0 }'