Page MenuHomePhabricator

Drop unneeded empty tables from wikis
Closed, ResolvedPublic

Description

Since we want to reduce files being opened by mariadb in s3. We can drop a couple of tables that are not needed in our infra.

e.g. I got the list of empty tables of hywiki:

0       cu_useragent
0       interwiki
0       ipblocks_restrictions
0       job
0       l10n_cache
0       objectcache
0       searchindex
0       securepoll_strike
0       securepoll_votes
0       uploadstash
0       user_autocreate_serial

A lot are needed, e.g. uploadstash but these tables can be dropped:

  • job
  • objectcache
  • searchindex
  • l10n_cache
  • interwiki
  • user_autocreate_serial

Event Timeline

Ladsgroup triaged this task as Medium priority.Jun 18 2025, 8:03 PM
Ladsgroup moved this task from Triage to Ready on the DBA board.

Mentioned in SAL (#wikimedia-operations) [2025-06-19T10:20:49Z] <Amir1> dropping searchindex table in itwiki (T397367)

I wait until Monday before dropping the table everywhere.

Mentioned in SAL (#wikimedia-operations) [2025-06-24T10:18:09Z] <Amir1> dropping searchindex table everywhere (T397367)

Change #1163323 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[operations/mediawiki-config@master] Clean up EventBus and jobs config

https://gerrit.wikimedia.org/r/1163323

Change #1163323 merged by jenkins-bot:

[operations/mediawiki-config@master] Clean up EventBus and jobs config

https://gerrit.wikimedia.org/r/1163323

Mentioned in SAL (#wikimedia-operations) [2025-06-25T09:37:01Z] <ladsgroup@deploy1003> Started scap sync-world: Backport for [[gerrit:1163323|Clean up EventBus and jobs config (T397367)]]

Mentioned in SAL (#wikimedia-operations) [2025-06-25T09:39:09Z] <ladsgroup@deploy1003> ladsgroup: Backport for [[gerrit:1163323|Clean up EventBus and jobs config (T397367)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2025-06-25T09:46:38Z] <ladsgroup@deploy1003> Finished scap sync-world: Backport for [[gerrit:1163323|Clean up EventBus and jobs config (T397367)]] (duration: 09m 36s)

Mentioned in SAL (#wikimedia-operations) [2025-06-25T10:13:46Z] <Amir1> dropping table job in group0 (T397367)

Mentioned in SAL (#wikimedia-operations) [2025-06-26T10:19:44Z] <Amir1> dropping job table on all wikis (T397367)

Dropped l10n_cache in group0 wikis. I'll do the next round on Monday or so

Mentioned in SAL (#wikimedia-operations) [2025-07-02T06:29:30Z] <Amir1> dropping l10n_cache table everywhere (T397367)

We also need to make sure no new such tables are created in new wikis.

Mentioned in SAL (#wikimedia-operations) [2025-09-03T12:04:14Z] <Amir1> dropping objectcache table in group0 (T397367)

Mentioned in SAL (#wikimedia-operations) [2025-09-08T09:18:35Z] <Amir1> dropping all objectcache table everywhere (T397367)

We use a central provider for temporary user account names. So we may not really need user_autocreate_serial for SUL wikis. (most non-SUL wikis do not allow IP to edit, so we can potentially drop it too, though care is needed not to break things.)

Thanks. I asked the team and after confirmation, adding it to the list.

We will eventually get around to using cu_useragent (T361139} though that work isn't planned at the moment.

If the DBAs would prefer to not have this table around we could additionally consider:

Actually, I have no problem with keeping cu_useragent for now. What would be much more impactful would be to get rid of these three tables securepoll is adding to every wiki but doesn't really need and since it has data, I can't drop them: T395928#11046636

Actually, I have no problem with keeping cu_useragent for now. What would be much more impactful would be to get rid of these three tables securepoll is adding to every wiki but doesn't really need and since it has data, I can't drop them: T395928#11046636

Sure. It seems to being handled by Novem Linguae and SD0001 (with thanks to them for taking this on). If you think it could do with some Trust and Safety Product Team input I can raise it with the team

Mentioned in SAL (#wikimedia-operations) [2025-09-10T17:15:52Z] <Amir1> dropping user_autocreate_serial on sul wikis where empty (T397367)

These wikis have non-empty interwiki tables:

ladsgroup@stat1009:~$ cat res_interwiki | grep -v "count(" | grep -v 0 
arbcom_zhwiki   49
bewwiktionary   49
idwikivoyage    49
kncwiki 49
madwikisource   49
minwikibooks    49
nupwiki 49
rkiwiki 49
satwiktionary   49
sylwiki 49
tigwiki 49
tlwikisource    49
zghwiktionary   49

Do you notice anything interesting @Zabe?

These wikis have non-empty interwiki tables:

ladsgroup@stat1009:~$ cat res_interwiki | grep -v "count(" | grep -v 0 
arbcom_zhwiki   49
bewwiktionary   49
idwikivoyage    49
kncwiki 49
madwikisource   49
minwikibooks    49
nupwiki 49
rkiwiki 49
satwiktionary   49
sylwiki 49
tigwiki 49
tlwikisource    49
zghwiktionary   49

Do you notice anything interesting @Zabe?

Yeah those are new. Maybe the ones created since interwiki got merged into core?

Change #1188741 had a related patch set uploaded (by Zabe; author: Zabe):

[mediawiki/core@master] InstallPreConfigured: Allow subclasses to skip tasks

https://gerrit.wikimedia.org/r/1188741

Change #1188742 had a related patch set uploaded (by Zabe; author: Zabe):

[mediawiki/extensions/WikimediaMaintenance@master] addWiki: Stop populating the interwiki table on new wikis

https://gerrit.wikimedia.org/r/1188742

Change #1188741 merged by jenkins-bot:

[mediawiki/core@master] InstallPreConfigured: Allow subclasses to skip tasks

https://gerrit.wikimedia.org/r/1188741

Change #1188742 merged by jenkins-bot:

[mediawiki/extensions/WikimediaMaintenance@master] addWiki: Stop populating the interwiki table on new wikis

https://gerrit.wikimedia.org/r/1188742

Seems to be fixed.

wikiadmin2023@10.64.0.50(mswikiquote)> select * from interwiki;
Empty set (0.001 sec)

wikiadmin2023@10.64.0.50(mswikiquote)>

This only means the interwiki table would be empty. But they should not exist at all in new wikis.

This only means the interwiki table would be empty. But they should not exist at all in new wikis.

you're missing the point of why we are doing this in the first place. s5 doesn't have the number of files opened problem.

Should we also consider dropping updatelog table? It may be nonempty but has no use in production.

Yeah, sounds good. It would be actually nice in case update.php somehow is run which would break the system (we had major outages caused by update.php automatically running)

Yeah, sounds good. It would be actually nice in case update.php somehow is run which would break the system (we had major outages caused by update.php automatically running)

It is written to by maintenance scripts that extend LoggedUpdateMaintenance, so we would loose that "skip if already run" functionality of those scripts if this table was dropped. Additionally, the code would need to simply skip in those cases given that we still need to run these maintenance scripts (even if they don't get run through update.php)

ugh. Right. I will stick to interwiki only then.

Mentioned in SAL (#wikimedia-operations) [2025-09-30T10:56:48Z] <Amir1> dropping interwiki table on group0 (T397367)

Mentioned in SAL (#wikimedia-operations) [2025-10-06T11:17:13Z] <Amir1> dropping interwiki table on group1 (T397367)

Mentioned in SAL (#wikimedia-operations) [2025-10-06T11:25:49Z] <Amir1> dropping interwiki table on group2 (T397367)

Ladsgroup updated the task description. (Show Details)
Ladsgroup moved this task from In progress to Done on the DBA board.

grafik.png (871×1 px, 58 KB)