Page MenuHomePhabricator

Finish removal of ukwikimedia wiki
Closed, ResolvedPublic

Description

Context from IRC:

15:25:54 	<Krinkle>	greg-g: noticed a few minor errors with prod job queue due to ukwikimedia wiki being deleted. Looks like https://wikitech.wikimedia.org/wiki/Delete_a_wiki wasn't followed (specifically, I suspect globalusage wasn't cleared on commons, as deleteWiki.php would do, and other steps may be forgotten as well).
15:26:22 	<Krinkle>	I can't seem to find who led the deletion of that wiki. Seems to have scattered tasks from 2016 through to 2018
15:26:32 	<Krinkle>	https://phabricator.wikimedia.org/T169488  / https://phabricator.wikimedia.org/T168436
15:26:34 	<+greg-g>	Krinkle: huh, crap. I don't know off the top of my head when that wiki was deleted....
15:26:53 	<Krinkle>	but.. someone should go through that list and make sure things are correct to avoid random issues.
15:26:58 	<+greg-g>	yeah....
15:27:14 	<Krinkle>	Now I think the impact of the error I spotted was just that some commons query may be broken for a few rare images that used to be used on that wiki.
15:27:32 	<Krinkle>	e.g. Special:GlobalUsage/<rare image used on that wiki before it was deleted>
15:27:43 	<Krinkle>	But not sure what else would/could happen.
15:27:51 	<+greg-g>	heh: https://tools.wmflabs.org/sal/production?p=0&q=ukwikimedia&d=

See also, these related tasks:

Event Timeline

greg triaged this task as Low priority.Mar 12 2019, 10:47 PM
greg created this task.
Restricted Application added subscribers: Base, Aklapper. · View Herald TranscriptMar 12 2019, 10:48 PM

AIUI the wiki was never formally deleted (just closed), the domain was redirected and the wiki continued to exist until these errors started cropping up and people began cleaning it up one by one.

It's unfortunate, because moving the wiki just added a lot of extra work on both sides without any clear benefits to either WMF or WMUK.

Peachey88 updated the task description. (Show Details)Mar 24 2019, 3:36 AM
Krinkle added a subscriber: Krinkle.

Still seen in production today:

[XYq8LwpAICoAADtT-UcAAABW] /rpc/RunSingleJob.php   LogicException from line 143 of /srv/mediawiki/php-1.34.0-wmf.23/includes/jobqueue/JobQueueGroup.php: Domain 'ukwikimedia' is not recognized.
mobrovac added a subscriber: mobrovac.

The error comes from the GlobalUsageCachePurgeJob job, itself triggered by a file upload on commons: https://commons.wikimedia.org/w/index.php?title=Special:Upload&wpDestFile=Flag_of_the_Federated_Malay_States_%281895_-_1946%29.svg&wpForReUpload=1, so I guess there are still some references to ukwm somewhere.

Okay, so what can we do about it? Who should take the next step?

In the interim, should deployers ignore all fatal errors mentioning ukwikimedia and assume it can't come up in more severe situations? The number of patterns to ignore is getting quite high that it's hard to tell how many new errors are getting silenced due to containing a phrase from an "unimportant" old issue.

The job queue is still seeing new jobs for ukwikimedia that it tries to, (but fails) to run:

Domain 'ukwikimedia' is not recognized.
#0 /srv/mediawiki/php-1.35.0-wmf.15/extensions/GlobalUsage/includes/GlobalUsageCachePurgeJob.php(56): JobQueueGroup->push(array)
#1 /srv/mediawiki/php-1.35.0-wmf.15/extensions/EventBus/includes/JobExecutor.php(70): GlobalUsageCachePurgeJob->run()
#2 /srv/mediawiki/rpc/RunSingleJob.php(76): JobExecutor->execute(array)

Tentatively reducing scope as it seems that all closing formalities for multiversion and wikiconfig etc are done. It's just something (not sure what) in JobQueue land and/or GlobalUsage that needs to be cleared.

Krinkle moved this task from Untriaged to Meta on the WMF-JobQueue board.Mar 6 2020, 11:08 PM
Dzahn removed a subscriber: Dzahn.Apr 29 2020, 9:03 AM

The job queue is still seeing new jobs for ukwikimedia that it tries to, (but fails) to run:

Domain 'ukwikimedia' is not recognized.
#0 /srv/mediawiki/php-1.35.0-wmf.15/extensions/GlobalUsage/includes/GlobalUsageCachePurgeJob.php(56): JobQueueGroup->push(array)
#1 /srv/mediawiki/php-1.35.0-wmf.15/extensions/EventBus/includes/JobExecutor.php(70): GlobalUsageCachePurgeJob->run()
#2 /srv/mediawiki/rpc/RunSingleJob.php(76): JobExecutor->execute(array)

Still seen in prod and affecting error levels.

Requesting re-triage of a live production error still seen one year later. This is distracting health monitoring of the Job Queue and MediaWiki overall, and for most people it's not obvious that this is due to the wiki no longer existing. It's rare enough that when it pops up, it's concievable something recent may've caused it.

Krinkle renamed this task from Review removal of ukwikimedia wiki to Finish removal of ukwikimedia wiki.Oct 26 2020, 5:54 PM
BPirkle added a subscriber: BPirkle.Nov 5 2020, 2:36 PM

Notes from a verbal discussion regarding how to approach this task:

The Delete_a_wiki page page may contain some helpful information. It is possible that those steps were not all followed, or were not completely successful. Warning: that page may be a bit dated, so watch out for things that have changed since that page was last updated.

From that page, this seemed a likely step that could help:

DELETE FROM globalimagelinks WHERE gil_wiki='wikidb';

Ran this against commonswiki:

SELECT COUNT(*) FROM globalimagelinks WHERE gil_wiki='ukwikimedia';

And found 4030 rows.

If that doesn't help, it might also be productive to review the GlobalUsage code and see what might be queuing these jobs.

Sounds good to me. I'd say, just do it! Perhaps next week when there's more folks around to watch for any unexpected fall out. Also it wouldn't hurt to e.g. dump the output of SELECT * for that query in a file of sorts before running the DELETE from sql commosnwiki on mwmaint1002. Then keep that around for a week or two for relatively easy restore in case we need it whilst figuring out what to do.

Mentioned in SAL (#wikimedia-operations) [2021-02-05T02:03:18Z] <Krinkle> krinkle@mwmaint1002 Prune globalimagelinks references on s4 database for the deleted ukwikimedia wiki, ref T218170.

Krinkle closed this task as Resolved.Fri, Feb 5, 2:05 AM
Krinkle claimed this task.
Check
krinkle@mwmaint$ sql commonswiki
# Read only

(commonswiki)> SELECT COUNT(*) FROM globalimagelinks WHERE gil_wiki='ukwikimedia';
+----------+
| COUNT(*) |
+----------+
|     4030 |
+----------+
1 row in set (0.00 sec)

wikiadmin@10.64.48.232(commonswiki)> SELECT * FROM globalimagelinks WHERE gil_wiki='ukwikimedia' LIMIT 2;
+-------------+----------+-----------------------+--------------------+----------------+---------------+
| gil_wiki    | gil_page | gil_page_namespace_id | gil_page_namespace | gil_page_title | gil_to        |
+-------------+----------+-----------------------+--------------------+----------------+---------------+
| ukwikimedia |        5 |                    10 | Template           | Notice         | Info_icon.svg |
| ukwikimedia |        8 |                     0 |                    | Water_cooler   | Archives.png  |
+-------------+----------+-----------------------+--------------------+----------------+---------------+
2 rows in set (0.00 sec)

> exit
Delete
krinkle@mwmaint$ sql commonswiki --write

wikiadmin@10.64.48.124(commonswiki)> DELETE FROM globalimagelinks WHERE gil_wiki='ukwikimedia' LIMIT 1;
Query OK, 1 row affected, 1 warning (0.00 sec)

wikiadmin@10.64.48.124(commonswiki)> DELETE FROM globalimagelinks WHERE gil_wiki='ukwikimedia';
Query OK, 4029 rows affected (0.26 sec)