Page MenuHomePhabricator

Spurious completely empty `image` table row on commonswiki
Closed, ResolvedPublic

Description

There is a spurious completely empty image table row in the commonswiki database.

mysql:research@analytics-store.eqiad.wmnet [commonswiki]> select * from image where img_media_type is null;
+----------+----------+-----------+------------+--------------+----------+----------------+----------------+----------------+-----------------+----------+---------------+---------------+----------+-------------+
| img_name | img_size | img_width | img_height | img_metadata | img_bits | img_media_type | img_major_mime | img_minor_mime | img_description | img_user | img_user_text | img_timestamp | img_sha1 | img_deleted |
+----------+----------+-----------+------------+--------------+----------+----------------+----------------+----------------+-----------------+----------+---------------+---------------+----------+-------------+
|          |        0 |         0 |          0 |              |        0 | NULL           | unknown        | unknown        |                 |        0 |               |               |          |           0 |
+----------+----------+-----------+------------+--------------+----------+----------------+----------------+----------------+-----------------+----------+---------------+---------------+----------+-------------+
1 row in set (0.00 sec)

I feel it would be nice to have it deleted. There's a special hack for it in our code, in SpecialMediaStatistics.php (workaround for T74535):

class MediaStatisticsPage extends QueryPage {
	...
	public function getQueryInfo() {
		...
			'conds' => [
				// WMF has a random null row in the db
				'img_media_type IS NOT NULL'
			],
		...

I didn't check if there are any in other databases.

Event Timeline

matmarex created this task.Jan 19 2017, 7:41 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 19 2017, 7:41 PM

Hello,

That is indeed present on all the servers for commonswiki.
It is also present at enwiki for instance.
I have checked other big wikis like:

frwiki
jawiki
ruwiki
eswiki
ukwiki

And it is not there.

Just to try to understand the whole picture, what is the impact of having that empty row there - what will be fixed if we get rid of it? if we delete will it appear again?

Thanks!

Just to try to understand the whole picture, what is the impact of having that empty row there - what will be fixed if we get rid of it? if we delete will it appear again?

It's corrupted and the software doesn't handle it correctly. I don't think anything explodes because of it, but it messed up the interface in a few places – for example, it also appears as the empty table row at the bottom of https://commons.wikimedia.org/w/index.php?title=Special:ListFiles&dir=prev.

It shouldn't reappear (empty string is not a valid file name; MediaWiki shouldn't allow it to be created), although we don't know what caused it to appear there in the first place – might have been a MediaWiki bug years ago, might have been some broken database operation?

Let's just delete it? Seems similar to T96233.

Let's just delete it? Seems similar to T96233.

If you guys consider it is safe to delete, go ahead, but please remember to use mediawiki for this. We shouldn't touch the database records manually as it could create inconsistencies we are not aware of.

If you guys consider it is safe to delete, go ahead, but please remember to use mediawiki for this. We shouldn't touch the database records manually as it could create inconsistencies we are not aware of.

I don't think it's possible to use MediaWiki APIs for this. The title for this image ("File:") is invalid, so MediaWiki will refuse to have anything to do with it (this just shows a bad title error: https://commons.wikimedia.org/wiki/File:?action=delete). And even if it wasn't, the actual file and the actual page don't exist, so it might not be possible to attempt deleting them.

(Also, looking for related things, I found T24227 which maybe offers some hints on how this row came to be – "in my case this was a consequence from a failed/aborted file upload in a very old REL_1.4 MediaWiki from 2005 ...")

If you guys consider it is safe to delete, go ahead, but please remember to use mediawiki for this. We shouldn't touch the database records manually as it could create inconsistencies we are not aware of.

I don't think it's possible to use MediaWiki APIs for this. The title for this image ("File:") is invalid, so MediaWiki will refuse to have anything to do with it (this just shows a bad title error: https://commons.wikimedia.org/wiki/File:?action=delete). And even if it wasn't, the actual file and the actual page don't exist, so it might not be possible to attempt deleting them.
(Also, looking for related things, I found T24227 which maybe offers some hints on how this row came to be – "in my case this was a consequence from a failed/aborted file upload in a very old REL_1.4 MediaWiki from 2005 ...")

Thanks for the explanation!
My comment was more on the way of using some sort of mediawiki middleware (maintenance scripts?) in order to access the DB, to avoid going to the DB and executing the DELETE manually as it can be really dangerous if we simply just delete the row without doing any other consistency/code checks.

Ottomata assigned this task to matmarex.Mar 6 2017, 6:58 PM
Ottomata added a subscriber: Ottomata.

Triaging, feel free to re-assign.

I ran this query: P5085 to find out if any other wikis are affected – and yes, enwiki has the same problem. https://en.wikipedia.org/w/index.php?title=Special:ListFiles&dir=prev

So, I propose to execute the following queries on production databases:

DELETE FROM commonswiki.image WHERE img_name = '';
DELETE FROM enwiki.image WHERE img_name = '';

I do not have production access. Could anyone do this?

jcrespo triaged this task as Low priority.Mar 22 2017, 8:03 PM
jcrespo moved this task from Triage to Backlog on the DBA board.
matmarex removed matmarex as the assignee of this task.Apr 5 2017, 3:32 PM

There is nothing else I can do myself to resolve this. I do not have the access to run the two queries I posted in the last comment. Please, could someone with the access do it?

jcrespo added a subscriber: jcrespo.Apr 5 2017, 3:38 PM

This was classified as a low priority task. It will be eventually done, do not worry, it is not forgotten, but at the cost of other, more important pending tasks :-).

For the record, I checked the "consistency" of that row across s4 (commons) and s1 (enwiki), and to make sure at least it is present on all the servers in a consistent way (I am specially worried for those running row based replication).

demon added a subscriber: demon.Apr 21 2017, 7:29 PM

For the record, I checked the "consistency" of that row across s4 (commons) and s1 (enwiki), and to make sure at least it is present on all the servers in a consistent way (I am specially worried for those running row based replication).

If this is consistent everywhere, let's just do the DELETE and move on (I volunteer)

A one-off maintenance script for this is kinda ugly, it shouldn't have ever happened to begin with, it's just 1 row, and shouldn't come back again.

jcrespo added a comment.EditedApr 21 2017, 7:54 PM

(I volunteer)

Please don't. A null image should be fake-uploaded and then deleted on each wiki.

demon added a comment.Apr 21 2017, 8:23 PM

That doesn't even even make sense? You cannot upload an image with that filename, it would be impossible to delete.

Reedy closed this task as Resolved.Apr 26 2017, 1:44 PM
Reedy claimed this task.
Reedy added a subscriber: Reedy.
mysql:wikiadmin@db1084 [commonswiki]> select * from page where page_title = '';
Empty set (27.45 sec)

There's no page to go with it, so there's no way this row is referenced.

mysql:wikiadmin@db2019 [commonswiki]> select * from image where img_name = '';
+----------+----------+-----------+------------+--------------+----------+----------------+----------------+----------------+-----------------+----------+---------------+---------------+----------+-------------+
| img_name | img_size | img_width | img_height | img_metadata | img_bits | img_media_type | img_major_mime | img_minor_mime | img_description | img_user | img_user_text | img_timestamp | img_sha1 | img_deleted |
+----------+----------+-----------+------------+--------------+----------+----------------+----------------+----------------+-----------------+----------+---------------+---------------+----------+-------------+
|          |        0 |         0 |          0 |              |        0 | NULL           | unknown        | unknown        |                 |        0 |               |               |          |           0 |
+----------+----------+-----------+------------+--------------+----------+----------------+----------------+----------------+-----------------+----------+---------------+---------------+----------+-------------+
1 row in set (0.01 sec)

mysql:wikiadmin@db2019 [commonswiki]> delete from image where img_name = '';
Query OK, 1 row affected (0.01 sec)

Sooo...

mysql:wikiadmin@db2070 [enwiki]> select * from page where page_title = '';
Empty set (2 min 21.67 sec)
mysql:wikiadmin@db2016 [enwiki]> select * from image where img_name = '';
+----------+----------+-----------+------------+--------------+----------+----------------+----------------+----------------+-----------------+----------+---------------+---------------+----------+-------------+
| img_name | img_size | img_width | img_height | img_metadata | img_bits | img_media_type | img_major_mime | img_minor_mime | img_description | img_user | img_user_text | img_timestamp | img_sha1 | img_deleted |
+----------+----------+-----------+------------+--------------+----------+----------------+----------------+----------------+-----------------+----------+---------------+---------------+----------+-------------+
|          |        0 |         0 |          0 |              |        0 | NULL           | unknown        | unknown        |                 |        0 |               |               |          |           0 |
+----------+----------+-----------+------------+--------------+----------+----------------+----------------+----------------+-----------------+----------+---------------+---------------+----------+-------------+
1 row in set (0.04 sec)

mysql:wikiadmin@db2016 [enwiki]> delete from image where img_name = '';
Query OK, 1 row affected (0.00 sec)

And bang! The dirt is gone!

Probably can clean up SpecialMediaStatistics now...

I checked, and no other WMF wiki has this problem

Change 350434 had a related patch set uploaded (by Bartosz Dziewoński):
[mediawiki/core@master] SpecialMediaStatistics: Remove WMF-specific hack, no longer needed

https://gerrit.wikimedia.org/r/350434

Change 350434 merged by jenkins-bot:
[mediawiki/core@master] SpecialMediaStatistics: Remove WMF-specific hack, no longer needed

https://gerrit.wikimedia.org/r/350434