Page MenuHomePhabricator

Files with "UNKNOWN" img_media_type
Closed, ResolvedPublic

Description

There are 20 BMP files with "UNKNOWN" img_media_type on enwiki, see https://en.wikipedia.org/wiki/Special:MediaStatistics.

They're all super old.

Event Timeline

matmarex created this task.Mar 29 2016, 2:49 PM
Restricted Application added projects: Multimedia, Commons. · View Herald TranscriptMar 29 2016, 2:49 PM
Restricted Application added subscribers: Steinsplitter, Aklapper. · View Herald Transcript
mysql:research@analytics-store.eqiad.wmnet [enwiki]> select img_name, img_timestamp from image where img_media_type='UNKNOWN';
+-----------------------------------+----------------+
| img_name                          | img_timestamp  |
+-----------------------------------+----------------+
| Escudokolnsburgrad.bmp            | 20071107235841 |
| Flat_chisel.bmp                   | 20071006011721 |
| Julianity.bmp                     | 20080130185940 |
| Kilcoocrest.bmp                   | 20070928191324 |
| Leave_mes_talk.bmp                | 20080114010558 |
| Logon.bmp                         | 20080115223846 |
| Magneprop.bmp                     | 20071201003812 |
| Moatcclogo.bmp                    | 20071116234455 |
| New_Philanthropy_Capital_logo.bmp | 20071025114753 |
| PaulBrowntree.bmp                 | 20080126032848 |
| Router1.bmp                       | 20071217164542 |
| Router2.bmp                       | 20071217164952 |
| Sgbau_logo.bmp                    | 20071210214013 |
| Signposts.bmp                     | 20080114010123 |
| SpeaksEnglish.bmp                 | 20071013024542 |
| StAndrewsManagement.bmp           | 20071203155709 |
| Timeline-191_map2-2.bmp           | 20070926212447 |
| Timeline-191_map2-4.bmp           | 20071013204657 |
| Total_Control.bmp                 | 20070922204747 |
| Triple_Padlock.bmp                | 20080125011145 |
+-----------------------------------+----------------+
Restricted Application added a subscriber: Matanya. · View Herald TranscriptMar 29 2016, 2:49 PM

There's a bunch more on other wikis, not just BMP, but just about every kind of file type, really. Not posting the full list, since some of the wikis are private.

WikiCount of UNKNOWN img_media_type
aawiki1
afwiki1
arwiki4
bgwiki1
boardwiki1
cawiki1
collabwiki15
emlwiki1
enwiki20
enwikiversity3
etwiki8
fiwiki6
fiwikibooks1
frwiki3
hewiki9
hewiktionary1
hrwiki22
huwiki21
internalwiki12
iswiki1
itwiki1
jawiki3
kmwiki6
kywiki1
ltwiki3
metawiki4
mkwiki9
mrwiki1
mswiki1
nowiki1
officewiki27
ptwikibooks1
rowiki1
ruwiki5
ruwiktionary1
siwiki9
skwiki1
slwiki5
test2wiki2
thwiki4
trwiki6
ukwiki3
urwiki1
viwiki4
viwikibooks2
wikimania2006wiki12
zhwiki5
matmarex claimed this task.EditedMar 29 2016, 3:58 PM

There's already a maintenance script which can fix this (refreshImageMetadata.php), but it needs a small tweak.

Change 280246 had a related patch set uploaded (by Bartosz Dziewoński):
refreshImageMetadata: Allow filtering by 'img_media_type' too

https://gerrit.wikimedia.org/r/280246

matmarex renamed this task from BMP files with "UNKNOWN" img_media_type to Files with "UNKNOWN" img_media_type.Mar 29 2016, 4:15 PM

After the above is merged, we should run php maintenance/refreshImageMetadata.php --mediatype=UNKNOWN --force on each of the 47 wikis listed above (T131157#2157759).

greg added a subscriber: greg.Mar 29 2016, 4:41 PM
greg added a comment.Mar 29 2016, 6:11 PM

After the above is merged, we should run php maintenance/refreshImageMetadata.php --mediatype=UNKNOWN --force on each of the 47 wikis listed above (T131157#2157759).

Seeing as Matma doesn't have deploy access, and @aaron is a reviewer of the patch, I'd say when it's ready @aaron can deploy/run the main script during a window that makes sense for him (and doesn't conflict with anything else going on). OK, @aaron?

Change 280246 merged by jenkins-bot:
refreshImageMetadata: Allow filtering by 'img_media_type' too

https://gerrit.wikimedia.org/r/280246

matmarex triaged this task as Normal priority.Apr 5 2016, 10:01 PM

@aaron, when can we do this?

aaron added a comment.Apr 11 2016, 5:57 PM

I literally started it 3 minutes ago, should be done soon.

aaron added a comment.Apr 11 2016, 6:17 PM

I literally started it 3 minutes ago, should be done soon.

Done.

matmarex closed this task as Resolved.Apr 11 2016, 6:25 PM

Thanks! That fixed almost all of them, closing.

I re-ran my query and there are four more unknown media type files: