Page MenuHomePhabricator

Specific revisions of multiple files missing from Swift - 404 Not Found returned
Open, NormalPublic

Description

The previous revision of https://commons.wikimedia.org/wiki/File:Clinton_power_station_1.jpg can not be accessed or found.

https://upload.wikimedia.org/wikipedia/commons/archive/7/7d/20080122163443%21Clinton_power_station_1.jpg:

404 Not Found
The resource could not be found.
File not found: /v1/AUTH_mw/wikipedia-commons-local-public.7d/archive/7/7d/20080122163443%21Clinton_power_station_1.jpg

See also:

  • T41615 (404 error for all revisions of some files)

Event Timeline

Josve05a raised the priority of this task from to Needs Triage.
Josve05a updated the task description. (Show Details)
Josve05a added a subscriber: Josve05a.
Restricted Application added a project: Multimedia. · View Herald TranscriptJan 19 2016, 10:05 PM
Restricted Application added subscribers: StudiesWorld, Steinsplitter, Aklapper. · View Herald Transcript
Josve05a updated the task description. (Show Details)Jan 19 2016, 10:06 PM
Josve05a set Security to None.
Restricted Application added a subscriber: Matanya. · View Herald TranscriptJan 19 2016, 10:06 PM
Bawolff added a subscriber: aaron.Jan 19 2016, 10:16 PM
Aklapper renamed this task from File not found (404 - The resource could not be found) to Specific revision of file on Commons triggers 404 (not found).Jan 20 2016, 1:36 PM
Josve05a moved this task from Backlog to Tasks to follow on the User-Josve05a board.
Steinsplitter moved this task from Incoming to Backlog on the Commons board.Jan 22 2016, 5:59 PM
Poyekhali renamed this task from Specific revision of file on Commons triggers 404 (not found) to Specific revision of a file on Commons triggers 404 (not found).Apr 13 2016, 4:00 AM
Poyekhali triaged this task as High priority.
Poyekhali updated the task description. (Show Details)
Poyekhali added a subscriber: Poyekhali.

Same with https://upload.wikimedia.org/wikipedia/en/archive/9/9a/20080218114104%21Internet_Download_Accelerator.png vs https://en.wikipedia.org/wiki/File:Internet_Download_Accelerator.png

404 Not Found

The resource could not be found.

File not found: /v1/AUTH_mw/wikipedia-en-local-public.9a/archive/9/9a/20060321114104%21Internet_Download_Accelerator.png
Josve05a updated the task description. (Show Details)May 21 2016, 1:08 AM
Josve05a assigned this task to aaron.Jul 31 2016, 10:58 PM
aaron removed aaron as the assignee of this task.Aug 2 2016, 12:13 AM
aaron added a project: Multimedia.
Josve05a added a comment.EditedAug 11 2016, 7:03 PM

And first revision of https://en.wikipedia.org/wiki/File:Parisian.PNG is also missing

https://upload.wikimedia.org/wikipedia/en/archive/6/68/20070717000058%21Parisian.PNG;

File not found: /v1/AUTH_mw/wikipedia-en-local-public.68/archive/6/68/20070717000058%21Parisian.PNG

Josve05a renamed this task from Specific revision of a file on Commons triggers 404 (not found) to Specific revision of a file triggers 404 (not found).Aug 11 2016, 7:03 PM

First revision of https://en.wikipedia.org/wiki/File:Greater_London_Authority_logo.png as well

https://upload.wikimedia.org/wikipedia/en/archive/7/78/20080828203602%21Greater_London_Authority_logo.png

File not found: /v1/AUTH_mw/wikipedia-en-local-public.78/archive/7/78/20080828203602%21Greater_London_Authority_logo.png

Josve05a renamed this task from Specific revision of a file triggers 404 (not found) to Specific revision of a files triggers 404 (not found).Aug 11 2016, 8:45 PM
Josve05a renamed this task from Specific revision of a files triggers 404 (not found) to Specific revisions of multiple files triggers 404 (not found).

Image https://commons.wikimedia.org/wiki/File:Cactaceae_(1082183341).jpg missing (there's only one revisions). File was uploaded 01:25, 7 June 2016

The 120px thumbnail is there, though.

Platonides renamed this task from Specific revisions of multiple files triggers 404 (not found) to Specific revisions of multiple files missing from Swift - 404 Not Found returned.Sep 16 2016, 4:46 PM

This problems seems to increase. There was one thread on the the English VP today, and two on the German Forum.

Perhelion updated the task description. (Show Details)Nov 29 2016, 9:55 AM
Fae awarded a token.Nov 29 2016, 1:53 PM
Fae added a subscriber: Fae.
MarkTraceur lowered the priority of this task from High to Normal.Dec 2 2016, 9:57 PM
MarkTraceur moved this task from Untriaged to Triaged on the Multimedia board.
MarkTraceur added a subscriber: MarkTraceur.

I don't see this as "high" priority, but I'm willing to be convinced otherwise. Since these are not current versions of files, it doesn't seem to affect most day-to-day work.

It doesn't seem like these files have anything in common. There are PNGs and JPGs, different file sizes, different resolutions, different upload dates...and different re-upload dates, too. There are even files from at least two different wikis.

Can anyone offer any insight into the similarities between these files?

First revision of https://commons.wikimedia.org/wiki/File:J.J._Burns_NSRW1-0009.jpg as well

https://upload.wikimedia.org/wikipedia/commons/archive/b/b8/20071019085735%21J.J._Burns_NSRW1-0009.jpg

File not found: /v1/AUTH_mw/wikipedia-commons-local-public.b8/archive/b/b8/20071019085735%21J.J._Burns_NSRW1-0009.jpg

To be honest, I find priority "normal" to be worrying. If potentially unrecoverable data loss is not a high priority then what is?

@Srittau: Feel free to elaborate why this task is more urgent than other tasks on the project workboards. "Data loss" in general sounds like a severity categorization. Priority rather depends on how often and recent things happen I'd say...

Dzahn added a subscriber: Dzahn.Dec 13 2017, 7:38 PM

14:24 < Dragonfly6-7> https://commons.wikimedia.org/wiki/File:Burbuja_(1496994920).jpg does anyone else see this as broken?
14:25 < mutante> Dragonfly6-7: yea, confirmed. https://upload.wikimedia.org/wikipedia/commons/2/2e/Burbuja_%281496994920%29.jpg is 404

Dzahn added a comment.Dec 13 2017, 7:39 PM

14:24 < Dragonfly6-7> https://commons.wikimedia.org/wiki/File:Burbuja_(1496994920).jpg does anyone else see this as broken?
14:25 < mutante> Dragonfly6-7: yea, confirmed. https://upload.wikimedia.org/wikipedia/commons/2/2e/Burbuja_%281496994920%29.jpg is 404

Aklapper added a subscriber: Jojr149.

@Jojr149: Please do not remove user projects from tasks.

AlexisJazz added a subscriber: AlexisJazz.

Copied from my duplicate task:

https://commons.wikimedia.org/wiki/File:Accuracy_International_Arctic_Warfare_-_Psg_90.jpg

First revision is from 23:38, 1 December 2006 but the file does not exist. Current revision from 23:33, 3 November 2008 is fine though. (fine is a big word, the quality is shitty, but Wikimedia can't help that)

I tried 00 to ff for https://upload.wikimedia.org/wikipedia/commons/2/2e/Burbuja_%281496994920%29.jpg (guessing it might be misplaced) but it's just not there. https://commons.wikimedia.org/w/api.php?action=query&list=allimages&aisha1=5c898098c147202d918a63e2b3731b3eb9779051 does work, so the upload was fine.

Here is a list things, may not all be related to this issue.

https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Mohit_Manke.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Kleinbardorf_Judenh%C3%BCgel_032.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:User_record_for_the_mangle_in_Zaschendorf.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:TehsilPirmahalz_8.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Sysadmin_emc_unisphere.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Voortrekker_Monument,_Pretoria.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:US_Navy_110607-N-KB666-173_Multination_divers_from_the_Caribbean_conduct_a_hull_search_of_a_suspicious_vessel_during_a_training_exercise_with_Mobile.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Terezinha_Guilhermina_-_2013_IPC_Athletics_World_Championships-2.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Lavoir_08313.JPG
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Louis_Payette_Pierre_tombale.JPG
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:M%C4%9Bsto_ve_h%C5%99e.png
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Kit_left_arm_palmeiras13c.png
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Trippa_e_patate_in_Calabria.JPG
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Psophocarpus_tetragonolobus_in_Hainan_-_02.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Saint_Joseph_and_Jesus_Christ_as_a_baby,_Bas%C3%ADlica_da_Estrela,_Lisbon.jpg

https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:L%27%C3%A9glise_de_Sadirac.JPG
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Anne_Paugam,_directrice_g%C3%A9n%C3%A9rale_de_l%27AFD.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/Image:Gauteng_eastrand_map.jpg

https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Burghausen05.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Z%C3%A1smuky,_Pol%C3%A1kova.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:HC-081_(Hex-CruZ)_2014-04-17_22-17.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Rio_limon_venezuela31.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Schladming_wm2013_1881_13-02-07.JPG
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:2010-08-04_(27)_Purpur-Fruchtwanze,_Carpocoris_purpureipennis.test.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Charlie_und_die_Schokoladenfabrik_(Film)_Logo.png
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:B02_Ecuador_012_small_village_at_Rio_Misahuall%C3%AD,_February_1985.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Adam-Stegerwald-Haus_in_K%C3%B6nigswinter.JPG
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Rio_limon_venezuela27.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Cmd_prompt_Befehle.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Sphagnum_girgensohnii_(a,_150137-481740)_7191.JPG
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Olof_Hellstr%C3%B6m_Kluven.JPG
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Capuchin_monastery_in_Lubart%C3%B3w,_Poland.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:0_Domus_Augustana_-_Mt_Palatin_-_Rome_(1).JPG
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:A_picture_of_Rachel_Michelle_King-_2014-04-17_16-20.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Bollywood_Poster_Shop,_Chor_Bazaar.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Klagenfurt_Heuplatz_3_14072009_55.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:HSchBtl_108_(V1).jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:CDs-P-10256_Alfio_Giuffrida-AG_Sinnwerke.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Javanese_illumination_of_an_unknown_text.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Nehemiya,_au_stand_de_Wikipedia_au_forum_mondial_de_la_langue_francaise.JPG
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Estas_son_las_calles_de_barrio_Mar%C3%ADa_Auxiliadora_Concepcion_del_Uruguay_Entre_Rios.png
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Picking_strawberries_Mr_Pyiiuui%27s_orchard,_Kelow-na,_British_Columbia_(HS85-10-21212)_original.tif
https://commons.wikimedia.org/wiki/Commons:Village_pump/Archive/2016/11 (https://commons.wikimedia.org/wiki/File:Denkmal2016-St.Petersburg-Drache_02.jpg, https://commons.wikimedia.org/wiki/File:Leko_H%C3%BCbner_2000_Dortmund.jpg, https://commons.wikimedia.org/wiki/File:Bamboo_bridge_over_Nam_Khan_LP.jpg)
https://commons.wikimedia.org/wiki/Commons:Forum/Archiv/2016/January (https://commons.wikimedia.org/wiki/File:Brotjacklriegel_FMT.jpg) (T125140 ?)

https://commons.wikimedia.org/wiki/Category:Files_with_404_errors
From this category: https://commons.wikimedia.org/wiki/File:Cimbriani_shield_pattern.svg (png thumbnail works, cached probably, svg missing)

Glitch?
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Long_Beach_Comic_%26_Horror_Con_2011_-_Aquamen_(6301708270).jpg

Possibly different case, "A handful of them got corrupted on upload." (but should that have resulted in 404?)
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Australian_Census_2011_demographic_map_-_New_South_Wales_by_POA_-_BCP_field_0147_Visitor_from_Different_SA2_in_Victoria_Age_0_14_years.svg

Unclear:
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Alfred_Bunge.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/Image:Grossesschauspielhaus.jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:IM0G040002.JPG
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:RusEmb_-_Phnom_Penh.JPG
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:0_Combretum_fruticosum_1,_Bs_As_(E_Haene).jpg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Hulluch_-_Cit%C3%A9_n%C2%B0_13_de_Lens_(02).JPG
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Homesteadcommercial.ogg
https://commons.wikimedia.org/wiki/Commons:Deletion_requests/File:Cartogram_of_countries_by_number_of_Nobel_Laureates.svg
https://commons.wikimedia.org/wiki/Commons:Upload_Wizard_feedback/Archive/2014/02 (https://commons.wikimedia.org/wiki/File:WM.Sollnerstr.41.jpg)
https://commons.wikimedia.org/wiki/Commons:Administrators%27_noticeboard/Archive_55 (https://commons.wikimedia.org/wiki/File:Temde.JPG)

Dzahn removed a subscriber: Dzahn.Jun 21 2018, 5:18 AM

https://commons.wikimedia.org/wiki/File:Chris_Benoit_in_the_Ring.jpg

7 previous revisions missing, maybe revdel worked like this in 2007?

Original of https://commons.wikimedia.org/wiki/File:ATX-Netzteil.jpg is missing too: File not found: /v1/AUTH_mw/wikipedia-commons-local-public.e9/e/e9/ATX-Netzteil.jpg

Original of https://commons.wikimedia.org/wiki/File:ATX-Netzteil.jpg is missing too: File not found: /v1/AUTH_mw/wikipedia-commons-local-public.e9/e/e9/ATX-Netzteil.jpg

I've reuploaded it to https://commons.wikimedia.org/wiki/File:HEC_350W_ATX_power_supply.jpg because it was in use. Commons now reports it as duplicate. It remembers the hash, but the original file is gone.

The most suspicious development in this story is that these are Wikimedia users—some not even sysops on any wiki where a revision is missing—try to trace causes of the bug, whereas all people having shell access to Wikimedia servers are silent. Are files really missing from directories or only unreadable for some reason (permissions, file-system damage, I/O error… )? For files which are missing, how recently did they exist in the past? Which procedures (such as moving files from one storage to another) could potentially cause the data loss?

@Incnis_Mrsi at T198177 I also found some missing revisions. But the things that take me hours could be done in seconds with shell access.

Tgr added a subscriber: Tgr.Nov 30 2018, 6:04 PM

But the things that take me hours could be done in seconds with shell access.

How do you think that would happen? If you have specific steps in mind, maybe we could expose those to power users. I can't think of any investigation that would be much more complicated without shell access though.

AlexisJazz added a comment.EditedNov 30 2018, 6:23 PM

But the things that take me hours could be done in seconds with shell access.

How do you think that would happen? If you have specific steps in mind, maybe we could expose those to power users. I can't think of any investigation that would be much more complicated without shell access though.

For T198177 I made hundreds of requests for filenames like https://upload.wikimedia.org/wikipedia/commons/archive/4/42/20180626085337%21WikipediaTomeRaider3.png using an educated guess for the approximate timestamp, then bruteforcing it.

Just typing "locate Clinton_power_station_1.jpg" and "locate ATX-Netzteil.jpg" would be way faster. (my revision finder didn't work for those anyway, but the files may still exist elsewhere.. I just don't know where.

Tgr added a comment.Nov 30 2018, 7:35 PM

Swift has filename prefix search but not substring search AFAIK.