Page MenuHomePhabricator

Some files had disappeared from Commons after renaming
Open, HighPublic

Event Timeline

Wieralee created this task.Sep 8 2015, 6:14 PM
Wieralee raised the priority of this task from to Needs Triage.
Wieralee updated the task description. (Show Details)
Wieralee added a project: Commons.
Wieralee added a subscriber: Wieralee.
Restricted Application added subscribers: Steinsplitter, Aklapper. · View Herald TranscriptSep 8 2015, 6:14 PM

I can only note that I'm seing the same things in the colsole as I do with T111815: SVG files larger than 10 MB cannot be thumbnailed

Restricted Application added a subscriber: Matanya. · View Herald TranscriptSep 8 2015, 10:02 PM
Josve05a triaged this task as Unbreak Now! priority.Sep 8 2015, 10:09 PM

Triaging to Unbreak Now! since this is causing a lot of problems.

Question: Is it happening to all file-moves or just some? Something changed recently in some code? Regression?

Tgr added a subscriber: Tgr.Sep 8 2015, 11:17 PM

Probably some kind of Swift error, and the files need to be renamed manually. See T71311.

Steinsplitter moved this task from Incoming to Backlog on the Commons board.Sep 9 2015, 4:44 PM

This is a well known long term problem.

Wondering why this isn't fixed yet (after years and multiple reports)

It is difficult to explain. During renaming I don't see any problems. Everything is O.K.

But after some time, when I woulded like to see one of that files, I had noticed they disappeared. Maybe cache in my computer remember the file and I don't see errors at once.

Jdforrester-WMF moved this task from Untriaged to Backlog on the Multimedia board.Sep 10 2015, 1:17 AM
Jdforrester-WMF set Security to None.
Wieralee added a comment.EditedSep 10 2015, 6:40 PM

An uploader of this files is claiming to "give him his files back". I have found small versions of these photos in google. I don't know if I should upload them or wait?

These files are quite old. Do we have any Commons archive to get this files back?

What to do?

Ankry added a subscriber: Ankry.Sep 10 2015, 8:31 PM
Tgr added a comment.Sep 10 2015, 8:47 PM

Often the files still exist on disk, they just have a different name than in the DB so the system cannot find them.
Do you know which files the uploader is talking about?

I wrote it at the beginning...

"Some files had disappeared from Commons after renaming. Purging does not work.

File:Днестровский горсовет.jpg
File:Улица 25 Октября.jpg
File:Парк Победа-детский паровозик.jpg
File:Когенерационная установка ООО ТираспольТрансгаз.jpg
File:Станция Бендеры 1 (вид с пешеходного моста).jpg
File:А.С.Щерба 2 сентября 2011 года.jpg
File:Памятник Синёву В.Г. - Донор.jpg
File:Памятник первой электростанции в Молдавии - Донор.jpg
File:Бендеры 2 расписание поездов.jpg
File:В.И.Атаманюк во время парада.jpg

I don't know what happened..."

Aklapper lowered the priority of this task from Unbreak Now! to High.Sep 11 2015, 12:09 PM

Nobody answers...

Is there any back-up-copy of Commons files?

What should I do in this situation? Upload miniatures I have? Wait?

Nobody answers...

Well, there was an answer in T111838#1618666 I'd say.

Ankry added a comment.Sep 12 2015, 4:54 AM

Renaming the files back does not help, see File:Днестровский горсовет.jpg. They are still broken after that.

Another hint?

Well, there was an answer in T111838#1618666 I'd say.

I'm not an IT programist. I don't understand what has been written there.

@Aklapper , the method described in T111838#1618666 does not work in this case as I mentioned above.

Any other hints?

@Tgr , Who is able to find where the missing files are and to retrieve them? definitely only technical guys, not a non-sysop wiki user. Moving the files back DOES NOT WORK in case of these files. Any other hints?

Tgr added a comment.Sep 28 2015, 9:44 PM

@Tgr , Who is able to find where the missing files are and to retrieve them? definitely only technical guys, not a non-sysop wiki user. Moving the files back DOES NOT WORK in case of these files. Any other hints?

Requires shell access on the production cluster and some experimentation (Swift doesn't allow you to do anything other than query a specific filename, so somebody would have to guess what filename the original is stuck at, if it still exists at all). Is this urgent?

Ankry added a comment.Oct 5 2015, 6:09 AM

Service that causes image loss should be dropped. So knowledge whether there is a possibility to restore the images is URGENT.

Any timeline for the rest? A month? A year? A decade? We need this to tell the uploader when they can expect the files back?

I wouldn't expect the files back at all. Nobody who could do it seems to have time to investigate whether they still exist somewhere, much less where and how to restore them. Maybe we'll have them back one day, but nobody can really tell.

aaron added a subscriber: aaron.Oct 28 2015, 6:28 PM

It would either be at the old or the new name. If it's at either then it can be accessed via the right URL (the tricky part is the /x/xy/ part of the URL, which is the first chars of md5( <filename> ).

I used to have a script to find all orphaned files in Swift (e.g. ones with no DB entry). I can't seem to find it, but there should be one in any case. If it printed out the URLs I could post the listing here.

Change 249494 had a related patch set uploaded (by Aaron Schulz):
Add script to find orphaned LocalRepo files

https://gerrit.wikimedia.org/r/249494

Change 249494 merged by jenkins-bot:
Add script to find orphaned LocalRepo files

https://gerrit.wikimedia.org/r/249494

matmarex assigned this task to aaron.Nov 3 2015, 9:12 PM
matmarex removed a project: Patch-For-Review.

So, what can we do about this now with the shiny new script? :)

aaron added a comment.Nov 5 2015, 4:28 AM

Large 5.5Mb list of ~40K orphaned files in the "public" zone for all of Commons. Files in the "deleted" zone were not scanned.

Many of them seem to be from new uploads that never succeeded beyond the file creation (the DB/page part didn't happen). These could be filtered out be removing files with no upload/move/delete log entries. Some of the orphans are actually from renames/deletes.

Each files has two content lines, the file name, and the canonical URL.

Ankry added a comment.Nov 5 2015, 9:58 AM


Large 5.5Mb list of ~40K orphaned files in the "public" zone for all of Commons. Files in the "deleted" zone were not scanned.
Many of them seem to be from new uploads that never succeeded beyond the file creation (the DB/page part didn't happen). These could be filtered out be removing files with no upload/move/delete log entries. Some of the orphans are actually from renames/deletes.
Each files has two content lines, the file name, and the canonical URL.

I have checked few random files from the list and they all were intentionally deleted files (some of them as copyvio, some as duplicates) and they should no longer be available for non-admin users.

However, while looking for the first "lost" file from the initial list at the top of this ticket it seems to be not available under any of its names (current & previous):

File:Днестровский горсовет.jpg
File:Днестровский городской совет народных депутатов donor.tiras.biz.jpg
File:Днестровский горсовет donor.tiras.biz.jpg

Note: I was looking for initial revision of this file with sha1=f153cf14cedf25550fe3128f7b829588332381f6 as can be found here:

https://commons.wikimedia.org/w/api.php?action=query&titles=File:%D0%94%D0%BD%D0%B5%D1%81%D1%82%D1%80%D0%BE%D0%B2%D1%81%D0%BA%D0%B8%D0%B9%20%D0%B3%D0%BE%D1%80%D1%81%D0%BE%D0%B2%D0%B5%D1%82.jpg&prop=imageinfo&iilimit=10&iiprop=sha1|user|size

{
  "batchcomplete": "",
  "query": {
      "pages": {
          "21009203": {
              "pageid": 21009203,
              "ns": 6,
              "title": "File:\u0414\u043d\u0435\u0441\u0442\u0440\u043e\u0432\u0441\u043a\u0438\u0439 \u0433\u043e\u0440\u0441\u043e\u0432\u0435\u0442.jpg",
              "imagerepository": "local",
              "imageinfo": [
                  {
                      "user": "Wieralee",
                      "size": 123561,
                      "width": 800,
                      "height": 531,
                      "sha1": "59cfe5604f8720c9bfed163a08ee07c934fa923a"
                  },
                  {
                      "user": "\u0414\u043e\u043d\u043e\u0440",
                      "size": 1075948,
                      "width": 1400,
                      "height": 930,
                      "sha1": "f153cf14cedf25550fe3128f7b829588332381f6"
                  }
              ]
          }
      }
  }
}

Any further hints are welcome.

jcrespo added a subscriber: jcrespo.EditedNov 5 2015, 10:05 AM

Those 3 files and the ones on the description have a space character, could it be related to: T107676, in terms of automatic character substitution?

(sorry, I didn't read the full thread, ignore my comment).

Ankry added a comment.Nov 5 2015, 10:15 AM

Just for records:

I have received information that on 8 October 2015 another file disappeared while being moved:

https://commons.wikimedia.org/wiki/Special:Undelete/File:Gen_Tellerkappe_AT.jpg

Deletion log

  (change visibility) 17:45, 8 October 2015 Denniss (talk | contribs | block) deleted page File:Gen Tellerkappe AT.jpg (File is corrupt, empty, or in a disallowed format) (view/restore) (global usage; delinker log)
  (change visibility) 17:44, 8 October 2015 Denniss (talk | contribs | block) deleted page File:Gen Tellerkappe AT.jpg (Temporary deletion for history cleaning or revision suppression) (view/restore) (global usage; delinker log)

Page history

  (change visibility) (diff) 17:45, 8 October 2015 . . Denniss (talk | contribs | block) m (561 bytes) (Denniss moved page File:Gen-GenLt OF9-8 Tellerkappe AT.jpg to File:Gen Tellerkappe AT.jpg)
  (change visibility) (diff) 17:43, 8 October 2015 . . Denniss (talk | contribs | block) m (53 bytes) (Updating redirect while processing File:Gen-GenLt OF9-8 Tellerkappe2 AT.jpg)
  (change visibility) (diff) 17:43, 8 October 2015 . . Denniss (talk | contribs | block) m (561 bytes) (Denniss moved page File:Gen-GenLt OF9-8 Tellerkappe2 AT.jpg to File:Gen-GenLt OF9-8 Tellerkappe AT.jpg over redirect)
  (change visibility) (diff) 17:31, 8 October 2015 . . Wieralee (talk | contribs | block) (54 bytes) (Updating redirect while processing File:Gen-GenLt OF9-8 Tellerkappe AT.jpg)
  (change visibility) (diff) 17:31, 8 October 2015 . . Wieralee (talk | contribs | block) (561 bytes) (Removing template; rename done)
  (change visibility) (diff) 17:31, 8 October 2015 . . Wieralee (talk | contribs | block) m (628 bytes) (Wieralee moved page File:Gen-GenLt OF9-8 Tellerkappe AT.jpg to File:Gen-GenLt OF9-8 Tellerkappe2 AT.jpg: File renaming criterion #3: To correct obvious errors in file names, including misspelled [[:en:Noun#Proper_nouns_and_co...)
  (change visibility) (diff) 17:18, 8 October 2015 . . HHubi (talk | contribs | block) (628 bytes) ((Script): Requesting renaming this file to File:Gen-GenLt OF9-8 Tellerkappe2 AT.jpg; Reason: ; Criterion 10)
  (change visibility) (diff) 17:14, 8 October 2015 . . Wieralee (talk | contribs | block) (561 bytes) (Removing template; rename done)
  (change visibility) (diff) 17:14, 8 October 2015 . . Wieralee (talk | contribs | block) (53 bytes) (Wieralee moved page File:Gen Tellerkappe AT.jpg to File:Gen-GenLt OF9-8 Tellerkappe AT.jpg: File renaming criterion #3: To correct obvious errors in file names, including misspelled [[:en:Noun#Proper_nouns_and_common_nouns|pr...)
  (change visibility) (diff) 17:14, 8 October 2015 . . Wieralee (talk | contribs | block) m (627 bytes) (Wieralee moved page File:Gen Tellerkappe AT.jpg to File:Gen-GenLt OF9-8 Tellerkappe AT.jpg: File renaming criterion #3: To correct obvious errors in file names, including misspelled [[:en:Noun#Proper_nouns_and_common_nouns|pr...)
  (change visibility) (diff) 17:14, 8 October 2015 . . HHubi (talk | contribs | block) (627 bytes) ((Script): Requesting renaming this file to File:Gen-GenLt OF9-8 Tellerkappe AT.jpg; Reason: ; Criterion 10)
  (change visibility) (diff) 17:14, 8 October 2015 . . Wieralee (talk | contribs | block) (45 bytes) (Wieralee moved page File:Gen Tellerkappe AT.jpg to File:Gen OF9-Tellerkappe AT.jpg: File renaming criterion #3: To correct obvious errors in file names, including misspelled [[:en:Noun#Proper_nouns_and_common_nouns|proper nou...)

and visible above admin operations did not restore it.

The file was restored later while attemting to upload exactly the same file as a new version. File upload failed with an error that this file already exist and the original file reappeared and it is available now as

https://commons.wikimedia.org/wiki/File:GenLt_OF8-Tellerkappe_AT.jpg

Unfortunately, we are not always able to find exactly the same files elsewhere. But, maybe, one find this information useful for constructing a procedure to restore "lost" images.

It seems that file moves are still dangerous operations...

I have checked few random files from the list and they all were intentionally deleted files (some of them as copyvio, some as duplicates) and they should no longer be available for non-admin users.

T109331?

aaron removed aaron as the assignee of this task.Nov 30 2015, 7:04 PM

Is this still valid?

matmarex updated the task description. (Show Details)Dec 17 2016, 1:29 PM

I'm not sure what you mean by "still valid". The files mentioned in the task description were almost all re-uploaded (except for https://commons.wikimedia.org/wiki/File:Бендеры_2_расписание_поездов.jpg). All of the original versions are still missing, though. And I don't see any signs of the underlying issue being determined and fixed.

Ankry added a comment.EditedDec 17 2016, 1:35 PM

I think yes, as thumbnails of initial file revisions are still not available.
But the priority can be lowered IMO, as almost all filea seem to be reuploaded (I think not exactly the same files - different size, sha1)

aaron added a comment.Dec 17 2016, 2:33 PM

By "valid" I mean "new occurrences" (though at least the one of them was earlier in the year). Of course, any missing files will likely not come back by themselves.

T153540 is an identical issue that was just filed today.

It still happens... Not every day, but maybe once for a week... The files I was listing were fortunately uploaded again by original uploader, except this one:

https://commons.wikimedia.org/wiki/File:%D0%91%D0%B5%D0%BD%D0%B4%D0%B5%D1%80%D1%8B_2_%D1%80%D0%B0%D1%81%D0%BF%D0%B8%D1%81%D0%B0%D0%BD%D0%B8%D0%B5_%D0%BF%D0%BE%D0%B5%D0%B7%D0%B4%D0%BE%D0%B2.jpg

Aklapper added a subscriber: Liuxinyu970226.

@Liuxinyu970226: This was mentioned as an example why the Community Wishlist item "Commons backup" was proposed. This task is not an item on the Community Wishlist.