Page MenuHomePhabricator

File cache is not updated correctly after move/delete/etc, causing files to disappear until purged
Closed, ResolvedPublic3 Story Points

Description

File existence status is not updated in the cache after an operation which changes it for a given filename (e.g. move, delete), which causes weird errors (files do not disappear but become broken on delete; the file is displayed on the redirect page but not displayed on the new page after move). Purging the file fixes it.

original description:

During move, the file is not properly moved to new filename, only description page entries are.
Problem appeared today (2015-03-17) about 17:57 UTC, Concerns, for example, all entries from:

https://commons.wikimedia.org/wiki/Special:Log/Wieralee

moved between 2015-03-17 18:57 UTC and 2015-03-17 19:13 UTC

Moving the image back, deleting & restoring does not retrieve the file entry (only description page entries are available for restoring)

Details

Related Gerrit Patches:

Event Timeline

Ankry created this task.Mar 17 2015, 8:42 PM
Ankry raised the priority of this task from to Needs Triage.
Ankry updated the task description. (Show Details)
Ankry added a subscriber: Ankry.
Restricted Application added a project: Multimedia. · View Herald TranscriptMar 17 2015, 8:42 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Didym added a subscriber: Didym.Mar 17 2015, 8:43 PM
Wieralee added a comment.EditedMar 17 2015, 8:54 PM

All of these files come from [https://commons.wikimedia.org/wiki/User:OgreBot/Uploads_by_new_users/2014_October_22_12:00#MPlauche_.2850_edits.29 this uploads].

They were a part of this gallery: [https://commons.wikimedia.org/wiki/User:MPlauche/gallery].

I was moving them one past one - I'd always waited if the folloving move had been done.

But some of these files had disappeared.

I was trying to rename another file for check.

And it's the same: look at these two files:
https://commons.wikimedia.org/wiki/File:Glendenning_%288442865569%29.jpg
https://commons.wikimedia.org/wiki/File:Adam_Clendening_AHL_All-Star_Classic_2013.jpg

And at my editions:
https://commons.wikimedia.org/wiki/Special:Contributions/Wieralee

There is ANY report about ANY problem. Everything is ok, as always. But the rename had failed - and divided the file - the description contaiment is in the new place, the photo is in the old place.

Didym added a comment.Mar 17 2015, 9:10 PM

Looks quite similar to the issue in T42927

Tgr triaged this task as High priority.Mar 17 2015, 9:27 PM
Tgr added a subscriber: Tgr.

Did this affect every move in a certain time period? Does it still affect moves?

I was trying to move another file after 90 minutes, but it was the same.

I can't move files.

At 22:07 UTC (https://commons.wikimedia.org/wiki/File:Tomb_of_Agnes_and_Clementia_of_Durazzo_-_Santa_Chiara_-_Naples_-_Italy_2015.jpg) problem still exists
Not observed before 18:57 UTC
Not all file moves affected, but most of them

Tgr raised the priority of this task from High to Unbreak Now!.Mar 17 2015, 11:22 PM

At the moment, this occurs with every deletion on Commons until the page is manually purged.

Tgr added a comment.Mar 18 2015, 1:15 AM

Nothing obviously related in exception.log, fatal.log, dberror.log or hhvm.log around 22:07 (although I might have missed it, the logs are full of OOM and other generic errors).

Tgr lowered the priority of this task from Unbreak Now! to High.Mar 18 2015, 1:20 AM

Looking at https://commons.wikimedia.org/wiki/Special:Log/Wieralee the redirect pages still try to display the image (but can't). The target pages think there is no image by such name (but a purge fixes them). So it seems the cache tracking file existence is poisoned.

Lowering priority as affected pages can be fixed by purging.

Tgr added a comment.Mar 18 2015, 1:21 AM

Deletion is obviously the same issue: the file still believes that it exists.

Tgr added a comment.Mar 18 2015, 1:24 AM

Based on the timing this was caused by the switch to 1.25wmf21.

Tgr added a comment.Mar 18 2015, 2:52 AM

My very uncertain guess is that the loadFromDB call in LocalFile::purgeMetadataCache needs a File::READ_LATEST flag.

Gilles added a subscriber: Gilles.Mar 18 2015, 7:40 AM

I'll try to revert each of these 3 suspect commits locally to see if we can narrow it down to a single one.

Gilles set Security to None.Mar 18 2015, 7:41 AM
Gilles edited a custom field.

Ugh, can't reproduce locally on 1.25wmf21

Gilles added a subscriber: aaron.Mar 18 2015, 7:46 AM
Tgr added a comment.Mar 18 2015, 8:19 AM

You would need multiple databases, possible even some sort of artificial replication lag. (Created T93047 about such a vagrant role.)

I think the easiest way to test a patch is to just do it live on testwiki.

This is a long term problem, exists since years. I can't find the other reports (phabricator search function...)

TheDJ added a subscriber: TheDJ.Mar 18 2015, 10:54 AM

@Steinsplitter, no, it's a recurring symptom of various different problems.

@Steinsplitter, no, it's a recurring symptom of various different problems.

Pleas see my comment above, this problem (with the missing) files is known for a while. Sometime a lot sometime a few files affected.

See the last reports @ swift

Gilles added subscribers: Denniss, yuvipanda, Matanya.

@Tgr had already identified a few comments ago that these issues are the same, hence the task merge

Tgr added a comment.Mar 18 2015, 3:39 PM

@Steinsplitter: this is a cache issue. All previous bugs I am aware of messed up the database or file name permanently.

greg added a subscriber: greg.Mar 18 2015, 5:36 PM
Tgr renamed this task from Files disappear after move to File cache is not updated correctly after move/delete/etc, causing files to disappear until purged.Mar 18 2015, 5:41 PM
Tgr updated the task description. (Show Details)
Tgr added a comment.Mar 18 2015, 5:56 PM
In T93009#1127359, @Tgr wrote:

My very uncertain guess is that the loadFromDB call in LocalFile::purgeMetadataCache needs a File::READ_LATEST flag.

Verified on testwiki.

Change 197684 had a related patch set uploaded (by Gergő Tisza):
Force LocalFile::purgeMetadataCache use the master DB

https://gerrit.wikimedia.org/r/197684

Tgr added a comment.Mar 18 2015, 6:20 PM

I'll SWAT this today evening PDT, but even so the bug is going to affect Wikipedias as well for a few hours. As mentioned in the bug description, the fix is to just purge the file.

Change 197684 merged by jenkins-bot:
Force LocalFile::purgeMetadataCache use the master DB

https://gerrit.wikimedia.org/r/197684

Change 197708 had a related patch set uploaded (by Gergő Tisza):
Force LocalFile::purgeMetadataCache use the master DB

https://gerrit.wikimedia.org/r/197708

Change 197709 had a related patch set uploaded (by Gergő Tisza):
Force LocalFile::purgeMetadataCache use the master DB

https://gerrit.wikimedia.org/r/197709

Tgr added a comment.Mar 18 2015, 7:21 PM

Note that the patch won't fix cache issues retroactively - anything that was moved/deleted before the patch was deployed will still be cached incorrectly until it's purged or the cache expires (7 days for normal files, 1 day for deleted files). If you think that's a problem and we need a manual fix, please speak up.

How can I purge it myself? Empty edition doesn't suit the problem. I'm looking into files moved by me and another persons - they are still broken...

Man77 added a subscriber: Man77.Mar 18 2015, 7:37 PM
Didym added a comment.Mar 18 2015, 7:41 PM

How can I purge it myself? Empty edition doesn't suit the problem. I'm looking into files moved by me and another persons - they are still broken...

You can add ?action=purge to the URL or use the UTCLiveClock gadget.

Thanks.

Purging fixes the file into the new name/place, but the old one is still incorrect. It should be a redirection only there -- but we get the same photo with a description and a redirection... So it duplicates the files.

Didym added a comment.Mar 18 2015, 8:44 PM

Purging fixes the file into the new name/place, but the old one is still incorrect. It should be a redirection only there -- but we get the same photo with a description and a redirection... So it duplicates the files.

The old one also needs to be purged.

Tgr claimed this task.Mar 18 2015, 9:41 PM
Tgr moved this task from Backlog to Needs backport on the Multimedia-Sprint-2015-03-11 board.

Change 197709 merged by jenkins-bot:
Force LocalFile::purgeMetadataCache use the master DB

https://gerrit.wikimedia.org/r/197709

Change 197708 merged by jenkins-bot:
Force LocalFile::purgeMetadataCache use the master DB

https://gerrit.wikimedia.org/r/197708

Tgr added a comment.Mar 19 2015, 12:17 AM

Backported. Verified on Commons that deletion works properly now.

Gilles closed this task as Resolved.Mar 25 2015, 3:11 PM