Page MenuHomePhabricator

File cache is not updated correctly after move/delete/etc, causing files to disappear until purged
Closed, ResolvedPublic3 Estimated Story Points

Description

File existence status is not updated in the cache after an operation which changes it for a given filename (e.g. move, delete), which causes weird errors (files do not disappear but become broken on delete; the file is displayed on the redirect page but not displayed on the new page after move). Purging the file fixes it.

original description:

During move, the file is not properly moved to new filename, only description page entries are.
Problem appeared today (2015-03-17) about 17:57 UTC, Concerns, for example, all entries from:

https://commons.wikimedia.org/wiki/Special:Log/Wieralee

moved between 2015-03-17 18:57 UTC and 2015-03-17 19:13 UTC

Moving the image back, deleting & restoring does not retrieve the file entry (only description page entries are available for restoring)

Event Timeline

Ankry raised the priority of this task from to Needs Triage.
Ankry updated the task description. (Show Details)
Ankry added a subscriber: Ankry.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

All of these files come from [https://commons.wikimedia.org/wiki/User:OgreBot/Uploads_by_new_users/2014_October_22_12:00#MPlauche_.2850_edits.29 this uploads].

They were a part of this gallery: [https://commons.wikimedia.org/wiki/User:MPlauche/gallery].

I was moving them one past one - I'd always waited if the folloving move had been done.

But some of these files had disappeared.

I was trying to rename another file for check.

And it's the same: look at these two files:
https://commons.wikimedia.org/wiki/File:Glendenning_%288442865569%29.jpg
https://commons.wikimedia.org/wiki/File:Adam_Clendening_AHL_All-Star_Classic_2013.jpg

And at my editions:
https://commons.wikimedia.org/wiki/Special:Contributions/Wieralee

There is ANY report about ANY problem. Everything is ok, as always. But the rename had failed - and divided the file - the description contaiment is in the new place, the photo is in the old place.

Looks quite similar to the issue in T42927

Tgr triaged this task as High priority.Mar 17 2015, 9:27 PM
Tgr added a subscriber: Tgr.

Did this affect every move in a certain time period? Does it still affect moves?

I was trying to move another file after 90 minutes, but it was the same.

I can't move files.

At 22:07 UTC (https://commons.wikimedia.org/wiki/File:Tomb_of_Agnes_and_Clementia_of_Durazzo_-_Santa_Chiara_-_Naples_-_Italy_2015.jpg) problem still exists
Not observed before 18:57 UTC
Not all file moves affected, but most of them

Tgr raised the priority of this task from High to Unbreak Now!.Mar 17 2015, 11:22 PM

At the moment, this occurs with every deletion on Commons until the page is manually purged.

Nothing obviously related in exception.log, fatal.log, dberror.log or hhvm.log around 22:07 (although I might have missed it, the logs are full of OOM and other generic errors).

Tgr lowered the priority of this task from Unbreak Now! to High.Mar 18 2015, 1:20 AM

Looking at https://commons.wikimedia.org/wiki/Special:Log/Wieralee the redirect pages still try to display the image (but can't). The target pages think there is no image by such name (but a purge fixes them). So it seems the cache tracking file existence is poisoned.

Lowering priority as affected pages can be fixed by purging.

Deletion is obviously the same issue: the file still believes that it exists.

Based on the timing this was caused by the switch to 1.25wmf21.

My very uncertain guess is that the loadFromDB call in LocalFile::purgeMetadataCache needs a File::READ_LATEST flag.

I'll try to revert each of these 3 suspect commits locally to see if we can narrow it down to a single one.

Gilles edited a custom field.

Ugh, can't reproduce locally on 1.25wmf21

You would need multiple databases, possible even some sort of artificial replication lag. (Created T93047 about such a vagrant role.)

I think the easiest way to test a patch is to just do it live on testwiki.

This is a long term problem, exists since years. I can't find the other reports (phabricator search function...)

@Steinsplitter, no, it's a recurring symptom of various different problems.

@Steinsplitter, no, it's a recurring symptom of various different problems.

Pleas see my comment above, this problem (with the missing) files is known for a while. Sometime a lot sometime a few files affected.

See the last reports @ swift

Gilles added subscribers: Denniss, yuvipanda, Matanya.

@Tgr had already identified a few comments ago that these issues are the same, hence the task merge

@Steinsplitter: this is a cache issue. All previous bugs I am aware of messed up the database or file name permanently.

Tgr renamed this task from Files disappear after move to File cache is not updated correctly after move/delete/etc, causing files to disappear until purged.Mar 18 2015, 5:41 PM
Tgr updated the task description. (Show Details)
In T93009#1127359, @Tgr wrote:

My very uncertain guess is that the loadFromDB call in LocalFile::purgeMetadataCache needs a File::READ_LATEST flag.

Verified on testwiki.

Change 197684 had a related patch set uploaded (by Gergő Tisza):
Force LocalFile::purgeMetadataCache use the master DB

https://gerrit.wikimedia.org/r/197684

I'll SWAT this today evening PDT, but even so the bug is going to affect Wikipedias as well for a few hours. As mentioned in the bug description, the fix is to just purge the file.

Change 197684 merged by jenkins-bot:
Force LocalFile::purgeMetadataCache use the master DB

https://gerrit.wikimedia.org/r/197684

Change 197708 had a related patch set uploaded (by Gergő Tisza):
Force LocalFile::purgeMetadataCache use the master DB

https://gerrit.wikimedia.org/r/197708

Change 197709 had a related patch set uploaded (by Gergő Tisza):
Force LocalFile::purgeMetadataCache use the master DB

https://gerrit.wikimedia.org/r/197709

Note that the patch won't fix cache issues retroactively - anything that was moved/deleted before the patch was deployed will still be cached incorrectly until it's purged or the cache expires (7 days for normal files, 1 day for deleted files). If you think that's a problem and we need a manual fix, please speak up.

How can I purge it myself? Empty edition doesn't suit the problem. I'm looking into files moved by me and another persons - they are still broken...

How can I purge it myself? Empty edition doesn't suit the problem. I'm looking into files moved by me and another persons - they are still broken...

You can add ?action=purge to the URL or use the UTCLiveClock gadget.

Thanks.

Purging fixes the file into the new name/place, but the old one is still incorrect. It should be a redirection only there -- but we get the same photo with a description and a redirection... So it duplicates the files.

Purging fixes the file into the new name/place, but the old one is still incorrect. It should be a redirection only there -- but we get the same photo with a description and a redirection... So it duplicates the files.

The old one also needs to be purged.

Change 197709 merged by jenkins-bot:
Force LocalFile::purgeMetadataCache use the master DB

https://gerrit.wikimedia.org/r/197709

Change 197708 merged by jenkins-bot:
Force LocalFile::purgeMetadataCache use the master DB

https://gerrit.wikimedia.org/r/197708

Backported. Verified on Commons that deletion works properly now.