Inconsistent state within the internal storage backends
Closed, ResolvedPublic

bzimport added a project: Wikimedia-Media-storage.Via ConduitNov 22 2014, 1:08 AM
bzimport set Reference to bz39221.
Yann created this task.Via LegacyAug 10 2012, 11:17 AM
RobLa-WMF added a comment.Via ConduitAug 10 2012, 6:32 PM

Aaron is working on this one.

aaron added a comment.Via ConduitAug 10 2012, 7:41 PM

Some of these files don't have previous versions. What where you reverting to for those?

aaron added a comment.Via ConduitAug 10 2012, 8:03 PM

Begbroke_Church_-_geograph.org.uk_-_1386361.jpg and St_Tudno%27s_Church_from_the_Lych_Gate_-_geograph.org.uk_-_1419145.jpg had NFS files with bad permissions. They were fixed.

Yann added a comment.Via ConduitAug 11 2012, 5:40 AM

I was reverting those which have a previous version.

Matanya added a comment.Via ConduitAug 11 2012, 8:16 PM
  • Bug 39260 has been marked as a duplicate of this bug. ***
aaron added a comment.Via ConduitAug 12 2012, 11:32 PM

(In reply to comment #3)

Begbroke_Church_-_geograph.org.uk_-_1386361.jpg and
St_Tudno%27s_Church_from_the_Lych_Gate_-_geograph.org.uk_-_1419145.jpg had NFS
files with bad permissions. They were fixed.

Error messages is gone.

aaron added a comment.Via ConduitAug 13 2012, 12:07 AM

All of these *_geograph.org.uk_* files have wrong UNIX permissions and are owned by "catrope" instead of "apache".

Fastily added a comment.Via ConduitAug 13 2012, 5:28 AM

So, sudo chown apache *_geograph.org.uk_* ??

Catrope added a comment.Via ConduitAug 13 2012, 10:28 PM

(In reply to comment #10)

So, sudo chown apache *_geograph.org.uk_* ??

I'm currently running a find to list all files in the public upload directories (i.e. excluding private wikis and excluding thumbnails, but including archived versions) that aren't owned by the right user. This'll take a few more hours to run, so I'll do a chown based on that list tomorrow.

Catrope added a comment.Via ConduitAug 14 2012, 7:54 PM

(In reply to comment #11)

(In reply to comment #10)
> So, sudo chown apache *_geograph.org.uk_* ??
I'm currently running a find to list all files in the public upload directories
(i.e. excluding private wikis and excluding thumbnails, but including archived
versions) that aren't owned by the right user. This'll take a few more hours to
run, so I'll do a chown based on that list tomorrow.

This chown has now been running for an hour:
18:52 RoanKattouw: Running /root/fixownership < /root/badownershipfiles in a screen on ms7

It'll take a while to finish: it's done 152,000 files in the first hour and there are 1.8 million files to fix, so it'll probably take another 11-12 hours.

Fastily added a comment.Via ConduitAug 15 2012, 9:13 PM

Thanks for doing that Roan! I may have found a similar bug affecting undeletions. When I tried undeleting http://commons.wikimedia.org/w/index.php?title=File:Afrikaner_Commandos2.JPG , I received the following error: http://commons.wikimedia.org/wiki/File:FileInconsistentStateCommonsError20120815.png

Will this also be fixed by the mass chown?

Catrope added a comment.Via ConduitAug 15 2012, 9:30 PM

(In reply to comment #12)

This chown has now been running for an hour:

18:52 RoanKattouw: Running /root/fixownership < /root/badownershipfiles in a

screen on ms7

It'll take a while to finish: it's done 152,000 files in the first hour and
there are 1.8 million files to fix, so it'll probably take another 11-12 hours.

This is now done, but there are still some bad files left. Running another find to fix those:

21:19 RoanKattouw: find /export/upload/wik*/*/{0,1,2,3,4,5,6,7,8,9,a,b,c,d,e,f,archive,math,temp,timeline} ! -user apache -exec chown apache \{\} \;

Catrope added a comment.Via ConduitAug 15 2012, 9:31 PM

(In reply to comment #13)

Thanks for doing that Roan! I may have found a similar bug affecting
undeletions. When I tried undeleting
http://commons.wikimedia.org/w/index.php?title=File:Afrikaner_Commandos2.JPG ,
I received the following error:
http://commons.wikimedia.org/wiki/File:FileInconsistentStateCommonsError20120815.png

Will this also be fixed by the mass chown?

I *think* so, but I'm not sure. Aaron, could you look into that specific file and see what's wrong with it?

aaron added a comment.Via ConduitAug 15 2012, 11:55 PM

$name = "local-public/archive/6/69/20110119133921!Afrikaner_Commandos2.JPG";

...

var_dump( $nfs->getFileStat( array( 'src' => "mwstore://local-NFS/$name" ) ) );

bool(false)

var_dump( $swift->getFileStat( array( 'src' => "mwstore://local-swift/$name" ) ) );

array(4) {

["mtime"]=>
string(14) "20120721184844"
["size"]=>
int(123148)
["sha1"]=>
string(31) "lox3gcuif55humxbiq3exjhcu6dg1wt"
["latest"]=>
bool(true)

}

So the file is in Swift but not NFS.

aaron added a comment.Via ConduitAug 16 2012, 12:40 AM

I've manually resynced Afrikaner_Commandos2.JPG.

Catrope added a comment.Via ConduitAug 16 2012, 4:54 PM

(In reply to comment #14)

This is now done, but there are still some bad files left. Running another find
to fix those:

21:19 RoanKattouw: find
/export/upload/wik*/*/{0,1,2,3,4,5,6,7,8,9,a,b,c,d,e,f,archive,math,temp,timeline}
! -user apache -exec chown apache \{\} \;

This is now done

aaron added a comment.Via ConduitAug 16 2012, 9:21 PM

Any other issues coming up?

bzimport added a comment.Via ConduitAug 17 2012, 3:49 AM

material.scientist wrote:

Please have a look at http://commons.wikimedia.org/wiki/File:D%C4%9B%C4%8D%C3%ADn,_Dlouh%C3%A1_j%C3%ADzda,_pivovarsk%C3%BD_kom%C3%ADn.jpg - this file can't be moved because of: API request failed (unknownerror): Unknown error: "backend-fail-synced". There were more files with the same error a few days ago, but I suppose they became unlocked.

jeremyb added a comment.Via ConduitAug 19 2012, 4:30 PM

Another example in bug 39483.

Yann added a comment.Via ConduitAug 22 2012, 4:59 PM
Nemo_bis added a comment.Via ConduitAug 23 2012, 8:43 AM

(In reply to comment #23)

Again here: http://commons.wikimedia.org/wiki/File:Chordeiles_Gundlachii-.jpg
File can't be deleted.

This is different, it gives a 404 error (completely missing): different bug?
Another example (by esby): http://commons.wikimedia.org/wiki/File:FMLF_2012-4.JPG

McZusatz added a comment.Via ConduitAug 23 2012, 10:41 AM

(In reply to comment #25)

This is different

I also think this is different. Is there already a bug report open?

Nemo_bis added a comment.Via ConduitAug 23 2012, 10:43 AM

(In reply to comment #26)

(In reply to comment #25)
> This is different

I also think this is different. Is there already a bug report open?

I search various components (and moved a bunch of bugs under file management/media storage) but I couldn't find any. I'm not really able to distinguish issues though...

Yann added a comment.Via ConduitAug 23 2012, 11:00 AM

(In reply to comment #23)

Again here: http://commons.wikimedia.org/wiki/File:Chordeiles_Gundlachii-.jpg
File can't be deleted.

OK now. File deleted.

McZusatz added a comment.Via ConduitAug 24 2012, 11:46 AM

(In reply to comment #27)

(In reply to comment #26)
> (In reply to comment #25)
> > This is different
>
> I also think this is different. Is there already a bug report open?

I search various components (and moved a bunch of bugs under file
management/media storage) but I couldn't find any. I'm not really able to
distinguish issues though...

opened bug #39615

TheDJ added a comment.Via ConduitAug 28 2012, 9:20 PM

Error deleting file: The file "mwstore://local-multiwrite/local-public/6/60/Miley_Cyrus_interpreta_Miley_Stewart.jpg" is in an inconsistent state within the internal storage backends

aaron added a comment.Via ConduitAug 30 2012, 4:09 AM

(In reply to comment #32)

http://commons.wikimedia.org/w/index.php?title=Special:Undelete&action=submit

This bug is now a real big problem :(

That link doesn't go to any specific file.

bzimport added a comment.Via ConduitAug 30 2012, 3:22 PM

Turelio001 wrote:

http://commons.wikimedia.org/wiki/File:Miley_Cyrus_interpreta_Miley_Stewart.jpg

Guys, this is really a highly critical bug that needs urgent fixing. The workload of the deleting admins on Commons is already unbearable, it doesn't need to be artifically increased by software bugs.

RobLa-WMF added a comment.Via ConduitAug 31 2012, 6:43 PM

Bumping down priority. Aaron has another issue which is higher priority than this one.

bzimport added a comment.Via ConduitSep 1 2012, 2:13 AM

svenmanguard wrote:

Roan told me to report this here.

I tried undeleting File:Rajeev kumar varshney.jpg over on Commons (per OTRS ticket) and got the message

Error undeleting page

Undelete failed; someone else may have undeleted the page first.

Undeletion will not be performed if it will result in the top page or file revision being partially deleted. In such cases, you must uncheck or unhide the newest deleted revision.

Error undeleting file: The file "mwstore://local-multiwrite/local-public/1/10/Rajeev_kumar_varshney.jpg" is in an inconsistent state within the internal storage backends

First line is larger than the rest, fourth line is in red text.

Sven

Yann added a comment.Via ConduitSep 2 2012, 12:39 PM

https://commons.wikimedia.org/wiki/Commons:Bistro#Message_d.27erreur_-_upload
With the WLM going, this needs to be fixed now. Thanks.

McZusatz added a comment.Via ConduitSep 2 2012, 9:13 PM

(In reply to comment #42)
The file is also in an "inconsistent state" if you try to reupload the file. So this is related?

Yann added a comment.Via ConduitSep 3 2012, 5:06 AM

(In reply to comment #42)

(In reply to comment #40)
> https://commons.wikimedia.org/wiki/File:Akong_Rinpoche_and_a_monk.jpg
> Error while creating thumbnails:
> https://upload.wikimedia.org/wikipedia/commons/thumb/4/42/Akong_Rinpoche_and_a_monk.jpg/1280px-Akong_Rinpoche_and_a_monk.jpg

That's very unlikely to be related.

It is exactly the same error message, and it also concerns a media file.
The evidence shows that it is probably related to this bug.

bzimport added a comment.Via ConduitSep 3 2012, 6:38 PM

putevod wrote:

Just to point out that today I had a problem renaming a file which had a previous version. The problem is described [http://commons.wikimedia.org/wiki/Commons:Administrators%27_noticeboard#File:.D0.A6.D0.B5.D1.80.D0.BA.D0.BE.D0.B2.D1.8C_5.JPG here], the file is Церковь5.JPG . I run into a similar problem earlier today, but now I do not remember the file name (it started with Tha and I wanted to rename in into The).

aaron added a comment.Via ConduitSep 3 2012, 8:09 PM

(In reply to comment #44)

(In reply to comment #42)
> (In reply to comment #40)
> > https://commons.wikimedia.org/wiki/File:Akong_Rinpoche_and_a_monk.jpg
> > Error while creating thumbnails:
> > https://upload.wikimedia.org/wikipedia/commons/thumb/4/42/Akong_Rinpoche_and_a_monk.jpg/1280px-Akong_Rinpoche_and_a_monk.jpg
>
> That's very unlikely to be related.

It is exactly the same error message, and it also concerns a media file.
The evidence shows that it is probably related to this bug.

Probably, since https://upload.wikimedia.org/wikipedia/commons/4/42/Akong_Rinpoche_and_a_monk.jpg loads but http://upload.wikimedia.org/wikipedia/commons/4/42/Akong_Rinpoche_and_a_monk.jpg is a 404.

When I last looked at them I specifically tried purging first to clear the squids, and then viewed the (https) url that was given here. The file loaded fine, which strongly suggested that it existed and the thumbnail errors were unrelated. However, since the regular http url is a 404 (and adding ?x=1 to the https one doesn't either) that means that file really doesn't exist and that the https squid cache purging is also broken (I still can't get that https link to purge).

aaron added a comment.Via ConduitSep 4 2012, 1:43 AM

*** Bug 39952 has been marked as a duplicate of this bug. ***

aaron added a comment.Via ConduitSep 4 2012, 5:11 PM

I completed another commons run of syncFileBackend.php, which should have eliminated all such inconsistencies (except for new ones of course).

Rillke added a comment.Via ConduitSep 5 2012, 4:35 PM

My bot recently uploaded http://commons.wikimedia.org/w/index.php?title=File:Snowcreekwater3web_small_-_West_Virginia_-_ForestWander.jpg

I was unable to delete this file: https://commons.wikimedia.org/w/index.php?title=File:Snowcreekwater3web_small_-_West_Virginia_-_ForestWander.jpg&action=delete
-->

Fehler bei Datei-Löschung: Die Datei „mwstore://local-multiwrite/local-public/3/36/Snowcreekwater3web_small_-_West_Virginia_-_ForestWander.jpg“ befindet sich, innerhalb des internen Speicher-Backends, in einem inkonsistenten Zustand.

McZusatz added a comment.Via ConduitSep 5 2012, 6:55 PM

(In reply to comment #48)

I completed another commons run of syncFileBackend.php, which should have
eliminated all such inconsistencies (except for new ones of course).

It seems https://commons.wikimedia.org/wiki/File:EnwikipediagrowthGom.PNG was also not cleared by this run. (It is an old file)

liangent added a comment.Via ConduitSep 6 2012, 4:08 AM

(12:02:00 PM) liangent: 恢复被删文件时发生错误:文件"mwstore://local-multiwrite/local-public/9/9c/Tencent_QQ.png"在内部存储后端之中处于不一致状态
(12:02:00 PM) liangent: 恢复被删文件时发生错误:文件"mwstore://local-multiwrite/local-deleted/p/h/o/phoiac9h4m842xq45sp7s6u21eteeq1.jpg"在内部存储后端之中处于不一致状态

Two more. The second is on [[zh:File:Monferno.jpg]].

matmarex added a comment.Via ConduitSep 9 2012, 9:58 PM

Another file from today: https://commons.wikimedia.org/wiki/File:Bierzgłowo_Windmill.jpg

Margoz said he has gotten this error. He was able to upload a new (working) version of the file. Fun fact: Rotatebot claims to have rotated the image, but he's not in the file history. Fun fact 2: the missing image was (according to the uploader) the same, but rotated by 90 degrees; when visiting the page earlier, I got the thumbnail from that one first, then one refresh later the current one (landscape orientation).

McZusatz added a comment.Via ConduitSep 10 2012, 8:15 AM

(In reply to comment #48)

I completed another commons run of syncFileBackend.php, which should have
eliminated all such inconsistencies (except for new ones of course).

Still another old instance: https://commons.wikimedia.org/wiki/File:Geslau_St._Kilian_018.jpg

Rillke added a comment.Via ConduitSep 10 2012, 10:30 AM

I don't know whether this is related:
https://commons.wikimedia.org/wiki/File:OAB_Neuenbuerg_Ansicht1.png when requesting the full size ( https://upload.wikimedia.org/wikipedia/commons/5/5a/OAB_Neuenbuerg_Ansicht1.png ), I get "404 Not Found" ( File not found: /v1/AUTH_xxxx1b15-ed7a-40b6-b745-47666abf8dxx/wikipedia-commons-local-public.5a/5/5a/OAB_Neuenbuerg_Ansicht1.png )

McZusatz added a comment.Via ConduitSep 10 2012, 11:26 AM

(In reply to comment #57)
I have added the appropriate cat. (see: bug #39615 )

Trijnstel added a comment.Via ConduitSep 10 2012, 12:18 PM

One more which is probably related to this bug:

https://commons.wikimedia.org/wiki/File:Amnapreet_Sokhi_in_India.jpg

Eloquence added a comment.Via ConduitSep 11 2012, 5:02 AM

I'm also encountering this when trying to rename:

https://commons.wikimedia.org/wiki/File:Metrics_9.4.12.theora.ogv

(This was uploaded from the cluster, probably hume, by Roan.)

McZusatz added a comment.Via ConduitSep 12 2012, 3:00 PM

again: 404-bug and inconsistent-bug
https://commons.wikimedia.org/wiki/File:Mahabharata02ramauoft_0022_19.jpg

Maybe it would be helpful to create a category where to put the files. Otherwise the comment section will grow unnaturally big.

MZMcBride added a comment.Via ConduitSep 12 2012, 9:28 PM

(In reply to comment #63)

Maybe it would be helpful to create a category where to put the files.
Otherwise the comment section will grow unnaturally big.

I guess you could create a category called "Category:Bug 39221" or something, but this is kind of absurd. Uploads need to be completely disabled if this is causing data corruption until the problem can be properly addressed.

aaron added a comment.Via ConduitSep 12 2012, 10:27 PM

https://gerrit.wikimedia.org/r/#/c/22494 has been deployed and merged.

Eloquence added a comment.Via ConduitSep 12 2012, 10:53 PM

I'm getting a new error now when trying to rename this file:
https://commons.wikimedia.org/wiki/File:Metrics_9.4.12.theora.ogv

You do not have permission to move this page, for the following reason:
Could not write file "mwstore://local-multiwrite/local-public/e/ec/Metrics_9.4.12.theora.ogv" due to insufficient permissions or missing directories/containers.

Catrope added a comment.Via ConduitSep 12 2012, 11:08 PM

(In reply to comment #66)

I'm getting a new error now when trying to rename this file:
https://commons.wikimedia.org/wiki/File:Metrics_9.4.12.theora.ogv

You do not have permission to move this page, for the following reason:
Could not write file
"mwstore://local-multiwrite/local-public/e/ec/Metrics_9.4.12.theora.ogv" due to
insufficient permissions or missing directories/containers.

That's because I screwed up when importing that file. Fixed the ownership just now.

Eloquence added a comment.Via ConduitSep 12 2012, 11:33 PM

That did the trick, thanks.

aaron added a comment.Via ConduitSep 13 2012, 3:04 AM

These should basically be gone now, aside from permission errors that require root users to fix, which scripts were run to fix.

That ogv file was recently manually added by an admin with the wrong permissions (which is rare). dcec58e could also be activated even for that case.

Nemo_bis added a comment.Via ConduitSep 13 2012, 10:16 AM

This bug cannot be at the same time resolved and blocker of bug 39615: reopening for now, please close if it's not the cause.

aaron added a comment.Via ConduitSep 14 2012, 2:07 AM

(In reply to comment #70)

This bug cannot be at the same time resolved and blocker of bug 39615:
reopening for now, please close if it's not the cause.

I don't see how that makes sense. This bug is about people not being able to do
things due to sync errors. Any other problems, like 404s can be separate (which
39615 is already there for).

Nemo_bis added a comment.Via ConduitSep 14 2012, 2:17 AM

Thank you for fixing dependencies.

Billinghurst added a comment.Via ConduitOct 1 2012, 11:53 AM

Still some weird happenings

File:St Clement Eastcheap - sword rest - front, close-up - 1394055.duplicate.jpg

View or restore 11 deleted edits?

https://commons.wikimedia.org/wiki/Special:Undelete/File:St_Clement_Eastcheap_-_sword_rest_-_front,_close-up_-_1394055.duplicate.jpg

File says that it is deleted, yet it still shows as an active link, and shows an image. <shrug>

aaron added a comment.Via ConduitOct 1 2012, 7:12 PM

Please open a separate bug for that (but also try using the "delete all" link first).

Aklapper added a comment.Via ConduitDec 17 2012, 4:07 PM
  • Bug 37213 has been marked as a duplicate of this bug. ***
Tgr added a comment.Via ConduitApr 30 2013, 9:48 PM

Same error when trying to delete [[hu:Fájl:A Qulto felépítése - 003.jpg]] (originally uploaded 3 days ago):

Error deleting file: The file "mwstore://local-multiwrite/local-public/c/c8/A_Qulto_felépítése_-_003.jpg" is in an inconsistent state within the internal storage backends

MZMcBride added a comment.Via ConduitMay 1 2013, 1:22 AM

(In reply to comment #76)

Same error when trying to delete [[hu:Fájl:A Qulto felépítése - 003.jpg]]
(originally uploaded 3 days ago):

Error deleting file: The file
"mwstore://local-multiwrite/local-public/c/c8/A_Qulto_felépítése_-_003.jpg"
is in an inconsistent state within the internal storage backends

Given that it's been a few months, I've split this recent reported issue out into a separate bug: bug 47905. This bug should remain resolved/fixed, as it was fixed in September 2012 (cf. comment 69).

Gilles added a project: Multimedia.Via WebDec 4 2014, 9:26 AM
Gilles moved this task to Closed on the Multimedia workboard.Via WebDec 4 2014, 10:10 AM

Add Comment

Column Prototype
This is a very early prototype of a persistent column. It is not expected to work yet, and leaving it open will activate other new features which will break things. Press "\" (backslash) on your keyboard to close it now.