Page MenuHomePhabricator

GWToolset gives error: An unknown error occurred in storage backend "local-swift-codfw
Closed, ResolvedPublic


Reported by @85jesse on the glamtools mailing list:

While trying to use the GlamWiki Toolset:

[70ed1b6b] 2016-03-09 15:15:49: Fatal exception of type "MWException”

Event Timeline

JeanFred created this task.Mar 9 2016, 3:45 PM
Restricted Application added projects: Multimedia, Commons. · View Herald TranscriptMar 9 2016, 3:45 PM
Restricted Application added subscribers: Steinsplitter, Aklapper. · View Herald Transcript

In the metadata detection field I uploaded the xml file. I've tried two xml files: one that previously worked and a new one. After hitting 'submit' both didn't go through and gave the same error message (as mentioned above).

Restricted Application added a subscriber: Matanya. · View Herald TranscriptMar 9 2016, 3:51 PM
Bawolff added a subscriber: Bawolff.Mar 9 2016, 4:12 PM

Error is: <p>Please contact a developer. This issue must be addressed before you can continue. Please add the following text to your report: </p><p>GWToolset\Helpers\GWTFileBackend::saveFile: An unknown error occurred in storage backend "local-swift-codfw". </p>

So something to do with recent data center testing?

Bawolff renamed this task from GWToolset throws MWException to GWToolset gives error: An unknown error occurred in storage backend "local-swift-codfw .Mar 9 2016, 4:14 PM
Bawolff added a project: media-storage.

I'm doing my job today as a Wikimedian In Residence. I don't often have access to the server which generates the XML, and I've come today to the office from where I have access specially to do some uploads, although I'll have access tomorrow, so for me this is an urgent problem.

I have also tried two different XML files, one new and a previous one that has worked before. The error I get each time on uploading the XML is "[0453e274] 2016-03-10 15:00:35: Fatal exception of type "MWException"" Thanks in advance for fixing this.

greg added a subscriber: greg.Mar 10 2016, 4:23 PM

From logstash: 70ed1b6b on mw1214 at 2016-03-09T15:15:49.000Z

Bawolff added a subscriber: aaron.Mar 10 2016, 6:02 PM

Which seems to actually point to the following error:

HTTP 404 (Not Found) in 'SwiftFileBackend::doStoreInternal' (given '{"async":false,"op":"store","src":"/tmp/74fkIy","dst":"mwstore://local-swift-codfw/gwtoolset-metadata/85jesse/3/c/4/3c4824f0616b71641e9fa44b15805f7a.xml","overwrite":true}')

I wonder if this is something silly, like someone forgot to create a container for gwtoolset-metadata in codfw, and enabling replication in 5fa77bd751998 broke it.

(ping @aaron )

aaron added a comment.Mar 10 2016, 6:08 PM

A PUT to a non-existing container gives 404s, though code should call prepare() before doOperations() to avoid this problem (as other callers do). It's like making a file without making the directory first.

GWtoolset does seem to make such a call prior to doQuickStore()

return $this->FileBackend->prepare(
                'dir' => $this->getMWStoreFileDirectory(),
                'noAccess' => true,
                'noListing' => true

Mentioned in SAL [2016-03-10T18:46:23Z] <godog> sync wikipedia-commons-gwtoolset-metadata with swiftrepl eqiad -> codfw T129359

indeed that container doesn't match the usual list of containers that we synced, I'm syncing it now with swiftrepl

Hmm, so local-swift-codfw has async writes. thus ->prepare() should be delayed until the end of the request. But the quickStore() is a store op, so that would not be deferred. Thus quickStore will always happen first. So quickStore fails since prepare() hasn't run yet, which causes gwtoolset to throw an exception. But that would only explain the issue happening once. Even with the exception, the deferred update should still run (I think)

Bawolff closed this task as Resolved.Mar 14 2016, 3:06 PM
Bawolff claimed this task.

Users report its fixed on glamtools mailing list

Restricted Application added a subscriber: Poyekhali. · View Herald TranscriptMay 10 2016, 8:47 PM
mmodell changed the subtype of this task from "Task" to "Production Error".Wed, Aug 28, 11:11 PM