Page MenuHomePhabricator

FileOperation error "SwiftFileBackend::addMissingMetadata: {path} was not stored with SHA-1 metadata."
Closed, ResolvedPublicPRODUCTION ERROR

Description

Error

Request ID: W5mZegrAIFcAAFZSjTwAAAAD

normalized message
SwiftFileBackend::addMissingMetadata: {path} was not stored with SHA-1 metadata.
example
SwiftFileBackend::addMissingMetadata: mwstore://local-swift-codfw/local-thumb/0/05/Wes_Craven_2010.jpg/140px-Wes_Craven_2010.jpg was not stored with SHA-1 metadata.

channel: FileOperation
level: ERROR

http_method: GET
wiki: commons.wikimedia.org
url: /w/thumb.php?f=Wes_Craven_2010.jpg&w=140

Notes

Past reports that might be related:

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Gilles added subscribers: aaron, Gilles.

This probably happens for thumbnails generated with thumbor and then accessed via thumb.php. What's the point of that custom x-object-meta-sha1base36 header for thumbnails, though? The code doesn't say. It could be something that's only really useful for originals and thumbnails just happen to get the same treatment.

@aaron can probably shed some light on this when he's back from his vacation. Assigning the task to him to provide some context on that header and whether it's necessary for MediaWiki to bother populating it for thumbnails (in which case thumbor should generate the same thing).

It's used for originals. I don't think it matters much for thumbnails, but it's hard to cleanly tell that to SwiftFileBackend. It seems like it might be easiest to have thumbor hash the local file and save the metadata in the PUT request to avoid these errors (and slowness of triggering a GET to POST the missing data).

Well, that's a lot of storage space to waste saving useless metadata on thumbnail objects, imho. If it's not needed, we shouldn't store it. Surely we can add an option to SwiftFileBackend or something for the thumb case. I can look into it, I just wanted confirmation that this was only useful for originals.

Is it that much space? If you add an option, you have to have getFileStat return some dummy value for the SHA1 and also not have that mess up the logic in doOperations(), which is why it seemed easier to just include the header.

If you go that route, then something like having the getFileSha1() and the sha1 field being null for certain containers plus having doOperations and friends pass a flag to getFileStat/Sha1 to have the current behavior of lazy-loading and not using null, it might work.

Header name + data is 55 bytes longs. We have 1.2 billion thumbnails in Swift. That's 66 GiB of data. Which represents 0.02% of our Swift storage space. Not earth-shattering savings, I'll grant you that, but I think we should get rid of data we don't need. It does affect speed a little as well when fetching thumbnails from Swift.

Change 472608 had a related patch set uploaded (by Aaron Schulz; owner: Aaron Schulz):
[mediawiki/core@master] filebackend: avoiding computing file SHA-1 hashes unless needed

https://gerrit.wikimedia.org/r/472608

Gilles triaged this task as Low priority.

Change 472608 merged by jenkins-bot:
[mediawiki/core@master] filebackend: avoiding computing file SHA-1 hashes unless needed

https://gerrit.wikimedia.org/r/472608

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:09 PM