Uploads sometimes are incomplete: file is there, but image page is missing
OpenPublic

Description

We've had at the Commons recently several cases where uploads were not done completely. The images showed up, the file history showed even the full upload summary, but the corresponding image description pages were not created. As a result, if the user used the license dropdown to specify the license, that license was lost completely (because it doesn't show up in the upload summary).

An example is http://commons.wikimedia.org/wiki/Image:Palencia_-_San_Pablo_15.JPG the image page was created manually on the day after the upload, see the page history at http://commons.wikimedia.org/w/index.php?title=Image:Palencia_-_San_Pablo_15.JPG&action=history . The uploader was notified and did add the license after that, so all is well for this image now. But there are other such images where the licenses were lost, and where the uploaders did not react.

I've observed this behavior twice so far, once on July 31 from about 10:30 to 11:50 UTC, and now on August 31 from 17:19 to 17:39 UTC.

See also http://commons.wikimedia.org/wiki/Commons:Administrators%27_noticeboard/Archive_10#Server_hiccups and http://commons.wikimedia.org/w/index.php?title=Commons:Administrators%27_noticeboard/Attention&oldid=14027516#Problems_with_upload


Version: unspecified
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=61898
https://bugzilla.wikimedia.org/show_bug.cgi?id=32551

bzimport added a project: MediaWiki-Uploading.Via ConduitNov 21 2014, 10:16 PM
bzimport added a subscriber: wikibugs-l.
bzimport set Reference to bz15430.
Lupo created this task.Via LegacySep 1 2008, 8:31 PM
Catrope added a comment.Via ConduitSep 1 2008, 8:33 PM

Assigning to Bryan as he rewrote the image handling code recently.

bzimport added a comment.Via ConduitSep 1 2008, 8:40 PM

Bryan.TongMinh wrote:

The new code is not active yet on Wikimedia.

Catrope added a comment.Via ConduitSep 1 2008, 8:42 PM

(In reply to comment #2)

The new code is not active yet on Wikimedia.

Hmm. In that case, assigned to you because you know a lot about that part of the code ;)

Seriously, though, it would probably be worth investigating/trying whether your rewrite fixes this.

Lupo added a comment.Via ConduitSep 1 2008, 8:43 PM

Here's another example where it's even more evident that there was no image
page:
http://commons.wikimedia.org/wiki/Image:Basketball_29082008666_(5).jpg

The only edit to the image page was my nominating it for deletion:
http://commons.wikimedia.org/w/index.php?title=Image:Basketball_29082008666_(5).jpg&action=history

Platonides added a comment.Via ConduitSep 1 2008, 8:50 PM

http://commons.wikimedia.org/wiki/Commons:Administrators%27_noticeboard/Archive_10#Server_hiccups mentions "a bunch of wiki has a problem messages around that time"
I think this may be the same problem reported on #wikimedia-tech, with some apache trying to write to the old master (now read only) on external storage.
I recommend adding some check to the upload code for the case where the description can't be saved.

IRC reports were as follows (UTC time):
16:15:56 <[X]> The last attempted database query was: (SQL query hidden) from within function "ExternalStoreDB::store". MySQL returned error "1290: The MySQL server is running with the --read-only option so it cannot execute this statement (10.0.2.102)".
17:18:30 <Mike_lifeguard> ...within function "ExternalStoreDB::store". MySQL returned error "1290: The MySQL server is running with the --read-only option so it cannot execute this statement (10.0.2.107)".

bzimport added a comment.Via ConduitSep 1 2008, 8:55 PM

Bryan.TongMinh wrote:

Apparently LocalFile::recordUpload2 continues unconditionally after Article::doEdit without any error checking whatsoever. Should be fixed.

Catrope added a comment.Via ConduitSep 2 2008, 12:22 PM

(In reply to comment #5)

http://commons.wikimedia.org/wiki/Commons:Administrators%27_noticeboard/Archive_10#Server_hiccups
mentions "a bunch of wiki has a problem messages around that time"
I think this may be the same problem reported on #wikimedia-tech, with some
apache trying to write to the old master (now read only) on external storage.
I recommend adding some check to the upload code for the case where the
description can't be saved.

IRC reports were as follows (UTC time):
16:15:56 <[X]> The last attempted database query was: (SQL query hidden) from
within function "ExternalStoreDB::store". MySQL returned error "1290: The MySQL
server is running with the --read-only option so it cannot execute this
statement (10.0.2.102)".
17:18:30 <Mike_lifeguard> ...within function "ExternalStoreDB::store". MySQL
returned error "1290: The MySQL server is running with the --read-only option
so it cannot execute this statement (10.0.2.107)".

Related to bug 15391?

bzimport added a comment.Via ConduitSep 2 2008, 3:26 PM

mike.lifeguard+bugs wrote:

(In reply to comment #5)

IRC reports were as follows (UTC time):
17:18:30 <Mike_lifeguard> ...within function "ExternalStoreDB::store". MySQL
returned error "1290: The MySQL server is running with the --read-only option
so it cannot execute this statement (10.0.2.107)".

That one was attempting to save either [[m:Talk:Spam blacklist]] or [[m:Spam blacklist]] (forget which one) not uploading an image. Not sure whether that's relevant or not though.

aaron added a comment.Via ConduitSep 5 2008, 3:48 AM

(In reply to comment #6)

Apparently LocalFile::recordUpload2 continues unconditionally after
Article::doEdit without any error checking whatsoever. Should be fixed.

Yeah, looks like bad transaction wrapping there.

aaron added a comment.Via ConduitSep 5 2008, 4:12 AM

Still, LB should rollback the last transaction if a db failure comes up.

Lupo added a comment.Via ConduitNov 6 2008, 2:26 PM

Two more of these, uploaded 2008-09-12:

http://commons.wikimedia.org/wiki/Image:Fibersensing.JPG

http://commons.wikimedia.org/wiki/Image:Audolici.JPG

Any progress on this issue? Since we do lose licensing information when this occurs, this is a pretty serious failure, even if I haven't seen it anymore recently. Has this been fixed or not?

bzimport added a comment.Via ConduitNov 6 2008, 11:00 PM

Bryan.TongMinh wrote:

Patch

This patch checks the result of Article::doEdit.

I am not sure whether I did the transaction stuff correctly. Somebody review.

attachment patch.txt ignored as obsolete

bzimport added a comment.Via ConduitDec 26 2008, 7:42 PM

Bryan.TongMinh wrote:

Fixed in r45058.

brion added a comment.Via ConduitDec 31 2008, 5:25 PM

Reverted in r45230

Article::doEdit() opens and commits transactions of its own, so the transaction we started earlier will already be committed and cannot be rolled back at this point unless one of two particular failure modes hits (ArticleSave hook aborts or mLatest isn't set). A more general fix needs to be devised...

bzimport added a comment.Via ConduitJan 3 2009, 3:36 PM

Bryan.TongMinh wrote:

Updated patch

Patch that swaps the order around: first the article is editted and if that succeeded the upload is recorded.

Can't test it myself atm because my PHP uploading is broken.

attachment patch.txt ignored as obsolete

brion added a comment.Via ConduitJan 4 2009, 3:47 AM

This seems like it would be more likely to break, and in a less recoverable fashion.

Current order of operations appears to be:

  1. Physically move in the file
  2. Save 'image'/'oldimage' record
  3. Save log entry
  4. Add/update description page

Failing at the page save at least should leave us with a working file record, history, and log update, unless we hit a rare case where the actual image record update gets rolled back [if the log and page saves don't start any transactions, but then fail].

Order of operations for the patch is:

  1. Physically move in the file
  2. Add/update description page
  3. Save 'image'/'oldimage' record
  4. Save log entry

If we fail at 2) now, we have a file sitting around with no record. If this is an update to an existing file, we have a broken history and an untraceable, unexplained, unlogged change -- very bad.

Either we need to accept that failing the page is ok, or we need to be able to roll back the change to the actual file -- or else do it after, and be able to roll back all our DB and page updates if the file move fails. Icky. :(

bzimport added a comment.Via ConduitJan 4 2009, 10:01 PM

Bryan.TongMinh wrote:

(In reply to comment #16)

Order of operations for the patch is:

  1. Physically move in the file
  2. Add/update description page
  3. Save 'image'/'oldimage' record
  4. Save log entry

    If we fail at 2) now, we have a file sitting around with no record. If this is an update to an existing file, we have a broken history and an untraceable, unexplained, unlogged change -- very bad.

In case this is an existing file, we never reach Article::doEdit, so it can never fail at that.

Either we need to accept that failing the page is ok, or we need to be able to
roll back the change to the actual file -- or else do it after, and be able to
roll back all our DB and page updates if the file move fails. Icky. :(

What I think is that we should accept that LocalFile::recordUpload2 does not always succeed. In case it doesn't it should be guaranteed that it did not make any changes and leave it to the caller to clean up the file.

The current situation is that in case it fails, it may or may not leave a log entry depending on what caused the failure. This seems to me less desirable than a clear guarantee that no log entry has been left.

brion added a comment.Via ConduitJan 5 2009, 6:59 PM

(In reply to comment #17)

In case this is an existing file, we never reach Article::doEdit, so it can
never fail at that.

Mmm, true. That should simplify things a bit at least. :)

> Either we need to accept that failing the page is ok, or we need to be able to
> roll back the change to the actual file -- or else do it after, and be able to
> roll back all our DB and page updates if the file move fails. Icky. :(
>
What I think is that we should accept that LocalFile::recordUpload2 does not
always succeed. In case it doesn't it should be guaranteed that it did not make
any changes and leave it to the caller to clean up the file.

The current situation is that in case it fails, it may or may not leave a log
entry depending on what caused the failure. This seems to me less desirable
than a clear guarantee that no log entry has been left.

We might want to consider what happens if various parts fail. Eg, if the edit succeeds, but the log entry fails, then we have to delete the page, the image record, and the file?

Lupo added a comment.Via ConduitFeb 9 2009, 9:58 PM
Lupo added a comment.Via ConduitJun 10 2009, 5:17 PM

And again:

http://commons.wikimedia.org/wiki/File:Rubik%27s_Cube_Beginner%27s_Method_-_after_step_0.svg

uploaded 2009-06-10, 11:54 (UTC), corrected manually. License was known in this case because the uploader told us at

http://commons.wikimedia.org/w/index.php?title=Commons:Forum&oldid=22211254#Fehler_beim_Hochladen

(in German).

Lupo added a comment.Via ConduitJul 9 2009, 11:28 AM

See

http://commons.wikimedia.org/wiki/User:Ilmari_Karonen/Zombie_images

for a list of more such images without description page (and, unfortunately, without licensing info).

Platonides added a comment.Via ConduitDec 2 2009, 3:21 PM

http://commons.wikimedia.org/wiki/File:Arepas_cooked_with_firewood.jpg file and image entry is there but no image page or log entry.
It was uploaded at today's swapdeath of the servers.

Lupo added a comment.Via ConduitMay 17 2010, 9:54 AM

And another one:

http://commons.wikimedia.org/wiki/File:1000_Minin.jpg

Uploaded 2010-05-17, image page was created only when the file was tagged as "no permission".

bzimport added a comment.Via ConduitFeb 12 2011, 5:59 PM

Bryan.TongMinh wrote:

Delete image row on Article::doEdit failure

Cooked up something simpler: since we know that the only thing that we did before editing the article is adding a row to the image table, we can safely delete this row.

Attached: patch.txt

bzimport added a comment.Via ConduitFeb 12 2011, 6:01 PM

Bryan.TongMinh wrote:

(In reply to comment #24)

Created attachment 8136 [details]
Delete image row on Article::doEdit failure

*sigh*

The condition before deleting should obviously be ( !$reupload && !status->isOk() )

Attached: patch.txt

Platonides added a comment.Via ConduitFeb 12 2011, 10:46 PM

Looks acceptable, given that we're not going to get consensus on doing the transaction bigger, which is what really should be done, due to fear of locks. I would perform an explicit commit before doEdit(), though.

Peachey88 added a comment.Via ConduitApr 30 2011, 12:10 AM

*Bulk BZ Change: +Patch to open bugs with patches attached that are missing the keyword*

Mabdul added a comment.Via ConduitJun 21 2012, 1:30 PM

I'm not sure if that belongs into this bug report: https://en.wikipedia.org/wiki/File:Aljazeera.svg was deleted and restored, but while restoring something went wrong - the image was available but not the description page. An user in Commons was aware of this bug and told the workaround is to delete and restore the image again, see log at https://en.wikipedia.org/w/index.php?title=Special:Log&page=File%3AAljazeera.svg

Nemo_bis added a comment.Via ConduitAug 23 2012, 8:49 AM
  • Bug 34228 has been marked as a duplicate of this bug. ***
TheDJ added a comment.Via ConduitAug 25 2012, 10:01 PM
  • Bug 37394 has been marked as a duplicate of this bug. ***
bzimport added a comment.Via ConduitSep 23 2012, 9:14 PM

Bryan.TongMinh wrote:

(In reply to comment #24)

Created attachment 8136 [details]
Delete image row on Article::doEdit failure

Cooked up something simpler: since we know that the only thing that we did
before editing the article is adding a row to the image table, we can safely
delete this row.

Getting back to this again, my original reasoning was wrong. There are four possible cases:

  1. File does not exist, page does not exist
  2. File does not exist, page does exist
  3. File does exist, page does not exist
  4. File does exist, page does exist

In cases 3 & 4 we move the image row to oldimage, which is a cumbersome operation to recover. In case 1 & 3 we call Article::doEdit(), which potentially fails. If Article::doEdit() fails, and we are in case 3, this patch leaves a dangling oldimage row with no image row.

Attached: patch.txt

Rillke added a comment.Via ConduitDec 13 2013, 2:39 PM

Created attachment 14083
Screenshot showing the issue. Uploader of the photo this is a derivative of is https://commons.wikimedia.org/wiki/User:Red_Castle - License: unknown

This happened again at https://commons.wikimedia.org/wiki/File:D%C3%A9l%C3%A9gation_g%C3%A9n%C3%A9rale_du_Qu%C3%A9bec_%C3%A0_Londres_2013.JPG and
https://commons.wikimedia.org/wiki/File:Montreal_Place_Londres_2013.JPG

Timeframe: 2013-12-13 01:54:00 - 2013-12-13 01:54:00

Their uploads are also not listed under their contributions: https://commons.wikimedia.org/wiki/Special:Contributions/Red_Castle

(change visibility) 01:46, 13 December 2013 (diff | hist) . . (+1,221)‎ . . N User:Red Castle ‎ (←Created page with 'Bonjour, je s...') (current)
(change visibility) 01:45, 13 December 2013 (diff | hist) . . (0)‎ . . User:Dr Wilson ‎ (→‎Licence) (current) [rollback: 1 edit]
(change visibility) 06:35, 8 November 2013 (diff | hist) . . (+389)‎ . . User talk:Harfang ‎ (→‎Drapeau de l'Acadie: new section) (current) [rollback: 1 edit]

Another user was also affected:
https://commons.wikimedia.org/wiki/File:Ghost_creepy_2013-12-12_17-52.jpg

Attached:

Rillke added a comment.Via ConduitDec 16 2013, 8:00 PM

https://commons.wikimedia.org/wiki/Commons:Village_pump#Upload_wizard_error

I got a weird error when I uploaded [[File:Facebook users by age population pyramid.png]]. When I try to tag it for deletion, I get "edit conflict". Can someone delete it? The correct file is at [[File:P opulation pyramid of Facebook users by age.png]]. --Dennis Bratland (talk) 03:29, 16 December 2013 (UTC)


The revision #0 of the page named "Facebook users by age population pyramid.png" does not exist.

Fabrice_Florin added a comment.Via ConduitDec 18 2013, 7:43 PM

Thank you all for bringing up this issue, which has gone on too long.

This issue will require major re-architecture work. We plan to revisit it in coming weeks as part of next quarter's improvements to the upload pipeline by the multimedia team. We will post here again once we have discussed possible solutions as a team in early January.

Thanks for your patience, and sorry we haven't been able to address this sooner.

Aklapper added a comment.Via ConduitFeb 26 2014, 4:54 PM

(In reply to Fabrice Florin from comment #36)

Thank you all for bringing up this issue, which has gone on too long.

This issue will require major re-architecture work. We plan to revisit it in
coming weeks as part of next quarter's improvements to the upload pipeline
by the multimedia team. We will post here again once we have discussed
possible solutions as a team in early January.

Fabrice: Any news?

Tgr added a comment.Via ConduitFeb 28 2014, 9:02 PM

Unassigning, I have been working exclusively on MediaViewer for the last two months, and that probably won't change for a while.

Gilles added a project: Multimedia.Via WebNov 24 2014, 3:42 PM
Gilles moved this task to Current cycle on the Multimedia workboard.Via WebNov 24 2014, 4:05 PM
KTC added a subscriber: KTC.Via WebDec 13 2014, 9:22 PM
Gilles moved this task to Backlog on the Multimedia workboard.Via WebMon, Apr 6, 9:23 AM

Add Comment

Column Prototype
This is a very early prototype of a persistent column. It is not expected to work yet, and leaving it open will activate other new features which will break things. Press "\" (backslash) on your keyboard to close it now.