From https://lists.wikimedia.org/pipermail/mediawiki-api/2014-December/003402.html
I found similar cases that UPLOAD update is lack of page_id. recentchange page <http://commons.wikimedia.org/w/api.php?action=query&format=xml&rccontinue=20141201163618|142850059&continue=-||&list=recentchanges&rclimit=500&rcnamespace=0%7C2%7C4%7C6%7C10%7C14%7C100%7C828&rcprop=comment%7Cflags%7Cids%7Cloginfo%7Csizes%7Ctimestamp%7Ctitle%7Cuser> <rc type="log" ns="6" title="File:Siegelmarke Der Persönliche Adjutant Seiner Königlichen Hoheit des Prinzen Georg von Preussen W0204563.jpg" pageid="0" revid="0" old_revid="0"rcid="142847103" user="Nicolas Rück (WMDE) " oldlen="0" newlen="0" timestamp="2014-12-01T16:29:40Z" comment="[[Commons:GLAMwiki Toolset Project|GWToolset]]: Creating mediafile for Nicolas Rück (WMDE)." logid="103976391" logtype="upload" logaction="*overwrite*" img_sha1=" swlwd4u41zhuaa0sq0x2pzuhzkr7xna" img_timestamp="2014-12-01T16:29:40Z"/> <rc type="log" ns="6" title="File:Siegelmarke Glogau - Saganer Fürstenthums Landschaft W0204559.jpg" pageid="0" revid="0" old_revid="0" rcid="142847096" user="Nicolas Rück (WMDE)" oldlen="0" newlen="0" timestamp=" 2014-12-01T16:29:40Z" comment="[[Commons:GLAMwiki Toolset Project|GWToolset]]: Creating mediafile for Nicolas Rück (WMDE)."logid=" 103976394" logtype="upload" logaction="*overwrite*" img_sha1=" 3o7mqqobcibbnnwj8x166gmrxj5rbtb" img_timestamp="2014-12-01T16:29:40Z"/> <rc type="log" ns="6" title="File:Siegelmarke Königlich Preussisches Haus - Archiv W0204564.jpg" pageid="0" revid="0" old_revid="0" rcid="142847109" user="Nicolas Rück (WMDE)"oldlen="0" newlen="0" timestamp=" 2014-12-01T16:29:40Z" comment="[[Commons:GLAMwiki Toolset Project|GWToolset]]: Creating mediafile for Nicolas Rück (WMDE)." logid=" 103976393"logtype="upload" logaction="*overwrite*" img_sha1=" sx4uuswlcbh3itwrr4dy67u0s9ca18h" img_timestamp="2014-12-01T16:29:40Z"/> It seems the lack of page_id problem only shows on *logaction="overwrite"* situation. Could you please take a look and see if this is a systematic bug? Thanks, Yijun
Investigated, appears to be a race condition (Same file given to gwtoolset twice, uploaded within 1 second of each other). What I think is happening is:
in LocalFile::recordUpload2 during the second (overwrite) upload:
` $exists = $descTitle->exists(); `
returns false due to slave lag, or perhaps the other image save hasn't even created the page yet
Eventually we get to the create the description page code:
$content = ContentHandler::makeContent( $pageText, $descTitle ); $status = $wikiPage->doEditContent( $content, $comment, EDIT_NEW | EDIT_SUPPRESS_RC, false, $user ); $dbw->begin( __METHOD__ ); // XXX; doEdit() uses a transaction // Now that the page exists, make an RC entry. $logEntry->publish( $logId ); if ( isset( $status->value['revision'] ) ) { $dbw->update( 'logging', array( 'log_page' => $status->value['revision']->getPage() ), array( 'log_id' => $logId ), __METHOD__ ); } $dbw->commit( __METHOD__ ); // commit before anything bad can happen `
At this point the page exists in the db, so $wikiPage->doEditContent fails (due to the EDIT_NEW). Note that in the examples of this bug, there is only one entry in the page history of the file. As a result, $status->value['revision'] is not set, so we do not update the logging table entry's log_page. The recent changes entry also has wrong rc_cur_id, since that comes directly from $descTitle->getArticleId(), which if we're at this code branch, would be cached and 0, and since the doEditContent failed, $this->mTitle->resetArticleID( $newid ); was not executed.
Suggested solution:
In the case that doEditContent fails with "edit-already-exists", we should probably have some code that does something along the lines of $descEdit->getArticleID( Title::GAID_FOR_UPDATE ); before the rc publish line, to force reloading $descTitle from master, and then use that value in the logging table UPDATE code.