Page MenuHomePhabricator

GWToolset is flooding logs on Commons with poorly-constructed log entries
Closed, ResolvedPublic

Description

Steps to reproduce:
http://commons.wikimedia.org//w/api.php?action=query&list=recentchanges&format=xml&rcstart=20150122000951&rcend=20150123000951&rcdir=newer&rcnamespace=6&rclimit=500&rctype=log&rawcontinue

Expected result:
[...]
<rc type="log" ns="6" title="File:Europe under Nazi domination.png" pageid="14619311" revid="0" old_revid="0" rcid="150700975" timestamp="2015-01-22T00:10:24Z"/>
<rc type="log" ns="6" title="File:Move It - Thinktank Birmingham Science Museum - City of Birmingham locomotive 46235 (8620737880).jpg" pageid="37965992" revid="0" old_revid="0" rcid="150701084" timestamp="2015-01-22T00:10:33Z"/>
[...]

Actual result:
[...]
<rc type="log" ns="6" title="File:No title" pageid="0" revid="0" old_revid="0" rcid="150700967" timestamp="2015-01-22T00:10:23Z"/>
<rc type="log" ns="6" title="File:No title" pageid="0" revid="0" old_revid="0" rcid="150700968" timestamp="2015-01-22T00:10:23Z"/>
<rc type="log" ns="6" title="File:No title" pageid="0" revid="0" old_revid="0" rcid="150700969" timestamp="2015-01-22T00:10:23Z"/>
<rc type="log" ns="6" title="File:No title" pageid="0" revid="0" old_revid="0" rcid="150700970" timestamp="2015-01-22T00:10:23Z"/>
<rc type="log" ns="6" title="File:No title" pageid="0" revid="0" old_revid="0" rcid="150700971" timestamp="2015-01-22T00:10:23Z"/>
<rc type="log" ns="6" title="File:No title" pageid="0" revid="0" old_revid="0" rcid="150700972" timestamp="2015-01-22T00:10:23Z"/>
<rc type="log" ns="6" title="File:No title" pageid="0" revid="0" old_revid="0" rcid="150700973" timestamp="2015-01-22T00:10:23Z"/>
<rc type="log" ns="6" title="File:No title" pageid="0" revid="0" old_revid="0" rcid="150700974" timestamp="2015-01-22T00:10:23Z"/>
<rc type="log" ns="6" title="File:Europe under Nazi domination.png" pageid="14619311" revid="0" old_revid="0" rcid="150700975" timestamp="2015-01-22T00:10:24Z"/>
[...]

Event Timeline

McZusatz created this task.Feb 6 2015, 8:34 AM
McZusatz raised the priority of this task from to Needs Triage.
McZusatz updated the task description. (Show Details)
McZusatz added a project: MediaWiki-API.
McZusatz added a subscriber: McZusatz.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 6 2015, 8:34 AM
Aklapper triaged this task as Low priority.Feb 6 2015, 1:03 PM
Anomie added a subscriber: Anomie.

Adding the 'loginfo' parameter to the query (like this) reveals that these are far from "void data"; in fact, they all appear to be log entries from MediaWiki-extensions-GWToolset.

<rc type="log" ns="6" title="File:No title" pageid="0" revid="0" old_revid="0" rcid="150701245" timestamp="2015-01-22T00:10:57Z"
logid="111446482" logtype="gwtoolset" logaction="metadata-job" message="<h2>Step 4: Batch upload</h2>Metadata batch job created.
Your metadata file will be analyzed shortly and each item will be uploaded to the wiki in a background process. You can check the
page "<a href="/wiki/Special:NewFiles" title="Special:NewFiles" target="_blank">Special:NewFiles</a>" to see when they have been
uploaded. Started with metadata record 1"/>
Restricted Application added a project: Multimedia. · View Herald TranscriptFeb 6 2015, 3:16 PM
Anomie set Security to None.

these are far from "void data"; in fact, they all appear to be log entries from MediaWiki-extensions-GWToolset.

Ok, thank you for finding the cause of the entries. I never saw such log entries before and in the last 30 days there are 1756775 such cases. (81.4 % of 2158260; Probably concentrated on a few days.)

Anomie renamed this task from Result set of recentchanges gets padded with void data to GWToolset is flooding logs on Commons with poorly-constructed log entries.Feb 6 2015, 8:13 PM
Tgr closed this task as Resolved.Feb 6 2015, 10:10 PM
Tgr claimed this task.
Tgr added a subscriber: Tgr.

I never saw such log entries before and in the last 30 days there are 1756775 such cases. (81.4 % of 2158260; Probably concentrated on a few days.)

A side effect of T87040 (including the "poorly constructed" part - GWT uses the same method to construct log messages and HTML messages on its special page, and it confused the two formats), which has been fixed.