Page MenuHomePhabricator

Make file uploads patrollable
Closed, ResolvedPublic

Description

New articles are yellow in the list
(http://nl.wikipedia.org/wiki/Speciaal:Newpages) untill they are marked as
patrolled; can this be done for new uploads as well?


URL: http://nl.wikipedia.org/wiki/Speciaal:Newimages

Details

Reference
bz9501

Related Objects

Event Timeline

bzimport raised the priority of this task from to Low.
bzimport set Reference to bz9501.
bzimport added a subscriber: Unknown Object (MLST).
Ciell created this task.Apr 5 2007, 9:07 AM

robchur wrote:

We'll need to alter things so new uploads generate an "unpatrolled" entry
somewhere in the database...possibly worth rethinking part of the patrol model.

Krinkle added a comment.EditedApr 27 2010, 10:20 PM

What's the progress on this ?

This would be of great help on Commons, currently have do have a manual way:

http://commons.wikimedia.org/wiki/Commons:Recent_uploads_patrol

But it's basicly a lot of double work, since while wandering around on Commons, I come across uploads of people, being able to patrol those reduces the backlogs.

However, the old manual way on the above page would mean I'd have to do an entire day part.

Hopefully with patrol functionality on uploads this can be sorted out better and can be much better organised. (ie. filtering out patrolled uploads)

I asume this change would mean that anyone with the autopatrol right (bots, autopatrollers, sysops) are ofcourse autopatrolled in Uploads aswell.

– Krinkle

Krinkle added a comment.EditedJun 21 2010, 4:11 PM

content hidden in Bugzilla

Krinkle added a comment.EditedJun 21 2010, 4:11 PM

content hidden in Bugzilla

After a chat on IRC we came to the conclusion that actually this should be
working right now (partially).

There's basicly two option:

  • Make the upload-action patrollable (this means that theoratically an upload

without a description page is patrollable, it also means that re-uploads are
patrollable and that it needs to be integrated into the patrol model as
something new (as Rob sugggests above)

Another option:

  • Creation of File-description pages are currently only patrollable if it was

not the result of an upload (ie. accidental page creation, or for example a
local wikipedia filepage for a Commons-image, such as
http://en.wikipedia.org/w/index.php?title=Special:Log&page=File:Penile-Clitoral+Structure.JPG&hide_review_log=0&hide_patrol_log=0
) this functionality could be extended to cover all file-description pages instead of only the ones not created when uploading.

The latter option seems kinda of a bug that it is not the case already since
this means a lot of page creations are not patrollable. On the other hand I
think it's not wise to implement both because that would cause double work.

Created attachment 7794
rough patch for patrolling uploads

Rough patch to make uploads patrollable.

I haven't tested this very well, some issues that come to mind
*It generates patrol log entries of the form User:foo patrolled revision 0 of file:some_image.png (probably not too difficult to fix)
*It re-uses a bunch of the new page patrol stuff, so the interface seems to suggest you're patrolling a page instead of a file.
*And the big one: It only works if you follow links from RC (this is the case for new page patrol too). Krinkle pointed out that most people wouldn't do new image patrolling from special:recentchanges.

Attached:

(In reply to comment #6)

*And the big one: It only works if you follow links from RC (this is the case
for new page patrol too).

That's bug 15936 for the record.

I'll check the patch out tonight. A few things that come to mind which may or may not be taken care of automatically by MediaWiki if the rc-entry is unpatrolled.

  • On Special:RecentChanges the upload log entries should show a red exclamation mark if it's unpatrolled, and the patrolled ones should be hidden when clicking " Hide patrolled edits" (may need rephrasing as it doesn't just show edits - hasn't been that way for a looong time (log entries are part of RC for quite some time)
  • Just like NewPage patrolling is rarely done from RecentChanges (as it's unpractical and doesn't provide the filtering options needed) - there is Special:NewPages for that with options to toggle "logged-in users, patrolled edits and bots". The same is the case for files, patrolling files from RecentChanges will most likely not be done as it doesn't give the info needed (size, thumbnail, etc.). Special:NewFiles is suited for this perfectly. These four toggle options have to be added to Special:NewFiles aswell.
  • When viewing a difference view and the right-side is an upload (ie. re-upload / overwrite) it should show a [mark as patrolled] link in the diff-frame just like it does for edits (was done for edits in diff in r24607 )
  • Although edits are mostly not new pages, the contrary is with uploads where the majority are 'new' and not 're-uploads'. Which means bug 15936 -like situations (where going to the oldid of the first revision of a file would -not- show the [mark] link) are unacceptable in my opinion, without it it's pretty useless.

Krinkle

Oh, and before I forget.
I was thinking about a variable like $wgUseLogPatrol instead of $wgUserUploadPatrol.
Which would either be a boolean or an array. If boolean it disables or enables patrolling for each log type. Or as an array to only enable it for some

ie. $wgUseLogPatrol = array( 'upload' => true, 'move' => true);

(In reply to comment #8)

  • On Special:RecentChanges the upload log entries should show a red exclamation mark if it's unpatrolled, and the patrolled ones should be hidden when clicking

I did that as part of the patch (also made the b marker for bot uploads work too)

  • Just like NewPage patrolling is rarely done from RecentChanges (as it's unpractical and doesn't provide the filtering options needed) - there is Special:NewPages for that with options to toggle "logged-in users, patrolled edits and bots". The same is the case for files, patrolling files from RecentChanges will most likely not be done as it doesn't give the info needed (size, thumbnail, etc.). Special:NewFiles is suited for this perfectly. These four toggle options have to be added to Special:NewFiles aswell.

That might be an issue, as special:newimages uses the image table, not the RC table. (Special:newimages also does rather weird stuff when filtering bots that seems inefficient (It checks if the user who uploaded is currently a "bot" not that they were a bot when the image was uploaded), but I'm not all that well versed with db efficiency). There doesn't seem to be any indexes on the needed fields in recentchanges, so filtering rc to be only uploads might be inefficient (Again, I don't really understand the intricacies of db efficiency, so take what i say here with salt).

  • Although edits are mostly not new pages, the contrary is with uploads where the majority are 'new' and not 're-uploads'. Which means bug 15936 -like situations (where going to the oldid of the first revision of a file would -not- show the [mark] link) are unacceptable in my opinion, without it it's pretty useless.

I don't think that'd be easy to do (At least not in the way I did it in the patch above) since its patrolling the log actions, not edits, so the oldid (revision id) has no relation to what we're patrolling.

The other issues you mention are also things that generally apply and need to be worked on.

In response to added keywords.

Note, my patch was meant more as a starting point, I don't really think its ready/should be committed to trunk at this stage yet.

So what does need doing here, the addition of patrol stuff to [[Special:NewFiles]]? The system is in place, isn't it?

Not really.

What is in place is this:

  • A generic database implementation for patrolling recent changes entries (any entry, be it RC_EDIT, RC_NEW page or RC_LOG). RC_LOG includes upload actions. So this doesn't need anything for uploads, it is already generically in place. rc_patrolled is toggleable for log actions in theory, and uploads are logged and in recentchanges
  • A front-end for patrolling edits
    • Configurable: Through wgUseRCPatrol[1] // enabled in mw by default, disabled on most wmf wikis, enabled on nl.wikipedia, commons.wikimedia and dozens others
    • List: Special:RecentChanges with "Hide patrolled edits"
    • Patrol-interface: On the diff pages with "[mark as patrolled]"
  • A front-end for patrolling new pages
    • Configurable: Through wgUseNPPatrol[2] // enabled in mw by default
    • List: Special:NewPages with "Hide patrolled edits"
    • Patrol-interface: On the bottom when viewing a page, with "[mark as patrolled]" (though subject to bug 15936, which causes the link only to be there when visiting the page from Special:NewPages)

What we need is a front-end for patrolling uploads.

  • Configurable: Of course
  • List: We already have two file-list special pages (though plans to merge exist afaik), so it would only need to have a way to indicate the patrol mark and a way to exclude patrolled files from the view.
  • Patrol interface: I would like if it were given first class treatment like for edits (not like with new pages where it is just dumped at the bottom through the ugly hack of passing it through a query parameter, making it not really related to the page you're looking at).

And contrary to edits/new pages, we need something special for uploads because contrary to regular edits, uploads are log actions. Right now those are not patrollable because Log sets rc_patrolled to false by default.

And then there is the issue of uploads (usually?) causing two events: New page creation and a log event. Though the new page creation is usually not emitted afaik.

[1] https://www.mediawiki.org/wiki/Manual:$wgUseRCPatrol
[1] https://www.mediawiki.org/wiki/Manual:$wgUseNPPatrol

I plan on uploading a patch set for this and T98617, since it's the same mechanism.
A checkbox in Special:NewFiles would allow to select only unpatrolled files.

Change 211656 had a related patch set uploaded (by Cenarium):
Allow patrol of page moves and uploads

https://gerrit.wikimedia.org/r/211656

Krinkle renamed this task from List/indication of unpatrolled uploaded media files to Make file uploads patrollable.May 18 2015, 12:30 AM
Krinkle updated the task description. (Show Details)
Krinkle set Security to None.
Krinkle removed subscribers: gerritbot, Unknown Object (MLST).
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 26 2015, 7:41 AM

Change 251795 had a related patch set uploaded (by Cenarium):
Allow patrol of uploads

https://gerrit.wikimedia.org/r/251795

Catrope removed a subscriber: Catrope.Nov 13 2015, 8:21 PM

Change 251795 merged by jenkins-bot:
Allow patrol of uploads

https://gerrit.wikimedia.org/r/251795

matmarex added a subscriber: matmarex.

Whoo! Thanks, @Cenarium!

I'm going to write a release note about it and drop a note somewhere on Commons.

matmarex closed this task as Resolved.Jan 7 2016, 2:05 AM
matmarex removed a project: Patch-For-Review.

Change 263005 had a related patch set uploaded (by Bartosz Dziewoński):
RELEASE-NOTES-1.27: Add a note about file upload patrolling

https://gerrit.wikimedia.org/r/263005

Change 263005 merged by jenkins-bot:
RELEASE-NOTES-1.27: Add a note about file upload patrolling

https://gerrit.wikimedia.org/r/263005

Meno25 removed a subscriber: Meno25.Jan 9 2016, 2:15 PM
Rillke added a subscriber: Rillke.Feb 28 2016, 6:24 PM

Thank you so much for implementing this feature. The query appears to be a little slow. I repeatedly got

A database query error has occurred. This may indicate a bug in the software.

    Function: IndexPager::buildQueryInfo (NewFilesPager)
    Error: 2013 Lost connection to MySQL server during query (10.64.48.19)

Any idea how to speed it up?

Luke081515 added a subscriber: Luke081515.

Forgot to mention, this happens at Wikimedia Commons.

Thank you so much for implementing this feature. The query appears to be a little slow. I repeatedly got

A database query error has occurred. This may indicate a bug in the software.

    Function: IndexPager::buildQueryInfo (NewFilesPager)
    Error: 2013 Lost connection to MySQL server during query (10.64.48.19)

Any idea how to speed it up?

This is T124205.

Note, the page may be a little faster when "Show bots" is checked - https://commons.wikimedia.org/wiki/Special:NewFiles?showbots=1&hidepatrolled=1&limit=50&offset=


Some things I notice about the query (Locally it is

SELECT /* IndexPager::buildQueryInfo (NewFilesPager) Bawolff */  * 
FROM `image` LEFT JOIN `user_groups` ON (ug_group = 'bot' AND (ug_user = img_user))
INNER JOIN `recentchanges` ON ((rc_title = img_name) AND (rc_user = img_user) AND (rc_timestamp = img_timestamp)) 
WHERE (ug_group IS NULL) AND rc_type = '3' AND rc_log_type = 'upload' AND rc_patrolled = '0'
ORDER BY img_timestamp DESC LIMIT 51

Using * as the field means that every column is returned. On some files img_metadata is huge (up to 16 mb). In the worst case this query could return more than 800 mb.

Second, doing explain on tool labs suggest that this filesorts the entire image table on enwiki, and filesorts the recentchanges table on commonswiki. Which is bad.

Maybe we could try and make it at least use the new_name_timestamp index on recentchanges (Since for this query, rc_new = 0 and rc_namespace = 6 and we're ordering by rc_timestamp).

Otherwise I think we need to add some new indexes (on rc_type, rc_patrol, rc_timestamp I guess? Or maybe even move rc_patrol into the image table)

Using * as the field means that every column is returned. On some files img_metadata is huge (up to 16 mb). In the worst case this query could return more than 800 mb.

Fixed very recently in https://gerrit.wikimedia.org/r/#/c/269429/.

Second, doing explain on tool labs suggest that this filesorts the entire image table on enwiki, and filesorts the recentchanges table on commonswiki. Which is bad.

I don't think it does any filesorting after T124205.

Using * as the field means that every column is returned. On some files img_metadata is huge (up to 16 mb). In the worst case this query could return more than 800 mb.

Fixed very recently in https://gerrit.wikimedia.org/r/#/c/269429/.

See also: T86611