Moved pages should be listed in NewPages/RecentChanges with current title, namespace, existence etc.
OpenPublic

Assigned To
rohan013
Priority
High
Author
bzimport
Subscribers
gerritbot, Cenarium, Xeno and 27 others
Projects
Tokens
"Like" token, awarded by Nemo_bis.
Security
None
Reference
bz12363
Description

Author: dhwl09

Description:
New articles are sorted by namespace on Special:Newpages, by default displaying only mainspace creations. From personal experience and by observing the the marking of articles as "patrolled" it would appear that articles created in userspace are generally not observed by Newpage Patrollers. And while moving an article into mainspace will still show up in recent changes, I presume they remain quite unlikely to be patrolled soon after creation. Thus, creating articles in userspace before moving them into mainspace seems to me a sneaky way of avoiding scrutiny from newpage patrollers. An simple solution (in concept, perhaps not in actual implementation, I have no idea on that part) would be to display any pagemove into mainspace alongside mainspace creations in Special:Newpages. Thank you.


Version: 1.22.0
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=36930

bzimport set Reference to bz12363.
bzimport created this task.Via LegacyDec 20 2007, 5:05 AM
siebrand added a comment.Via ConduitFeb 2 2009, 1:32 PM

Changed component to "RecentChanges"

demon added a comment.Via ConduitApr 5 2011, 11:56 PM
  • Bug 28431 has been marked as a duplicate of this bug. ***
Dcoetzee added a comment.Via ConduitJan 13 2012, 7:10 AM

More recently, some users on enwiki have been subverting new page patrol by taking articles submitted at Articles for Creation which were declined (not accepted for publishing in mainspace), and moving them into mainspace anyway. One user has proposed move-protecting these pages to avoid the problem (see https://en.wikipedia.org/wiki/Wikipedia_talk:WikiProject_Articles_for_creation#Arbitrary_section_break), but clearly moves from user sandboxes, Wikiproject drafts in project space, etc. are just as dangerous in their potential for avoiding scrutiny. This proposed feature is the right way to deal with the problem.

In my opinion the way this feature should work is, when viewing new pages in any particular namespace, it should include moves into that namespace from other namespaces (since they are "new" to that namespace). Newpages over all namespaces should work the same as it does now. This would help it to be more generally useful. This could be made a checkbox option, if there is some reason this behavior is not always desirable.

Peachey88 added a comment.Via ConduitJan 13 2012, 10:55 AM

I'm in the feeling that this should a wontfix, NewPages as the page name suggests should only list new page creations which pages moves aren't.

bzimport added a comment.Via ConduitJan 13 2012, 11:58 AM

paolobenve wrote:

+1 Derrick Coetzee!

Actually in wikipedia and most of other mediawiki project, ns0 is higher rank than other ns, it's not the same thing to create a page in some namespace and to create an article.

I think that the new pages feature should address principally article creations, and so moving a page to an article should be reported in special:Newpages

bzimport added a comment.Via ConduitJan 13 2012, 2:46 PM

snottywong.wiki wrote:

(In reply to comment #4)

I'm in the feeling that this should a wontfix, NewPages as the page name
suggests should only list new page creations which pages moves aren't.

I partially disagree. Page moves within main space are not new page creations, but page moves from a different namespace into main space are most definitely new page creations and need to be patrolled.

Dcoetzee added a comment.Via ConduitJan 14 2012, 10:16 AM

For now, I've implemented a tool on Toolserver called "Recent moves" to address this problem:

http://toolserver.org/~dcoetzee/recentmoves/

I hope I can encourage people to start patrolling it.

bzimport added a comment.Via ConduitJan 14 2012, 1:07 PM

paolobenve wrote:

ok, but it "solves" the problem for wikipedia, not for all other mediawiki installation...

Dcoetzee added a comment.Via ConduitJan 14 2012, 2:11 PM

I agree that this should be fixed in Mediawiki eventually. But I know that will take time, so the tool is a temporary stopgap measure.

bzimport added a comment.Via ConduitOct 1 2013, 9:57 PM

kgorman wrote:

This is currently being actively exploited on a large scale by a group of paid editors on ENWP to get their pages in to article space without having them scrutinized. More details about their activities available via email request, but their activity is significant enough to make this worth fixing quickly. (Also in email communication with Dan about this, who I'll add to the CC list fi I can figure out how.)

Deskana added a comment.Via ConduitOct 2 2013, 3:31 PM

This bug may be being actively misused on the English Wikipedia. We need to get this fixed.

CCing Fabrice Florin, the product manager for page curation, to get his input.

Bawolff added a comment.Via ConduitOct 2 2013, 11:14 PM

(In reply to comment #11)

This bug may be being actively misused on the English Wikipedia. We need to
get
this fixed.

CCing Fabrice Florin, the product manager for page curation, to get his
input.

Just to clarify, the original request was for
Special:newpages not special:newpagesfeed. Are you saying this is an issue/you want this fixed for both?

Nemo_bis added a comment.Via ConduitOct 3 2013, 5:03 AM

(In reply to comment #12)

Just to clarify, the original request was for
Special:newpages not special:newpagesfeed. Are you saying this is an
issue/you
want this fixed for both?

That would be a separate bug. I'd focus on Special:NewPages, among other reasons because it's more used (https://toolserver.org/~dartar/pc/).

bzimport added a comment.Via ConduitOct 5 2013, 10:44 PM

kgorman wrote:

I'm not sure if the same issue applies to newpagesfeed, but I would agree with Nemo that getting newpages fixed would be a higher priority. If the same issue does apply with newpagesfeed, it should be fixed there as well, but it would be a lower priority.

Dcoetzee added a comment.Via ConduitOct 6 2013, 5:53 PM

Just a heads up, at least for enwiki, http://toolserver.org/~dcoetzee/recentmoves/ is operational again (it was broken by MW1.19 changes to log_params field and I didn't notice until now). However I still consider this to be an important issue to resolve in MW.

Bawolff added a comment.Via ConduitOct 9 2013, 5:42 PM

Ugh, the way recentchanges is stored isn't the most ammenable to fixing this bug. (Its not the most accesible to filter by what the target namespace of the move is).

Options:

  1. Totally redo the recentchanges schema (Unlikely to happen)
  2. Move Derrick's http://toolserver.org/~dcoetzee/recentmoves/ into core (aka create a new special page, called Special:Recentmoves that lists all recent moves)
  3. As part of the page move process, retroactively change the rc_title/rc_namespace of the creation entry in recentchanges table.

Option 3 seems rather hacky. (Especially seems hacky if one only changes the page creation entry, and not subsequent edits. This would make all the recent change entries change from "created page at [original title]" to "created page at [final title]")

None of these options seem really good fixes to the problem (Problem being Special:newpages. I'm not familiar with newpagesfeed).

Dcoetzee added a comment.Via ConduitOct 9 2013, 6:24 PM

I've checked in the updates to recentmoves in case anyone is interested in basing a patch on it. It's worth noting that it is not a single query as written right now - it does some filtering of the query in procedural code. This is largely due to fields which have a format too complex to match with the SQL LIKE operator. I also haven't tested it with databases other than MySQL.

Ironholds added a comment.Via ConduitOct 11 2013, 12:26 AM

Why would NPF be a lower priority? Note that NPF intentionally lists moves.

Ironholds added a comment.Via ConduitOct 11 2013, 12:26 AM

Ah, just saw Nemo's comment; ignore me.

bzimport added a comment.Via ConduitOct 11 2013, 1:05 AM

swalling wrote:

(In reply to comment #16)

Ugh, the way recentchanges is stored isn't the most ammenable to fixing this
bug. (Its not the most accesible to filter by what the target namespace of
the
move is).

Options:

  1. Totally redo the recentchanges schema (Unlikely to happen)
  2. Move Derrick's http://toolserver.org/~dcoetzee/recentmoves/ into core (aka create a new special page, called Special:Recentmoves that lists all recent moves)
  3. As part of the page move process, retroactively change the rc_title/rc_namespace of the creation entry in recentchanges table.

    Option 3 seems rather hacky. (Especially seems hacky if one only changes the page creation entry, and not subsequent edits. This would make all the recent change entries change from "created page at [original title]" to "created page at [final title]")

    None of these options seem really good fixes to the problem (Problem being Special:newpages. I'm not familiar with newpagesfeed).

Oliver says NewPagesFeed already lists page moves. If this is the case, then we should be evangelizing use of that tool as a short term solution.

The above solution is only good for enwiki though. If it's inordinately difficult to add recent moves to Special:NewPages, then I think Option 2, and creation of a proper Special:RecentMoves, would be the ideal case for most Wikimedia wikis.

Ideally, a Special:RecentMoves would...

  • Be able to be filtered by moves within any namespace
  • Be able to be filtered down to cross-namespace moves

Brian, I added Dan Garry to this bug, as it seems like it's a part of Platform's remit. Dan, does that sound reasonable?

Bawolff added a comment.Via ConduitOct 16 2013, 3:22 PM

(In reply to comment #20)

Oliver says NewPagesFeed already lists page moves. If this is the case, then
we
should be evangelizing use of that tool as a short term solution.

Verified that Special:newpagesfeed does not have this bug.

Like special:newpages, special:newpagesfeed lists moved pages under their original creation date. This makes logical sense, however when I was testing examples on wikipedia, I noticed that lots of these pages were created long ago (often several months), thus in a sense these "new pages" appear at the bottom of the list, when they do appear, and hence perhaps still avoiding some scrutiny.

The above solution is only good for enwiki though. If it's inordinately
difficult to add recent moves to Special:NewPages, then I think Option 2, and
creation of a proper Special:RecentMoves, would be the ideal case for most
Wikimedia wikis.

Ideally, a Special:RecentMoves would...

  • Be able to be filtered by moves within any namespace
  • Be able to be filtered down to cross-namespace moves

    Brian, I added Dan Garry to this bug, as it seems like it's a part of Platform's remit. Dan, does that sound reasonable?

Hmm, I wonder what the performance implications of Select * from recentchanges inner join page on page_id = rc_cur_id where rc_new = 1 and rc_namespace != page_namespace and page_namespace = 0; Seems like that would be a bit more scary then what normally can go on 'pedia. Then again, that's probably similar to the performance of Special:Recentchangeslinked, so maybe... [Saying this without doing any testing whatsoever, probably totally wrong].

bzimport added a comment.Via ConduitOct 16 2013, 9:58 PM

swalling wrote:

(In reply to comment #21)

> The above solution is only good for enwiki though. If it's inordinately
> difficult to add recent moves to Special:NewPages, then I think Option 2, and
> creation of a proper Special:RecentMoves, would be the ideal case for most
> Wikimedia wikis.
>
> Ideally, a Special:RecentMoves would...
>
> * Be able to be filtered by moves within any namespace
> * Be able to be filtered down to cross-namespace moves
>
> Brian, I added Dan Garry to this bug, as it seems like it's a part of
> Platform's remit. Dan, does that sound reasonable?

Hmm, I wonder what the performance implications of Select * from
recentchanges
inner join page on page_id = rc_cur_id where rc_new = 1 and rc_namespace !=
page_namespace and page_namespace = 0; Seems like that would be a bit more
scary then what normally can go on 'pedia. Then again, that's probably
similar
to the performance of Special:Recentchangeslinked, so maybe... [Saying this
without doing any testing whatsoever, probably totally wrong].

Stupid question: since this filtering is done by the user, is it perhaps better to gather the list of all moves, then do a filter for moves where the new/target namespace is mismatched on the client side, rather than in SQL? Since nothing would prevent users without JS from viewing the list, that seems like a nice to have feature.

bzimport added a comment.Via ConduitDec 12 2013, 12:47 AM

swalling wrote:

Poke. Any movement on this?

bzimport added a comment.Via ConduitDec 21 2013, 2:34 AM

cs wrote:

Just a review to ensure that I understand what is wanted:

Moves from Draft namaespace to mainspace should appear in the new pages Feed.
They should appear in the NPF chronologically on the date the move was effected.

Legoktm added a comment.Via ConduitDec 21 2013, 6:00 AM

(In reply to comment #24)

Moves from Draft namaespace to mainspace should appear in the new pages
Feed.
They should appear in the NPF chronologically on the date the move was
effected.

No. NPF is bug 36930 (I added it as a see also, not sure why it wasn't already). Also, what namespace the page is coming from shouldn't matter, it should work for any.

Peachey88 added a comment.Via ConduitDec 21 2013, 6:30 AM

I still feel the same about this as I did back in Comment 4, NewPages as it's name suggest should be for NEW pages, moves aren't new pages. We should think about another log page(s) to display move actions specifically (and appropriate API queries if we don't have them) if we don't have one already.

bzimport added a comment.Via ConduitDec 21 2013, 6:48 AM

paolobenve wrote:

Maybe nobody has thoght on that, but I'm proposing something "different":

Why can't we leave Special:NewPages as it is, and implement a Special:NewArticles page which only tracks new entries in ns0, either created or moved from nsX?

Peachey88 added a comment.Via ConduitDec 21 2013, 6:55 AM

Special:MovedPages would probably be better, from the aspect that not everyone uses the "Article" terminology when referring to pages.

Then just have a couple of filters on the page for "Moved from NS X" and "Moved to NS X" a long with the standard date filters.

Nemo_bis added a comment.Via ConduitDec 21 2013, 8:43 AM

(In reply to comment #27)

Maybe nobody has thoght on that, but I'm proposing something "different":

It's option 2 in comment 16. Option 2 seems the easiest, but it's IMHO clear that not forcing new page patrollers to learn about another special page would be ideal. Pages moved into a namespace are "new pages" for that namespace: from the interface point of view, we would only need a checkbox, activated when a filter by namespace is selected, that also shows pages moved in that namespace.

Unless someone has ideas to discuss about option 1 or some alternatives to the hack of option 3, I think it's time for any interested dev to stop talking and just submit a patch for whatever approach they manage to create one. :)

bzimport added a comment.Via ConduitDec 21 2013, 12:06 PM

paolobenve wrote:

I think I haven't expressed well my thoght in comment 27.

When I spoke of Special:NewArticle, different from Special:NewPages, I meant a special page which tracked all new ns0 pages, regardless of whether they are created or moved from another ns.

The idea behind recentmoves in solution 2 of comment 16 is different.

If NewArticles isn't the best denomination for this page, let's search for a better expression.

Mattflaschen added a comment.Via ConduitDec 24 2013, 6:35 AM

(In reply to comment #30)

I think I haven't expressed well my thoght in comment 27.

When I spoke of Special:NewArticle, different from Special:NewPages, I meant
a special page which tracked all new ns0 pages, regardless of whether they are
created or moved from another ns.

However, people will also want to know about pages that appear in other namespaces (e.g. Project or Portal). They can appear in two ways, creation and move.

I am inclined to think it should be added to NewPages.

Nemo_bis added a comment.Via ConduitJan 23 2014, 7:25 AM

Until someone gets to implement one of the options in comment 16, it's worth noting that one simple workaround for WMF would be to raise $wgRCMaxAge considerably. That's OT here, so if someone wants to discuss it please open another bug. :)

Nemo_bis added a comment.Via ConduitJan 23 2014, 7:27 AM

(To be clear, simple but not trivial to approve, and workaround but only for some part.)

bzimport added a comment.Via ConduitJan 23 2014, 8:07 AM

paolobenve wrote:

What about creating a new special page, Special:NewArticles (perhaps a better name could be chosen)?

It could list new pages in ns0, regardless of they are created or moved from another ns.

Mattflaschen added a comment.Via ConduitJan 28 2014, 6:42 AM

(In reply to comment #34)

What about creating a new special page, Special:NewArticles (perhaps a better
name could be chosen)?

It could list new pages in ns0, regardless of they are created or moved from
another ns.

I don't think ns0 should have any special behavior for this use case. Special:NewPages already lets you filter by target namespace. People may be interested in other target namespaces, e.g. pages created/moved into the project namespace.

Krinkle added a comment.Via ConduitJun 5 2014, 9:45 AM

Classifying as bug. This allows pages to bypass review.

gerritbot added a comment.Via ConduitJul 22 2014, 6:33 AM

Change 148322 had a related patch set uploaded by Rohan013:
Show Page moves on Special:Newpages

https://gerrit.wikimedia.org/r/148322

Bawolff added a comment.Via ConduitJul 24 2014, 4:30 AM

Rohan's patch just gave me a (hacky) idea.

*Add an index on (rc_source, rc_log_type, rc_timestamp)
*Do the normal newpages query, taking note of the first and last timestamp returned.
*Do another query, something like: SELECT rc_title, rc_namespace, rc_params FROM recentchanges WHERE rc_source = 'mw.log' AND rc_log_type = 'move' and rc_timestamp >= $minTimestamp AND rc_timestamp <= $maxTimestamp order by rc_timestamp desc LIMIT max( 2*$whateverTheActualLimitIs, 500 )

And then in php look through rc_params for the target page filtering out those that don't match the namespace. (Originally I was thinking of doing a rc_params NOT LIKE '%:"<namespace>:' clause for each namespace, but seems more sane to do that filtering in php).

This works on the assumption that the number of moves total will be less than the normal of page creations in a given namespace for a given time period. This is mostly true for main namespace. Probably not true for others. Has a limit check for sanity which would potentially cause some page moves not to be shown if the limit is reached. I guess that's better than the current situation. Maybe we could give a warning in that case or something.

Bawolff added a comment.Via ConduitJul 24 2014, 6:55 PM

(In reply to Bawolff (Brian Wolff) from comment #38)

Rohan's patch just gave me a (hacky) idea.

*Add an index on (rc_source, rc_log_type, rc_timestamp)
*Do the normal newpages query, taking note of the first and last timestamp
returned.
*Do another query, something like: SELECT rc_title, rc_namespace, rc_params
FROM recentchanges WHERE rc_source = 'mw.log' AND rc_log_type = 'move' and
rc_timestamp >= $minTimestamp AND rc_timestamp <= $maxTimestamp order by
rc_timestamp desc LIMIT max( 2*$whateverTheActualLimitIs, 500 )

And then in php look through rc_params for the target page filtering out
those that don't match the namespace. (Originally I was thinking of doing a
rc_params NOT LIKE '%:"<namespace>:' clause for each namespace, but seems
more sane to do that filtering in php).

This works on the assumption that the number of moves total will be less
than the normal of page creations in a given namespace for a given time
period. This is mostly true for main namespace. Probably not true for
others. Has a limit check for sanity which would potentially cause some page
moves not to be shown if the limit is reached. I guess that's better than
the current situation. Maybe we could give a warning in that case or
something.

Or actually could just use logging table. Maybe add stuff to log_search table too for target look up.

Bawolff added a comment.Via ConduitJul 27 2014, 6:36 PM

So to flesh out comment 38, here's what I would suggest:

*In Title::moveToInternal, if this->getNamespace() !== $nt->getNamespace(), then do $logEntry->setRelations( array( 'new_page_helper' => sprintf( "%05d|%s", $nt->getNamespace(), $logEntry->getTimestamp() ) );

*Then we do a second query, something along the lines of "SELECT log_namespace as rc_namespace, log_title as rc_title, ... FROM log_search INNER JOIN logging ON ls_log_id = log_id where ls_field = 'new_page_helper' and ls_filed_value between " . sprintf( "%05d|%s", $targetNamespace, $firstTimestampReturnedFromMainQuery ) . " AND " . sprintf( "%05d|%s", $targetNamespace, $lastTimestampReturnedFromMainQuery ) . " ORDER BY ls_value LIMIT $limit";

Combine results using FakeResultWrapper in some manner similar to ImageListPager::combineResult.

When namespace is set to all, or an inverted namespace. Could probably just query logging table, and then join on log_search, with a ls_field_value NOT LIKE sprintf( "%05d|%', $namespace ). This would be potentially a little bit more expensive query, but still probably fine, especially because people don't look at Special:NewPages for inverted namespace all that often AFAIK. For the all namespace cases I wonder if its even appropriate to do anything, given that all moved pages already "existed" in at least one namespace, in some sense.

Bawolff added a comment.Via ConduitAug 9 2014, 4:41 PM
  • Bug 69324 has been marked as a duplicate of this bug. ***
Nemo_bis awarded a token.Via WebDec 12 2014, 8:05 AM
Qgil lowered the priority of this task from "High" to "Normal".Via WebJan 12 2015, 8:16 AM
Qgil added a subscriber: Qgil.

@rohan013, this is one of the oldest tasks assigned to someone. Are you planning to work on it?

In any case, this doesn't look like a current high priority task.

LuisV_WMF added a comment.Via WebJan 12 2015, 5:10 PM

It was originally marked high because it is actively exploited by paid editors. Assuming that is still going on and the folks who check new pages for advertising have not found a good workaround, I think this should still be high.

rohan013 added a comment.Via WebJan 12 2015, 5:22 PM

@Qgil I have already submitted a patch for review

https://gerrit.wikimedia.org/r/#/c/148322/

gerritbot added a comment.Via ConduitJan 12 2015, 5:30 PM

Change 148322 had a related patch set uploaded (by He7d3r):
Show Page moves on Special:Newpages

https://gerrit.wikimedia.org/r/148322

Patch-For-Review

Nemo_bis raised the priority of this task from "Normal" to "High".Via WebJan 12 2015, 9:28 PM
Deskana removed a subscriber: Deskana.Via WebJan 12 2015, 11:00 PM
Nemo_bis changed the title from "Page moves into mainspace should appear on Special:Newpages for patrol" to "Moved pages should be listed in NewPages/RecentChanges with current title and namespace".Via WebFeb 3 2015, 8:40 PM
Nemo_bis changed the title from "Moved pages should be listed in NewPages/RecentChanges with current title and namespace" to "Moved pages should be listed in NewPages/RecentChanges with current title, namespace, existence etc.".
Nemo_bis added subscribers: Nikerabbit, Aklapper, KuboF and 9 others.
Nemo_bis set Security to None.
JAnD added a comment.Via WebFeb 10 2015, 5:37 AM

Related bug T86491

Aklapper added a comment.Via WebApr 19 2015, 4:32 PM

Patch needs rework, see Gerrit. - @rohan013: Do you plan to rework your patch?

Cenarium added a subscriber: Cenarium.Via WebMay 8 2015, 4:21 PM

The cross namespace autotag of T73236 might help solve the performance issues. In the mean time, I think I'll also introduce a 'move into mainspace' autotag so we can have this done in recentchanges at least.

gerritbot added a subscriber: gerritbot.Via ConduitMay 25 2015, 1:31 AM

Change 190656 had a related patch set uploaded (by Cenarium):
Allow patrolling of tagged changes with minimalist RC patrol

https://gerrit.wikimedia.org/r/190656

Add Comment