Page MenuHomePhabricator

Redirects not appearing in New Page Patrol
Closed, ResolvedPublic

Description

Steps to reproduce:

  1. Go to a redirect which has been marked as patrolled (or perhaps an article that was marked as patrolled before it became a redirect)
  2. Start a new article

Actual results
Article is already marked as patrolled

Expected Results
Article shows up in New Page Patrol Queue

Event Timeline

This is an urgent issue for the enwiki community. This allows users to easily circumvent the new page patrol process, on pages that have usually been unwatched but are precisely the ones that are not notable enough for articles (that's why they're redirects in the first place). I hope that we'll see a fix for this as soon as possible. Thanks!

Is the current behavior a regression?

JJMC89 subscribed.

Is the current behavior a regression?

Yes

Yes. Per this document, expected and normal behavior in the past is that redirects converted to articles show in the queue. Also see this discussion for more local info. Thanks!

Adding Community-Tech since this is a regression and the team has been working on the extension.

Niharika added subscribers: Samwilson, MusikAnimal.

@MusikAnimal/@Samwilson Can you look into this? We've worked on redirects as part of T157046: Redirects with RfD tags should display in the feed under 'Nominated for deletion' but that code is not in production yet so that didn't cause this.

Putting this in the sprint as it is high priority for the community.

I think I've tracked it down to Hooks.php#L105. Revision::getPrevious() always returns null. This method lives in MediaWiki core. I reverted back to a version of MediaWiki from several months ago, and it's still returning null.

Has the bug reported here been there for that long? When was the last time we recall seeing an article-from-redirect in Special:NewPagesFeed, @Barkeep49 @Lixxx235 @JJMC89 ? I seem to recall seeing some not *that* long ago, but I'm not as active in NPP these days. It's possible my debugging is flawed, and this is a more recent regression.

Possibly related, the Revision class has been marked as deprecated since MediaWiki 1.31 (June 2018). This is the class that is being given to us by NewRevisionFromEditComplete, which is also in MediaWiki core. Maybe there's some software rot going on.

I regularly patrol from the old end of the queue and this is definitely a newer issue. It started at most only a day or two before May 18 when I became aware of it due to an undoing of my editing at https://en.wikipedia.org/wiki/2019_NCAA_Division_I_Baseball_Tournament. I had noticed that old articles weren't popping up in the queue as frequently but just figured that another patroller was getting to them first and hadn't thought it might be because of a bug.

Hmm okay. The last code change to PageTriage was on May 8 with https://gerrit.wikimedia.org/r/#/q/Ib1a13deb3be907b3224b990b440301f84e42c328, and I'm about 99% sure that isn't the culprit. I don't see any changes in PageTriage over the past month or two that could have caused this regression. So, I'm leaning towards blaming something in core. Something must be wrong with my setup, because I tried reverting back core as far as I could (until I ran into database changes), and I had the same issue. I shall dig deeper!

So in looking through my patrol logs it's been longer than I thought since I can find a patrol I did from a redirect. So I'm going to back-off my initial (strong) statement that it's new as I cannot find any patrol I did more recent than May 7th. I have asked on the NPP talk page to see if anyone has something more recent than that but it's possible that it was tied to that update on the 8th.

Interesting. The thing is, I reverted that change and I still see the problem on my local. But, I'm a bit convinced something is wrong my local anyway...

@Krinkle do you think https://gerrit.wikimedia.org/r/#/q/Ib1a13deb3be907b3224b990b440301f84e42c328 could have caused this? See my assessment at T223828#5202240.

Interesting. The thing is, I reverted that change and I still see the problem on my local. But, I'm a bit convinced something is wrong my local anyway...

@Krinkle do you think https://gerrit.wikimedia.org/r/#/q/Ib1a13deb3be907b3224b990b440301f84e42c328 could have caused this? See my assessment at T223828#5202240.

I agree with your assessment that it's very unlikely. I've double-checked and triple-checked it just now. It only removes return true statements in hook handlers which MW treats identically to void returns. It also does not change any execution flow, such as early returns.

If you find that it behaves correctly before that change, however, feel free to revert and ask questions later. No problem :)

I am not familiar with how patrolled-redirects-that-become-articles are added to the PageTriage queue. This is the first I hear that this feature existed. It sounds pretty cool, but knowing MediaWiki core's logic for this, I would have assumed that this "bug" still existed today.

If that is not the case, then I would think this must be a custom behaviour that PageTriage invented, in which case I defer to the Growth Team and their expertise on the topic. They're already tagged, but some extra pings: @kostajh, @Catrope.

Thanks. Back to the drawing board, I suppose. I can confirm Revision::getPrevious() does by itself work, for instance if I grab an arbitrary revision. It's just within the deferred update the previous revision is apparently unknown.

I've re-enabled https://en.wikipedia.org/wiki/Special:AbuseFilter/342 as an interim solution. You can monitor the log to identify articles created from redirects.

JTannerWMF subscribed.

It appears Community-Tech is working on this.

Is there any update on this? The pages affected by this bug continues to grow. If there is more information you need from the NPP community I am happy to attempt to solicit it as I did above.

No more information needed from you, thank you. We just need to figure out how to fix this :) Sorry for the wait, we got tied up with other un-break-now things. This is back to our top priority. I can't give an estimate on when it will be fixed, but I have reached out to teams to get help, since I'm convinced the issue is in Core and not Page Curation.

I'm hoping https://en.wikipedia.org/wiki/Special:AbuseFilter/342 and https://en.wikipedia.org/w/index.php?title=Special:RecentChanges&tagfilter=mw-removed-redirect are enough to hold you off in the meantime?

Change 514108 had a related patch set uploaded (by MusikAnimal; owner: MusikAnimal):
[mediawiki/extensions/PageTriage@master] Fix check for articles created from redirects

https://gerrit.wikimedia.org/r/514108

Update: I still have not figured out how this broke, but I figured out a way to fix it =p

I added a new test that requires a change be made in Core: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/514105

The PageTriage patch is up for review https://gerrit.wikimedia.org/r/514108 which Depends-On the aforementioned patch.

A quick guess at why this happens:

The relevant extension code is triggered by a hook immediately after the new revision is written to the database. The getPrevious() call however defaults to running queries against a replica. That replica is likely to not yet know the new revision.

RevisionStore::getPrevisousRevision lets you pass query flags to work around this. The old Revision class doesn't have that feature.

Using the revision's parent ID also works, but it should be noted that "previous revision" and "parent revision" are not always the same. Though as far as I know, they are always the same for a newly created revision.

A quick guess at why this happens:

The relevant extension code is triggered by a hook immediately after the new revision is written to the database. The getPrevious() call however defaults to running queries against a replica. That replica is likely to not yet know the new revision.

RevisionStore::getPrevisousRevision lets you pass query flags to work around this. The old Revision class doesn't have that feature.

Using the revision's parent ID also works, but it should be noted that "previous revision" and "parent revision" are not always the same. Though as far as I know, they are always the same for a newly created revision.

Thanks for the pointers!

I was wondering the same thing about DB_REPLICA, but the old code always used the replica, as far as I can tell? I changed Revision::getPrevious() to use READ_LATEST on my local, and I'm still getting null within onNewRevisionFromEditComplete. I'm convinced something changed in Core, because getPrevious() reliably worked up until sometime in early May.

I noticed the only other place where onNewRevisionFromEditComplete is being used by WMF, and looking at the previous revision, is Wikibase. So I just copied that and used Revision::newFromId, and viola, it worked.

I'm content with this workaround but it may still be worthwhile to find out what broke in Core, or rewrite onNewRevisionFromEditComplete to use the newer revision classes.

I was wondering the same thing about DB_REPLICA, but the old code always used the replica, as far as I can tell? I changed Revision::getPrevious() to use READ_LATEST on my local, and I'm still getting null within onNewRevisionFromEditComplete.

I'd love to know why that is. It's really confusing to me, that really shouldn't happen.

Change 514108 merged by jenkins-bot:
[mediawiki/extensions/PageTriage@master] Fix check for articles created from redirects

https://gerrit.wikimedia.org/r/514108

dom_walden subscribed.

For a redirect that has been created either as a result of a page move or someone adding "#REDIRECT [[other_page]]".

If the redirect gets turned into an article (by removing "#REDIRECT...")

  1. it becomes unreviewed, if it was reviewed before
  2. it appears in Special:NewPagesFeed when filtering for "Unreviewed pages" and "All others"

and so should appear to patrollers.

This appears to satisfy the issue raised in the description and the way NPP is documented (https://en.wikipedia.org/wiki/Wikipedia:New_pages_patrol#Redirects_converted_to_articles).

This remains an issue of concern for the community - there was recent discussion about this still being unresolved in the NPP Discord channel. Any update?

This remains an issue of concern for the community - there was recent discussion about this still being unresolved in the NPP Discord channel. Any update?

A fix is on the way! There were no deploys last week, or else it would have gone live by now. We'll try to SWAT deploy as soon as possible, otherwise it'll be on enwiki this Thursday. Sorry for the long wait!

Thanks. The lack of deploys last week was what was confusing to me. Appreciate your work and communication.

Change 517563 had a related patch set uploaded (by Niharika29; owner: MusikAnimal):
[mediawiki/extensions/PageTriage@wmf/1.34.0-wmf.8] Fix check for articles created from redirects

https://gerrit.wikimedia.org/r/517563

This will go out with the train. Trying to SWAT it failed on the test so let's try sending it on the train. To be tested on testwiki Tuesday afternoon (PST).

Change 517563 abandoned by Niharika29:
Fix check for articles created from redirects

Reason:
Decided to go with the train

https://gerrit.wikimedia.org/r/517563

I tested this by creating https://test.wikipedia.org/w/index.php?title=Foobar_redirect&redirect=no using a sysop account, then turning it into an article using an account without any additional rights. The article showed up in the queue as expected. This confirms the system is correctly getting the previous edit, which is what was broken. I did not do any tests on really old redirects; those can be hard to find.

If all goes well you'll start seeing articles-from-redirects in the English Wikipedia queue this Thursday.

Let's not forget to disable https://en.wikipedia.org/wiki/Special:AbuseFilter/342

@MusikAnimal Can I suggest not disabling it for a bit, so that it can be used to verify that the queue is working as expected (i.e. all pages that had the tag added by 342 should have also shown up in the queue)

@MusikAnimal Can I suggest not disabling it for a bit, so that it can be used to verify that the queue is working as expected (i.e. all pages that had the tag added by 342 should have also shown up in the queue)

No problem! Good idea.

An update that the deployment train is apparently blocked due to T226109 (unrelated to our work).

Last week's deployment branch is now on English Wikipedia. Articles-from-redirects appear to be showing up in the feed again. Could you confirm, @Barkeep49, @DannyS712 ?

I can confirm that things appear to be working correctly now.

Great! Sorry for the long wait. I have gone ahead and disabled https://en.wikipedia.org/wiki/Special:AbuseFilter/342

Notifying @Niharika that all seems well.

Awesome! Thanks for your work, @MusikAnimal!