Page MenuHomePhabricator

Unexpected namespaces in pagetriage_page table
Open, LowPublic2 Estimated Story PointsBUG REPORT

Description

https://quarry.wmcloud.org/query/68437

Namespaces 0 (main), 2 (user), and 118 (draft) are expected to be in this table as these are namespaces PageTriage patrols. However there are a bunch of other namespaces in this table. Is this a bug? If so, let's track it down and fix it.

image.png (674×518 px, 7 KB)

  • Fix the root cause of the bugs
    1. Hooks.php -> onPageMoveComplete, the last conditional should check if page is being moved out of a namespace PageTriage patrols. If so, run $pageTriage->deleteFromPageTriage()
    2. includes/Maintenance/RemoveOldRows.php should delete from all namespaces after 30 days but doesn't. The bug is on line 72, where the $secondaryNamespaces variable is declared. Despite a comment stating otherwise, it contains an incomplete list of namespaces.
  • Re-run this Quarry query after the two patches are deployed and after 30 days elapse (giving the cron job time to delete the old entries). If the counts start swelling again, track down the offending code and fix it. Rinse and repeat.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

I checked a few of these at random. They appear to have been moved from one of the valid namespaces.

Good find. Thank you for that MPGuy2824.

Here's a query of the counts of all pages in pagetriage_pages over one month since ptrp_reviewed_updated was last touched, excluding draftspace. Why one month? This is a common setting for some of the PageTriage delete cron jobs.

https://quarry.wmcloud.org/query/68541

And a list of pages, excluding draftspace.

https://quarry.wmcloud.org/query/68542

Some of these are not from page moves, but many are.

I think these are just cluttering up the table. They are skipped by our normal cleanup cron jobs and just swell the size of the SQL tables. I think the path forward is...

  • Change move hook to delete from pagetriage_page and pagetriage_page_tags when moving a page to an unsupported namespace
  • Ask DBAs to do a one time deletion of these entries, probably by running maintenance/cleanupPageTriage.php
  • Keep an eye on this / query this after the DBA deletion. If it starts swelling again, track down the offending code and fix it. Rinse and repeat.

Looks like the # of buggy rows increased by 600 in 1 month.

I think there's two bugs here:

  1. Hooks.php -> onPageMoveComplete, line 95 should check if page is being moved out of a namespace PageTriage patrols. If so, run $pageTriage->deleteFromPageTriage()
  2. cron job at cron/updatePageTriageQueue.php should delete from all namespaces after 30 days but doesn't. The bug is on line 77, where the $secondaryNamespaces variable is declared. Despite a comment stating otherwise, it contains an incomplete list of namespaces.

I think I'll refactor updatePageTriageQueue.php, then we can squash the two bugs with small patches.

Change 865581 had a related patch set uploaded (by Novem Linguae; author: Novem Linguae):

[mediawiki/extensions/PageTriage@master] [WIP] Refactor updatePageTriageQueue.php

https://gerrit.wikimedia.org/r/865581

Change 868635 had a related patch set uploaded (by Novem Linguae; author: Novem Linguae):

[mediawiki/extensions/PageTriage@master] Extract cron job class into its own file

https://gerrit.wikimedia.org/r/868635

Change 868635 merged by jenkins-bot:

[mediawiki/extensions/PageTriage@master] Extract cron job class into its own file

https://gerrit.wikimedia.org/r/868635

Change 865581 abandoned by Novem Linguae:

[mediawiki/extensions/PageTriage@master] [WIP] Refactor updatePageTriageQueue.php

Reason:

I ended up splitting this into smaller patches

https://gerrit.wikimedia.org/r/865581

Samwalton9-WMF set the point value for this task to 2.

Note that there appears to be a maintenance script that cleans these up: maintenance/cleanupPageTriage.php. But I still think it'd be best to fix the root causes. See above for the 2 bugs I suspect are causing this.

Novem_Linguae updated the task description. (Show Details)
Novem_Linguae updated the task description. (Show Details)

Hi @Novem_Linguae , has this bug been fixed ?

Nope. Not yet. I've updated the task to be more accurate about what needs to be fixed.