Page MenuHomePhabricator

Refactor Category::refreshCounts logic to a job and simplify
Open, LowPublic

Description

As of https://gerrit.wikimedia.org/r/506032 we now have four ways of updating category counts:

1. If a non-locking master read says the stale count is zero, we do a full recount.

This is used after an edit to a page, for the categories that were in the previous revision, but not in the new one. (From LinksUpdate, via WikiPage::updateCategoryCounts).

2. If a non-locking master read says the stale count is <= 200, we do a full recount.

This is used for a category after its category description page is deleted.

3. If no row exists yet, or it appears corrupt, we do a full recount.

This can happen through any of the following scenarios:

  • Reading a category page.
  • Viewing "Page information" (action=info) for a category page.
  • Parsing wikitext containing {{pagesincategory}}.
  • Viewing search results on Special:Search for a match that is a category page.
  • UploadWizard/ApiQueryAllCampaigns for querying the file count from a campaign's category.

This is triggered whenever one of these methods is called on a Category object: getPageCount(), getSubcatCount(), getFileCount(), or getTitle(). This then uses the path via Category->initialize( Category::LAZY_INIT_ROW ).

4. Relative increments/decrements (including creation/deletion of the row)

From WikiPage::updateCategoryCounts after edits for categories associated with that page.


I'd like to re-explore whether we still need use case three. It seems to me like, at least in theory, it wouldn't be needed. If we can validate that relatively easily, I would propose we remove it in favour of a warning being logged with stack trace so that we can find out why and whether that is preventible.

Alternatively, if it cannot be prevented within reason (e.g. too costly or impossible to get right given scale requirements), then I suggest we move it to a job and have use case 1, 2 and 3 be reduced to the queuing of a job that takes care of things.

  • Document and/or reference from the code how case 2 is possible.
    • If rare/unlikely:
      • Consider removing in favour of a manual recount admins can trigger via purge of the category page.
    • If common and not easily preventible:
      • Move to job queue as a "validate recount", emit log warning if result turned out different.
  • Determine whether case 3 is still probable.
    • If so:
      • Move refresh logic (recount, auto-create, auto-delete) to a job and queue that for case 1, 2, and 3.
    • If not:
      • Replace recount with a log warning from case 1 and 3.

Event Timeline

Krinkle created this task.Apr 24 2019, 5:15 PM
Restricted Application added subscribers: Liuxinyu970226, Aklapper. · View Herald TranscriptApr 24 2019, 5:15 PM

The problem persists. Any plans to have this resolved? Any chance to trigger the recount manually?

Huji added a comment.May 25 2019, 1:54 PM

I moved T224321 in the hierarchy of tasks so that the current task would be its parent. @Superyetkin that is essentially what you are asking for. @Krinkle I think it would be nice to start over in a clean state; this can help with identifying the root causes of mismatches, whether before or after #3 is refactored.

Reedy changed the status of subtask T224321: Run populateCategory.php from Open to Stalled.May 25 2019, 7:34 PM
Reedy updated the task description. (Show Details)May 26 2019, 11:48 AM
Wargo added a subscriber: Wargo.May 29 2019, 8:05 AM

Problems with counting starts from T224209

aaron triaged this task as Low priority.Jun 6 2019, 10:45 AM