Page MenuHomePhabricator

CommonsMetadata extension is triggering a duplicate parse in commons
Closed, ResolvedPublic

Event Timeline

Ladsgroup created this task.
Ladsgroup updated the task description. (Show Details)

The extension itself didn't have any change in the past month, possibly it's triggered by a change in core.

Ladsgroup renamed this task from CommonsMetadata extension is triggering a duplicate parse in commons in every edit to CommonsMetadata extension is triggering a duplicate parse in commons in every page view.Feb 9 2022, 2:55 AM

The hook handler is onSkinAfterBottomScripts meaning it's on every page view...

Ladsgroup renamed this task from CommonsMetadata extension is triggering a duplicate parse in commons in every page view to CommonsMetadata extension is triggering a duplicate parse in commons.Feb 9 2022, 3:03 AM
  • The examples I could find are all from SkinMinerva not Vector or other skins. Is this right?
  • Is this "the same request is now duplicating the parse" or "there are many more requests", e.g. could this just be a cache invalidation semi-stampede?
  • The examples I could find are all from SkinMinerva not Vector or other skins. Is this right?

Nice catch, parsercache entries of mobile views are fragmented using responsiveimages key (don't ask why) and if you make the search minus it with -"responsiveimages", there is only a few from commons.

  • Is this "the same request is now duplicating the parse" or "there are many more requests", e.g. could this just be a cache invalidation semi-stampede?

First for sure, but possibly the latter as well. i.e. it might have been duplicate parses all the time but due to some cache invalidation (which should not be done like this anyway) we are seeing a flood in logs but the first part is for sure.

Change 761993 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[mediawiki/core@master] ParserOutputAccess: Cache Parsing inside the class as well

https://gerrit.wikimedia.org/r/761993

Tested in mwdebug1001 confirming this fixes the issue ^

@tstarling We are having trouble understanding the root cause of this, see discussion on the patch. It may be related to PoolCounter, could you have a look?

So I just write brain dump of four hours of work and pulling my hair:

How to reproduce? Go to any image in commons, Disable FormatMetadata cache (in /srv/mediawiki/php-1.38.0-wmf.21/includes/media/FormatMetadata.php, the cache with getExtendedMetadata key). And request a purge

Notes I've seen:

  • There is nothing in configuration of PoolCounter in mediawiki, the class returned is MediaWiki\Extension\PoolCounter\Client. While I see for both PoolCounter work objects, we get two different PoolCounter objects so it can't use its own class cache (I tried to fix it in PoolCounter::factory() by adding a static per-type per-key object holder, it started to be in the same object and yet double parsed it)
  • Both of them get to PoolWorkArticleViewCurrent and not different keys. The keys are the same.
  • PoolCounter considers both of them cacheable() and enters the $this->isFastStaleEnabled() section in PoolCounterWork::execute().
  • The value of isFastStaleEnabled is true in both cases which made me think maybe the only recent change in poolcounter directory in mediawiki caused it: https://github.com/wikimedia/mediawiki/commit/ca71e69fc6fc4b6a90a961136c2c4b4c5e0ac792#diff-493b98036078af4c901fd431b8c6fd5cdf7720f512f5ffce47d0dd032fb5d7e1 which re-throws the timeout error
    • But reverting it didn't fix the issue (maybe I need to double test?) Or maybe we need to remove another re-throw?
  • In both cases PoolCounter responds with LOCKED (value = 1), letting the double parse to go ahead.
  • The other problem is that even if PoolCounter responds with DONE (=cached) instead of LOCKED (lock acquired). PoolWorkArticleViewCurrent::getCachedWork() simply tries to read ParserCache and fails because it's not saved there yet.

That's all I've got.

Change 762914 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[mediawiki/core@wmf/1.38.0-wmf.21] ParserOutputAccess: Cache Parsing inside the class as well

https://gerrit.wikimedia.org/r/762914

Change 762915 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[mediawiki/core@wmf/1.38.0-wmf.22] ParserOutputAccess: Cache Parsing inside the class as well

https://gerrit.wikimedia.org/r/762915

Change 761993 merged by jenkins-bot:

[mediawiki/core@master] ParserOutputAccess: Add process cache within the service class

https://gerrit.wikimedia.org/r/761993

Change 762915 merged by jenkins-bot:

[mediawiki/core@wmf/1.38.0-wmf.22] ParserOutputAccess: Cache Parsing inside the class as well

https://gerrit.wikimedia.org/r/762915

Change 762914 merged by jenkins-bot:

[mediawiki/core@wmf/1.38.0-wmf.21] ParserOutputAccess: Cache Parsing inside the class as well

https://gerrit.wikimedia.org/r/762914

Mentioned in SAL (#wikimedia-operations) [2022-02-16T16:06:57Z] <ladsgroup@deploy1002> Synchronized php-1.38.0-wmf.22/includes/page/ParserOutputAccess.php: Backport: [[gerrit:762915|ParserOutputAccess: Cache Parsing inside the class as well (T301310)]] (duration: 00m 54s)

Mentioned in SAL (#wikimedia-operations) [2022-02-16T16:10:31Z] <ladsgroup@deploy1002> Synchronized php-1.38.0-wmf.21/includes/page/ParserOutputAccess.php: Backport: [[gerrit:762914|ParserOutputAccess: Cache Parsing inside the class as well (T301310)]] (duration: 00m 52s)

Ladsgroup lowered the priority of this task from High to Low.
Ladsgroup removed a project: DBA.

This is back to the previous rate. No increase in memory error (or other types of errors) has been observed.

image.png (203×847 px, 16 KB)

DBA's work on this as a fix to avoid overloading the database is done. I leave it open for finding the root cause.

Ladsgroup claimed this task.
Ladsgroup raised the priority of this task from Low to High.
Ladsgroup added a project: DBA.

On a second thought, it seems it's a different issue. I filed T302008: Improve in-process caching of PoolWorkArticleViewCurrent and close this instead to reflect the reality that duplicate parses are gone now.