Page MenuHomePhabricator

Many `request has exceeded memory limit` fatal errors for wikidata jobrunner
Closed, ResolvedPublic

Description

Error

Request URL: N/A (job runner)
Request ID: XSMHqwpAAEwAAGXdU9AAAADV

message
/rpc/RunSingleJob.php   PHP Fatal Error from line 79 of /srv/mediawiki/php-1.34.0-wmf.11/vendor/wikibase/data-model/src/Entity/ItemId.php: request has exceeded memory limit
trace
 	#0 /srv/mediawiki/php-1.34.0-wmf.11/vendor/wikibase/data-model/src/Entity/ItemId.php(79): NO_FUNCTION_GIVEN()
#1 [internal function]: Wikibase\DataModel\Entity\ItemId->unserialize(string)
#2 /srv/mediawiki/php-1.34.0-wmf.11/vendor/wikibase/data-model/src/Entity/EntityIdValue.php(50): unserialize(string)
#3 [internal function]: Wikibase\DataModel\Entity\EntityIdValue->unserialize(string)
#4 /srv/mediawiki/php-1.34.0-wmf.11/vendor/wikibase/data-model/src/Snak/PropertyValueSnak.php(67): unserialize(string)
#5 [internal function]: Wikibase\DataModel\Snak\PropertyValueSnak->unserialize(string)
#6 [internal function]: Memcached->getByKey(string, string, NULL, NULL)
#7 /srv/mediawiki/php-1.34.0-wmf.11/includes/libs/objectcache/MemcachedPeclBagOStuff.php(154): Memcached->get(string, NULL, NULL)
#8 /srv/mediawiki/php-1.34.0-wmf.11/includes/libs/objectcache/BagOStuff.php(206): MemcachedPeclBagOStuff->doGet(string, integer)
#9 /srv/mediawiki/php-1.34.0-wmf.11/extensions/Wikibase/lib/includes/Store/EntityRevisionCache.php(74): BagOStuff->get(string)
#10 /srv/mediawiki/php-1.34.0-wmf.11/extensions/Wikibase/lib/includes/Store/CacheRetrievingEntityRevisionLookup.php(104): Wikibase\Lib\Store\EntityRevisionCache->get(Wikibase\DataModel\Entity\ItemId)
#11 /srv/mediawiki/php-1.34.0-wmf.11/extensions/Wikibase/lib/includes/Store/CachingEntityRevisionLookup.php(84): Wikibase\Lib\Store\CacheRetrievingEntityRevisionLookup->getEntityRevisionFromCache(Wikibase\DataModel\Entity\ItemId, integer, string)
#12 /srv/mediawiki/php-1.34.0-wmf.11/extensions/Wikibase/lib/includes/Store/CachingEntityRevisionLookup.php(104): Wikibase\Lib\Store\CachingEntityRevisionLookup->getEntityRevision(Wikibase\DataModel\Entity\ItemId, integer, string)
#13 /srv/mediawiki/php-1.34.0-wmf.11/extensions/Wikibase/lib/includes/Store/CachingEntityRevisionLookup.php(87): Wikibase\Lib\Store\CachingEntityRevisionLookup->fetchEntityRevision(Wikibase\DataModel\Entity\ItemId, integer, string)
#14 /srv/mediawiki/php-1.34.0-wmf.11/extensions/Wikibase/lib/includes/Store/RevisionBasedEntityLookup.php(39): Wikibase\Lib\Store\CachingEntityRevisionLookup->getEntityRevision(Wikibase\DataModel\Entity\ItemId)
#15 /srv/mediawiki/php-1.34.0-wmf.11/vendor/wikibase/data-model-services/src/Lookup/RedirectResolvingEntityLookup.php(51): Wikibase\Lib\Store\RevisionBasedEntityLookup->getEntity(Wikibase\DataModel\Entity\ItemId)
#16 /srv/mediawiki/php-1.34.0-wmf.11/extensions/WikibaseQualityConstraints/src/ConstraintCheck/Helper/ExceptionIgnoringEntityLookup.php(37): Wikibase\DataModel\Services\Lookup\RedirectResolvingEntityLookup->getEntity(Wikibase\DataModel\Entity\ItemId)
#17 /srv/mediawiki/php-1.34.0-wmf.11/extensions/WikibaseQualityConstraints/src/ConstraintCheck/Checker/SymmetricChecker.php(106): WikibaseQuality\ConstraintReport\ConstraintCheck\Helper\ExceptionIgnoringEntityLookup->getEntity(Wikibase\DataModel\Entity\ItemId)
#18 /srv/mediawiki/php-1.34.0-wmf.11/extensions/WikibaseQualityConstraints/src/ConstraintCheck/DelegatingConstraintChecker.php(566): WikibaseQuality\ConstraintReport\ConstraintCheck\Checker\SymmetricChecker->checkConstraint(WikibaseQuality\ConstraintReport\ConstraintCheck\Context\MainSnakContext, WikibaseQuality\ConstraintReport\Constraint)
#19 /srv/mediawiki/php-1.34.0-wmf.11/extensions/WikibaseQualityConstraints/src/ConstraintCheck/DelegatingConstraintChecker.php(460): WikibaseQuality\ConstraintReport\ConstraintCheck\DelegatingConstraintChecker->getCheckResultFor(WikibaseQuality\ConstraintReport\ConstraintCheck\Context\MainSnakContext, WikibaseQuality\ConstraintReport\Constraint)
#20 /srv/mediawiki/php-1.34.0-wmf.11/extensions/WikibaseQualityConstraints/src/ConstraintCheck/DelegatingConstraintChecker.php(375): WikibaseQuality\ConstraintReport\ConstraintCheck\DelegatingConstraintChecker->checkConstraintsForMainSnak(Wikibase\DataModel\Entity\Item, Wikibase\DataModel\Statement\Statement, NULL, array)
#21 /srv/mediawiki/php-1.34.0-wmf.11/extensions/WikibaseQualityConstraints/src/ConstraintCheck/DelegatingConstraintChecker.php(347): WikibaseQuality\ConstraintReport\ConstraintCheck\DelegatingConstraintChecker->checkStatement(Wikibase\DataModel\Entity\Item, Wikibase\DataModel\Statement\Statement, NULL, array)
#22 /srv/mediawiki/php-1.34.0-wmf.11/extensions/WikibaseQualityConstraints/src/ConstraintCheck/DelegatingConstraintChecker.php(156): WikibaseQuality\ConstraintReport\ConstraintCheck\DelegatingConstraintChecker->checkEveryStatement(Wikibase\DataModel\Entity\Item, NULL, array)
#23 /srv/mediawiki/php-1.34.0-wmf.11/extensions/WikibaseQualityConstraints/src/Api/CheckingResultsSource.php(54): WikibaseQuality\ConstraintReport\ConstraintCheck\DelegatingConstraintChecker->checkAgainstConstraintsOnEntityId(Wikibase\DataModel\Entity\ItemId, NULL, array, array)
#24 /srv/mediawiki/php-1.34.0-wmf.11/extensions/WikibaseQualityConstraints/src/Api/CachingResultsSource.php(247): WikibaseQuality\ConstraintReport\Api\CheckingResultsSource->getResults(array, array, NULL, array)
#25 /srv/mediawiki/php-1.34.0-wmf.11/extensions/WikibaseQualityConstraints/src/Api/CachingResultsSource.php(179): WikibaseQuality\ConstraintReport\Api\CachingResultsSource->getAndStoreResults(array, array, NULL, array)
#26 /srv/mediawiki/php-1.34.0-wmf.11/extensions/WikibaseQualityConstraints/src/Job/CheckConstraintsJob.php(90): WikibaseQuality\ConstraintReport\Api\CachingResultsSource->getResults(array, array, NULL, array)
#27 /srv/mediawiki/php-1.34.0-wmf.11/extensions/WikibaseQualityConstraints/src/Job/CheckConstraintsJob.php(79): WikibaseQuality\ConstraintReport\Job\CheckConstraintsJob->checkConstraints(Wikibase\DataModel\Entity\ItemId)
#28 /srv/mediawiki/php-1.34.0-wmf.11/extensions/EventBus/includes/JobExecutor.php(64): WikibaseQuality\ConstraintReport\Job\CheckConstraintsJob->run()
#29 /srv/mediawiki/rpc/RunSingleJob.php(76): JobExecutor->execute(array)
#30 {main}

Impact

Root cause & Suggested Fix

TBD .. Timeboxed investigation for 4h.

Notes

  • first occurrence seems to be XRspOwpAMEQAAKZsFwkAAABJ at 2019-07-02 09:52:34
  • this doesn't seem to be restricted to data-model/src/Entity/ItemId.php, even so it happens there most often by far
    • see for example also [XSMCUQpAMD4AAF27XEcAAABY] /rpc/RunSingleJob.php PHP Fatal Error from line 290 of /srv/mediawiki/php-1.34.0-wmf.11/vendor/wikibase/data-model/src/Snak/SnakList.php: request has exceeded memory limit

Details

Related Gerrit Patches:
mediawiki/extensions/WikibaseQualityConstraints : masterAdd entity lookup without cache service and use it in SymmetryChecker

Event Timeline

Michael created this task.Jul 8 2019, 9:21 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 8 2019, 9:21 AM

And almost all are SymmetricChecker, it seems it tries to load a huge item and it fails, the number of fatal error is not a big issue because when a job fails, it gets retried lots of times so the number is a little bit bigger than the actual cases. The symmetry check can be done in SPARQL but I'm not sure if this is the direction we want to move towards (maybe we should? I don't know)

The symmetry check can be done in SPARQL but I'm not sure if this is the direction we want to move towards (maybe we should? I don't know)

No, we don’t want to check things in SPARQL unless we absolutely have to – SPARQL checks are slow and use outdated data.

And almost all are SymmetricChecker

Well, a single CheckConstraintJob job always checks all constraints on an entity, and symmetric constraints are pretty rare, so this is surprising… can we see the job parameters in logstash somehow? Is it possible that this is always the same job? If there were lots of items we can’t deserialize due to memory limits, that would be a problem for more than just WikibaseQualityConstraints.

No idea what changed in our config, but the error message seems to be somewhat different now:

[XSw5rgpAAE0AAICPh8QAAAAL] /rpc/RunSingleJob.php   PHP Fatal Error from line 139 of /srv/mediawiki/php-1.34.0-wmf.13/vendor/wikibase/data-model/src/Entity/EntityId.php: Allowed memory size of 692060160 bytes exhausted (tried to allocate 20480 bytes)

However, this still seems to be the dominant Wikibase error with apparently over 1000 occurrences in the last 24 hours.

Addshore moved this task from Incoming to Needs Work on the Wikidata-Campsite board.
Michael triaged this task as High priority.Jul 23 2019, 7:59 AM

Other than resolving the root cause, logging the frequency of this error to Grafana to monitor it and preventing it from going to Logstash could be a sensible way forward, too.

alaa_wmde updated the task description. (Show Details)Jul 23 2019, 9:47 AM
alaa_wmde moved this task from Needs Work to Ready to pick up on the Wikidata-Campsite board.

So we can call the mediawiki's API (wbgetclaim) for symmetry check, the doesn't reduce the total amount of memory needed, it just distributes them in two places.

I think I found the underlying problem, it seems that WBQC requests entity lookup that caches the entity too and a huge big hash cache (loaded into memory), we can avoid that by asking the entity lookup for retrieve only.

Change 525084 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[mediawiki/extensions/WikibaseQualityConstraints@master] Set the entity lookup to retrieve the value and do not cache it into memory

https://gerrit.wikimedia.org/r/525084

Change 525084 merged by jenkins-bot:
[mediawiki/extensions/WikibaseQualityConstraints@master] Add entity lookup without cache service and use it in SymmetryChecker

https://gerrit.wikimedia.org/r/525084

alaa_wmde changed the subtype of this task from "Task" to "Bug Report".Jul 24 2019, 9:14 AM

waiting for this to be verified in production. expectation is that error rate drops in the few days after deploying this
https://logstash.wikimedia.org/app/kibana#/dashboard/AWra4yyim2VjIW0682f0?_g=h@8b5b71a&_a=h@21362d2

alaa_wmde closed this task as Resolved.Aug 12 2019, 2:51 PM
alaa_wmde claimed this task.
mmodell changed the subtype of this task from "Bug Report" to "Production Error".Aug 28 2019, 11:06 PM