Page MenuHomePhabricator

HTTP 500 error for https://www.wikidata.org/wiki/Special:EntityData/Q30.rdf / ttl
Closed, ResolvedPublic8 Estimated Story Points

Description

Various Special:EntityData calls to .ttl or .rdf timeout of run out of memory.
The main suspicious code for these paths collects labels from all entities refered to by the entity that RDF is being retrieved from.
This currently does full entity lookups, rather than just retrieving labels from a faster storage mechanism.
See T243950: RDF output of an entity loads all referenced entities, it shouldn't

This probably means switching from full EntityStore lookups to using TermLookups.

Acceptance CriteriaπŸ•οΈπŸŒŸ

  • Features of the RDF endpoints remain the same
  • /wiki/Special:EntityData/Q30.rdf in production no longer times out

Notes & links


Initial Investigation

While looking at logstash for some reason @Silvan_WMDE & I noticed there were quite some errors relating to https://www.wikidata.org/wiki/Special:EntityData/Q30.rdf

  • Timeouts [5649a640-5f42-4d22-aaec-36ffdfb053de] /wiki/Special:EntityData/Q30.rdf Wikibase\DataModel\Services\Lookup\EntityLookupException: The maximum execution time of 60 seconds was exceeded
  • OOMs [0891912f-89b3-4cbc-a675-bf37bff44910] PHP Fatal error: Allowed memory size of 698351616 bytes exhausted (tried to allocate 134217736 bytes) in /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikimedia/purtle/src/RdfWriterBase.php:229
  • EntityLookupException [5649a640-5f42-4d22-aaec-36ffdfb053de] 2021-04-27 13:57:05: Fatal exception of type "Wikibase\DataModel\Services\Lookup\EntityLookupException" (due to timeout)

Seemingly this has been occurring for the complete time range of existing logstash logs

image.png (1Γ—3 px, 332 KB)

Searching phabricator I came across T62003: HTTP 503 error when requesting linked data for large entities but I do not think this is related as I also get a 500 error while requesting these pages directly from a mwapp server inside the cluster.

addshore@deploy1002:~$ curl -k -L -H "Host: www.wikidata.org" -I https://mw1405.eqiad.wmnet/wiki/Special:EntityData/Q30.rdf
HTTP/1.1 500 Internal Server Error
Date: Tue, 27 Apr 2021 13:57:23 GMT
Server: mw1405.eqiad.wmnet
X-Powered-By: PHP/7.2.31-1+0~20200514.41+debian9~1.gbpe2a56b+wmf1+buster1
X-Request-Id: 5e4f152f-8a45-4d31-a77a-b9a112319b25
Backend-Timing: D=53005048 t=1619531843710982
Content-Type: text/html; charset=UTF-8
X-Envoy-Upstream-Service-Time: 53005
Transfer-Encoding: chunked

The stack traces vary.

An example for OOM would be:

#0 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikimedia/purtle/src/RdfWriterBase.php(229): unknown()
#1 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikimedia/purtle/src/XmlRdfWriter.php(69): Wikimedia\Purtle\RdfWriterBase->write()
#2 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikimedia/purtle/src/XmlRdfWriter.php(221): Wikimedia\Purtle\XmlRdfWriter->tag()
#3 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikimedia/purtle/src/RdfWriterBase.php(451): Wikimedia\Purtle\XmlRdfWriter->writeText()
#4 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/repo/includes/Rdf/TermsRdfBuilder.php(92): Wikimedia\Purtle\RdfWriterBase->text()
#5 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/repo/includes/Rdf/TermsRdfBuilder.php(183): Wikibase\Repo\Rdf\TermsRdfBuilder->addLabels()
#6 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/repo/includes/Rdf/RdfBuilder.php(480): Wikibase\Repo\Rdf\TermsRdfBuilder->addEntityStub()
#7 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/repo/includes/Rdf/RdfBuilder.php(444): Wikibase\Repo\Rdf\RdfBuilder->addEntityStub()
#8 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/repo/includes/LinkedData/EntityDataSerializationService.php(225): Wikibase\Repo\Rdf\RdfBuilder->resolveMentionedEntities()
#9 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/repo/includes/LinkedData/EntityDataSerializationService.php(170): Wikibase\Repo\LinkedData\EntityDataSerializationService->rdfSerialize()
#10 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/repo/includes/LinkedData/EntityDataRequestHandler.php(529): Wikibase\Repo\LinkedData\EntityDataSerializationService->getSerializedData()
#11 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/repo/includes/LinkedData/EntityDataRequestHandler.php(275): Wikibase\Repo\LinkedData\EntityDataRequestHandler->showData()
#12 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/repo/includes/Specials/SpecialEntityData.php(143): Wikibase\Repo\LinkedData\EntityDataRequestHandler->handleRequest()
#13 /srv/mediawiki/php-1.37.0-wmf.1/includes/specialpage/SpecialPage.php(646): Wikibase\Repo\Specials\SpecialEntityData->execute()
#14 /srv/mediawiki/php-1.37.0-wmf.1/includes/specialpage/SpecialPageFactory.php(1381): SpecialPage->run()
#15 /srv/mediawiki/php-1.37.0-wmf.1/includes/MediaWiki.php(313): MediaWiki\SpecialPage\SpecialPageFactory->executePath()
#16 /srv/mediawiki/php-1.37.0-wmf.1/includes/MediaWiki.php(916): MediaWiki->performRequest()
#17 /srv/mediawiki/php-1.37.0-wmf.1/includes/MediaWiki.php(550): MediaWiki->main()
#18 /srv/mediawiki/php-1.37.0-wmf.1/index.php(53): MediaWiki->run()
#19 /srv/mediawiki/php-1.37.0-wmf.1/index.php(46): wfIndexMain()
#20 /srv/mediawiki/w/index.php(3): require()

And an example for timeout could be:

from /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikimedia/request-timeout/src/Detail/ExcimerTimerWrapper.php(97)
#0 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikimedia/request-timeout/src/Detail/ExcimerTimerWrapper.php(72): Wikimedia\RequestTimeout\Detail\ExcimerTimerWrapper->onTimeout(integer)
#1 /srv/mediawiki/php-1.37.0-wmf.1/vendor/data-values/serialization/src/Deserializers/DataValueDeserializer.php(115): Wikimedia\RequestTimeout\Detail\ExcimerTimerWrapper->Wikimedia\RequestTimeout\Detail\{closure}(integer)
#2 /srv/mediawiki/php-1.37.0-wmf.1/vendor/data-values/serialization/src/Deserializers/DataValueDeserializer.php(91): DataValues\Deserializers\DataValueDeserializer->getDeserialization(array)
#3 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikibase/data-model-serialization/src/Deserializers/SnakDeserializer.php(128): DataValues\Deserializers\DataValueDeserializer->deserialize(array)
#4 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikibase/data-model-serialization/src/Deserializers/SnakDeserializer.php(117): Wikibase\DataModel\Deserializers\SnakDeserializer->deserializeDataValue(array)
#5 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikibase/data-model-serialization/src/Deserializers/SnakDeserializer.php(100): Wikibase\DataModel\Deserializers\SnakDeserializer->newValueSnak(array)
#6 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikibase/data-model-serialization/src/Deserializers/SnakDeserializer.php(82): Wikibase\DataModel\Deserializers\SnakDeserializer->getDeserialized(array)
#7 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikibase/data-model-serialization/src/Deserializers/SnakListDeserializer.php(60): Wikibase\DataModel\Deserializers\SnakDeserializer->deserialize(array)
#8 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikibase/data-model-serialization/src/Deserializers/SnakListDeserializer.php(41): Wikibase\DataModel\Deserializers\SnakListDeserializer->getDeserialized(array)
#9 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikibase/data-model-serialization/src/Deserializers/ReferenceDeserializer.php(77): Wikibase\DataModel\Deserializers\SnakListDeserializer->deserialize(array)
#10 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikibase/data-model-serialization/src/Deserializers/ReferenceDeserializer.php(67): Wikibase\DataModel\Deserializers\ReferenceDeserializer->deserializeSnaks(array)
#11 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikibase/data-model-serialization/src/Deserializers/ReferenceDeserializer.php(57): Wikibase\DataModel\Deserializers\ReferenceDeserializer->getDeserialized(array)
#12 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikibase/data-model-serialization/src/Deserializers/ReferenceListDeserializer.php(53): Wikibase\DataModel\Deserializers\ReferenceDeserializer->deserialize(array)
#13 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikibase/data-model-serialization/src/Deserializers/ReferenceListDeserializer.php(40): Wikibase\DataModel\Deserializers\ReferenceListDeserializer->getDeserialized(array)
#14 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikibase/data-model-serialization/src/Deserializers/StatementDeserializer.php(161): Wikibase\DataModel\Deserializers\ReferenceListDeserializer->deserialize(array)
#15 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikibase/data-model-serialization/src/Deserializers/StatementDeserializer.php(124): Wikibase\DataModel\Deserializers\StatementDeserializer->setReferencesFromSerialization(array, Wikibase\DataModel\Statement\Statement)
#16 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikibase/data-model-serialization/src/Deserializers/StatementDeserializer.php(100): Wikibase\DataModel\Deserializers\StatementDeserializer->getDeserialized(array)
#17 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikibase/data-model-serialization/src/Deserializers/StatementListDeserializer.php(60): Wikibase\DataModel\Deserializers\StatementDeserializer->deserialize(array)
#18 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikibase/data-model-serialization/src/Deserializers/StatementListDeserializer.php(41): Wikibase\DataModel\Deserializers\StatementListDeserializer->getDeserialized(array)
#19 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikibase/data-model-serialization/src/Deserializers/ItemDeserializer.php(130): Wikibase\DataModel\Deserializers\StatementListDeserializer->deserialize(array)
#20 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikibase/data-model-serialization/src/Deserializers/ItemDeserializer.php(85): Wikibase\DataModel\Deserializers\ItemDeserializer->setStatementListFromSerialization(array, Wikibase\DataModel\Entity\Item)
#21 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikibase/data-model-serialization/src/Deserializers/ItemDeserializer.php(77): Wikibase\DataModel\Deserializers\ItemDeserializer->getDeserialized(array)
#22 /srv/mediawiki/php-1.37.0-wmf.1/vendor/serialization/serialization/src/Deserializers/DispatchingDeserializer.php(42): Wikibase\DataModel\Deserializers\ItemDeserializer->deserialize(array)
#23 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikibase/internal-serialization/src/Deserializers/EntityDeserializer.php(42): Deserializers\DispatchingDeserializer->deserialize(array)
#24 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/lib/includes/Store/EntityContentDataCodec.php(253): Wikibase\InternalSerialization\Deserializers\EntityDeserializer->deserialize(array)
#25 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/lib/includes/Store/Sql/WikiPageEntityDataLoader.php(82): Wikibase\Lib\Store\EntityContentDataCodec->decodeEntity(string, NULL)
#26 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/lib/includes/Store/Sql/WikiPageEntityRevisionLookup.php(236): Wikibase\Lib\Store\Sql\WikiPageEntityDataLoader->loadEntityDataFromWikiPageRevision(MediaWiki\Revision\RevisionStoreRecord, string, integer)
#27 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/lib/includes/Store/Sql/WikiPageEntityRevisionLookup.php(122): Wikibase\Lib\Store\Sql\WikiPageEntityRevisionLookup->loadEntity(stdClass, string)
#28 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/lib/includes/Store/TypeDispatchingEntityRevisionLookup.php(54): Wikibase\Lib\Store\Sql\WikiPageEntityRevisionLookup->getEntityRevision(Wikibase\DataModel\Entity\ItemId, integer, string)
#29 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/data-access/src/ByTypeDispatchingEntityRevisionLookup.php(55): Wikibase\Lib\Store\TypeDispatchingEntityRevisionLookup->getEntityRevision(Wikibase\DataModel\Entity\ItemId, integer, string)
#30 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/lib/includes/Store/TypeDispatchingEntityRevisionLookup.php(54): Wikibase\DataAccess\ByTypeDispatchingEntityRevisionLookup->getEntityRevision(Wikibase\DataModel\Entity\ItemId, integer, string)
#31 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/lib/includes/Store/CachingEntityRevisionLookup.php(104): Wikibase\Lib\Store\TypeDispatchingEntityRevisionLookup->getEntityRevision(Wikibase\DataModel\Entity\ItemId, integer, string)
#32 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/lib/includes/Store/CachingEntityRevisionLookup.php(87): Wikibase\Lib\Store\CachingEntityRevisionLookup->fetchEntityRevision(Wikibase\DataModel\Entity\ItemId, integer, string)
#33 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/lib/includes/Store/CachingEntityRevisionLookup.php(104): Wikibase\Lib\Store\CachingEntityRevisionLookup->getEntityRevision(Wikibase\DataModel\Entity\ItemId, integer, string)
#34 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/lib/includes/Store/CachingEntityRevisionLookup.php(87): Wikibase\Lib\Store\CachingEntityRevisionLookup->fetchEntityRevision(Wikibase\DataModel\Entity\ItemId, integer, string)
#35 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/lib/includes/Store/RevisionBasedEntityLookup.php(46): Wikibase\Lib\Store\CachingEntityRevisionLookup->getEntityRevision(Wikibase\DataModel\Entity\ItemId, integer, string)
#36 /srv/mediawiki/php-1.37.0-wmf.1/vendor/wikibase/data-model-services/src/Lookup/RedirectResolvingEntityLookup.php(51): Wikibase\Lib\Store\RevisionBasedEntityLookup->getEntity(Wikibase\DataModel\Entity\ItemId)
#37 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/repo/includes/Rdf/RdfBuilder.php(438): Wikibase\DataModel\Services\Lookup\RedirectResolvingEntityLookup->getEntity(Wikibase\DataModel\Entity\ItemId)
#38 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/repo/includes/LinkedData/EntityDataSerializationService.php(225): Wikibase\Repo\Rdf\RdfBuilder->resolveMentionedEntities(Wikibase\DataModel\Services\Lookup\RedirectResolvingEntityLookup)
#39 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/repo/includes/LinkedData/EntityDataSerializationService.php(170): Wikibase\Repo\LinkedData\EntityDataSerializationService->rdfSerialize(Wikibase\Lib\Store\EntityRevision, NULL, array, Wikibase\Repo\Rdf\RdfBuilder, NULL)
#40 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/repo/includes/LinkedData/EntityDataRequestHandler.php(529): Wikibase\Repo\LinkedData\EntityDataSerializationService->getSerializedData(string, Wikibase\Lib\Store\EntityRevision, NULL, array, NULL)
#41 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/repo/includes/LinkedData/EntityDataRequestHandler.php(275): Wikibase\Repo\LinkedData\EntityDataRequestHandler->showData(WebRequest, OutputPage, string, Wikibase\DataModel\Entity\ItemId, integer)
#42 /srv/mediawiki/php-1.37.0-wmf.1/extensions/Wikibase/repo/includes/Specials/SpecialEntityData.php(143): Wikibase\Repo\LinkedData\EntityDataRequestHandler->handleRequest(string, WebRequest, OutputPage)
#43 /srv/mediawiki/php-1.37.0-wmf.1/includes/specialpage/SpecialPage.php(646): Wikibase\Repo\Specials\SpecialEntityData->execute(string)
#44 /srv/mediawiki/php-1.37.0-wmf.1/includes/specialpage/SpecialPageFactory.php(1381): SpecialPage->run(string)
#45 /srv/mediawiki/php-1.37.0-wmf.1/includes/MediaWiki.php(313): MediaWiki\SpecialPage\SpecialPageFactory->executePath(Title, RequestContext)
#46 /srv/mediawiki/php-1.37.0-wmf.1/includes/MediaWiki.php(916): MediaWiki->performRequest()
#47 /srv/mediawiki/php-1.37.0-wmf.1/includes/MediaWiki.php(550): MediaWiki->main()
#48 /srv/mediawiki/php-1.37.0-wmf.1/index.php(53): MediaWiki->run()
#49 /srv/mediawiki/php-1.37.0-wmf.1/index.php(46): wfIndexMain()
#50 /srv/mediawiki/w/index.php(3): require(string)
#51 {main}

While investigating the situation a little more I discovered that:

Expanding my logstash search it appears this is happening to some other entities too (I'm sure there would be more of these if I looked harder):

Event Timeline

Addshore added a project: Performance Issue.

With this extra hint it indeed seems that most of the traces include this resolveMentionedEntities.

This would explain most of the unknowns in the ticket description

Addshore updated the task description. (Show Details)
Addshore updated Other Assignee, added: Tarrow.

Change 701887 had a related patch set uploaded (by Tarrow; author: Tarrow):

[mediawiki/extensions/Wikibase@master] Prefetch Item StubEntityData

https://gerrit.wikimedia.org/r/701887

Change 701887 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Prefetch Item StubEntityData

https://gerrit.wikimedia.org/r/701887

Adding the one language feature flag and setup now.

Change 704586 had a related patch set uploaded (by Ladsgroup; author: Ladsgroup):

[mediawiki/extensions/Wikibase@master] [WIP] Use request language for RDF output.

https://gerrit.wikimedia.org/r/704586

So I tried this patch on mwdebug2001.

It makes Q30.rdf from 29MB to 7.2MB. I upload the result to compare: https://people.wikimedia.org/~ladsgroup/Q30-new.rdf

Q7251 went from 7MB to 2MB: https://people.wikimedia.org/~ladsgroup/Q7251-new.rdf and https://people.wikimedia.org/~ladsgroup/Q7251-fa-new.rdf (done with uselang=fa)

Change 704586 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Use request language for RDF output.

https://gerrit.wikimedia.org/r/704586

Moving to Verification so we can see if the prefetching helps once this rolls out with the train this week – I’d hope to see a further reduction of these errors in logstash. After that, this should move to Needs Announcement before we can flip the config switch to only send stubs in the user language and fallbacks. (Though we can also draft, or even send out, that announcement before the verification is done, I think.)

For the specific case we were looking at:

adam@adsh+wsl:/tmp$ $ time curl -s -o /dv/null https://www.wikidata.org/wiki/Special:EntityData/Q30.rdf

real    0m18.072s
user    0m0.017s
sys     0m0.008s
adam@adsh+wsl:/tmp$ $ time curl -s -o /dv/null https://www.wikidata.org/wiki/Special:EntityData/Q30.ttl

real    0m14.919s
user    0m0.022s
sys     0m0.000s

Looking at logstash it looks like EnittyData still OOMs in some situations, but I see no timeouts in the past week.

In the last weke the only entities and formats to OOM at Q668.nt, Q668.jsonld, Q668.n3, Q209727.rdf

For example:

adam@adsh+wsl:/mnt/c/Users/adam$ $ time curl -s -o /dv/null -w "%{http_code}\n" https://www.wikidata.org/wiki/Special:EntityData/Q668.nt
500

real    0m31.627s
user    0m0.012s
sys     0m0.011s
Addshore updated the task description. (Show Details)

Making this one as resolved.
T285795: Limit languages on EntityStub rdf builders remains open on the campsite board.
I'm going to file a follow up task to investigate if we can also get rid of the OOMs for the entities mentioned above.
It could be that the language task here fixes some of those OOMs? however one is for the jsonld format, thus probably has nothing to do with RDF languages.