Page MenuHomePhabricator

Wikidata API wbgetentities: Allowed memory size exceeded
Open, Needs TriagePublic

Description

API call (from Reasonator tool):
https://www.wikidata.org/w/api.php?callback=jQuery213049673058853886376_1586243741225&action=wbgetentities&ids=Q12184%7CQ17076801%7CQ3241045%7CP642%7CP3680%7CQ7817%7CP459%7CQ84759798%7CP1480%7CQ37113960%7CQ84759791%7CQ496997%7CQ334401%7CQ1116876%7CQ148%7CQ17%7CQ884%7CQ881%7CQ334%7CQ408%7CQ833%7CQ869%7CQ837%7CQ30%7CQ16%7CQ142%7CQ424%7CQ854%7CQ183%7CQ33%7CQ878%7CQ668%7CQ38%7CQ928%7CQ145%7CQ159%7CQ865%7CQ34%7CQ29%7CQ31%7CQ79%7CQ794%7CQ822%7CQ801%7CQ14773%7CQ39%7CQ774&props=info%7Caliases%7Clabels%7Cdescriptions%7Cclaims%7Csitelinks%7Cdatatype&format=json&_=1586243741226

Result:

<!DOCTYPE html>
<html lang="en" dir="ltr">
<meta charset="utf-8">
<title>Wikimedia Error</title>
<style>
* { margin: 0; padding: 0; }
body { background: #fff; font: 15px/1.6 sans-serif; color: #333; }
.content { margin: 7% auto 0; padding: 2em 1em 1em; max-width: 640px; }
.footer { clear: both; margin-top: 14%; border-top: 1px solid #e5e5e5; background: #f9f9f9; padding: 2em 0; font-size: 0.8em; text-align: center; }
img { float: left; margin: 0 2em 2em 0; }
a img { border: 0; }
h1 { margin-top: 1em; font-size: 1.2em; }
p { margin: 0.7em 0 1em 0; }
a { color: #0645ad; text-decoration: none; }
a:hover { text-decoration: underline; }
code { font-family: inherit; }
.text-muted { color: #777; }
</style>
<div class="content" role="main">
<a href="https://www.wikimedia.org"><img src="https://www.wikimedia.org/static/images/wmf.png" srcset="https://www.wikimedia.org/static/images/wmf-2x.png 2x" alt=Wikimedia width=135 height=135></a>
<h1>Error</h1>
<p>Our servers are currently under maintenance or experiencing a technical problem. Please <a href="" title="Reload this page" onclick="location.reload(false); return false">try again</a> in a few&nbsp;minutes.</p><p>See the error message at the bottom of this page for more&nbsp;information.</p>
</div>
<div class="footer">
<p>If you report this error to the Wikimedia System Administrators, please include the details below.</p>
<p class="text-muted"><code>
  PHP fatal error: <br/> Allowed memory size of 692060160 bytes exhausted (tried to allocate 28068496 bytes)</code></p></div>
</html>

Expected: JSON-P

Event Timeline

Hi @Magnus,

Are you still seeing this error? If this was from a log have you got the timestamp? Just trying to see if it was related to the outage last night (https://wikitech.wikimedia.org/wiki/Incident_documentation/20200407-Wikidata%27s_wb_items_per_site_table_dropped)?

Otherwise it looks suspiciously similar to T249018

Yes, still seeing it. Runs for ~20sec so probably not a cache. Changed the JSONP callback(s) and still same error, so definitely no content-level cache.

So I tried the query on the debug servers with profiling on and I hit timeouts instead of memory limit.
But generally at the time the requests hit the timeout, the memory limit is quite high.

https://performance.wikimedia.org/xhgui/run/view?id=5ea0c61e3f3dfa135999325b
https://performance.wikimedia.org/xhgui/run/view?id=5ea0c5ab3f3dfa1a59c01fa7
https://performance.wikimedia.org/xhgui/run/view?id=5ea0c7106318cbb47153e294

When running on the debug servers without profiling I end up hitting into the memory limit.

One thing to note in the profiling is that a significant portion of the time that the query is executing is spent in a section of code in ApiResult::addValue, specifically the validateValue and size methods.
I don't think we really need to do this, but we should consult with the core platform team and potentially remove this from the execution path for our apis?

An equal amount of time seems to be spent in Wikibase\Lib\Serialization\SerializationModifier::modifyUsingCallback which could likely do with some optimizations.

Noticeably in this call property id creation seems to take quite some time, it looks like the bulk of this overhead is due to joining and splitting around id prefixes, which we do not actually use.
@Ladsgroup recently filed a ticket about the performance of EntityId and ItemId etc.
I think we could make an improvement here probably by not doing so much logic in the constructor when we don't need to.
Perhap we can also just remove all of this id prefix stuff for now as we don't use it.

It'll be good to try and get a full profile of a completed execution.
To do that we will have to see if we can temporarily remove the timeline restriction on one of the mwdebug servers.

One thing to note in the profiling is that a significant portion of the time that the query is executing is spent in a section of code in ApiResult::addValue, specifically the validateValue and size methods.
I don't think we really need to do this, but we should consult with the core platform team and potentially remove this from the execution path for our apis?

An equal amount of time seems to be spent in Wikibase\Lib\Serialization\SerializationModifier::modifyUsingCallback which could likely do with some optimizations.

Noticeably in this call property id creation seems to take quite some time, it looks like the bulk of this overhead is due to joining and splitting around id prefixes, which we do not actually use.
@Ladsgroup recently filed a ticket about the performance of EntityId and ItemId etc.
I think we could make an improvement here probably by not doing so much logic in the constructor when we don't need to.
Perhap we can also just remove all of this id prefix stuff for now as we don't use it.

It'll be good to try and get a full profile of a completed execution.
To do that we will have to see if we can temporarily remove the timeline restriction on one of the mwdebug servers.

Thank you for running the performance inspections and it's very useful but we should check memory instead of time, they actually have a trade-off. For example if we try to to cache property objects so it wouldn't spend time checking the regex, it would be faster but it would use more even memory and worsens the situation mentioned in this ticket. Checking the memory explosions:

Wikibase\Lib\Serialization\SerializationModifier::getPathParts
    138,755,064 bytes
Wikibase\DataModel\Entity\EntityId::extractRepositoryNameAndLocalPart
    73,650,136 bytes
Wikibase\DataModel\Entity\EntityId::extractSerializationParts
    71,232,576 bytes
Wikibase\DataModel\Snak\PropertyValueSnak::unserialize
    66,952,800 bytes
Wikibase\Lib\Serialization\CallbackFactory::Wikibase\Lib\Serialization\{closure}
    64,796,608 bytes
Wikibase\DataModel\Snak\SnakList::unserialize

I guess data model needs a little tuning in storing values.

Currently, EntityId stores everything twice, once as $serialization (e.g. commons:M12) and once as combination of both: ($repositoryName + $localPart). Fixing this would drop the memory footprint of Wikidata.

I got the memory and time limits removed on mwdebug1002 so that we could get a full profile.

https://performance.wikimedia.org/xhgui/run/view?id=5eb3c4dbbb8544687a18a8f1
https://performance.wikimedia.org/xhgui/run/view?id=5eb3c579bb8544737a2869da
https://performance.wikimedia.org/xhgui/run/view?id=5eb3c5fcbb8544767a7343ce

Which shows the full results is over 2x as bad as the first set of profiles when actually complete!

Wikibase\Lib\Serialization\SerializationModifier::getPathParts
175,364,728 bytes
ApiResult::stripMetadataNonRecursive
123,734,304 bytes
Wikibase\Lib\Serialization\CallbackFactory::Wikibase\Lib\Serialization\{closure}
81,384,096 bytes
Wikibase\DataModel\Entity\EntityId::extractRepositoryNameAndLocalPart
80,464,760 bytes
Wikibase\DataModel\Entity\EntityId::extractSerializationParts
80,320,120 bytes
Wikibase\DataModel\Snak\PropertyValueSnak::unserialize
67,297,352 bytes
--- a/includes/api/ApiResult.php
+++ b/includes/api/ApiResult.php
@@ -396,9 +396,15 @@ class ApiResult implements ApiSerializable {
        if ( !( $flags & self::NO_SIZE_CHECK ) ) {
            // self::size needs the validated value. Then flag
            // to not re-validate later.
+           $start = microtime( true );
            $value = self::validateValue( $value );
+           $duration = microtime( true ) - $start;
            $flags |= self::NO_VALIDATE;
 
+           if ( @$_GET['krinkle'] ) {
+               wfDebugLog( 'AdHocDebug', 'Spent ' . round( $duration * 1000, 3 ) . ' ms in ' . __METHOD__ );
+           }

Two runs:

  • Run 1
    • Total time in PHP: 97 ms
    • Time in ApiResult::validateValue (from ApiResult::addValue): ~ 4.8ms (5 top-level calls)
  • Run 2
    • Total time in PHP: 94 ms
    • Time in ApiResult::validateValue (from ApiResult::addValue): ~ 4.7ms (5 top-level calls)
Raw data
krinkle@mw1378:$ curl 'http://mw1378.eqiad.wmnet/w/api.php?action=wbgetentities&ids=Q12184|Q17076801&props=info|aliases|labels|descriptions|claims|sitelinks|datatype&format=json&krinkle=1' -i -H 'Host: www.wikidata.org' | grep Backend-Timing
Backend-Timing: D=97086 t=1588895607974196

timestamp:  2020-05-07T23:53:28+00:00
reqId: XrSfdwpAMNwAA5U6DdIAAABU

- Spent 3.655 ms in ApiResult::addValue
- Spent 0.002 ms in ApiResult::addValue
- Spent 1.088 ms in ApiResult::addValue
- Spent 0.001 ms in ApiResult::addValue
- Spent 0.001 ms in ApiResult::addValue


krinkle@mw1378:$ curl 'http://mw1378.eqiad.wmnet/w/api.php?action=wbgetentities&ids=Q12184|Q17076801&props=info|aliases|labels|descriptions|claims|sitelinks|datatype&format=json&krinkle=1' -i -H 'Host: www.wikidata.org' | grep Backend-Timing
Backend-Timing: D=93593 t=1588895609066951

timestamp:  2020-05-07T23:53:29+00:00
reqId: XrSfeQpAMNwAA5U6Df4AAABX

- Spent 3.598 ms in ApiResult::addValue
- Spent 0.002 ms in ApiResult::addValue
- Spent 1.028 ms in ApiResult::addValue
- Spent 0.001 ms in ApiResult::addValue
- Spent 0.001 ms in ApiResult::addValue