Page MenuHomePhabricator

/w/api.php... ErrorException from line 0 of : PHP Notice: Unable to unserialize ... Size of serialized string ... exceeds max
Closed, ResolvedPublic

Description

Error

Request ID: XV-C1wpAEMYAAB5CGRsAAACB
Request URL: en.wikibooks.org/w/api.php...

message
[XV-C1wpAEMYAAB5CGRsAAACB] /w/api.php?hidebots=1&days=7&limit=50&target=Robotics&action=feedrecentchanges&feedformat=atom   ErrorException from line 0 of : PHP Notice: Unable to unserialize: [a:4:{i:0;i:1;i:1;s:68140141:"<tr>
  <td colspan="2" class="diff-lineno"><!--LINE 1--></td>
  <td colspan="2" class="diff-lineno"><!--LINE 1--></td>
</tr>
<tr>
  <td colspan="2" class="diff-empty">&#160;</td>
  <td class="diff-marker">+</td>
  <td class="diff-addedline"><div>gay</div></td>
</tr>
<tr>
  <td class="diff-marker">−</td>
  <td class="diff-deletedline"><div>{{Featured book}}</div></td>
  <td colspan="2" class="diff-empty">&#160;</td>
</tr>
<tr>
  <td colspan="2" class="diff-empty">&#160;</td>
  <td class="diff-marker">+</td>
  <td class="diff-addedline"><div>gay</div></td>
</tr>
<tr>
  <td class="diff-marker">−</td>
  <td class="diff-deletedline"><div>&lt;div style="text-align:center;font-style:italic;"&gt;Welcome to...&lt;/div&gt;</div></td>
  <td colspan="2" class="diff-empty">&#160;</td>
</tr>
<tr>
  <td colspan="2" class="diff-empty">&#160;</td>
  <td class="diff-marker">+</td>
  <td class="diff-addedline"><div>gay</div></td>
</tr>
<tr>
  <td class="diff-marker">−</]. Size of serialized string (68140141) exceeds max.
trace
#0 [internal function]: MWExceptionHandler::handleError(integer, string, string, integer, array, array)
#1 [internal function]: Memcached->getMultiByKey(string, array, NULL, integer)
#2 /srv/mediawiki/php-1.34.0-wmf.19/includes/libs/objectcache/MemcachedPeclBagOStuff.php(320): Memcached->getMulti(array)
#3 /srv/mediawiki/php-1.34.0-wmf.19/includes/libs/objectcache/MediumSpecificBagOStuff.php(544): MemcachedPeclBagOStuff->doGetMulti(array, integer)
#4 /srv/mediawiki/php-1.34.0-wmf.19/includes/libs/objectcache/wancache/WANObjectCache.php(434): MediumSpecificBagOStuff->getMulti(array)
#5 /srv/mediawiki/php-1.34.0-wmf.19/includes/libs/objectcache/wancache/WANObjectCache.php(359): WANObjectCache->getMulti(array, array, array, integer)
#6 /srv/mediawiki/php-1.34.0-wmf.19/includes/diff/DifferenceEngine.php(1046): WANObjectCache->get(string)
#7 /srv/mediawiki/php-1.34.0-wmf.19/includes/diff/DifferenceEngine.php(987): DifferenceEngine->getDiffBody()
#8 /srv/mediawiki/php-1.34.0-wmf.19/includes/FeedUtils.php(134): DifferenceEngine->getDiff(string, string)
#9 /srv/mediawiki/php-1.34.0-wmf.19/includes/FeedUtils.php(77): FeedUtils::formatDiffRow(Title, string, string, string, string, string)
#10 /srv/mediawiki/php-1.34.0-wmf.19/includes/changes/ChangesFeed.php(124): FeedUtils::formatDiff(stdClass)
#11 /srv/mediawiki/php-1.34.0-wmf.19/includes/api/ApiFeedRecentChanges.php(84): ChangesFeed::buildItems(Wikimedia\Rdbms\ResultWrapper)
#12 /srv/mediawiki/php-1.34.0-wmf.19/includes/api/ApiMain.php(1589): ApiFeedRecentChanges->execute()
#13 /srv/mediawiki/php-1.34.0-wmf.19/includes/api/ApiMain.php(533): ApiMain->executeAction()
#14 /srv/mediawiki/php-1.34.0-wmf.19/includes/api/ApiMain.php(504): ApiMain->executeActionWithErrorHandling()
#15 /srv/mediawiki/php-1.34.0-wmf.19/api.php(86): ApiMain->execute()
#16 /srv/mediawiki/w/api.php(3): include(string)
#17 {main}
Impact

16 hits in the last hour, 832 in the last day.

Notes

Blocking next week's train.

Event Timeline

Restricted Application added a project: Core Platform Team. · View Herald TranscriptAug 23 2019, 11:09 AM
Restricted Application added subscribers: Liuxinyu970226, Aklapper. · View Herald Transcript
zeljkofilipin triaged this task as Unbreak Now! priority.Aug 23 2019, 11:10 AM
CDanis added a subscriber: CDanis.Aug 23 2019, 2:01 PM

Judging from logstash, this seems to only occur on HHVM. Likely, PHP7 is writing some things to memcache that HHVM can't unserialize.

WDoranWMF added subscribers: daniel, WDoranWMF.

@daniel Can you take a look at this as a priority?

Anomie added a subscriber: Anomie.

TL;DR: There doesn't seem to be any bug in MediaWiki here. It seems to be an issue where PHP7 saved something to memcached that HHVM refuses to handle.

What seems to be going on here is that an edit was made to wikibooks:Wikijunior:Biology at 2019-08-23T02:35:01Z (history link) that replaced the page with 479808 lines of the word "gay". It was reverted at 02:53:51Z.

The HTML for the diff of that edit is 68140141 bytes.

When asked to save that, the memcached PHP extension wraps that in an array with a bit of extra metadata, serializes the array (using PHP's native serialization), and saves the serialized string to memcached. PHP 7 seems to handle this fine, and is probably what wrote the memcached entry. But HHVM refuses to unserialize it ("Size of serialized string (68140141) exceeds max"), instead raising an E_NOTICE-level error and returning false. That E_NOTICE is what's getting logged.

The many hits are largely from requests for RSS feeds that would include that diff, so probably people's RSS clients fetching the feed for them (or possibly bots using the feed rather than the normal recentchanges query endpoint). There are also a few attempts to view the diff via the web UI, which in testing seem to show the diff without much issue (presumably regenerating it from scratch due to the cache "miss").

WDoranWMF moved this task from Inbox to Tracking/Watching on the Core Platform Team board.

@Anomie who would be best to pick this up next? Or is it actionable at all given the switch over to PHP 7?

We may want to enforce some size limit on objects we save to memcached...

That's a very good question.

  • This should go away once everything is on PHP7.
  • The immediate issue will go away when that diff (and probably its revert too) drop out of recentchanges.
    • There are a few ways that could be made to happen faster than the default 30 days. Although most of the requests are specifying a limit of 7 days, so the logspam should reduce or disappear by then.
    • OTOH, it's possible someone might make a new similar edit.
  • We could add code to MediaWiki to try to work around this issue, by determining what HHVM's limit is and avoiding saving larger diffs to memcached.
    • Personally I'm not sure that's worthwhile except maybe as a short-term hack.
  • We could revise the way that E_NOTICE messages are logged, if we don't want them going to the "error" channel.

We may want to enforce some size limit on objects we save to memcached...

The value being stored is a single long string, not an object. This doesn't seem to be an issue with the length of the serialized representation itself, just the fact that one item in that representation is a very long string.

@zeljkofilipin of the 3 pathways @Anomie laid out, is it reasonable that this will dissipate as we move over to PHP 7 or is this causing significant issues that would require us to pursue a resolution now?

@WDoranWMF I don't know much about this error. I've noticed it in the logs after train deployment. All new log errors by default block the train.

@zeljkofilipin grand, who can we check with to see what is an acceptable resolution?

@WDoranWMF subscribers of this task should know the best. If not, the rest of Core Platform Team.

Excluding this error as-is from Logstash dashboards is a problem because it's too generic (we don't want to ignore all serialisation errors).

If we (who?) are interested in keeping this error visible in Logstash for investigations (in case it becomes more common), perhaps we can locally suppress the warning in favour of a dedicated message about this issue. We can then exclude that one from the regression dashboards (or use a custom channel, then it won't be included in the first place).

@Anomie wrote

We could revise the way that E_NOTICE messages are logged, if we don't want them going to the "error" channel.

Friendly reminder, this is blocking the train. Will it be resolved by train window tomorrow?

Per @Anomie's comment this isn't caused by any change in Mediawiki code, and thus IMO shouldn't be a train blocker?

Per @Anomie's comment this isn't caused by any change in Mediawiki code, and thus IMO shouldn't be a train blocker?

+1 from me.

This is creating log spam, that's why I've reported it. To resolve this, this error should not get logged. However that's done is fine with me.

WDoranWMF added a comment.EditedAug 27 2019, 12:54 PM

@zeljkofilipin Thanks, I can schedule this into our Clinic Duty to look into it further and reduce the log spam but since it's not a new error message caused by code changes but instead caused by a previously undiscovered bug, it does seem like it shouldn't be a train blocker in this instance. Is that reasonable?

hashar added a subscriber: hashar.EditedAug 27 2019, 1:51 PM

From the HHVM documentation:

hhvm.resource_limit.serialization_size_limitint2146435072 (~2 MB)The maximum size of a serialized string.

Whereas in Zend PHP IIRC there is no limit. I guess MediaWiki could learn to avoid saving in the cache objects which are way too large.

I am not sure though why the error suddenly surfaces though :(

Regardless that does not seem to be a blocker for the train.

Great, thanks, I'll move this into our Clinic Duty

WDoranWMF lowered the priority of this task from Unbreak Now! to Medium.Aug 27 2019, 2:18 PM
mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:06 PM

We have largely migrated to PHP7.2 which does not seem to have a serialization size limit. HHVM had:

hhvm.resource_limit.serialization_size_limitint2146435072 (~2 MB)

Which also mean that previously we would never store a serialized object larger than 2MBytes but PHP 7.2 support would. That might eventually cause caches to hold objects which are way too large.

Krinkle removed a project: MediaWiki-Cache.EditedOct 9 2019, 7:43 PM

(Untagging MediaWiki-Cache as it does not appear to be an issue with the objectcache library.)

Anomie closed this task as Resolved.Oct 9 2019, 8:03 PM

This should be gone now due to T176370, we no longer use HHVM so its limitation on length for serialized string values should no longer cause a problem.