Page MenuHomePhabricator

PHP Warning: Class RawMessage has no unserializer
Closed, ResolvedPublicPRODUCTION ERROR

Description

Error
normalized_message
[{reqId}] {exception_url}   PHP Warning: Class RawMessage has no unserializer
exception.trace
from /srv/mediawiki/php-1.40.0-wmf.10/extensions/Echo/includes/Model/Event.php(329)
#0 [internal function]: MWExceptionHandler::handleError(integer, string, string, integer, array)
#1 /srv/mediawiki/php-1.40.0-wmf.10/extensions/Echo/includes/Model/Event.php(329): unserialize(string)
#2 /srv/mediawiki/php-1.40.0-wmf.10/extensions/Echo/includes/Model/Event.php(409): MediaWiki\Extension\Notifications\Model\Event->loadFromRow(stdClass)
#3 /srv/mediawiki/php-1.40.0-wmf.10/extensions/Echo/includes/Model/Notification.php(152): MediaWiki\Extension\Notifications\Model\Event::newFromRow(stdClass)
#4 /srv/mediawiki/php-1.40.0-wmf.10/extensions/Echo/includes/Mapper/NotificationMapper.php(273): MediaWiki\Extension\Notifications\Model\Notification::newFromRow(stdClass)
#5 /srv/mediawiki/php-1.40.0-wmf.10/extensions/Echo/includes/Mapper/NotificationMapper.php(102): MediaWiki\Extension\Notifications\Mapper\NotificationMapper->fetchByUserInternal(User, integer, NULL, array, array, integer)
#6 /srv/mediawiki/php-1.40.0-wmf.10/extensions/Echo/includes/Api/ApiEchoNotifications.php(306): MediaWiki\Extension\Notifications\Mapper\NotificationMapper->fetchUnreadByUser(User, integer, NULL, array, NULL)
#7 /srv/mediawiki/php-1.40.0-wmf.10/extensions/Echo/includes/Api/ApiEchoNotifications.php(136): MediaWiki\Extension\Notifications\Api\ApiEchoNotifications->getPropList(User, array, array, integer, NULL, NULL, NULL, boolean, boolean)
#8 /srv/mediawiki/php-1.40.0-wmf.10/extensions/Echo/includes/Api/ApiEchoNotifications.php(66): MediaWiki\Extension\Notifications\Api\ApiEchoNotifications->getLocalNotifications(array)
#9 /srv/mediawiki/php-1.40.0-wmf.10/includes/api/ApiQuery.php(671): MediaWiki\Extension\Notifications\Api\ApiEchoNotifications->execute()
#10 /srv/mediawiki/php-1.40.0-wmf.10/includes/api/ApiMain.php(1902): ApiQuery->execute()
#11 /srv/mediawiki/php-1.40.0-wmf.10/includes/api/ApiMain.php(877): ApiMain->executeAction()
#12 /srv/mediawiki/php-1.40.0-wmf.10/includes/api/ApiMain.php(848): ApiMain->executeActionWithErrorHandling()
#13 /srv/mediawiki/php-1.40.0-wmf.10/api.php(90): ApiMain->execute()
#14 /srv/mediawiki/php-1.40.0-wmf.10/api.php(45): wfApiMain()
#15 /srv/mediawiki/w/api.php(3): require(string)
#16 {main}
Impact
Notes

11 of these after moving 1.40.0-wmf.10 (T320515) to group1.

Details

Request URL
https://commons.wikimedia.org/w/api.php?action=query&centralauthtoken=*&errorformat=*&errorsuselocal=*&format=*&formatversion=*&meta=*&notfilter=*&notlimit=*&notprop=*&notwikis=*

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

It might be transitionary due to rMW27ee63f8c9d9: Remove pre PHP 7.4 serialize()/unserialize() (changes to Message.php).

If it goes away (and stays gone), I won't be too worried!

Agreed, if these values are only used within caches local to a single wiki, then the transition will be graceful and runtime naturally self-corrects anyway.

However, if they are in cross-wiki caches then we will see it continously for the next few days, which at low volume might still be fine assuming it self-corrects each time at runtime (by discarding the cache value as we do) until group2 where it would converge on the new value.

However, if they are stored in the database, then it might be a bigger issue as it would continue to see old values. It seems from a quick look that these are actually stored in the database given Event->loadFromRow and a direct call to unserialize in Echo's code, i.e. not implicitly via a self-correcting cache layer.

As for whether how big an issue that is and what the long-term user impact is, I defer to Growth-Team to assess how much and whether it matters that $event->extra will become boolean false instead of the metadata array that is normally there (for integrity, the entire array would be discarded, not just the field containing the RawMessage object). Source: https://gerrit.wikimedia.org/g/mediawiki/extensions/Echo/+/a17802dec70dd872b13c4f651f8163d365bdaf7f/includes/Model/Event.php#329

As for actual impact, Logstash "mediawiki" dashboard shows 12 matches for message:"Class RawMessage has no unserializer" in the past 24 hours, all on commonswiki. I note that this might not be a meaningful analysis at this time, given the train is currently rolled back to group0.

Noting that this hasn't gone away yet, at least. 25 in the last hour.

Why is the error specific to RawMessage? It shouldn't be different from Message in any relevant way. It stores the message text in the message key, so any logic that serializes Message should also work for RawMessage.

As for whether how big an issue that is and what the long-term user impact is, I defer to Growth-Team to assess how much and whether it matters that $event->extra will become boolean false instead of the metadata array that is normally there (for integrity, the entire array would be discarded, not just the field containing the RawMessage object).

It contains all the event-specific data like the target revision, reason, in some cases the title; so it would be pretty bad.

In any case, wouldn't the policy-compliant approach to rMW27ee63f8c9d9: Remove pre PHP 7.4 serialize()/unserialize() be to hard-deprecate the interface methods and keep them for another release?

Why is the error specific to RawMessage? It shouldn't be different from Message in any relevant way. It stores the message text in the message key, so any logic that serializes Message should also work for RawMessage.

Maybe the way RawMessage is being used... If it's just Echo?

In any case, wouldn't the policy-compliant approach to rMW27ee63f8c9d9: Remove pre PHP 7.4 serialize()/unserialize() be to hard-deprecate the interface methods and keep them for another release?

Wouldn't hard deprection instead result in deprecated warnings? Which is maybe only marginally better, if something is actually functionally broken (in terms of user experience etc) with code as is.

The methods shouldn't be needed now that we're on PHP 7.4 in production though... As per https://php.watch/versions/8.1/serializable-deprecated

Maybe the way RawMessage is being used... If it's just Echo?

I suppose it's the only place in the codebase where a message gets serialized in long-term storage?

The methods shouldn't be needed now that we're on PHP 7.4 in production though... As per https://php.watch/versions/8.1/serializable-deprecated

The new magic-method-based serialization interface is not backwards-compatible with old serialized strings. 3v4l: old serialization code, new unserialization code. IIRC @tstarling patched PHP temporarily to avoid this issue in short-lived caches, but since apparently Message is present in long-term storage, we'll probably need a migration script.

Wouldn't hard deprection instead result in deprecated warnings? Which is maybe only marginally better, if something is actually functionally broken (in terms of user experience etc) with code as is.

It wasn't broken until the Serializable interface got removed. 3v4l: new unserialization code with compat

That code tries to discard the notification entirely (and references T73489: Echo: Special:Notifications Exception from line of : DateTimeZone::__construct(): Unknown or bad timezone (+00:00)) but fails because PHP issues a warning, not an exception. I wonder if that would be preferable.

Wouldn't hard deprection instead result in deprecated warnings? Which is maybe only marginally better, if something is actually functionally broken (in terms of user experience etc) with code as is.

It wasn't broken until the Serializable interface got removed. 3v4l: new unserialization code with compat

I didn't say it was broken before this patch. I was meaning if it was broken now, in the current state, with the Serializable interface removed...

The user impact isn't seemingly clear.

Logspam of the deprecation is not much better, if we just reinstated and hard deprecated.... Because usually we don't hard deprecate something until all the WMF Production usages are fixed, so we wouldn't do that here either.

Maybe the way RawMessage is being used... If it's just Echo?

I suppose it's the only place in the codebase where a message gets serialized in long-term storage?

Maybe. We also know how php serialisation sucks generally... See T161647: RFC: Deprecate using php serialization inside MediaWiki

If we're going to improve this, we don't want to just move to a newer PHP serialisation that may or may not get broken again in future.

I didn't say it was broken before this patch. I was meaning if it was broken now, in the current state, with the Serializable interface removed...

The user impact isn't seemingly clear.

The user impact is that ~1000 times a day notifications display incorrectly, since the information they are supposed to show (like which revision to give a difflink to) is unavailable.

Logspam of the deprecation is not much better, if we just reinstated and hard deprecated.... Because usually we don't hard deprecate something until all the WMF Production usages are fixed, so we wouldn't do that here either.

For one thing, it gives a fighting chance to third-party extensions which are affected by the same issue to fix their code before breaking. For another, it would have let us learn about this issue without causing user-visible issues.

If we're going to improve this, we don't want to just move to a newer PHP serialisation that may or may not get broken again in future.

We are talking about how to serialize Message objects here, the decision to use new PHP serialization instead of, say, JSON, was already made. Having Message serialization logic that lives inside Echo and not the Message class would be a rather bad idea.

Should Message use a less risky serialization method? Probably, but that's not something to delay fixing this bug for.

Although we should check if we can just get rid of using RawMessage in Echo notifications. It's basically just a string, wrapped in a (terrible) compatibility layer so it can be passed to code that expects a Message; we can probably just make the relevant code take a string instead. Then the migration could just replace the serialized RawMessage with a serialized string.

Anyway, the first step is to add logging to figure out what type of event is doing this.

Change 858323 had a related patch set uploaded (by Reedy; author: Reedy):

[mediawiki/core@master] Partial Revert "Remove pre PHP 7.4 serialize()/unserialize()"

https://gerrit.wikimedia.org/r/858323

Change 858323 merged by jenkins-bot:

[mediawiki/core@master] Partial Revert "Remove pre PHP 7.4 serialize()/unserialize()"

https://gerrit.wikimedia.org/r/858323

Change 860030 had a related patch set uploaded (by Reedy; author: Reedy):

[mediawiki/core@wmf/1.40.0-wmf.10] Partial Revert "Remove pre PHP 7.4 serialize()/unserialize()"

https://gerrit.wikimedia.org/r/860030

Change 860030 merged by jenkins-bot:

[mediawiki/core@wmf/1.40.0-wmf.10] Partial Revert "Remove pre PHP 7.4 serialize()/unserialize()"

https://gerrit.wikimedia.org/r/860030

Mentioned in SAL (#wikimedia-operations) [2022-11-23T21:59:50Z] <reedy@deploy1002> Synchronized php-1.40.0-wmf.10/includes/language/Message.php: T323236 (duration: 04m 35s)

Reedy claimed this task.

@Reedy is there a followup needed, or do we just plan to keep the Serializable interface forever? If we do, a source code comment about why we are doing it might be nice.

I doubt we can keep it forever; PHP will remove it eventually.

This task was about the production error, which was fixed.

If we want to do followup work (such as logging, updating Echo etc), they should probably be done in different tasks

Removing it would break any system which stores serialized data for a long time so I somewhat doubt that. Nevertheless, if you want to re-revert the patch anytime soon, we need a migration script that re-serializes the data using the new storage format. If there are no such plans, we can just leave it as it is (and migrate to JSON instead at some point in the future, I imagine).

Removing it would break any system which stores serialized data for a long time so I somewhat doubt that.

https://wiki.php.net/rfc/phase_out_serializable

In PHP 9.0 the Serializable interface will be removed and unserialize() will reject payloads using the C serialization format. Code needing to support both PHP < 7.4 and PHP >= 9.0 may polyfill the Serializable interface, though it will have no effect on serialization.

The RFC was approved unanimously.

Thanks, that's good to know.

Based on the PHP lifecycle and how much behind it we are, I expect that will become relevant in 3-4 years? Probably makes more sense to wait for Message to become JSON-serializable then.

Filed as T325703: Switch Echo serialization format from PHP to JSON.

Thanks, that's good to know.

Based on the PHP lifecycle and how much behind it we are, I expect that will become relevant in 3-4 years? Probably makes more sense to wait for Message to become JSON-serializable then.

Thanks for filing the task.

I guess we should be expecting PHP 8.3 November/December 2023, PHP 8.4 potentially November/December 2024 etc.

I can't see anything obvious as to when PHP 9 may even come out, so we've definitely got some time (or, one would hope).

Would obviously be nice to get ahead of the curve for a change! But it's "good" to know that only things using the Message heirarchy that was known to be broken at this stage.

We probably need to do some more looking around to see what's lurking, as ConcatenatedGzipHistoryBlob is documented as

WARNING: Objects of this class are serialized and permanently stored in the DB.

Suggesting we're going to have to do something for that longer term... Whether writing new/custom __unserialize() functions...