Page MenuHomePhabricator

TypeError: array_map(): Argument #2 ($array) must be of type array, int given
Closed, ResolvedPublicPRODUCTION ERROR

Description

Error
  • service.version: 1.45.0-wmf.7
  • timestamp: 2025-07-02T09:53:17.070Z
  • labels.phpversion: 8.1.32
  • trace.id: d38baafc-9ad6-4bd0-8f2a-608caf1b9c3e
  • Find trace.id in Logstash
labels.normalized_message
[{reqId}] {exception_url}   TypeError: array_map(): Argument #2 ($array) must be of type array, int given
FrameLocationCall
from/srv/mediawiki/php-1.45.0-wmf.7/extensions/EventBus/includes/EventBus.php(469)
#0/srv/mediawiki/php-1.45.0-wmf.7/extensions/EventBus/includes/EventBus.php(469)array_map(Closure, int)
#1/srv/mediawiki/php-1.45.0-wmf.7/extensions/EventBus/includes/Adapters/JobQueue/JobQueueEventBus.php(120)MediaWiki\Extension\EventBus\EventBus->send(array, int)
#2/srv/mediawiki/php-1.45.0-wmf.7/includes/jobqueue/JobQueue.php(387)MediaWiki\Extension\EventBus\Adapters\JobQueue\JobQueueEventBus->doBatchPush(array, int)
#3/srv/mediawiki/php-1.45.0-wmf.7/includes/jobqueue/JobQueue.php(359)MediaWiki\JobQueue\JobQueue->batchPush(array, int)
#4/srv/mediawiki/php-1.45.0-wmf.7/includes/jobqueue/JobQueueGroup.php(164)MediaWiki\JobQueue\JobQueue->push(array)
#5/srv/mediawiki/php-1.45.0-wmf.7/extensions/GrowthExperiments/includes/UserImpact/GrowthExperimentsUserImpactUpdater.php(110)MediaWiki\JobQueue\JobQueueGroup->push(array)
#6/srv/mediawiki/php-1.45.0-wmf.7/extensions/GrowthExperiments/includes/UserImpact/MediaWikiEventIngress/PageRevisionUpdatedIngress.php(28)GrowthExperiments\UserImpact\GrowthExperimentsUserImpactUpdater->refreshUserImpactData(MediaWiki\User\User)
#7/srv/mediawiki/php-1.45.0-wmf.7/includes/DomainEvent/EventDispatchEngine.php(204)GrowthExperiments\UserImpact\MediaWikiEventIngress\PageRevisionUpdatedIngress->handlePageRevisionUpdatedEvent(MediaWiki\Page\Event\PageLatestRevisionChangedEvent)
#8/srv/mediawiki/php-1.45.0-wmf.7/includes/DomainEvent/EventDispatchEngine.php(193)MediaWiki\DomainEvent\EventDispatchEngine->invoke(array, MediaWiki\Page\Event\PageLatestRevisionChangedEvent)
#9/srv/mediawiki/php-1.45.0-wmf.7/includes/deferred/MWCallableUpdate.php(52)MediaWiki\DomainEvent\EventDispatchEngine->MediaWiki\DomainEvent\{closure}(string)
#10/srv/mediawiki/php-1.45.0-wmf.7/includes/deferred/DeferredUpdates.php(459)MediaWiki\Deferred\MWCallableUpdate->doUpdate()
#11/srv/mediawiki/php-1.45.0-wmf.7/includes/deferred/DeferredUpdates.php(201)MediaWiki\Deferred\DeferredUpdates::attemptUpdate(MediaWiki\Deferred\MWCallableUpdate)
#12/srv/mediawiki/php-1.45.0-wmf.7/includes/deferred/DeferredUpdates.php(288)MediaWiki\Deferred\DeferredUpdates::run(MediaWiki\Deferred\MWCallableUpdate)
#13/srv/mediawiki/php-1.45.0-wmf.7/includes/deferred/DeferredUpdatesScope.php(243)MediaWiki\Deferred\DeferredUpdates::MediaWiki\Deferred\{closure}(MediaWiki\Deferred\MWCallableUpdate, int)
#14/srv/mediawiki/php-1.45.0-wmf.7/includes/deferred/DeferredUpdatesScope.php(172)MediaWiki\Deferred\DeferredUpdatesScope->processStageQueue(int, int, Closure)
#15/srv/mediawiki/php-1.45.0-wmf.7/includes/deferred/DeferredUpdates.php(307)MediaWiki\Deferred\DeferredUpdatesScope->processUpdates(int, Closure)
#16/srv/mediawiki/php-1.45.0-wmf.7/includes/MediaWikiEntryPoint.php(670)MediaWiki\Deferred\DeferredUpdates::doUpdates()
#17/srv/mediawiki/php-1.45.0-wmf.7/includes/MediaWikiEntryPoint.php(492)MediaWiki\MediaWikiEntryPoint->restInPeace()
#18/srv/mediawiki/php-1.45.0-wmf.7/includes/MediaWikiEntryPoint.php(450)MediaWiki\MediaWikiEntryPoint->doPostOutputShutdown()
#19/srv/mediawiki/php-1.45.0-wmf.7/includes/MediaWikiEntryPoint.php(207)MediaWiki\MediaWikiEntryPoint->postOutputShutdown()
#20/srv/mediawiki/php-1.45.0-wmf.7/api.php(44)MediaWiki\MediaWikiEntryPoint->run()
#21/srv/mediawiki/w/api.php(3)require(string)
#22{main}
Notes
  • Appeared today Wed 2nd for the first time for 1.45.0-wmf.7
  • English wiki only so far
  • It looks like the issue in the code may have been there for a while and just now surfaced by something else

Details

Request URL
https://en.wikipedia.org/w/api.php?action=edit&assert=*&format=*&minor=*&title=*&watchlist=*
Related Changes in Gerrit:

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

This lines up with:

2025-07-01
13:29	<urbanecm@deploy1003>	Finished scap sync-world: Backport for [[gerrit:1164979|Growth: Configure higher impact module edit limits for english and test wiki (T341599)]] (duration: 19m 10s)

which is enwiki-specific. Looking.

The error itself comes from this section:

EventBus.php
// We expect the event service to return an array of objects
// in the response body.
// FormatJson::decode will return `null` if the message failed to parse.
// If anything other than an array is parsed we treat it as unexpected
// behaviour, and log the response at error severity.
// See https://phabricator.wikimedia.org/T370428

// $failureInfosByKind should look like:
// {
// 	  "<failure_kind">: [
// 		{ ..., "event": {<failed event here>}, "context": {<failure context here>},
//		{ ... }
//    ],
// }
$failureInfosByKind = FormatJson::decode( $res['body'], true );
if ( is_array( $failureInfosByKind ) ) {
	foreach ( $failureInfosByKind as $failureKind => $failureInfos ) {
		// $failureInfos should not be null or empty.
		// This is just a guard against what the intake
		// service returns (or the behavior of different json parsing methods - possibly).
		// https://www.mediawiki.org/wiki/Manual:Coding_conventions/PHP#empty()
		if ( $failureInfos === null || $failureInfos === [] ) {
			continue;
		}

		// Get the events that failed from the response.
		$failedEvents = array_map(
			static function ( $failureStatus ) {
				return $failureStatus['event'] ?? null;
			},
			$failureInfos // <-- ERROR: This should be array but is int!
		);

which would seem to indicate that somehow the event service is failing to send a correct error response when, presumably, it deals with a request or response that is too large?

The link in the description "Find trace.id in Logstash" reveals that there is also a 500 error coming from EventBus, and that has the error message:

{"status":500,"type":"internal_error","title":"PayloadTooLargeError","detail":"request entity too large","method":"POST","uri":"/v1/events"}

Which would seem to confirm @Urbanecm_WMF's suspicion about what triggered this now.

We certainly do have some impact data updated for enwiki since the deployment:

mysql:research@dbstore1009.eqiad.wmnet [enwiki]> select count(*) from growthexperiments_user_impact where geui_timestamp like '20250702%';
+----------+
| count(*) |
+----------+
|    23448 |
+----------+
1 row in set (0.006 sec)

mysql:research@dbstore1009.eqiad.wmnet [enwiki]>

If I'm looking at those 500er events from EventBus than we're seeing over 500 of them starting very recently:

image.png (295×818 px, 18 KB)

Probably, all have this issue here as the cause.

I think that is enough for us to roll back.

If I'm looking at those 500er events from EventBus than we're seeing over 500 of them starting very recently:

image.png (295×818 px, 18 KB)

Probably, all have this issue here as the cause.

I think that is enough for us to roll back.

That being said, if all events with an impact of say 1100 and up would be causing this, then I would expect that many more of these errors would occur. So maybe the actual threshold is closer to the upper end of 10K here?

> \MediaWiki\MediaWikiServices::getInstance()->get('GrowthExperimentsUserImpactUpdater')
= GrowthExperiments\UserImpact\GrowthExperimentsUserImpactUpdater {#5139}

> $u = \MediaWiki\MediaWikiServices::getInstance()->get('GrowthExperimentsUserImpactUpdater')
= GrowthExperiments\UserImpact\GrowthExperimentsUserImpactUpdater {#5139}

> sudo $userImpactLookup = $u->userImpactLookup
= GrowthExperiments\UserImpact\ComputedUserImpactLookup {#7737}

> $user = User::newFromId(44217690)
= MediaWiki\User\User {#7871
    +mId: 44217690,
    +mName: null,
    +mActorId: null,
    +mRealName: null,
    +mEmail: null,
    +mTouched: null,
    +mEmailAuthenticated: null,
    +mFrom: "id",
    mId: 44217690,
    mName: null,
    mActorId: null,
    mRealName: null,
    mEmail: null,
    mTouched: null,
    mEmailAuthenticated: null,
    mFrom: "id",
  }

> $impact = $userImpactLookup->getExpensiveUserImpact($user, 1, [])
= GrowthExperiments\UserImpact\ExpensiveUserImpact {#21673}

> strlen(json_encode($impact))
= 11143829

> \MediaWiki\MediaWikiServices::getInstance()->getJobQueueGroup()->push(new \MediaWiki\JobQueue\JobSpecification('refreshUserImpactJob', ['impactDataBatch' => [$user->getId() => json_encode($impact)], 'staleBefore' => \MediaWiki\Utils\MWTimestamp::time() + 1]))

   TypeError  array_map(): Argument #2 ($array) must be of type array, int given.

>

I successfully reproduced this issue for an user on enwiki, see above.

[...]
I successfully reproduced this issue for an user on enwiki, see above.

Nice! Then let's roll it back and figure something out.

Change #1165866 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[operations/mediawiki-config@master] [Growth] enwiki: Decrease wgGEUserImpactMaxEdits to 1000

https://gerrit.wikimedia.org/r/1165866

Change #1165866 merged by jenkins-bot:

[operations/mediawiki-config@master] [Growth] enwiki: Decrease wgGEUserImpactMaxEdits to 1000

https://gerrit.wikimedia.org/r/1165866

Mentioned in SAL (#wikimedia-operations) [2025-07-02T12:48:57Z] <urbanecm@deploy1003> Started scap sync-world: Backport for [[gerrit:1165865|[Growth] Move Impact limit configuration to ext-GrowthExperiments (T341599)]], [[gerrit:1165866|[Growth] enwiki: Decrease wgGEUserImpactMaxEdits to 1000 (T398418 T341599)]]

I'm adding Data-Engineering because this is also surfacing an issues in the EventBus code: It should probably not break with a PHP TypeError in this situation. Also, I'm not sure if the service should respond with a 500 Error in the first place.

Mentioned in SAL (#wikimedia-operations) [2025-07-02T12:51:16Z] <urbanecm@deploy1003> urbanecm: Backport for [[gerrit:1165865|[Growth] Move Impact limit configuration to ext-GrowthExperiments (T341599)]], [[gerrit:1165866|[Growth] enwiki: Decrease wgGEUserImpactMaxEdits to 1000 (T398418 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2025-07-02T12:58:40Z] <urbanecm@deploy1003> Finished scap sync-world: Backport for [[gerrit:1165865|[Growth] Move Impact limit configuration to ext-GrowthExperiments (T341599)]], [[gerrit:1165866|[Growth] enwiki: Decrease wgGEUserImpactMaxEdits to 1000 (T398418 T341599)]] (duration: 09m 42s)

Urbanecm_WMF triaged this task as High priority.

Let's split the data-engineering portion to a followup that has clear context. Moving to sprint for now.

Filled two follow-ups:

With this, I believe this can be moved to QA, as the error should no longer happen, and follow-up work is tracked elsewhere.

Etonkovidova subscribed.

The errors seem to stop - the last timestamp - Jul 2, 2025 @ 12:50:29.022- https://logstash.wikimedia.org/goto/aea2868ce6912470dae4ae4784058c3b