Page MenuHomePhabricator

[ES] Investigate OutOfMemory error in browser test, remove workaround
Open, Needs TriagePublic

Description

In T354179, we added a work-around for a CI-blocker - the Schema Edit Page test suite was failing because the first web request to the mediawiki in CI was returning:

Fatal error: Allowed memory size of 134217728 bytes exhausted...

Acceptance criteria:

  • CI for EntitySchema is green on master
  • The dummy request introduced in 987396 is removed from the test setup.

Event Timeline

ArthurTaylor renamed this task from Investigate OutOfMemory error in browser test, remove workaround to [ES] Investigate OutOfMemory error in browser test, remove workaround.Feb 2 2024, 10:46 AM

Fatal error: Allowed memory size of 134217728 bytes exhausted...

I don't know what is the memory limit on CI, and 128MBytes sounds on the low end. Looking at docker-registry.wikimedia.org/releng/quibble-buster-php74:1.6.0-s4:

$ grep -R memory_limit /etc/php
/etc/php/7.4/cli/php.ini:memory_limit = -1
/etc/php/current/cli/php.ini:memory_limit = -1

The CLI definitely runs without a memory limit but the web requests are hitting Apache / php-fpm and checking the later:

$ php-fpm7.4 -i|grep memory_limit
memory_limit => 128M => 128M

Is it too low? Possibly, then we have apparently been running with that low level limit for years.

What I suspect is the first web request triggers the generation of a large set of data for cache purpose. What comes to mind is the localization cache, even though at least the English one is generated by Quibble after having installed and updated MediaWiki:

  • maybe your request triggers the generation of the localization cache for the German language (I have seen that in the past with eg the language team using Hebrew for development since it is a right to left language)
  • maybe the localization cache generated from the CLI is not used by the php fpm process:
// Localisation Cache to StaticArray (T218207)
$wgLocalisationCacheConf['store'] = 'array';

I am not sure whether T218207 is related, but the config comes from https://gerrit.wikimedia.org/r/c/mediawiki/core/+/529057

That one sounds like the cache is only stored in the local process and vanish upon completion? That would means each web request would trigger a rebuild of the localization? :\

$wgLocalisationCacheConf['store'] = 'array'; that is the format used to store the cache. Looking at the mw-debug-www.log generated by a Selenium, it gets the localization cache from Memcached:

[MessageCache] MessageCache using store MemcachedPhpBagOStuff
[memcached] MemCache: got WANCache:wikidb:messages:en:hash:v1|#|v
[memcached] MemCache: got WANCache:wikidb:messages:en|#|t
[memcached] get(wikidb:messages:en)
[memcached] MemCache: sock i:0; got wikidb:messages:en
[MessageCache] MessageCache::loadUnguarded: Loading en... local cache is empty, got from global cache

If I look at the GET / originating from Cypress (which is the first in the debug log), I don't see anything suspicious beside the recentChangesUpdate job, but maybe that is the one taking a lot of memory.

Start request GET /
IP: 127.0.0.1
HTTP HEADERS:
ACCEPT-ENCODING: gzip, deflate
HOST: 127.0.0.1:9413
ACCEPT: */*
USER-AGENT: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Cypress/13.6.4 Chrome/114.0.5735.289 Electron/25.8.4 Safari/537.36
CONNECTION: keep-alive
(end headers)
[Wikibase] Wikibase\Lib\SettingsArray::getSetting: setting entitySources was given as a closure, resolve it to array (
  'local' => 
  array (
    'entityNamespaces' => 
    array (
      'item' => 120,
      'property' => 122,
      'lexeme' => 146,
      'mediainfo' => '6/mediainfo',
    ),
    'repoDatabase' => false,
    'baseUri' => 'http://127.0.0.1:9413/entity/',
    'rdfNodeNamespacePrefix' => 'wd',
    'rdfPredicateNamespacePrefix' => '',
    'interwikiPrefix' => '',
  ),
)
[session] SessionManager using store MemcachedPhpBagOStuff
[localisation] LocalisationCache using store LCStoreStaticArray
[memcached] MainWANObjectCache using store MemcachedPhpBagOStuff
[memcached] MicroStash using store MemcachedPhpBagOStuff
[SQLBagOStuff] MainObjectStash using store SqlBagOStuff
[session] CentralAuthSessionProvider::provideSessionInfo: Not enabled, falling back to core sessions
[session] SessionBackend "791jiahib6s5edn2jaompkac276h33e0" is unsaved, marking dirty in constructor
[session] SessionBackend "791jiahib6s5edn2jaompkac276h33e0" save: dataDirty=1 metaDirty=1 forcePersist=0
[cookie] already deleted setcookie: "wikidb_session", "", "1676021349", "/", "", "", "1", ""
[cookie] already deleted setcookie: "wikidbUserID", "", "1676021349", "/", "", "", "1", ""
[cookie] already deleted setcookie: "wikidbToken", "", "1676021349", "/", "", "", "1", ""
[session] SessionBackend "791jiahib6s5edn2jaompkac276h33e0" data dirty due to dirty(): MediaWiki\Session\SessionManager->getSessionForRequest/MediaWiki\Session\SessionManager->getInitialSession/MediaWiki\Session\Session->getToken/MediaWiki\Session\Session->set/MediaWiki\Session\SessionBackend->dirty
[session] SessionBackend "791jiahib6s5edn2jaompkac276h33e0" save: dataDirty=1 metaDirty=0 forcePersist=0
[EventLogging] wgEventLoggingBaseUri has not been configured.
[MessageCache] MessageCache using store MemcachedPhpBagOStuff
ParserFactory: using default preprocessor
[memcached] MemCache: got WANCache:wikidb:messages:en:hash:v1|#|v
[memcached] MemCache: got WANCache:wikidb:messages:en|#|t
[memcached] get(wikidb:messages:en)
[memcached] MemCache: sock i:0; got wikidb:messages:en
[MessageCache] MessageCache::loadUnguarded: Loading en... local cache is empty, got from global cache
MediaWiki\MediaWikiEntryPoint::commitMainTransaction: primary transaction round committed
MediaWiki\MediaWikiEntryPoint::commitMainTransaction: pre-send deferred updates completed
MediaWiki\MediaWikiEntryPoint::commitMainTransaction: session changes committed
[rdbms] Wikimedia\Rdbms\ChronologyProtector::persistSessionReplicationPositions: no primary positions data to save
[rdbms] Wikimedia\Rdbms\LBFactory::shutdown: finished ChronologyProtector shutdown
[rdbms] LBFactory shutdown completed
MediaWiki\Output\OutputPage::haveCacheVaryCookies: no cache-varying cookies found
[memcached] add WANCache:global:rdbms-server-readonly:localhost%3A/workspace/db/quibble-mysql-wa62l0ku/socket|#|m (STORED)
[rdbms] Wikimedia\Rdbms\DatabaseMySQL::open [0.436ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: SET group_concat_max_len = 262144, `sql_mode` = 'STRICT_ALL_TABLES,ONLY_FULL_GROUP_BY'
[rdbms] Wikimedia\Rdbms\LoadBalancer::reallyOpenConnection: opened new connection for 0/
[memcached] add WANCache:global:rdbms-server-readonly:localhost%3A/workspace/db/quibble-mysql-wa62l0ku/socket|#|m (NOT_STORED)
[memcached] fetchOrRegenerate(global:rdbms-server-readonly:localhost%3A/workspace/db/quibble-mysql-wa62l0ku/socket): busy miss
[rdbms] Wikimedia\Rdbms\DatabaseMySQL::serverIsReadOnly [1.186ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: SELECT @@GLOBAL.read_only AS Value
[memcached] client: serializing data as it is not scalar
[memcached] add WANCache:global:rdbms-server-readonly:localhost%3A/workspace/db/quibble-mysql-wa62l0ku/socket|#|v (STORED)
[memcached] MemCache: delete WANCache:global:rdbms-server-readonly:localhost%3A/workspace/db/quibble-mysql-wa62l0ku/socket|#|m (DELETED)
[memcached] fetchOrRegenerate(global:rdbms-server-readonly:localhost%3A/workspace/db/quibble-mysql-wa62l0ku/socket): miss, new value computed
[rdbms] Wikimedia\Rdbms\DatabaseMySQL::doSelectDomain [0.418ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: USE `wikidb`
[rdbms] Wikimedia\Rdbms\LoadBalancer::reuseOrOpenConnectionForNewRef: reusing connection for 0/wikidb
[rdbms] Wikimedia\Rdbms\Database::beginIfImplied (MediaWiki\Page\PageStore::getPageByNameViaLinkCache) [0.139ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: BEGIN
[rdbms] MediaWiki\Page\PageStore::getPageByNameViaLinkCache [0.996ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: SELECT  page_id,page_namespace,page_title,page_is_redirect,page_is_new,page_touched,page_links_updated,page_latest,page_len,page_content_model  FROM `page`    WHERE page_namespace = 0 AND page_title = 'Main_Page'  LIMIT 1  
[ContentHandler] Registered handler for wikitext: WikitextContentHandler
MediaWiki\Output\OutputPage::sendCacheControl: private caching (config); Sat, 10 Feb 2024 09:29:09 GMT **
[rdbms] MediaWiki\MediaWikiEntryPoint::restInPeace [0.411ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: COMMIT
[DeferredUpdates] DeferredUpdates::run: started MediaWiki\Deferred\TransactionRoundDefiningUpdate_MediaWiki\MediaWikiEntryPoint::schedulePostSendJobs #1305
[memcached] getWithSetCallback(global:rdbms-server-readonly:localhost%3A/workspace/db/quibble-mysql-wa62l0ku/socket): process cache hit
[rdbms] Wikimedia\Rdbms\DatabaseMySQL::open [0.213ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: SET group_concat_max_len = 262144, `sql_mode` = 'STRICT_ALL_TABLES,ONLY_FULL_GROUP_BY'
[rdbms] Wikimedia\Rdbms\LoadBalancer::reallyOpenConnection: opened new connection for 0/wikidb
[memcached] getWithSetCallback(global:rdbms-server-readonly:localhost%3A/workspace/db/quibble-mysql-wa62l0ku/socket): process cache hit
[rdbms] JobQueueDB::doGetSiblingQueuesWithJobs [3.164ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: SELECT DISTINCT job_cmd  FROM `job`    WHERE job_cmd IN ('updateBetaFeaturesUserCounts','crosswikiSuppressUser','LocalRenameUserJob','LocalPageMoveJob','CentralAuthCreateLocalAccountJob','CentralAuthUnattachUserJob','checkuserLogTemporaryAccountAccess','checkuserPruneCheckUserDataJob','cirrusSearchDeletePages','cirrusSearchIncomingLinkCount','cirrusSearchLinksUpdate','cirrusSearchLinksUpdatePrioritized','cirrusSearchMassIndex','cirrusSearchOtherIndex','cirrusSearchElasticaWrite','cirrusSearchCheckerJob','cirrusSearchDeleteArchive','EchoNotificationJob','EchoNotificationDeleteJob','EchoPushNotificationRequest','flaggedrevs_CacheUpdate','RecordLintJob','securePollPopulateVoterList','securePollTallyElection','securePollLogAdminAction','securePollArchiveElection','securePollUnarchiveElection','webVideoTranscode','webVideoTranscodePrioritized','CleanTermsIfUnused','PurgeEntityData','UpdateRepoOnMove','UpdateRepoOnDelete','DispatchChanges','DispatchChangeDeletionNotification','DispatchChangeVisibilityNotification','wikibase-addUsagesForPage','EntityChangeNotification','ChangeVisibilityNotification','ChangeDeletionNotification','wikibase-InjectRCRecords','constraintsTableUpdate','constraintsRunCheck','deletePage','refreshLinks','deleteLinks','htmlCacheUpdate','sendMail','enotifNotify','fixDoubleRedirect','AssembleUploadChunks','PublishStashedFile','ThumbnailRender','recentChangesUpdate','refreshLinksPrioritized','refreshLinksDynamic','activityUpdateJob','categoryMembershipChange','clearUserWatchlist','watchlistExpiry','cdnPurge','userGroupExpiry','clearWatchlistNotifications','userOptionsUpdate','revertedTagUpdate','null','userEditCountInit','parsoidCachePrewarm','renameUser')   
[rdbms] Wikimedia\Rdbms\LoadBalancer::reuseOrOpenConnectionForNewRef: reusing connection for 0/wikidb
[rdbms] Wikimedia\Rdbms\LoadBalancer::reuseOrOpenConnectionForNewRef: reusing connection for 0/wikidb
[rdbms] JobQueueDB::claimRandom [0.653ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: SELECT  job_id,job_cmd,job_namespace,job_title,job_timestamp,job_params,job_random,job_attempts,job_token,job_token_timestamp,job_sha1  FROM `job`    WHERE job_cmd = 'recentChangesUpdate' AND job_token = ''  LIMIT 205,1  
[memcached] get(WANCache:global:jobqueue:wikidb:main:recentChangesUpdate:small|#|v)
[memcached] client: serializing data as it is not scalar
[memcached] add WANCache:global:jobqueue:wikidb:main:recentChangesUpdate:small|#|v (STORED)
[rdbms] JobQueueDB::claimRandom [0.669ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: SELECT  job_id,job_cmd,job_namespace,job_title,job_timestamp,job_params,job_random,job_attempts,job_token,job_token_timestamp,job_sha1  FROM `job`    WHERE job_cmd = 'recentChangesUpdate' AND job_token = '' AND (job_random >= 1611712207)  ORDER BY job_random ASC LIMIT 1  
[rdbms] JobQueueDB::claimRandom [0.584ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: SELECT  job_id,job_cmd,job_namespace,job_title,job_timestamp,job_params,job_random,job_attempts,job_token,job_token_timestamp,job_sha1  FROM `job`    WHERE job_cmd = 'recentChangesUpdate' AND job_token = '' AND (job_random <= 1611712207)  ORDER BY job_random DESC LIMIT 1  
[rdbms] JobQueueDB::claimRandom [0.823ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: UPDATE  `job` SET job_token = '1c323f030c8fd7bd4f612ef55bfbee82',job_token_timestamp = '20240210092909',job_attempts = job_attempts+1 WHERE job_cmd = 'recentChangesUpdate' AND job_id = '1' AND job_token = ''
[runJobs] recentChangesUpdate Special:RecentChanges type=cacheUpdate namespace=-1 title=RecentChanges requestId=10ae3e5465dbdd29dfc90672 (id=1,timestamp=20240210092834) STARTING
[rdbms] Wikimedia\Rdbms\LoadBalancer::reuseOrOpenConnectionForNewRef: reusing connection for 0/wikidb
[rdbms] RecentChangesUpdateJob::updateActiveUsers [0.405ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: SELECT IF(GET_LOCK('wikidb-activeusers',0),UNIX_TIMESTAMP(SYSDATE(6)),NULL) AS acquired
[rdbms] Wikimedia\Rdbms\DatabaseMySQL::setSessionOptions [0.165ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: SET net_read_timeout=900, net_write_timeout=900
[rdbms] Wikimedia\Rdbms\Database::beginIfImplied (RecentChangesUpdateJob::updateActiveUsers) [0.084ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: BEGIN
[rdbms] RecentChangesUpdateJob::updateActiveUsers [0.825ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: SELECT  qci_timestamp  FROM `querycache_info`    WHERE qci_type = 'activeusers'  LIMIT 1  
[rdbms] RecentChangesUpdateJob::updateActiveUsers [1.089ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: SELECT  actor_name,MAX(rc_timestamp) AS `lastedittime`  FROM `recentchanges` JOIN `actor` ON ((actor_id=rc_actor))   WHERE (actor_user IS NOT NULL) AND (rc_type != 5) AND ((rc_log_type IS NULL OR rc_log_type != 'newusers')) AND (rc_timestamp >= '20240111092909') AND (rc_timestamp <= '20240210092909')  GROUP BY actor_name ORDER BY NULL 
[rdbms] RecentChangesUpdateJob::updateActiveUsers [0.429ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: SELECT  qcc_title AS `user_name`  FROM `querycachetwo`    WHERE qcc_type = 'activeusers' AND qcc_namespace = 2 AND qcc_title = 'MediaWiki default' AND (qcc_value >= 1704965349)  
[rdbms] RecentChangesUpdateJob::updateActiveUsers [0.834ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: INSERT INTO `querycachetwo` (qcc_type,qcc_namespace,qcc_title,qcc_value,qcc_namespacetwo,qcc_titletwo) VALUES ('activeusers',2,'MediaWiki default',1707557314,0,'')
[rdbms] RecentChangesUpdateJob::updateActiveUsers [0.716ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: COMMIT
[rdbms] Wikimedia\Rdbms\LoadBalancer::getAnyOpenConnection: found 'round' connection to #0.
[rdbms] Wikimedia\Rdbms\Database::beginIfImplied (RecentChangesUpdateJob::updateActiveUsers) [0.073ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: BEGIN
[rdbms] RecentChangesUpdateJob::updateActiveUsers [0.168ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: REPLACE INTO `querycache_info` (qci_type,qci_timestamp) VALUES ('activeusers','20240210092909')
[rdbms] RecentChangesUpdateJob::updateActiveUsers [0.228ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: DELETE FROM `querycachetwo` WHERE qcc_type = 'activeusers' AND (qcc_value < 1704965349)
[rdbms] Wikimedia\Rdbms\LoadBalancer::reuseOrOpenConnectionForNewRef: reusing connection for 0/wikidb
[rdbms] MediaWiki\Deferred\SiteStatsUpdate::cacheUpdate [0.557ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: SELECT  COUNT(DISTINCT rc_actor)  FROM `recentchanges` JOIN `actor` ON ((actor_id=rc_actor))   WHERE (rc_type != 5) AND (actor_user IS NOT NULL) AND (rc_bot = 0) AND ((rc_log_type != 'newusers' OR rc_log_type IS NULL)) AND (rc_timestamp >= '20240111092909')  LIMIT 1  
[rdbms] MediaWiki\Deferred\SiteStatsUpdate::cacheUpdate [0.232ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: UPDATE  `site_stats` SET ss_active_users = 1 WHERE ss_row_id = 1
[rdbms] RecentChangesUpdateJob::updateActiveUsers [0.155ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: SELECT RELEASE_LOCK('wikidb-activeusers') AS released
[rdbms] RecentChangesUpdateJob::run [0.284ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: COMMIT
[runJobs] recentChangesUpdate Special:RecentChanges type=cacheUpdate namespace=-1 title=RecentChanges requestId=10ae3e5465dbdd29dfc90672 (id=1,timestamp=20240210092834) t=12 good
[rdbms] Wikimedia\Rdbms\LoadBalancer::reuseOrOpenConnectionForNewRef: reusing connection for 0/wikidb
[rdbms] JobQueueDB::doAck [0.718ms] localhost:/workspace/db/quibble-mysql-wa62l0ku/socket: DELETE FROM `job` WHERE job_cmd = 'recentChangesUpdate' AND job_id = '1'
[DeferredUpdates] DeferredUpdates::run: ended MediaWiki\Deferred\TransactionRoundDefiningUpdate_MediaWiki\MediaWikiEntryPoint::schedulePostSendJobs #1305, processing time: 0.032116174697876
[rdbms] LBFactory shutdown completed
Request ended normally
[session] Saving all sessions on shutdown

$wgLocalisationCacheConf['store'] = 'array'; that is the format used to store the cache. […] it gets the localization cache from Memcached:

[MessageCache] MessageCache using store MemcachedPhpBagOStuff
[memcached] MemCache: got WANCache:wikidb:messages:en|#|t

There does not exist a Memcached backend for LocalisationCache. LCStore has only file-based (CDB or PHP) and DB-based implementations. What you see here is the MessageCache, which is a layer on top of LocalisationCache, where we cache the message overrides from NS_MEDIAWIKI pages.

Start request GET /
…
[localisation] LocalisationCache using store LCStoreStaticArray
…

It uses the static PHP files for LocalisationCache.

In T356896 it looks like we’re hitting a similar issue in QUnit tests – at least the timing seems to match up, as far as I can tell. Looking at this build:

mw-error.log
2024-02-14 16:47:27 17e1c5712357 wikidb: [747f2a954b9433cf796e8b21] /load.php?modules=mediawiki.language%2CmessagePoster%2Cpulsatingdot%2Cqunit-testrunner%2Crouter%2Cstorage%2Ctemplate%2Ctoc%2Cuser%2Cutil%2CvisibleTimeout%2Cwidgets%7Cmediawiki.language.months%2Cnames%2CspecialCharacters%2Ctestdata%7Cmediawiki.libs.guiders%2Cjpegmeta%2Cpluralruleparser%7Cmediawiki.page.gallery.styles%7Cmediawiki.page.ready%7Cmediawiki.page.watch.ajax%7Cmediawiki.rcfilters.filters.ui%7Cmediawiki.skinning.content.parsoid%7Cmediawiki.template.handlebars%2Cmustache%2Cvue%7Cmediawiki.ui.button%2Cicon%7Cmediawiki.widgets.AbandonEditDialog%2CCategoryMultiselectWidget%2CDateInputWidget%2CMediaSearch%2CTable%2CUserInputWidget%2Cstyles%7Cmediawiki.widgets.DateInputWidget.styles%7Cmobile.ooui.icons%7Cmobile.pagelist.styles%7Cmobile.pagesummary.styles%7Cmobile.placeholder.images%7Cmobile.startup%7Cmobile.startup.images%7Cmoment%2Coojs%2Coojs-ui%2Coojs-ui-core%2Coojs-ui-toolbars%2Coojs-ui-widgets%2Coojs-ui-windows%2Cpapaparse%2Crangefix%2Csinonjs%2Cspark-md5%2CtreeDiffer%2Cunicodejs%2CvalueFormatters%2CvalueParsers%7Cmw.config.values.wbDataTypes%2CwbRefTabsEnabled%2CwbRepo%2CwbSiteDetails%7Coojs-ui-core.icons%2Cstyles%7Coojs-ui-toolbars.icons%7Coojs-ui-widgets.icons%7Coojs-ui-windows.icons%7Coojs-ui.styles.icons-accessibility%2Cicons-alerts%2Cicons-content%2Cicons-editing-advanced%2Cicons-editing-citation%2Cicons-editing-core%2Cicons-editing-list%2Cicons-editing-styling%2Cicons-interactions%2Cicons-layout%2Cicons-media%2Cicons-moderation%2Cicons-movement%2Cicons-user%2Cindicators%7Cskins.minerva.messageBox.styles%7Csocket.io%7Ctest.ArticlePlaceholder%2CCentralAuth%2CCite%2CDiscussionTools%2CEcho%2CEventLogging%2CFlow%2CGuidedTour%2CMediaWiki%2CMinervaNeue%2CTemplateData%2CThanks%2CUniversalLanguageSelector%2CVisualEditor%2CWikiEditor%2CWikibaseClient%2CWikibaseLexeme%2CWikibaseRepository%2CWikimediaEvents%7Cutil.ContentLanguages%2CExtendable%2CMessageProvider%2CMessageProviders%2CNotifier%2ChighlightSubstring%2Cinherit%7CvalueParsers.ValueParserStore%2Cparsers&version=1h1ft   PHP Fatal Error from line 71 of /workspace/src/includes/xml/XmlJsCode.php: Allowed memory size of 134217728 bytes exhausted (tried to allocate 21670080 bytes)
#0 [internal function]: MWExceptionHandler::handleFatalError()
#1 {main}
Console Output
16:47:26 Chrome Headless 90.0.4430.212 (Linux x86_64) INFO: 'You are running a development build of Vue.
16:47:26 Make sure to use the production build (*.prod.js) when deploying for production.'
16:47:26 Chrome Headless 90.0.4430.212 (Linux x86_64) WARN: 'This page is using the deprecated ResourceLoader module "jquery.tipsy".'
16:47:27 Chrome Headless 90.0.4430.212 (Linux x86_64) WARN: 'This page is using the deprecated ResourceLoader module "jquery.ui".
16:47:27 Please use OOUI instead.'
16:47:27 14 02 2024 16:48:27.214:WARN [Chrome Headless 90.0.4430.212 (Linux x86_64)]: Disconnected (0 times) , because no message in 60000 ms.
16:48:27 Chrome Headless 90.0.4430.212 (Linux x86_64) ERROR
16:48:27   Disconnected , because no message in 60000 ms.
16:48:27 14 02 2024 16:48:27.217:DEBUG [Chrome Headless 90.0.4430.212 (Linux x86_64)]: CONFIGURING -> DISCONNECTED

I've done some more investigation here over in T356896 - I'm pretty certain that this is nothing to do with an external cache or about localisation. It seems just to be that we have big test suites and a greedy internal cache of objects in the ResourceLoader - multiple copies of 30MBs of strings are created during the module building and Javascript minification process.

I added a patch there and it moved the dial a little bit. I can keep digging and trying to optimise things, but my question would be how much effort it makes sense to spend here to optimise a case that we don't have in production (where we're not trying to serve 30MB of QUnit tests), vs. simply upping the memory limit for the CI containers for QUnit runners.

@hashar, @Lucas_Werkmeister_WMDE - what do you think? On the one hand, I can see that it's good to run the tests in the same constrained environment that we run production in - not doing that will cause some issues to slip through the testing net. On the other hand, QUnit and our current ResourceLoader implementation is a pretty memory-hungry combination and we seem (according to my research at least) just to be pushing up against the limits of that with a 128MB memory constraint.

Prio Notes:

Impact AreaAffectedNotes
production / end usersMemory savings should help in production too.
monitoring
development efforts
onboarding efforts
additional stakeholdersAffects shared CI environment.

Change 1008855 had a related patch set uploaded (by Arthur taylor; author: Arthur taylor):

[mediawiki/extensions/EntitySchema@master] Remove dispensable start-up request introduced as CI workaround

https://gerrit.wikimedia.org/r/1008855

On the one hand, I can see that it's good to run the tests in the same constrained environment that we run production in - not doing that will cause some issues to slip through the testing net.

But are we sure that the CI limit even matches production? Because in mediawiki/php.pp I see 'memory_limit' => '500M', and the actual memory limit in a maintenance script seems to be even larger (1400 MiB), though I don’t know why.

lucaswerkmeister-wmde@mwdebug1002:~$ mwscript shell wikidatawiki --memory-limit=default
Psy Shell v0.12.0 (PHP 7.4.33 — cli) by Justin Hileman
> ini_get('memory_limit')
= "1468006400"
lucaswerkmeister-wmde@mwdebug1002:~$ grep -r memory_limit /etc/php/7.4/
/etc/php/7.4/fpm/php.ini:memory_limit = 500M
/etc/php/7.4/cli/php.ini:memory_limit = 500M
/etc/php/7.4/phpdbg/php.ini:memory_limit = 128M

If that’s true, I think it would be reasonable to bump the CI limit to at least 256M.

Change 1012616 had a related patch set uploaded (by Hashar; author: Hashar):

[integration/config@master] dockerfiles: Quibble php-fpm memory_limit to 256M

https://gerrit.wikimedia.org/r/1012616

Change 1012616 merged by jenkins-bot:

[integration/config@master] dockerfiles: Quibble php-fpm memory_limit to 256M

https://gerrit.wikimedia.org/r/1012616

Mentioned in SAL (#wikimedia-releng) [2024-03-19T10:57:30Z] <hashar> Updating Docker images for "dockerfiles: Quibble php-fpm memory_limit to 256M" | https://gerrit.wikimedia.org/r/1012616 | T356402

Change 1012636 had a related patch set uploaded (by Hashar; author: Hashar):

[integration/config@master] jjb: update Quibble jobs to latest image

https://gerrit.wikimedia.org/r/1012636

Change 1012636 merged by jenkins-bot:

[integration/config@master] jjb: update Quibble jobs to latest image

https://gerrit.wikimedia.org/r/1012636

Change #1008855 merged by jenkins-bot:

[mediawiki/extensions/EntitySchema@master] Remove dispensable start-up request introduced as CI workaround

https://gerrit.wikimedia.org/r/1008855

I think we can check in a few days (or more) whether the test stayed stable or became flaky or something.