Page MenuHomePhabricator

Running importImages.php for a long while results in out of memory
Open, Needs TriagePublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

  • Configure the main cache and job queue to use Redis.
  • Run importImages.php on a path with many files (>=1000).
  • In another shell, check the memory usage of that PHP process.

What happens?:

Memory usage keeps growing, and eventually run out of memory.
After memory profiling, I noticed the major issue is BufferingStatsdDataFactory->buffer keeps growing and never clear because $wgStatsdServer was not set.

What should have happened instead?:

Memory usage stays at a reasonable level.

Software version (skip for WMF-hosted wikis like Wikipedia):
MediaWiki 1.39.1

Other information (browser name/version, screenshots, etc.):

Similar to T181385

Event Timeline

Change 887960 had a related patch set uploaded (by Func; author: Func):

[mediawiki/core@master] ServiceWiring: Use NullStatsdDataFactory if StatsdServer is not configured

https://gerrit.wikimedia.org/r/887960

Change 887960 merged by jenkins-bot:

[mediawiki/core@master] Clear the statsd data buffer regardless of StatsdServer config

https://gerrit.wikimedia.org/r/887960

Change 888328 had a related patch set uploaded (by Func; author: Func):

[mediawiki/core@REL1_39] Clear the statsd data buffer regardless of StatsdServer config

https://gerrit.wikimedia.org/r/888328

Change 888328 merged by jenkins-bot:

[mediawiki/core@REL1_39] Clear the statsd data buffer regardless of StatsdServer config

https://gerrit.wikimedia.org/r/888328

I finished the import with the patch applied, but I noticed that around 1.15G of memory was consumed by ParserObserver->previousParseStackTraces after around 151000 images were imported.

Maybe we should disable this feature on command line mode or during the maintenance script run. Or having a sensible limit on the maximum records by using MapCacheLRU.