Page MenuHomePhabricator

MediaWiki Core: exceeded the timeout of 300 seconds for export.php
Closed, ResolvedPublic

Description

This happens nearly every export since a few weeks.

The timeout of 300 is defined 3 x in the script and I am unsure which variable I have to increase.

[Symfony\Component\Process\Exception\ProcessTimedOutException]
The process "php /srv/mediawiki/targets/production/extensions/Translate/scripts/export.php --lang='*' --group='core' --quiet='1' --threshold='13' --target='/resources/raymond' --skip='test,aeb,be-x-old,crh,dk,en,fiu-vro,gan,gom,hif,
kbd,kk,kk-cn,iu,kk-kz,kk-tr,ko-kp,ku,ku-arab,no,ruq,simple,sr,tg,tp,tt,ug,zh,zh-classical,zh-cn,zh-sg,zh-min-nan,zh-mo,zh-my,zh-tw,zh-yue,bbc,ady'" exceeded the timeout of 300 seconds.

Details

Related Gerrit Patches:
translatewiki : masterReduce the export command timeout back to 5 minutes
mediawiki/extensions/Translate : masterFetch author and translation state info in a single database call
mediawiki/extensions/Translate : masterUse null coalescing operator in ThinMessage::getProperty
mediawiki/extensions/Translate : masterImplement maximum query size in MessageCollection
mediawiki/extensions/Translate : masterReuse title conditions built using keys when possible
mediawiki/extensions/Translate : masterExtend TitleValue to avoid validating the arguments
mediawiki/extensions/Translate : masterImprove logging during exports
translatewiki : masterIncrease the timeout for the export command to 15 minutes
mediawiki/extensions/Translate : masterOptimize MessageCollection
mediawiki/extensions/Translate : masterAdd language count and time spent in $ffs->write
mediawiki/extensions/Translate : masterAdd some more logs to identify performance issues with core export
mediawiki/extensions/Translate : masterAdd logging to export process to identify performance bottleneck
translatewiki : masterIncrease the timeout for the export command to 10 minutes

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 8 2019, 7:01 AM
abi_ added a subscriber: abi_.Aug 8 2019, 7:39 AM
Nikerabbit triaged this task as High priority.Aug 8 2019, 10:04 AM
Nikerabbit added a subscriber: Nikerabbit.

Let's investigate this with high priority. It's important to keep translation updates to MediaWiki core working.

Change 529752 had a related patch set uploaded (by Abijeet Patro; owner: Abijeet Patro):
[translatewiki@master] Increase the timeout for the export command to 10 minutes

https://gerrit.wikimedia.org/r/529752

Change 529752 merged by jenkins-bot:
[translatewiki@master] Increase the timeout for the export command to 10 minutes

https://gerrit.wikimedia.org/r/529752

Change 529939 had a related patch set uploaded (by Abijeet Patro; owner: Abijeet Patro):
[mediawiki/extensions/Translate@master] Add logging to export process to identify performance bottleneck

https://gerrit.wikimedia.org/r/529939

The new timout of 10 minutes is not sufficient :-(

[Symfony\Component\Process\Exception\ProcessTimedOutException]
The process "php /srv/mediawiki/targets/production/extensions/Translate/scripts/export.php --lang='*' --group='core' --quiet='1' --threshold='13' --target='/resources/raymond' --skip='test,aeb,be-x-old,crh,dk,en,fiu-vro,gan,gom,hif,
kbd,kk,kk-cn,iu,kk-kz,kk-tr,ko-kp,ku,ku-arab,no,ruq,simple,sr,tg,tp,tt,ug,zh,zh-classical,zh-cn,zh-sg,zh-min-nan,zh-mo,zh-my,zh-tw,zh-yue,bbc,ady'" exceeded the timeout of 600 seconds.

Maybe you could try running the listed command without --quiet manually: php /srv/mediawiki/targets/production/extensions/Translate/scripts/export.php --lang='*' --group='core' --threshold='13' --target='/resources/raymond' --skip='test,aeb,be-x-old,crh,dk,en,fiu-vro,gan,gom,hif,kbd,kk,kk-cn,iu,kk-kz,kk-tr,ko-kp,ku,ku-arab,no,ruq,simple,sr,tg,tp,tt,ug,zh,zh-classical,zh-cn,zh-sg,zh-min-nan,zh-mo,zh-my,zh-tw,zh-yue,bbc,ady' to see if it gives any useful output.

Abijeet is working on adding more logging to see where the time is going.

Yesterday and today the export worked well. I used above command without --quiet. No useful output.

My guess is that this is either disk IO issue and/or related to recalculating message group stats. Let's see when we have the additional debugging available.

Change 529939 merged by jenkins-bot:
[mediawiki/extensions/Translate@master] Add logging to export process to identify performance bottleneck

https://gerrit.wikimedia.org/r/529939

Change 531952 had a related patch set uploaded (by Abijeet Patro; owner: Abijeet Patro):
[mediawiki/extensions/Translate@master] Add some more logs to identify performance issues with core export

https://gerrit.wikimedia.org/r/531952

abi_ added a comment.Aug 23 2019, 5:36 PM

After deployment, ran the exports for the core group and here are the results,

# Core export - 3 Runs - Avg: 159 seconds

[2019-08-23 08:06:05] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-08-23 08:06:20] translation-exports.INFO: Finished calculating stats for group core. Time: 15.427733898163 ms {"private":false} []
[2019-08-23 08:08:35] translation-exports.INFO: Finished export process. Time: 150.77363085747 ms {"private":false} []
---
[2019-08-23 14:41:17] translation-exports.INFO: Starting exports... {"private":false} []
[2019-08-23 14:41:17] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-08-23 14:41:33] translation-exports.INFO: Finished calculating stats for group core. Time: 15.466252803802 ms {"private":false} []
[2019-08-23 14:44:16] translation-exports.INFO: Finished export process. Time: 178.7745411396 ms {"private":false} []
---
[2019-08-23 14:44:34] translation-exports.INFO: Starting exports... {"private":false} []
[2019-08-23 14:44:34] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-08-23 14:44:49] translation-exports.INFO: Finished calculating stats for group core. Time: 15.422734975815 ms {"private":false} []
[2019-08-23 14:47:02] translation-exports.INFO: Finished export process. Time: 148.66956615448 ms {"private":false} []

Calculating stats doesn't seem to take much time actually. I ran another export process from another screen when the core export was running, but that only affected the performance very slightly. I am fairly confident though that when repong.php forks multiple process to run the exports, there will be a slowdown.

I also noticed that the core has many more languages (308) that it has to export and this is the process that seems to take more time.

I have added some more logging in order to identify what happens during the actual export process when multiple exports are being run together.

Is the unit right? 150ms sounds very little, should it be seconds instead?

Change 531952 merged by jenkins-bot:
[mediawiki/extensions/Translate@master] Add some more logs to identify performance issues with core export

https://gerrit.wikimedia.org/r/531952

abi_ added a comment.Aug 30 2019, 4:49 AM

Is the unit right? 150ms sounds very little, should it be seconds instead?

It is indeed seconds. The unit needs to be fixed.

# Core export - 3 Runs - Avg: 159 seconds

Today was the first time since ~ 10 days that the export of core exceeds the limit of 600 seconds. Maybe the new log shows relevant information?

abi_ added a comment.Sep 9 2019, 6:40 AM

The latest logging functionality was deployed last Wednesday - 6th Sept, 2019. Here are the logs since then,

[2019-09-06 06:42:55] translation-exports.INFO: Starting exports for groups - core... {"private":false} []                        
[2019-09-06 06:42:55] translation-exports.INFO: Calculating stats for group core {"private":false} []                             
[2019-09-06 06:42:57] translation-exports.INFO: Finished calculating stats for group core. Time: 1.8658871650696 ms {"private":false} []                                                                                                                            
[2019-09-06 06:42:57] translation-exports.INFO: Exporting languages for group - core. {"private":false} []                        
[2019-09-06 06:49:17] translation-exports.INFO: Done exporting languages for group - core. Time taken - 379.61304593086 {"private":false} []     
------                                                                                                                   
[2019-09-06 06:49:17] translation-exports.INFO: Finished export process for groups - core. Time: 382.09534597397 ms {"private":false} []                                                                                                                            
[2019-09-06 06:49:22] translation-exports.INFO: Starting exports for groups - core... {"private":false} []                        
[2019-09-06 06:49:23] translation-exports.INFO: Exporting languages for group - core. {"private":false} []                        
[2019-09-06 06:49:26] translation-exports.INFO: Done exporting languages for group - core. Time taken - 3.6765029430389 {"private":false} []                                                                                                                        
[2019-09-06 06:49:26] translation-exports.INFO: Finished export process for groups - core. Time: 4.2184460163116 ms {"private":false} []

Note that the units are in seconds.

Couple of things,

  1. Calculating group stats is quite fast, and the exporting the languages is what takes time.
  2. The first export at 6:42 on the 6th Sept, 2019 actually took 380 seconds. Much higher than my manual run timings of 159 seconds. This could be due to,
    • Running multiple simultaneous exports slows down the speed of each individual one, possibly due to IO constraints.
    • Larger amount of translations performed.

I'll wait this week and keep monitoring.

Couple of things,

  1. Calculating group stats is quite fast, and the exporting the languages is what takes time.
  2. The first export at 6:42 on the 6th Sept, 2019 actually took 380 seconds. Much higher than my manual run timings of 159 seconds. This could be due to,
    • Running multiple simultaneous exports slows down the speed of each individual one, possibly due to IO constraints.
    • Larger amount of translations performed.

I'll wait this week and keep monitoring.

Thank you for the report. 2 notes:

  1. I can exclude that running multiple simultaneous exports are the reason. Every day I do export I run the export for core in a window as first project and waits until core is finished. 2nd MediaWiki skins, 3rd extensions.
  2. Maybe a larger amount of translations. But I would not bet on it...

I think we need additional debugging information to drill down where the time is spent.

abi_ added a comment.Sep 17 2019, 11:57 AM

Below are the exports log for the core group since the 10th of Sept, 2019.

NOTE: that timings are in seconds
# 10th Sept, Run #1
[2019-09-10 06:36:17] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-09-10 06:36:17] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-09-10 06:36:19] translation-exports.INFO: Finished calculating stats for group core. Time: 1.5243098735809 ms {"private":false} []
[2019-09-10 06:36:19] translation-exports.INFO: Exporting languages for group - core. {"private":false} []
[2019-09-10 06:45:05] translation-exports.INFO: Done exporting languages for group - core. Time taken - 526.02917504311 {"private":false} []
[2019-09-10 06:45:05] translation-exports.INFO: Finished export process for groups - core. Time: 527.76807403564 ms {"private":false} []
-----
# 10th Sept, Run #2
[2019-09-10 06:45:05] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-09-10 06:45:05] translation-exports.INFO: Exporting languages for group - core. {"private":false} []
[2019-09-10 06:45:10] translation-exports.INFO: Done exporting languages for group - core. Time taken - 4.5413930416107 {"private":false} []
[2019-09-10 06:45:10] translation-exports.INFO: Finished export process for groups - core. Time: 4.6728317737579 ms {"private":false} []
-----
# 12th Sept, Run #1
[2019-09-12 06:22:13] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-09-12 06:22:13] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-09-12 06:22:15] translation-exports.INFO: Finished calculating stats for group core. Time: 1.6597821712494 ms {"private":false} []
[2019-09-12 06:22:15] translation-exports.INFO: Exporting languages for group - core. {"private":false} []
[2019-09-12 06:29:27] translation-exports.INFO: Done exporting languages for group - core. Time taken - 432.49624300003 {"private":false} []
[2019-09-12 06:29:27] translation-exports.INFO: Finished export process for groups - core. Time: 434.64558386803 ms {"private":false} []
-----
# 12th Sept, Run #2
[2019-09-12 06:29:28] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-09-12 06:29:28] translation-exports.INFO: Exporting languages for group - core. {"private":false} []
[2019-09-12 06:29:33] translation-exports.INFO: Done exporting languages for group - core. Time taken - 4.8211569786072 {"private":false} []
[2019-09-12 06:29:33] translation-exports.INFO: Finished export process for groups - core. Time: 5.4264628887177 ms {"private":false} []
-----
# 13th Sept, Run #1
[2019-09-13 07:20:19] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-09-13 07:20:19] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-09-13 07:20:22] translation-exports.INFO: Finished calculating stats for group core. Time: 2.2407040596008 ms {"private":false} []
[2019-09-13 07:20:22] translation-exports.INFO: Exporting languages for group - core. {"private":false} []
[2019-09-13 07:29:35] translation-exports.INFO: Done exporting languages for group - core. Time taken - 552.99668598175 {"private":false} []
[2019-09-13 07:29:35] translation-exports.INFO: Finished export process for groups - core. Time: 555.58241581917 ms {"private":false} []
-----
# 13th Sept, Run #2
[2019-09-13 07:29:36] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-09-13 07:29:37] translation-exports.INFO: Exporting languages for group - core. {"private":false} []
[2019-09-13 07:29:41] translation-exports.INFO: Done exporting languages for group - core. Time taken - 4.454293012619 {"private":false} []
[2019-09-13 07:29:41] translation-exports.INFO: Finished export process for groups - core. Time: 5.0152490139008 ms {"private":false} []
-----
# 16th Sept, Run #1
[2019-09-16 06:59:16] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-09-16 06:59:16] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-09-16 06:59:18] translation-exports.INFO: Finished calculating stats for group core. Time: 1.8320798873901 ms {"private":false} []
[2019-09-16 06:59:18] translation-exports.INFO: Exporting languages for group - core. {"private":false} []
[2019-09-16 07:08:49] translation-exports.INFO: Done exporting languages for group - core. Time taken - 570.88018512726 {"private":false} []
[2019-09-16 07:08:49] translation-exports.INFO: Finished export process for groups - core. Time: 572.98943996429 ms {"private":false} []
-----
# 16th Sept, Run #2
[2019-09-16 07:08:50] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-09-16 07:08:50] translation-exports.INFO: Exporting languages for group - core. {"private":false} []
[2019-09-16 07:08:54] translation-exports.INFO: Done exporting languages for group - core. Time taken - 4.5552070140839 {"private":false} []
[2019-09-16 07:08:54] translation-exports.INFO: Finished export process for groups - core. Time: 4.68323802948 ms {"private":false} []
-----
# 17th Sept, Run #1
[2019-09-17 06:13:51] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-09-17 06:13:52] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-09-17 06:13:54] translation-exports.INFO: Finished calculating stats for group core. Time: 1.7621870040894 ms {"private":false} []
[2019-09-17 06:13:54] translation-exports.INFO: Exporting languages for group - core. {"private":false} []
[2019-09-17 06:23:04] translation-exports.INFO: Done exporting languages for group - core. Time taken - 549.55101895332 {"private":false} []
[2019-09-17 06:23:04] translation-exports.INFO: Finished export process for groups - core. Time: 552.04158997536 ms {"private":false} []
-----
# 17th Sept, Run #2
[2019-09-17 06:23:04] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-09-17 06:23:05] translation-exports.INFO: Exporting languages for group - core. {"private":false} []
[2019-09-17 06:23:10] translation-exports.INFO: Done exporting languages for group - core. Time taken - 4.9948379993439 {"private":false} []
[2019-09-17 06:23:10] translation-exports.INFO: Finished export process for groups - core. Time: 5.5005939006805 ms {"private":false} []
-----
abi_ added a comment.EditedSep 17 2019, 12:56 PM

The exports seem to be run twice daily though one after the other, here are the CLI logs from the server,

# 16th Sept, 2019

[16/Sep/2019 06:59:16 +0000] raymond/raymond (CommandlineExport) /srv/mediawiki/targets/production/extensions/Translate/scripts/export.php --lang=* --group=core --quiet=1 --threshold=13 --target=/resources/raymond --skip=test,aeb,be-x-old,crh,dk,en,fiu-vro,gan,gom,hif,kbd,kk,kk-cn,iu,kk-kz,kk-tr,ko-kp,ku,ku-arab,no,ruq,simple,sr,tg,tp,tt,ug,zh,zh-classical,zh-cn,zh-sg,zh-min-nan,zh-mo,zh-my,zh-tw,zh-yue,bbc,ady

[16/Sep/2019 07:08:50 +0000] raymond/raymond (CommandlineExport) /srv/mediawiki/targets/production/extensions/Translate/scripts/export.php --lang=qqq,en-ca,en-gb,es-formal,de-formal,de-at,de-ch,hu-formal,nl-informal,zh-hk --group=core --quiet=1 --target=/resources/raymond --skip=test,aeb,be-x-old,crh,dk,en,fiu-vro,gan,gom,hif,kbd,kk,kk-cn,iu,kk-kz,kk-tr,ko-kp,ku,ku-arab,no,ruq,simple,sr,tg,tp,tt,ug,zh,zh-classical,zh-cn,zh-sg,zh-min-nan,zh-mo,zh-my,zh-tw,zh-yue,bbc,ady
------------------------------------------------------------
# 17th Sept, 2019

[17/Sep/2019 06:13:51 +0000] raymond/raymond (CommandlineExport) /srv/mediawiki/targets/production/extensions/Translate/scripts/export.php --lang=* --group=core --quiet=1 --threshold=13 --target=/resources/raymond --skip=test,aeb,be-x-old,crh,dk,en,fiu-vro,gan,gom,hif,kbd,kk,kk-cn,iu,kk-kz,kk-tr,ko-kp,ku,ku-arab,no,ruq,simple,sr,tg,tp,tt,ug,zh,zh-classical,zh-cn,zh-sg,zh-min-nan,zh-mo,zh-my,zh-tw,zh-yue,bbc,ady

[17/Sep/2019 06:23:04 +0000] raymond/raymond (CommandlineExport) /srv/mediawiki/targets/production/extensions/Translate/scripts/export.php --lang=qqq,en-ca,en-gb,es-formal,de-formal,de-at,de-ch,hu-formal,nl-informal,zh-hk --group=core --quiet=1 --target=/resources/raymond --skip=test,aeb,be-x-old,crh,dk,en,fiu-vro,gan,gom,hif,kbd,kk,kk-cn,iu,kk-kz,kk-tr,ko-kp,ku,ku-arab,no,ruq,simple,sr,tg,tp,tt,ug,zh,zh-classical,zh-cn,zh-sg,zh-min-nan,zh-mo,zh-my,zh-tw,zh-yue,bbc,ady
abi_ added a comment.Sep 17 2019, 4:04 PM

Since the core does not have any $definitionFile, I'm fairly certain that, this is the part of the code in export.php is slow,

// Don't export ignored, unless it is the source language
// or message documentation
global $wgTranslateDocumentationLanguageCode;
if ( $lang !== $wgTranslateDocumentationLanguageCode
	&& $lang !== $sourceLanguage
) {
	$collection->filter( 'ignored' );
}

if ( $noFuzzy ) {
	$collection->filter( 'fuzzy' );
}

$ffs->write( $collection );

For the core it runs 308 times (once for each language) and performs disk IO.

Change 537479 had a related patch set uploaded (by Abijeet Patro; owner: Abijeet Patro):
[mediawiki/extensions/Translate@master] Add language count and time spent in $ffs->write

https://gerrit.wikimedia.org/r/537479

Change 537479 merged by jenkins-bot:
[mediawiki/extensions/Translate@master] Add language count and time spent in $ffs->write

https://gerrit.wikimedia.org/r/537479

I did this:

time php -dzend_extension=xdebug.so -d xdebug.profiler_enable=1 /srv/mediawiki/targets/production/extensions/Translate/scripts/export.php --lang='*' --group='core' --quiet='1' --threshold='13' --target='/resources/nike' --skip='test,aeb,be-x-old,crh,dk,en,fiu-vro,gan,gom,hif,kbd,kk,kk-cn,iu,kk-kz,kk-tr,ko-kp,ku,ku-arab,no,ruq,simple,sr,tg,tp,tt,ug,zh,zh-classical,zh-cn,zh-sg,zh-min-nan,zh-mo,zh-my,zh-tw,zh-yue,bbc,ady'
^C

real    12m15.294s
user    7m35.524s
sys     3m45.868s

As you can see, I killed it in the middle of execution. Even this produced a cachegrind file that is 5 GiB.

Some quick observations looking at this file:

  • 31 % of the time is spend in Title::makeTitleSafe (called from MessageCollection::fixKeys). This is not surprising to me and avoiding the use of Title methods has been an optimization in the past. I guess this might be slightly overstated if xdebug increasing function call overhead, which Title::makeTitle has a lot.
  • 10 % of memory is taken by explode calls (called from Wikimedia\Assert\Assert::parameterType. For some reason I cannot see callers of this function though.
  • 22 % of the time is taken by JsonFFS->write, which is plausible expect that 15.5 % of time is actually spend on Revision::getRevisionText, which is supposed to just return the text already present in the database result row!

So looking at how much time is spend externally, I see 5.5 % in mysqli::query, 1,5% in file_get_contents and 0.1% in file_exist and 0.0 % in file_put_contents. Of course the script is even more CPU bound when running with xdebug, but still the amount of time spend on IO seems very little, unless that doesn't actually show in the profile (the large sys vs. user above might mean this is the case).

I'll try to do this again with only a handful of languages (maybe 10) to see if I get a better profile, I think some things are missing here.

Trying again:

\time -v php -dzend_extension=xdebug.so -d xdebug.profiler_enable=1 /srv/mediawiki/targets/production/extensions/Translate/scripts/export.php --lang='ace,ady-cyrl,aeb-arab,af,ais,aln,am,ami,an,ang,anp,ar,arc,arn,arq,ary,arz,as,ase,ast,atj,avk,awa,az,azb' --group='core' --quiet='1' --threshold='13' --target='/resources/nike' --skip='test,aeb,be-x-old,crh,dk,en,fiu-vro,gan,gom,hif,kbd,kk,kk-cn,iu,kk-kz,kk-tr,ko-kp,ku,ku-arab,no,ruq,simple,sr,tg,tp,tt,ug,zh,zh-classical,zh-cn,zh-sg,zh-min-nan,zh-mo,zh-my,zh-tw,zh-yue,bbc,ady'
        Command being timed: "php -dzend_extension=xdebug.so -d xdebug.profiler_enable=1 /srv/mediawiki/targets/production/extensions/Translate/scripts/export.php --lang=ace,ady-cyrl,aeb-arab,af,ais,aln,am,ami,an,ang,anp,ar,arc,arn,arq,ary,arz,as,ase,ast,atj,avk,awa,az,azb --group=core --quiet=1 --threshold=13 --target=/resources/nike --skip=test,aeb,be-x-old,crh,dk,en,fiu-vro,gan,gom,hif,kbd,kk,kk-cn,iu,kk-kz,kk-tr,ko-kp,ku,ku-arab,no,ruq,simple,sr,tg,tp,tt,ug,zh,zh-classical,zh-cn,zh-sg,zh-min-nan,zh-mo,zh-my,zh-tw,zh-yue,bbc,ady"
        User time (seconds): 112.81
        System time (seconds): 55.00
        Percent of CPU this job got: 93%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 2:58.91
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 114272
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 2
        Minor (reclaiming a frame) page faults: 26152
        Voluntary context switches: 1293
        Involuntary context switches: 555
        Swaps: 0
        File system inputs: 9144
        File system outputs: 2476760
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

For comparison, without debugging:

\time -v php /srv/mediawiki/targets/production/extensions/Translate/scripts/export.php --lang='ace,ady-cyrl,aeb-arab,af,ais,aln,am,ami,an,ang,anp,ar,arc,arn,arq,ary,arz,as,ase,ast,atj,avk,awa,az,azb' --group='core' --quiet='1' --threshold='13' --target='/resources/nike' --skip='test,aeb,be-x-old,crh,dk,en,fiu-vro,gan,gom,hif,kbd,kk,kk-cn,iu,kk-kz,kk-tr,ko-kp,ku,ku-arab,no,ruq,simple,sr,tg,tp,tt,ug,zh,zh-classical,zh-cn,zh-sg,zh-min-nan,zh-mo,zh-my,zh-tw,zh-yue,bbc,ady'
        Command being timed: "php /srv/mediawiki/targets/production/extensions/Translate/scripts/export.php --lang=ace,ady-cyrl,aeb-arab,af,ais,aln,am,ami,an,ang,anp,ar,arc,arn,arq,ary,arz,as,ase,ast,atj,avk,awa,az,azb --group=core --quiet=1 --threshold=13 --target=/resources/nike --skip=test,aeb,be-x-old,crh,dk,en,fiu-vro,gan,gom,hif,kbd,kk,kk-cn,iu,kk-kz,kk-tr,ko-kp,ku,ku-arab,no,ruq,simple,sr,tg,tp,tt,ug,zh,zh-classical,zh-cn,zh-sg,zh-min-nan,zh-mo,zh-my,zh-tw,zh-yue,bbc,ady"
        User time (seconds): 4.01
        System time (seconds): 0.28
        Percent of CPU this job got: 27%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:15.54
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 89364
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 2
        Minor (reclaiming a frame) page faults: 21992
        Voluntary context switches: 1260
        Involuntary context switches: 27
        Swaps: 0
        File system inputs: 9176
        File system outputs: 8344
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

Visualization:

abi_ added a comment.Sep 24 2019, 9:44 AM

A bit of additional info that seems to backup what Niklas said,

# 23rd Sept, Run #1
[2019-09-23 06:14:33] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-09-23 06:14:33] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-09-23 06:14:35] translation-exports.INFO: Finished calculating stats for group core. Time: 1.5250749588013 secs. {"private":false} []
[2019-09-23 06:14:35] translation-exports.INFO: Exporting languages (304) for group - core. {"private":false} []
[2019-09-23 06:25:50] translation-exports.INFO: Time taken by FFS Write for group core - 43.727477073669 secs. {"private":false} []
[2019-09-23 06:25:50] translation-exports.INFO: Done exporting languages for group - core. Time taken - 675.24437499046 secs. {"private":false} []
[2019-09-23 06:25:50] translation-exports.INFO: Finished export process for groups - core. Time: 676.93374013901 secs. {"private":false} []
----------
# 23rd Sept, Run #2
[2019-09-23 15:17:39] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-09-23 15:17:39] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-09-23 15:17:44] translation-exports.INFO: Finished calculating stats for group core. Time: 4.9582350254059 secs. {"private":false} []
[2019-09-23 15:17:46] translation-exports.INFO: Exporting languages (304) for group - core. {"private":false} []

<< Logs are not complete: Was run by Niklas for profiling maybe? >>

----------
# 24th Sept, Run #1
[2019-09-24 06:12:46] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-09-24 06:12:46] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-09-24 06:12:48] translation-exports.INFO: Finished calculating stats for group core. Time: 2.1529099941254 secs. {"private":false} []
[2019-09-24 06:12:48] translation-exports.INFO: Exporting languages (304) for group - core. {"private":false} []
[2019-09-24 06:22:47] translation-exports.INFO: Time taken by FFS Write for group core - 40.452238798141 secs. {"private":false} []
[2019-09-24 06:22:47] translation-exports.INFO: Done exporting languages for group - core. Time taken - 598.57009887695 secs. {"private":false} []
[2019-09-24 06:22:47] translation-exports.INFO: Finished export process for groups - core. Time: 600.89104986191 secs. {"private":false} []
----------
# 24th Sept, Run #2
[2019-09-24 06:25:27] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-09-24 06:25:27] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-09-24 06:25:28] translation-exports.INFO: Finished calculating stats for group core. Time: 1.3119900226593 secs. {"private":false} []
[2019-09-24 06:25:28] translation-exports.INFO: Exporting languages (304) for group - core. {"private":false} []
[2019-09-24 06:27:28] translation-exports.INFO: Time taken by FFS Write for group core - 11.814153194427 secs. {"private":false} []
[2019-09-24 06:27:28] translation-exports.INFO: Done exporting languages for group - core. Time taken - 119.86421084404 secs. {"private":false} []
[2019-09-24 06:27:28] translation-exports.INFO: Finished export process for groups - core. Time: 121.32757091522 secs. {"private":false} []
----------
# 24th Sept, Run #3
[2019-09-24 06:27:28] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-09-24 06:27:28] translation-exports.INFO: Exporting languages (10) for group - core. {"private":false} []
[2019-09-24 06:27:32] translation-exports.INFO: Time taken by FFS Write for group core - 0.52151417732239 secs. {"private":false} []
[2019-09-24 06:27:32] translation-exports.INFO: Done exporting languages for group - core. Time taken - 3.6715960502625 secs. {"private":false} []
[2019-09-24 06:27:32] translation-exports.INFO: Finished export process for groups - core. Time: 3.7837700843811 secs. {"private":false} []
----------
# 24th Sept, Run #4
[2019-09-24 07:10:18] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-09-24 07:10:18] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-09-24 07:10:24] translation-exports.INFO: Finished calculating stats for group core. Time: 6.0391070842743 secs. {"private":false} []
[2019-09-24 07:10:27] translation-exports.INFO: Exporting languages (25) for group - core. {"private":false} []
[2019-09-24 07:13:17] translation-exports.INFO: Time taken by FFS Write for group core - 38.490405321121 secs. {"private":false} []
[2019-09-24 07:13:17] translation-exports.INFO: Done exporting languages for group - core. Time taken - 169.79854893684 secs. {"private":false} []
[2019-09-24 07:13:17] translation-exports.INFO: Finished export process for groups - core. Time: 178.42178106308 secs. {"private":false} []
----------
# 24th Sept, Run #5
[2019-09-24 07:14:11] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-09-24 07:14:11] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-09-24 07:14:12] translation-exports.INFO: Finished calculating stats for group core. Time: 1.3317229747772 secs. {"private":false} []
[2019-09-24 07:14:12] translation-exports.INFO: Exporting languages (25) for group - core. {"private":false} []
[2019-09-24 07:14:26] translation-exports.INFO: Time taken by FFS Write for group core - 5.6575934886932 secs. {"private":false} []
[2019-09-24 07:14:26] translation-exports.INFO: Done exporting languages for group - core. Time taken - 13.772558927536 secs. {"private":false} []
[2019-09-24 07:14:26] translation-exports.INFO: Finished export process for groups - core. Time: 15.254414081573 secs. {"private":false} []
----------
# 24th Sept, Run #6
[2019-09-24 07:16:14] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-09-24 07:16:14] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-09-24 07:16:19] translation-exports.INFO: Finished calculating stats for group core. Time: 5.0755708217621 secs. {"private":false} []
[2019-09-24 07:16:21] translation-exports.INFO: Exporting languages (25) for group - core. {"private":false} []
[2019-09-24 07:19:13] translation-exports.INFO: Time taken by FFS Write for group core - 38.512563705444 secs. {"private":false} []
[2019-09-24 07:19:13] translation-exports.INFO: Done exporting languages for group - core. Time taken - 171.53884792328 secs. {"private":false} []
[2019-09-24 07:19:13] translation-exports.INFO: Finished export process for groups - core. Time: 179.00376701355 secs. {"private":false} []

Change 538892 had a related patch set uploaded (by Nikerabbit; owner: Nikerabbit):
[mediawiki/extensions/Translate@master] Optimize MessageCollection

https://gerrit.wikimedia.org/r/538892

Change 538892 merged by jenkins-bot:
[mediawiki/extensions/Translate@master] Optimize MessageCollection

https://gerrit.wikimedia.org/r/538892

Raymond said this happened again. Do we have any additional insights from logs? It feels like all low hanging fruit has been eaten and we just need to bump the limit higher. Would still be interesting to know what causes the large variability in export times.

Change 550720 had a related patch set uploaded (by Abijeet Patro; owner: Abijeet Patro):
[mediawiki/extensions/Translate@master] Improve logging during exports

https://gerrit.wikimedia.org/r/550720

abi_ added a comment.Nov 13 2019, 5:14 PM

Following are some logs from the server since 11th,

[2019-11-11 07:23:24] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-11-11 07:23:24] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-11-11 07:23:26] translation-exports.INFO: Finished calculating stats for group core. Time: 1.6655039787292 secs. {"private":false} []
[2019-11-11 07:23:26] translation-exports.INFO: Exporting languages (304) for group - core. {"private":false} []
[2019-11-11 07:32:25] translation-exports.INFO: Time taken by FFS Write for group core - 37.914349079132 secs. {"private":false} []
[2019-11-11 07:32:25] translation-exports.INFO: Done exporting languages for group - core. Time taken - 539.45387101173 secs. {"private":false} []
[2019-11-11 07:32:25] translation-exports.INFO: Finished export process for groups - core. Time: 541.28293800354 secs. {"private":false} []

---

[2019-11-11 07:32:26] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-11-11 07:32:26] translation-exports.INFO: Exporting languages (10) for group - core. {"private":false} []
[2019-11-11 07:32:32] translation-exports.INFO: Time taken by FFS Write for group core - 1.1811378002167 secs. {"private":false} []
[2019-11-11 07:32:32] translation-exports.INFO: Done exporting languages for group - core. Time taken - 6.1518042087555 secs. {"private":false} []
[2019-11-11 07:32:32] translation-exports.INFO: Finished export process for groups - core. Time: 6.2725520133972 secs. {"private":false} []

---

[2019-11-12 07:27:27] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-11-12 07:27:28] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-11-12 07:27:30] translation-exports.INFO: Finished calculating stats for group core. Time: 2.0788280963898 secs. {"private":false} []
[2019-11-12 07:27:30] translation-exports.INFO: Exporting languages (304) for group - core. {"private":false} []

---

[2019-11-12 07:35:09] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-11-12 07:35:09] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-11-12 07:35:11] translation-exports.INFO: Finished calculating stats for group core. Time: 1.5970890522003 secs. {"private":false} []
[2019-11-12 07:35:11] translation-exports.INFO: Exporting languages (304) for group - core. {"private":false} []
[2019-11-12 07:42:37] translation-exports.INFO: Time taken by FFS Write for group core - 28.356355428696 secs. {"private":false} []
[2019-11-12 07:42:37] translation-exports.INFO: Done exporting languages for group - core. Time taken - 446.36051607132 secs. {"private":false} []
[2019-11-12 07:42:37] translation-exports.INFO: Finished export process for groups - core. Time: 448.12100791931 secs. {"private":false} []

---

[2019-11-12 07:42:37] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-11-12 07:42:37] translation-exports.INFO: Exporting languages (10) for group - core. {"private":false} []
[2019-11-12 07:42:42] translation-exports.INFO: Time taken by FFS Write for group core - 0.54296946525574 secs. {"private":false} []
[2019-11-12 07:42:42] translation-exports.INFO: Done exporting languages for group - core. Time taken - 4.3659307956696 secs. {"private":false} []
[2019-11-12 07:42:42] translation-exports.INFO: Finished export process for groups - core. Time: 4.51722407341 secs. {"private":false} []

---

[2019-11-13 07:23:05] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-11-13 07:23:05] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-11-13 07:23:07] translation-exports.INFO: Finished calculating stats for group core. Time: 1.8197929859161 secs. {"private":false} []
[2019-11-13 07:23:07] translation-exports.INFO: Exporting languages (304) for group - core. {"private":false} []
<<Ended abruptly: Timeout?>>
---

[2019-11-13 07:26:02] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-11-13 07:26:02] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-11-13 07:26:03] translation-exports.INFO: Finished calculating stats for group core. Time: 1.7429850101471 secs. {"private":false} []
[2019-11-13 07:26:03] translation-exports.INFO: Exporting languages (304) for group - core. {"private":false} []
[2019-11-13 07:38:37] translation-exports.INFO: Time taken by FFS Write for group core - 48.402160167694 secs. {"private":false} []
[2019-11-13 07:38:37] translation-exports.INFO: Done exporting languages for group - core. Time taken - 753.21978807449 secs. {"private":false} []
[2019-11-13 07:38:37] translation-exports.INFO: Finished export process for groups - core. Time: 755.10288095474 secs. {"private":false} []

----

I've submitted a patch and added some additional logging. Will also do some profiling to see what's happening.

Change 550796 had a related patch set uploaded (by Abijeet Patro; owner: Abijeet Patro):
[translatewiki@master] Increase the timeout for the export command to 15 minutes

https://gerrit.wikimedia.org/r/550796

abi_ added a comment.Nov 14 2019, 7:57 AM

I've added a new patch to add some additional logging. Running this on my local, I noticed that,

  1. On testing on my local certain groups take a long time to perform ffsWrite,
  2. There are a few that take a long time to load / filter collections.

These additional logs what is taking time for the core group export.

In the mean time I've also submitted a patch to increase the timeout for exports to 15 minutes.

Change 550796 merged by jenkins-bot:
[translatewiki@master] Increase the timeout for the export command to 15 minutes

https://gerrit.wikimedia.org/r/550796

Change 550720 merged by jenkins-bot:
[mediawiki/extensions/Translate@master] Improve logging during exports

https://gerrit.wikimedia.org/r/550720

Even on the new translatewiki server I am getting timeouts after 900 seconds.

abi_ added a comment.EditedThu, Dec 26, 10:06 AM

Here are some of the logs that I saw,

Old server

[2019-12-17 07:38:39] translation-exports.INFO: Finished calculating stats for group core. Time: 1.4666860103607 secs. {"private":false} []
[2019-12-17 07:38:39] translation-exports.INFO: Exporting languages (304) for group - core. {"private":false} []
[2019-12-17 07:56:59] translation-exports.INFO: Done exporting languages for group - core. Time taken - 1099.6536538601 secs. {"private":false} []
[2019-12-17 07:56:59] translation-exports.INFO: Time taken by 'collection' for group core - 1031.1946969032 secs. {"private":false} []
[2019-12-17 07:56:59] translation-exports.INFO: Time taken by 'ffs' for group core - 68.456996202469 secs. {"private":false} []
[2019-12-17 07:56:59] translation-exports.INFO: Time taken by 'definitionFile' for group core - 0 secs. {"private":false} []
[2019-12-17 07:56:59] translation-exports.INFO: Finished export process for groups - core. Time: 1101.3122811317 secs. {"private":false} []                 <------------
[2019-12-18 07:11:32] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-12-18 07:11:32] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-12-18 07:11:34] translation-exports.INFO: Finished calculating stats for group core. Time: 1.4848539829254 secs. {"private":false} []
[2019-12-18 07:11:34] translation-exports.INFO: Exporting languages (304) for group - core. {"private":false} []
[2019-12-18 07:27:34] translation-exports.INFO: Done exporting languages for group - core. Time taken - 960.18704390526 secs. {"private":false} []                
[2019-12-18 07:27:34] translation-exports.INFO: Time taken by 'collection' for group core - 917.7689101696 secs. {"private":false} []
[2019-12-18 07:27:34] translation-exports.INFO: Time taken by 'ffs' for group core - 42.416127920151 secs. {"private":false} []
[2019-12-18 07:27:34] translation-exports.INFO: Time taken by 'definitionFile' for group core - 0 secs. {"private":false} []
[2019-12-18 07:27:34] translation-exports.INFO: Finished export process for groups - core. Time: 961.85516214371 secs. {"private":false} []                 <------------
[2019-12-19 06:57:06] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-12-19 06:57:07] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-12-19 06:57:08] translation-exports.INFO: Finished calculating stats for group core. Time: 1.4674031734467 secs. {"private":false} []
[2019-12-19 06:57:08] translation-exports.INFO: Exporting languages (304) for group - core. {"private":false} []
[2019-12-19 07:14:24] translation-exports.INFO: Done exporting languages for group - core. Time taken - 1035.5577049255 secs. {"private":false} []             
[2019-12-19 07:14:24] translation-exports.INFO: Time taken by 'collection' for group core - 990.12688708305 secs. {"private":false} []
[2019-12-19 07:14:24] translation-exports.INFO: Time taken by 'ffs' for group core - 45.428552150726 secs. {"private":false} []
[2019-12-19 07:14:24] translation-exports.INFO: Time taken by 'definitionFile' for group core - 0 secs. {"private":false} []
[2019-12-19 07:14:24] translation-exports.INFO: Finished export process for groups - core. Time: 1037.2086029053 secs. {"private":false} []                 <------------

New server

[2019-12-22 15:20:35] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-12-22 15:20:35] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-12-22 15:20:35] translation-exports.INFO: Finished calculating stats for group core. Time: 0.025161981582642 secs. {"private":false} []
[2019-12-22 15:20:35] translation-exports.INFO: Exporting languages (304) for group - core. {"private":false} []
[2019-12-22 15:46:29] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-12-22 15:46:29] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-12-22 15:46:29] translation-exports.INFO: Finished calculating stats for group core. Time: 0.024173974990845 secs. {"private":false} []
[2019-12-22 15:46:29] translation-exports.INFO: Exporting languages (304) for group - core. {"private":false} []
[2019-12-22 16:42:03] translation-exports.INFO: Done exporting languages for group - core. Time taken - 4887.4669239521 secs. {"private":false} []                  
[2019-12-22 16:42:03] translation-exports.INFO: Time taken by 'collection' for group core - 4872.488956213 secs. {"private":false} []
[2019-12-22 16:42:03] translation-exports.INFO: Time taken by 'ffs' for group core - 14.972135543823 secs. {"private":false} []
[2019-12-22 16:42:03] translation-exports.INFO: Time taken by 'definitionFile' for group core - 0 secs. {"private":false} []
[2019-12-22 16:42:03] translation-exports.INFO: Finished export process for groups - core. Time: 4887.7183151245 secs. {"private":false} []                                <------------
[2019-12-22 17:06:50] translation-exports.INFO: Done exporting languages for group - core. Time taken - 4820.9848978519 secs. {"private":false} []                  
[2019-12-22 17:06:50] translation-exports.INFO: Time taken by 'collection' for group core - 4809.6903316975 secs. {"private":false} []
[2019-12-22 17:06:50] translation-exports.INFO: Time taken by 'ffs' for group core - 11.288955211639 secs. {"private":false} []
[2019-12-22 17:06:50] translation-exports.INFO: Time taken by 'definitionFile' for group core - 0 secs. {"private":false} []
[2019-12-22 17:06:50] translation-exports.INFO: Finished export process for groups - core. Time: 4821.2387390137 secs. {"private":false} []               <------------
[2019-12-24 07:48:48] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2019-12-24 07:48:49] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2019-12-24 07:48:49] translation-exports.INFO: Finished calculating stats for group core. Time: 0.026618957519531 secs. {"private":false} []                     
[2019-12-24 07:48:49] translation-exports.INFO: Exporting languages (304) for group - core. {"private":false} []
[2019-12-24 09:09:52] translation-exports.INFO: Done exporting languages for group - core. Time taken - 4862.9278309345 secs. {"private":false} []
[2019-12-24 09:09:52] translation-exports.INFO: Time taken by 'collection' for group core - 4842.6469516754 secs. {"private":false} []
[2019-12-24 09:09:52] translation-exports.INFO: Time taken by 'ffs' for group core - 20.276430368423 secs. {"private":false} []
[2019-12-24 09:09:52] translation-exports.INFO: Time taken by 'definitionFile' for group core - 0 secs. {"private":false} []
[2019-12-24 09:09:52] translation-exports.INFO: Finished export process for groups - core. Time: 4863.1723568439 secs. {"private":false} []                     <------------
abi_ added a comment.Mon, Jan 6, 4:20 PM

Did a couple of manual runs on the server,

time php /srv/mediawiki/targets/production/extensions/Translate/scripts/export.php --lang='*' --group='core' --threshold='13' --target='/resources/abi' --skip='test,aeb,be-x-old,crh,dk,en,fiu-vro,gan,gom,hif,kbd,kk,kk-cn,iu,kk-kz,kk-tr,ko-kp,ku,ku-arab,no,ruq,simple,sr,tg,tp,tt,ug,zh,zh-classical,zh-cn,zh-sg,zh-min-nan,zh-mo,zh-my,zh-tw,zh-yue,bbc,ady'

Run 1

Logs

[2020-01-06 13:07:36] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2020-01-06 13:07:36] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2020-01-06 13:07:36] translation-exports.INFO: Finished calculating stats for group core. Time: 0.021353006362915 secs. {"private":false} []
[2020-01-06 13:07:36] translation-exports.INFO: Exporting languages (304) for group - core. {"private":false} []
[2020-01-06 14:35:44] translation-exports.INFO: Done exporting languages for group - core. Time taken - 5288.1440048218 secs. {"private":false} []
[2020-01-06 14:35:44] translation-exports.INFO: Time taken by 'collection' for group core - 5245.4004302025 secs. {"private":false} []
[2020-01-06 14:35:44] translation-exports.INFO: Time taken by 'ffs' for group core - 42.738322734833 secs. {"private":false} []
[2020-01-06 14:35:44] translation-exports.INFO: Time taken by 'definitionFile' for group core - 0 secs. {"private":false} []
[2020-01-06 14:35:44] translation-exports.INFO: Finished export process for groups - core. Time: 5288.3449759483 secs. {"private":false} []

Output

Exporting 304 languages for group core
real    88m8.758s
user    0m53.057s
sys     0m1.670s

Run 2

Logs

[2020-01-06 14:40:20] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2020-01-06 14:40:21] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2020-01-06 14:40:21] translation-exports.INFO: Finished calculating stats for group core. Time: 0.026159048080444 secs. {"private":false} []
[2020-01-06 14:40:21] translation-exports.INFO: Exporting languages (304) for group - core. {"private":false} []
[2020-01-06 16:07:51] translation-exports.INFO: Done exporting languages for group - core. Time taken - 5250.1348438263 secs. {"private":false} []
[2020-01-06 16:07:51] translation-exports.INFO: Time taken by 'collection' for group core - 5237.3980917931 secs. {"private":false} []
[2020-01-06 16:07:51] translation-exports.INFO: Time taken by 'ffs' for group core - 12.732119560242 secs. {"private":false} []
[2020-01-06 16:07:51] translation-exports.INFO: Time taken by 'definitionFile' for group core - 0 secs. {"private":false} []
[2020-01-06 16:07:51] translation-exports.INFO: Finished export process for groups - core. Time: 5250.3526918888 secs. {"private":false} []

Output

Exporting 304 languages for group core
real    87m30.677s
user    0m52.493s
sys     0m1.403s

Change 562585 had a related patch set uploaded (by Abijeet Patro; owner: Abijeet Patro):
[mediawiki/extensions/Translate@master] Reuse title conditions built using keys when possible

https://gerrit.wikimedia.org/r/562585

abi_ added a comment.Tue, Jan 7, 8:06 PM

Change 562585 had a related patch set uploaded (by Abijeet Patro; owner: Abijeet Patro):
[mediawiki/extensions/Translate@master] Reuse title conditions built using keys when possible
https://gerrit.wikimedia.org/r/562585

Before

After

Its called 818 because 409 languages, loadTranslations is called twice per run,

  1. CommandlineExport::execute
  2. MessageCollection::getAuthors

Change 562695 had a related patch set uploaded (by Abijeet Patro; owner: Abijeet Patro):
[mediawiki/extensions/Translate@master] Use null coalescing operator in ThinMessage::getProperty

https://gerrit.wikimedia.org/r/562695

Change 562707 had a related patch set uploaded (by Abijeet Patro; owner: Abijeet Patro):
[mediawiki/extensions/Translate@master] Extend TitleValue to avoid validating the arguments

https://gerrit.wikimedia.org/r/562707

Change 562707 abandoned by Abijeet Patro:
Extend TitleValue to avoid validating the arguments

Reason:
Too little improvement, but way too ugly.

https://gerrit.wikimedia.org/r/562707

abi_ added a comment.Fri, Jan 10, 6:38 PM

I ran some profiling on the translatewiki.net server today,

time php -dzend_extension=xdebug.so -d xdebug.profiler_enable=1 -d xdebug.profiler_output_dir=/home/abi /srv/mediawiki/targets/production/extensions/Translate/scripts/export.php --lang='ace,ady-cyrl,aeb-arab,af,ais,aln,am,ami,an,ang,anp,ar,arc,arn,arq,ary,arz,as,ase,ast,atj,avk,awa,az,azb' --group='core' --quiet='1' --threshold='13' --target='/resources/abi' --skip='test,aeb,be-x-old,crh,dk,en,fiu-vro,gan,gom,hif,kbd,kk,kk-cn,iu,kk-kz,kk-tr,ko-kp,ku,ku-arab,no,ruq,simple,sr,tg,tp,tt,ug,zh,zh-classical,zh-cn,zh-sg,zh-min-nan,zh-mo,zh-my,zh-tw,zh-yue,bbc,ady

Here's the output from KCachegrind,

Callee Map

Call Stack & Function profile

Change 563771 had a related patch set uploaded (by Abijeet Patro; owner: Abijeet Patro):
[mediawiki/extensions/Translate@master] Fetch author and state information in a single db call

https://gerrit.wikimedia.org/r/563771

It seems that the number of items in the query (>4k) makes the MariaDB query planner to do something which I don't really understand:

Normal query plan with 1 item:

idselect_typetabletypepossible_keyskeykey_lenrefrowsExtra
1SIMPLEbw_pageconstPRIMARY, name_titlename_title261const, const1
1SIMPLEbw_revisionconstPRIMARY, rev_id, page_timestamp, page_user_timestampPRIMARY8const, const1
1SIMPLEtemp_rev_commentconstPRIMARY, revcomment_revrevcomment_rev4const1Using index
1SIMPLEcomment_rev_commentconstPRIMARYPRIMARY8const1
1SIMPLEtemp_rev_userconstPRIMARY, revactor_rev, actor_timestamprevactor_rev4const1Using index
1SIMPLEactor_rev_userconstPRIMARYPRIMARY8const1

Query plan for MediaWiki core message group:

idselect_typetabletypepossible_keyskeykey_lenrefrowsExtra
1PRIMARY<subquery2>ALLdistinct_keyNULLNULLNULL4275
1PRIMARYbw_pageeq_refPRIMARY, name_titlename_title261const, tvc_0._col_11Using index condition
1PRIMARYbw_revisionrefPRIMARY, rev_id, page_timestamp, page_user_timestamppage_timestamp4translatewiki_net.bw_page.page_id1Using index condition
1PRIMARYtemp_rev_commentrefPRIMARY, revcomment_revPRIMARY4translatewiki_net.bw_page.page_latest1Using index
1PRIMARYcomment_rev_commenteq_refPRIMARYPRIMARY8translatewiki_net.temp_rev_comment.revcomment_comment_id1
1PRIMARYtemp_rev_userrefPRIMARY, revactor_rev, actor_timestampPRIMARY4translatewiki_net.bw_page.page_latest1Using index
1PRIMARYactor_rev_userALLPRIMARYNULLNULLNULL20889Using where; Using join buffer (flat, BNL join)
2MATERIALIZED<derived3>ALLNULLNULLNULLNULL4275
3DERIVEDNULLNULLNULLNULLNULLNULLNULLNo tables used

The query format is:

SELECT  rev_id,rev_page,rev_timestamp,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,comment_rev_comment.comment_text AS `rev_comment_text`,comment_rev_comment.comment_data AS `rev_comment_data`,comment_rev_comment.comment_id AS `rev_comment_cid`,actor_rev_user.actor_user AS `rev_user`,actor_rev_user.actor_name AS `rev_user_text`,temp_rev_user.revactor_actor AS `rev_actor`,page_namespace,page_title,page_id,page_latest,page_is_redirect,page_len  FROM `bw_revision` JOIN `bw_revision_comment_temp` `temp_rev_comment` ON ((temp_rev_comment.revcomment_rev = rev_id)) JOIN `bw_comment` `comment_rev_comment` ON ((comment_rev_comment.comment_id = temp_rev_comment.revcomment_comment_id)) JOIN `bw_revision_actor_temp` `temp_rev_user` ON ((temp_rev_user.revactor_rev = rev_id)) JOIN `bw_actor` `actor_rev_user` ON ((actor_rev_user.actor_id = temp_rev_user.revactor_actor)) JOIN `bw_page` ON ((page_id = rev_page))   WHERE (page_latest = rev_id) AND ((page_namespace = 8 AND page_title IN ('Sidebar/en', ....) ) );

Is there a way to force it to use the normal query plan?

Hmm the "normal" query plan was not very helpful, here is one with two items which looks more "normal":

idselect_typetabletypepossible_keyskeykey_lenrefrowsExtra
1SIMPLEbw_pagerangePRIMARY, name_titlename_title261NULL2Using index condition
1SIMPLEbw_revisioneq_refPRIMARY, rev_id, page_timestamp, page_user_timestampPRIMARY8translatewiki_net.bw_page.page_id, translatewiki_net.bw_page.page_latest1
1SIMPLEtemp_rev_commentrefPRIMARY, revcomment_revPRIMARY4translatewiki_net.bw_page.page_latest1Using index
1SIMPLEcomment_rev_commenteq_refPRIMARYPRIMARY8translatewiki_net.temp_rev_comment.revcomment_comment_id1
1SIMPLEtemp_rev_userrefPRIMARY, revactor_rev, actor_timestampPRIMARY4translatewiki_net.bw_page.page_latest1Using index
1SIMPLEactor_rev_usereq_refPRIMARYPRIMARY8translatewiki_net.temp_rev_user.revactor_actor1

So one more, with just a one(!) items less than full query:

1PRIMARY<subquery2>ALLdistinct_keyNULLNULLNULL4274
1PRIMARYbw_pageeq_refPRIMARY,name_titlename_title261const,tvc_0._col_11Using index condition
1PRIMARYbw_revisionrefPRIMARY,rev_id,page_timestamp,page_user_timestamppage_timestamp4translatewiki_net.bw_page.page_id1Using index condition
1PRIMARYtemp_rev_commentrefPRIMARY,revcomment_revPRIMARY4translatewiki_net.bw_page.page_latest1Using index
1PRIMARYcomment_rev_commenteq_refPRIMARYPRIMARY8translatewiki_net.temp_rev_comment.revcomment_comment_id1
1PRIMARYtemp_rev_userrefPRIMARY,revactor_rev,actor_timestampPRIMARY4translatewiki_net.bw_page.page_latest1Using index
1PRIMARYactor_rev_usereq_refPRIMARYPRIMARY8translatewiki_net.temp_rev_user.revactor_actor1
2MATERIALIZED<derived3>ALLNULLNULLNULLNULL4274
3DERIVEDNULLNULLNULLNULLNULLNULLNULLNo tables used

Here we can clearly see that the problem happens actor_rev_user doesn't choose PRIMARY as key, and it seems core message group is just at the magic limit of 4275 items where it happens (maybe the limit is not items but total length of query or something).

One possible solution is split the query into multiple parts, say at most 1k or 2k items at most.

Change 564055 had a related patch set uploaded (by Nikerabbit; owner: Nikerabbit):
[mediawiki/extensions/Translate@master] Implement maximum query size in MessageCollection

https://gerrit.wikimedia.org/r/564055

Change 562585 merged by jenkins-bot:
[mediawiki/extensions/Translate@master] Reuse title conditions built using keys when possible

https://gerrit.wikimedia.org/r/562585

Change 564055 merged by jenkins-bot:
[mediawiki/extensions/Translate@master] Implement maximum query size in MessageCollection

https://gerrit.wikimedia.org/r/564055

Change 562695 merged by jenkins-bot:
[mediawiki/extensions/Translate@master] Use null coalescing operator in ThinMessage::getProperty

https://gerrit.wikimedia.org/r/562695

The changes made for this have been deployed on translatewiki.net and we're seeing very promising improvements (340% improvement for the core group).

Core export time on 15th, Jan, 2019,

[2020-01-15 08:14:45] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2020-01-15 08:14:45] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2020-01-15 08:14:45] translation-exports.INFO: Finished calculating stats for group core. Time: 0.020501136779785 secs. {"private":false} []
[2020-01-15 08:14:45] translation-exports.INFO: Exporting languages (304) for group - core. {"private":false} []
[2020-01-15 09:37:17] translation-exports.INFO: Done exporting languages for group - core. Time taken - 4951.9618310928 secs. {"private":false} []
[2020-01-15 09:37:17] translation-exports.INFO: Time taken by 'collection' for group core - 4928.6116998196 secs. {"private":false} []
[2020-01-15 09:37:17] translation-exports.INFO: Time taken by 'ffs' for group core - 23.346065759659 secs. {"private":false} []
[2020-01-15 09:37:17] translation-exports.INFO: Time taken by 'definitionFile' for group core - 0 secs. {"private":false} []
[2020-01-15 09:37:17] translation-exports.INFO: Finished export process for groups - core. Time: 4952.1584560871 secs. {"private":false} []

Core export time on 16th, Jan, 2019 after deployment,

[2020-01-16 08:07:06] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2020-01-16 08:07:06] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2020-01-16 08:07:06] translation-exports.INFO: Finished calculating stats for group core. Time: 0.022905111312866 secs. {"private":false} []
[2020-01-16 08:07:06] translation-exports.INFO: Exporting languages (304) for group - core. {"private":false} []
[2020-01-16 08:09:29] translation-exports.INFO: Done exporting languages for group - core. Time taken - 143.01634192467 secs. {"private":false} []
[2020-01-16 08:09:29] translation-exports.INFO: Time taken by 'collection' for group core - 128.89460325241 secs. {"private":false} []
[2020-01-16 08:09:29] translation-exports.INFO: Time taken by 'ffs' for group core - 14.110049009323 secs. {"private":false} []
[2020-01-16 08:09:29] translation-exports.INFO: Time taken by 'definitionFile' for group core - 0 secs. {"private":false} []
[2020-01-16 08:09:29] translation-exports.INFO: Finished export process for groups - core. Time: 143.24112606049 secs. {"private":false} []

These improvements have been confirmed by Raymond.

Similarly non-mediawiki exports are taking 50% of the time they were taking previously (From taking around 33 minutes, down to 17 minutes now).

I'll monitor this for a couple of more days before resolving this task.

abi_ added a comment.Mon, Jan 20, 11:19 AM

Here are some further logs from core exports from the 17th and 20th of Jan, 2020. The time has been below 3 minutes.

[2020-01-17 07:02:22] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2020-01-17 07:02:22] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2020-01-17 07:02:22] translation-exports.INFO: Finished calculating stats for group core. Time: 0.024768114089966 secs. {"private":false} []
[2020-01-17 07:02:22] translation-exports.INFO: Exporting languages (304) for group - core. {"private":false} []
[2020-01-17 07:05:01] translation-exports.INFO: Done exporting languages for group - core. Time taken - 158.96392416954 secs. {"private":false} []
[2020-01-17 07:05:01] translation-exports.INFO: Time taken by 'collection' for group core - 128.82819843292 secs. {"private":false} []
[2020-01-17 07:05:01] translation-exports.INFO: Time taken by 'ffs' for group core - 30.132598400116 secs. {"private":false} []
[2020-01-17 07:05:01] translation-exports.INFO: Time taken by 'definitionFile' for group core - 0 secs. {"private":false} []
[2020-01-17 07:05:01] translation-exports.INFO: Finished export process for groups - core. Time: 159.18489003181 secs. {"private":false} []
-----
[2020-01-20 06:51:52] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2020-01-20 06:51:52] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2020-01-20 06:51:52] translation-exports.INFO: Finished calculating stats for group core. Time: 0.014782190322876 secs. {"private":false} []
[2020-01-20 06:51:52] translation-exports.INFO: Exporting languages (304) for group - core. {"private":false} []
[2020-01-20 06:54:46] translation-exports.INFO: Done exporting languages for group - core. Time taken - 173.67638707161 secs. {"private":false} []
[2020-01-20 06:54:46] translation-exports.INFO: Time taken by 'collection' for group core - 142.27746868134 secs. {"private":false} []
[2020-01-20 06:54:46] translation-exports.INFO: Time taken by 'ffs' for group core - 31.395657062531 secs. {"private":false} []
[2020-01-20 06:54:46] translation-exports.INFO: Time taken by 'definitionFile' for group core - 0 secs. {"private":false} []
[2020-01-20 06:54:46] translation-exports.INFO: Finished export process for groups - core. Time: 173.83417582512 secs. {"private":false} []

Non-mediawiki exports took 19 minutes on Jan 20, 2020.

Change 566006 had a related patch set uploaded (by Abijeet Patro; owner: Abijeet Patro):
[translatewiki@master] Reduce the export command timeout back to 5 minutes

https://gerrit.wikimedia.org/r/566006

abi_ closed this task as Resolved.Mon, Jan 20, 11:25 AM

I've submitted a patch to reduce the timeout back to the original 5 minutes. Resolving this task for now.

Change 566006 merged by jenkins-bot:
[translatewiki@master] Reduce the export command timeout back to 5 minutes

https://gerrit.wikimedia.org/r/566006

abi_ added a comment.Tue, Jan 21, 9:07 AM

Export stats from core on 21st Jan, 2020 - 144 seconds (~2.5 minutes)

[2020-01-21 06:52:45] translation-exports.INFO: Starting exports for groups - core... {"private":false} []
[2020-01-21 06:52:45] translation-exports.INFO: Calculating stats for group core {"private":false} []
[2020-01-21 06:52:45] translation-exports.INFO: Finished calculating stats for group core. Time: 0.019475936889648 secs. {"private":false} []
[2020-01-21 06:52:45] translation-exports.INFO: Exporting languages (304) for group - core. {"private":false} []
[2020-01-21 06:55:08] translation-exports.INFO: Done exporting languages for group - core. Time taken - 143.55314898491 secs. {"private":false} []
[2020-01-21 06:55:08] translation-exports.INFO: Time taken by 'collection' for group core - 123.1598918438 secs. {"private":false} []
[2020-01-21 06:55:08] translation-exports.INFO: Time taken by 'ffs' for group core - 20.390076875687 secs. {"private":false} []
[2020-01-21 06:55:08] translation-exports.INFO: Time taken by 'definitionFile' for group core - 0 secs. {"private":false} []
[2020-01-21 06:55:08] translation-exports.INFO: Finished export process for groups - core. Time: 143.77037096024 secs. {"private":false} []