[Task] Add Scribunto to extension-gate in CI
Open, HighPublic

Description

Add Scribunto to extension-gate in CI.

In T96264 Wikidata was added which has tests that are skipped without Scribunto.

We have tried to add it but the test runtime skyrocket causing gate delay (T126274). Will need to tweak the Scribunto config before being able to add it again.

JanZerebecki updated the task description. (Show Details)
JanZerebecki raised the priority of this task from to Needs Triage.
JanZerebecki added a subscriber: JanZerebecki.
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptJan 28 2016, 7:50 AM

Change 267458 had a related patch set uploaded (by Paladox):
[Scribunto] Add template extension-gate to Scribunto

https://gerrit.wikimedia.org/r/267458

Change 267458 merged by jenkins-bot:
[Scribunto] Add template extension-gate to Scribunto

https://gerrit.wikimedia.org/r/267458

JanZerebecki closed this task as Resolved.Feb 6 2016, 6:31 PM
Paladox reopened this task as Open.Feb 8 2016, 10:54 PM
Paladox added a subscriber: Paladox.

Re opening since this extension will need some performance improvements such as testing since it is causing the test to slow down a lot.

Paladox removed a subscriber: gerritbot.

The fact that Scribunto is missing now also causes https://gerrit.wikimedia.org/r/#/c/279367/ to fail, as ZeroPortal binds to classes held within it!

The fact that Scribunto is missing now also causes https://gerrit.wikimedia.org/r/#/c/279367/ to fail, as ZeroPortal binds to classes held within it!

Hi ive uploaded a patch at https://gerrit.wikimedia.org/r/#/c/279400/

The fact that Scribunto is missing now also causes https://gerrit.wikimedia.org/r/#/c/279367/ to fail, as ZeroPortal binds to classes held within it!

@Addshore please file a bug.

Paladox added a subscriber: Yurik.Mar 24 2016, 6:30 PM

The fact that Scribunto is missing now also causes https://gerrit.wikimedia.org/r/#/c/279367/ to fail, as ZeroPortal binds to classes held within it!

@Yurik per ^^

Paladox removed a subscriber: Yurik.Mar 24 2016, 6:30 PM
Lydia_Pintscher renamed this task from Add Scribunto to extension-gate in CI to [Task] Add Scribunto to extension-gate in CI.Apr 3 2016, 11:22 AM
Lydia_Pintscher triaged this task as Normal priority.
aude added a subscriber: aude.EditedMar 29 2017, 3:43 AM

now tests are failing:

03:32:45 53) LuaStandalone: LuaWikibaseEntityLibraryTests[39]: mw.wikibase.entity.getDescription integration
03:32:45 proc_open(): fork failed - Cannot allocate memory
03:32:45 
03:32:45 /home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto/engines/LuaStandalone/LuaStandaloneEngine.php:267
03:32:45 /home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto/engines/LuaStandalone/LuaStandaloneEngine.php:115
03:32:45 /home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto/engines/LuaCommon/LuaCommon.php:95
03:32:45 /home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto/engines/LuaStandalone/LuaStandaloneEngine.php:9
03:32:45 /home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto/engines/LuaCommon/LuaCommon.php:198
03:32:45 /home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto/engines/LuaCommon/LuaCommon.php:884
03:32:45 /home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto/tests/engines/LuaCommon/LuaDataProvider.php:21
03:32:45 /home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto/tests/engines/LuaCommon/LuaEngineTestBase.php:237
03:32:45 /home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto/tests/engines/LuaCommon/LuaEngineTestBase.php:249
03:32:45 /home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/tests/phpunit/MediaWikiTestCase.php:400
03:32:45 /home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/maintenance/doMaintenance.php:111

https://integration.wikimedia.org/ci/job/mwext-testextension-php55-composer-trusty/151/console

hashar added a subscriber: hashar.Mar 29 2017, 2:58 PM

So nothing specific has changed. T126670 has been closed as a cleanup task since we have $wgScribuntoDefaultEngine = 'luasandbox'; since Feb 27 2016.

For Wikibase tests, the job mwext-testextension-php55-composer-trusty fails. The end of debug log shows a spam of:

1016.5270 1799.0M  Scribunto_LuaStandaloneInterpreter::terminate: terminating

The HHVM equivalent has 610MBytes

We have the tests running on instances with 4GBytes of memory. An empty one has 3.3 GBytes free and I have confirmed that on Trusty we have:

/etc/php5/cli/php.ini
memory_limit = -1

I noticed the luastandalone process are invoked with lua_ulimit.sh which set the maximum virtual memory to 48828. Maybe we have reach that limit, but the tests shows they can't fork so I am a bit puzzled.

Also the jobs run ALL tests of ALL extensions added as dependencies, so a Wikibase patch also run:

LuaStandalone: Scribunto_LuaUriLibraryTests1 mn 24 s
LuaStandalone: Scribunto_LuaUstringLibraryTests43 s

In short most probably the suite uses too much memory and ends up reaching the 4GB limit. Or alternatively, maybe the ulimit kicks in and prevents the fork from completing.

Anomie added a subscriber: Anomie.Mar 29 2017, 3:19 PM

For Wikibase tests, the job mwext-testextension-php55-composer-trusty fails. The end of debug log shows a spam of:

1016.5270 1799.0M  Scribunto_LuaStandaloneInterpreter::terminate: terminating

I note that the CLI log for #149 (which passed) shows the same thing at the end.

It seems we can tell the difference in the CLI log between the tests that succeed and the tests that fail:

mw-debug-clu.log for 149
472.9353 1693.0M  Parser: using preprocessor: Preprocessor_DOM
[gitinfo] Computed cacheFile=/home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/gitinfo.json for /home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src
[gitinfo] Cache incomplete for /home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src
472.9565 1693.0M  Parser: using preprocessor: Preprocessor_DOM
472.9585 1693.0M  Scribunto_LuaStandaloneInterpreter::__construct: creating interpreter: 'exec' '/bin/sh' '/home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto/engines/LuaStandalone/lua_ulimit.sh' '30' '31' '48828' ''\''/home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto/engines/LuaStandalone/binaries/lua5_1_5_linux_64_generic/lua'\'' '\''/home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto/engines/LuaStandalone/mw_main.lua'\'' '\''/home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto'\'' '\''1330'\'''
[gitinfo] Computed cacheFile=/home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/gitinfo.json for /home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src
[gitinfo] Cache incomplete for /home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src
473.0784 1693.0M  Scribunto_LuaStandaloneInterpreter::terminate: terminating
473.0808 1693.0M  Parser: using preprocessor: Preprocessor_DOM
473.0828 1693.0M  Scribunto_LuaStandaloneInterpreter::__construct: creating interpreter: 'exec' '/bin/sh' '/home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto/engines/LuaStandalone/lua_ulimit.sh' '30' '31' '48828' ''\''/home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto/engines/LuaStandalone/binaries/lua5_1_5_linux_64_generic/lua'\'' '\''/home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto/engines/LuaStandalone/mw_main.lua'\'' '\''/home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto'\'' '\''1331'\'''
[gitinfo] Computed cacheFile=/home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/gitinfo.json for /home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src
[gitinfo] Cache incomplete for /home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src
473.2470 1693.0M  Scribunto_LuaStandaloneInterpreter::terminate: terminating
473.2498 1693.0M  Parser: using preprocessor: Preprocessor_DOM
mw-debug-clu.log for 152
617.9261 1692.8M  Parser: using preprocessor: Preprocessor_DOM
[gitinfo] Computed cacheFile=/home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/gitinfo.json for /home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src
[gitinfo] Cache incomplete for /home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src
617.9486 1692.8M  Parser: using preprocessor: Preprocessor_DOM
617.9512 1692.8M  Scribunto_LuaStandaloneInterpreter::__construct: creating interpreter: 'exec' '/bin/sh' '/home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto/engines/LuaStandalone/lua_ulimit.sh' '30' '31' '48828' ''\''/home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto/engines/LuaStandalone/binaries/lua5_1_5_linux_64_generic/lua'\'' '\''/home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto/engines/LuaStandalone/mw_main.lua'\'' '\''/home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto'\'' '\''1330'\'''
617.9587 1692.8M  Parser: using preprocessor: Preprocessor_DOM
617.9608 1692.8M  Scribunto_LuaStandaloneInterpreter::__construct: creating interpreter: 'exec' '/bin/sh' '/home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto/engines/LuaStandalone/lua_ulimit.sh' '30' '31' '48828' ''\''/home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto/engines/LuaStandalone/binaries/lua5_1_5_linux_64_generic/lua'\'' '\''/home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto/engines/LuaStandalone/mw_main.lua'\'' '\''/home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto'\'' '\''1331'\'''
617.9664 1692.8M  Parser: using preprocessor: Preprocessor_DOM

Note the second lacks the "[gitinfo]" and "terminating" lines, because the proc_open failed and terminated the test. Interestingly, the passing test #149 here was reported as using more memory than the failing #152 at this point.

The problem reported in T161698 blocks merging any of Wikibase patches, so this is becoming more pressing for Wikidata folk. Is there anything on the Wikibase side we could do to help fixing the issue?

hoo raised the priority of this task from Normal to High.Apr 3 2017, 2:12 PM
hoo added a subscriber: Lydia_Pintscher.

We should try and make scribuntu more performant i.e. performance wise. Or at least skip the tests that cause the tests to take so long.

hoo added a comment.Apr 3 2017, 2:27 PM

We should try and make scribuntu more performant i.e. performance wise. Or at least skip the tests that cause the tests to take so long.

Not the performance is the problem but the fact that the tests are failing is.

Anomie added a comment.Apr 3 2017, 3:00 PM

We should try and make scribuntu more performant i.e. performance wise. Or at least skip the tests that cause the tests to take so long.

As I said earlier, it seems this Scribunto failure is a symptom of a problem, not the cause. The high-level cause seems to be that the test suite as a whole is exceeding a memory limitation of some sort. What limitation that is, exactly, and whether there's a configuration setting that could be changed or a leak somewhere that could be fixed is a matter for further investigation.

Mentioned in SAL (#wikimedia-releng) [2017-04-03T20:39:03Z] <hashar> Nodepool: holding instance ci-trusty-wikimedia-597386 in an attempt debug Wikibase/Scribunto memory usage exploding T125050

hashar added a comment.Apr 3 2017, 9:01 PM

I manually rebuild mwext-testextension-php55-composer-trusty for Wikibase. Instructed nodepool to not delete the instance (ssh labnodepool1001.eqiad.wmnet nodepool hold <ID of instance) and connected to it while the test was running.

The vmstat at the end of the build. Memory metrics are in megabytes, lines one second apart:

procs ---------------memory-------------- ---swap-- -----io---- -system-- ------cpu-----
 r  b     swpd     free     buff    cache   si   so    bi    bo   in   cs us sy id wa st
 2  0        0      602       47      957    0    0     0   340 2777 6007 49  3 48  1  0
 1  0        0      591       47      957    0    0     0   288 2806 6929 48  4 48  1  0
 1  0        0      571       47      957    0    0    48    32 4147 9573 48  4 47  1  0
 2  0        0      527       47      957    0    0     0    36 6498 8514 50  5 45  0  0
 1  0        0      562       47      959    0    0     0   372 7061 11920 46 10 43  1  0
 1  0        0      562       47      959    0    0     0    20 1248 2536 48  3 49  0  0
 1  0        0      562       47      959    0    0     0     0  365  116 48  2 50  0  0
 1  0        0      562       47      959    0    0     0  7296  483  954 49  3 48  1  0
 1  0        0      562       47      959    0    0     0  9748  488  889 50  2 48  1  0
 1  0        0     2405       47      959    0    0     4   404 1136 1913 36  8 55  0  0
 0  0        0     2396       47      962    0    0   152  2792 3631 6457 35  5 59  1  0
 1  0        0     2389       47      963    0    0  1332    60 1429 2521 42  4 54  1  0
 1  0        0     2387       47      964    0    0    32   956 1643 2917 73  3 24  0  0
 1  0        0     2379       47      966    0    0    52  1468 2949 5027 42  6 52  1  0
 0  0        0     2379       47      966    0    0     4     0  834 1181 51  1 48  0  0
 0  0        0     2379       47      966    0    0     0     0  119  134  0  0 100  0  0
 0  0        0     2379       47      966    0    0     0     0  126  343  1  0 99  0  0
 0  0        0     2380       47      966    0    0     0     0  105  116  0  0 100  0  0
 0  0        0     2380       47      966    0    0     0   368  101  118  0  0 100  0  0
 0  0        0     2380       47      966    0    0     0    56  271  364  0  1 99  0  0

The instance had at worth 500MB of free memory and ~ 950MB of cache (eg file system cache which can be reclaimed by the OS).

A snapshot of top:

top - 20:46:30 up  1:03,  3 users,  load average: 2.03, 1.78, 0.90
Tasks: 142 total,   3 running, 139 sleeping,   0 stopped,   0 zombie
%Cpu(s): 39.8 us, 11.6 sy,  0.0 ni, 48.3 id,  0.0 wa,  0.0 hi,  0.3 si,  0.0 st
KiB Mem:   4048100 total,  3320404 used,   727696 free,    48352 buffers
KiB Swap:        0 total,        0 used,        0 free.   979012 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                      
 3910 jenkins   20   0 2123364 1.619g  14476 R  81.1 41.9   6:25.79 php5 -dzend.enable_gc=0 phpunit.php ...
 1482 mysql     20   0  624248 169096  10064 S   1.0  4.2   0:20.91 /usr/sbin/mysqld  ...
24324 jenkins   20   0    7868   1156    740 R   1.0  0.0   0:00.03 /home/jenkins/workspace/mwext-testextension-php55-composer-trusty/src/extensions/Scribunto/engines/LuaStandalone/binaries/l+ 
 2149 jenkins   20   0 2419876 212784  15152 S   0.7  5.3   0:25.05 java -jar slave.jar                                                                                                          
23810 jenkins   20   0  105776   2044   1044 S   0.7  0.1   0:03.00 sshd: jenkins@pts/1                                                                                                          
    7 root      20   0       0      0      0 S   0.3  0.0   0:03.43 [rcu_sched]

So I think that rules out the lack of memory on the instance. Unless the fork of Scribunto actually syphon up all memory somehow OR is limited somehow.

hashar added a comment.Apr 3 2017, 9:06 PM

Next thing. On Jenkins we have a monitoring system named JavaMelody which can be used to inspect an instance, specially the process list. Eg via a link like https://integration.wikimedia.org/ci/monitoring/nodes/ci-trusty-wikimedia-597386?part=processes which is really the same as ps.

Attached are the PDF output sorted by pid and resident memory (rss):

What this show is that at the point I took the capture, there were several Scribunto/engines/LuaStandalone/binaries/lua5_1_5_linux_64_generic/lua process running. I guess they are on hold somehow. Though the lua_ulimit.sh does not appear.

Last thing, I confirmed that the memory_usage() shown in the debug log is consistent with the value shown in top.

Krinkle added a subscriber: Krinkle.EditedApr 4 2017, 3:04 AM

https://gerrit.wikimedia.org/r/#/c/345628/1
https://integration.wikimedia.org/ci/job/mwext-testextension-php55-composer-trusty/259/console

153) LuaStandalone: LuaWikibaseLibraryTests[38]: ...
proc_open(): fork failed - Cannot allocate memory

T161698: LuaStandalone: LuaWikibaseLibraryTests - proc_open(): fork failed - Cannot allocate memory

Is failing 5x in a row in this and other commits. Has made merging in WikibaseQualityConstraints repo impossible.

hoo added a comment.Apr 4 2017, 8:03 AM

What this show is that at the point I took the capture, there were several Scribunto/engines/LuaStandalone/binaries/lua5_1_5_linux_64_generic/lua process running. I guess they are on hold somehow. Though the lua_ulimit.sh does not appear.

lua_ulimit.sh uses exec, so it's expected to appear in the process list just very briefly.

@hashar Could we try raising the limit and see whether that makes a difference?

hashar added a comment.Apr 4 2017, 8:49 AM

Gave it a try directly on the instance running solely the Scribunto tests after the memory limit got raised to 70M which ends up being ulimit -v 68359.

WORKSPACE=/home/jenkins/workspace/mwext-testextension-php55-composer-trusty \
  php tests/phpunit/phpunit.php extensions/Scribunto/

The debug log shows 131.0M and PHPUnit says Memory: 149.50MB. There is no spam of "Scribunto_LuaStandaloneInterpreter::terminate: terminating" at the end.

hashar added a comment.Apr 4 2017, 9:30 AM

Ran the whole suite again with the ulimit in place but without --log-junit which is memory heavy. The whole suite pass just fine and there are no fork() issues.

debug log reports 912.2M and PHPUnit reports 924.50MB

So my best guess is that:

  • our code invokes proc_open()
  • Zend invokes fork()
  • the memory is copied somehow which is denied by the system due to overflow
hoo added a comment.Apr 4 2017, 11:47 AM

@hashar If we can't allocate more resources, can we disable --log-junit for just this job for now?

hoo added subscribers: Legoktm, greg.Apr 4 2017, 11:48 AM

--log-junit is what fails the build so no.

What fork() is doing is that it tries to allocate the whole virtual memory from the caller. In a trace above the PHP process had 1.8G and the system had 562M free + 959 cached. .That is not enough reclaimable memory to fit the whole 1.8G memory, even if it is never going to be used.

The low level fixes would be:

  • add a swap large enough which would trick Linux in thinking there is enough memory. The CI instances have no swap enabled and I haven't found out how to enable in the utility we use to generate the images (diskimage-builder)
  • raise instances RAM from 4GB to 6GB to allow for MediaWiki fork() to fit in (lame)
hashar added a comment.Apr 4 2017, 4:18 PM

Note to self, diskimage-builder wipe out /etc/fstab:

elements/debootstrap/install.d/15-cleanup-debootstrap
cat << EOF | tee /etc/fstab > /dev/null
proc /proc proc nodev,noexec,nosuid 0 0
LABEL=${DIB_ROOT_LABEL} / ${FS_TYPE} errors=remount-ro 0 1
EOF

Eg there is no swap defined at all, eg on Trusty:

/etc/fstab
LABEL=cloudimg-rootfs	/	 ext4	defaults	0 0

On Jessie:

/etc/fstab
proc /proc proc nodev,noexec,nosuid 0 0
LABEL=cloudimg-rootfs / ext4 errors=remount-ro 0 1

I have no idea what to mount as swap. A /dev a label or whatever else. So we can probably mkswap to a flat file.

Addshore removed a subscriber: Addshore.Apr 4 2017, 6:46 PM

Change 346634 had a related patch set uploaded (by Hashar):
[integration/config@master] dib: allow Linux memory overcommit on Trusty

https://gerrit.wikimedia.org/r/346634

Change 346634 merged by jenkins-bot:
[integration/config@master] dib: allow Linux memory overcommit on Trusty

https://gerrit.wikimedia.org/r/346634

hashar added a comment.EditedApr 6 2017, 12:42 PM

The Trusty instances that run the php5 jobs now allows Linux to over commit the memory. So a fork() should work now. I triggered a job and it managed to pass just fine. Guess you can check php5 or CR+2 again patches that were falling.

hashar added a comment.Apr 6 2017, 2:10 PM

Seems the Wikibase related jobs that use Scribunto are passing just fine. That does not close that task though which is about adding Scribunto to extension-gate.