Page MenuHomePhabricator

Scribunto_LuaInterpreterNotFoundError in production
Closed, ResolvedPublic

Description

We get one of these every 10 minutes:

2015-02-08 07:05:21 mw1177 enwiki: [6454a01d] /w/wiki.phtml?title=College_of_Arms&action=edit   Scribunto_LuaInterpreterNotFoundError from line 233 of /srv/mediawiki/php-1.25wmf15/extensions/Scribunto/engines/LuaSandbox/Engine.php: The luasandbox extension is not present, this engine cannot be used.
#0 /srv/mediawiki/php-1.25wmf15/extensions/Scribunto/engines/LuaSandbox/Engine.php(207): Scribunto_LuaSandboxInterpreter->__construct(Object(Scribunto_LuaSandboxEngine), Array)
#1 /srv/mediawiki/php-1.25wmf15/extensions/Scribunto/engines/LuaCommon/LuaCommon.php(95): Scribunto_LuaSandboxEngine->newInterpreter()
#2 /srv/mediawiki/php-1.25wmf15/extensions/Scribunto/engines/LuaCommon/LuaCommon.php(197): Scribunto_LuaEngine->load()
#3 /srv/mediawiki/php-1.25wmf15/extensions/Scribunto/engines/LuaCommon/LuaCommon.php(847): Scribunto_LuaEngine->getInterpreter()
#4 /srv/mediawiki/php-1.25wmf15/extensions/Scribunto/engines/LuaCommon/LuaCommon.php(864): Scribunto_LuaModule->getInitChunk()
#5 /srv/mediawiki/php-1.25wmf15/extensions/Scribunto/common/Hooks.php(115): Scribunto_LuaModule->invoke('main', Object(PPTemplateFrame_DOM))
#6 [internal function]: ScribuntoHooks::invokeHook(Object(Parser), Object(PPTemplateFrame_DOM), Array)
#7 /srv/mediawiki/php-1.25wmf15/includes/parser/Parser.php(3768): call_user_func_array('ScribuntoHooks:...', Array)
#8 /srv/mediawiki/php-1.25wmf15/includes/parser/Parser.php(3502): Parser->callParserFunction(Object(PPTemplateFrame_DOM), '#invoke', Array)

get_loaded_extensions() shows that all three of luasandbox, pcre_zend_compat, and standard_zend_compat are loaded.

pcre_zend_compat and standard_zend_compat look unfamiliar to me (I think that EZC had a different name before -- ext_zend_compat). I think they were recently introduced as part of the new HHVM package which includes Tim's PCRE cache work. The deployment of the new package was accompanied by configuration changes to specify the cache type. I suspect that this bug crept in with one of those changes.

As an aside: in the future, when we change HHVM configuration settings, let us require a minimum quorum of one operations engineer and one MediaWiki core engineer.

Event Timeline

ori raised the priority of this task from to High.
ori updated the task description. (Show Details)
ori added a project: Scribunto.
ori added subscribers: ori, tstarling, Joe.

note that the url is invalid anyway:

we do accept

/w/index.php?

or

/wiki.phtml?

but not

/w/wiki.phtml

This should be investigated anyways.

The change in config was suggested by Tim to go with the new package, although not on phabricator or the patch directly.

In T88942#1023792, @Joe wrote:

note that the url is invalid anyway:

we do accept

/w/index.php?

or

/wiki.phtml?

but not

/w/wiki.phtml

This should be investigated anyways.

This ends up being the key to the bug. It happens only on invalid URLs that fail to match the rewriting rules that route requests to the fcgi backend. Those get handled by PHP instead, due to some looseness in the Apache configs that we haven't yet isolated. This can be confirmed by comparing the X-Powered-By header values on https://en.wikipedia.org/wiki/College_of_Arms?action=edit (header: X-Powered-By: HHVM/3.3.1) and https://en.wikipedia.org/w/wiki.phtml?title=College_of_Arms&action=edit (header: X-Powered-By: HHVM/3.3.0-static).

The header X-Powered-By: HHVM/3.3.0-static is misleading; HHVM does not appear to be involved at all. It is set by:

modules/mediawiki/files/apache/configs/hhvm_mark_engine.conf:3:    Header always setifempty X-Powered-By "HHVM/3.3.0-static"

@Joe discovered that it is caused by this rule:

# Early phase 2 compatibility URLs                                                                                                                                            
RewriteRule ^/wiki\.phtml$ %{ENV:RW_PROTO}://%{SERVER_NAME}/w/index.php [R=301,L] this rule shouldn't match

He'll fix it during his regular hours.

ori set Security to None.

In fact, upon further looking, we saw that the culprit is that

/w/wiki.phtml

is a file on the filesystem. This means that it is not catched by our rewrite rules or FileMatch directives, and gets interpreted by mod_php (so the zend engine) instead than by HHVM, at least on the servers that still use mpm_prefork (the vast majority).

I have an hotfix that I'll apply this morning.

https://gerrit.wikimedia.org/r/189440 should solve this, given no other .phtml file is present in our repository and FilesMatch catchalls are processed AFTER RewriteRules

I merged the patch, and upon early testing I found that HHVM is taking ~ 30 seconds to respond to any request to that page; also, the response is very different from the one you get from Zend, see https://phabricator.wikimedia.org/P276

So, I uploaded a new patch that just rewrites /w/wiki.phtml to /w/index.php :

https://gerrit.wikimedia.org/r/#/c/189697/