Page MenuHomePhabricator

mwscript importDump.php fails to properly import some valid xml exports
Closed, ResolvedPublic

Description

several importDump commands fail.

steps to reproduce

  1. setup vagrant as mentioned here https://www.mediawiki.org/wiki/MediaWiki-Vagrant#Quick_start
  2. vagrant enable-role gwtoolset
  3. vagrant provision

expected results

==> default: Notice: /Stage[main]/Role::Gwtoolset/Mediawiki::Import_dump[template_de]/Exec[import_dump_template_de]/returns: executed successfully

actual results

==> default: Error: mwscript importDump.php --wiki=wiki /vagrant/puppet/modules/role/files/gwtoolset/template/De.xml returned  instead of one of [0]
==> default: Error: /Stage[main]/Role::Gwtoolset/Mediawiki::Import_dump[template_de]/Exec[import_dump_template_de]/returns: change from notrun to 0 failed: mwscript importDump.php --wiki=wiki /vagrant/puppet/modules/role/files/gwtoolset/template/De.xml returned  instead of one of [0]

Details

Reference
bz71354

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:43 AM
bzimport set Reference to bz71354.
bzimport added a subscriber: Unknown Object (MLST).

(In reply to dan from comment #0)

actual results

> default: Error: mwscript importDump.php --wiki=wiki

/vagrant/puppet/modules/role/files/gwtoolset/template/De.xml returned
instead of one of [0]

> default: Error:

/Stage[main]/Role::Gwtoolset/Mediawiki::Import_dump[template_de]/
Exec[import_dump_template_de]/returns: change from notrun to 0 failed:
mwscript importDump.php --wiki=wiki
/vagrant/puppet/modules/role/files/gwtoolset/template/De.xml returned
instead of one of [0]

Can you run the import command directly in the VM to get a better idea of why the maintenance script is failing? vagrant ssh; mwscript importDump.php --wiki=wiki /vagrant/puppet/modules/role/files/gwtoolset/template/De.xml

Alternately, run vagrant provision again but with verbose logging: PUPPET_DEBUG=1 vagrant provision

Created attachment 16604
selected output using the recommended debug provision methods

see attached debug output

Attached:

(In reply to dan from comment #2)

Created attachment 16604 [details]
selected output using the recommended debug provision methods

see attached debug output

"Exception from line 220 of /vagrant/mediawiki/extensions/Scribunto/engines/LuaSandbox/Engine.php: The luasandbox extension is not present, this engine cannot be used."

Template:Autotranslate from that dump file is using lua and you apparently don't have the scribunto role enabled in your VM. So guess this is a bug in the gwtoolset role. It should either include role::scrubunto or it should alter the templates it is importing to avoid the use of lua.

Attached:

the role is requiring scribunto - http://git.wikimedia.org/blob/mediawiki%2Fvagrant.git/dd9408b6c1814e66787e007b256738b33dc75555/puppet%2Fmodules%2Frole%2Fmanifests%2Fgwtoolset.pp

i have also tried re-working the role by separating out the mediawiki::import_dump commands. running vagrant provision and then running a role that does the mediawiki::import_dump. in that case there is no error about scribunto, but the importDump.php error persists.

additional investigation

hopefully this helps narrow down the issue ...

  1. start from scratch => $rm -rf vagrant
  2. setup vagrant as mentioned here https://www.mediawiki.org/wiki/MediaWiki-Vagrant#Quick_start
  3. vagrant up
  4. vagrant reload

    no errors, all is well
  1. removed the mediawiki::import_dump statements from the gwtoolset role
  2. vagrant enable-role gwtoolset
  3. vagrant provision
  4. verified that scribunto and lua are present via Special:Version a. LuaSandbox 2.0-7 b. Lua 5.1.5 c. Scribunto – (ffe6757) 21:07, 25 September 2014

    no errors, all is well
  1. vagrant ssh
  2. cd /vagrant/mediawiki/maintenance
  3. php importDump.php --wiki=wiki --debug=true /vagrant/puppet/modules/role/files/gwtoolset/template/De.xml Done! You might want to run rebuildrecentchanges.php to regenerate RecentChanges
  4. http://localhost:8080/wiki/Template:De

    Lua error in Module:Fallback at line 73: attempt to index a nil value. Backtrace: Module:Fallback:73: in function "fallbackpage" Module:Fallback:99: in function "chunk" mw.lua:497: ?
  1. cd
  2. mwscript importDump.php --wiki=wiki --debug=true /vagrant/puppet/modules/role/files/gwtoolset/template/Fr.xml Done! You might want to run rebuildrecentchanges.php to regenerate RecentChanges
  3. http://localhost:8080/wiki/Template:De

    Lua error in Module:Fallback at line 73: attempt to index a nil value. Backtrace: Module:Fallback:73: in function "fallbackpage" Module:Fallback:99: in function "chunk" mw.lua:497: ?
  1. exit
  2. added back one mediawiki::import_dump statement, template_nl, to the gwtoolset role
  3. PUPPET_DEBUG=1 vagrant provision

    ==> default: Debug: Exec[import_dump_template_nl](provider=posix): Executing check 'mwscript pageExists.php --wiki=wiki Testwiki:wiki/Template:Nl' ==> default: Debug: Executing 'mwscript pageExists.php --wiki=wiki Testwiki:wiki/Template:Nl' ==> default: Debug: /Stage[main]/Role::Gwtoolset/Mediawiki::Import_dump[template_nl]/Exec[import_dump_template_nl]/unless: Testwiki:wiki/Template:Nl doesn't exist. ==> default: Debug: Exec[import_dump_template_nl](provider=posix): Executing 'mwscript importDump.php --wiki=wiki /vagrant/puppet/modules/role/files/gwtoolset/template/Nl.xml' ==> default: Debug: Executing 'mwscript importDump.php --wiki=wiki /vagrant/puppet/modules/role/files/gwtoolset/template/Nl.xml' ==> default: Error: mwscript importDump.php --wiki=wiki /vagrant/puppet/modules/role/files/gwtoolset/template/Nl.xml returned instead of one of [0] ==> default: Error: /Stage[main]/Role::Gwtoolset/Mediawiki::Import_dump[template_nl]/Exec[import_dump_template_nl]/returns: change from notrun to 0 failed: mwscript importDump.php --wiki=wiki /vagrant/puppet/modules/role/files/gwtoolset/template/Nl.xml returned instead of one of [0]
  1. http://localhost:8080/wiki/Template:Nl page does not exist

(In reply to dan from comment #5)

  1. added back one mediawiki::import_dump statement, template_nl, to the

gwtoolset role

  1. PUPPET_DEBUG=1 vagrant provision

    ==> default: Debug: Exec[import_dump_template_nl](provider=posix):

Executing check 'mwscript pageExists.php --wiki=wiki
Testwiki:wiki/Template:Nl'

==> default: Debug: Executing 'mwscript pageExists.php --wiki=wiki

Testwiki:wiki/Template:Nl'

==> default: Debug:

/Stage[main]/Role::Gwtoolset/Mediawiki::Import_dump[template_nl]/
Exec[import_dump_template_nl]/unless: Testwiki:wiki/Template:Nl doesn't
exist.

==> default: Debug: Exec[import_dump_template_nl](provider=posix):

Executing 'mwscript importDump.php --wiki=wiki
/vagrant/puppet/modules/role/files/gwtoolset/template/Nl.xml'

==> default: Debug: Executing 'mwscript importDump.php --wiki=wiki

/vagrant/puppet/modules/role/files/gwtoolset/template/Nl.xml'

==> default: Error: mwscript importDump.php --wiki=wiki

/vagrant/puppet/modules/role/files/gwtoolset/template/Nl.xml returned
instead of one of [0]

==> default: Error:

/Stage[main]/Role::Gwtoolset/Mediawiki::Import_dump[template_nl]/
Exec[import_dump_template_nl]/returns: change from notrun to 0 failed:
mwscript importDump.php --wiki=wiki
/vagrant/puppet/modules/role/files/gwtoolset/template/Nl.xml returned
instead of one of [0]

It's strange that puppet isn't logging the output of the failed import command when you enable the debug flag. What output do you get if you ssh into the vm and manually run:

mwscript importDump.php --wiki=wiki /vagrant/puppet/modules/role/files/gwtoolset/template/Nl.xml

Created attachment 16612
another debug session with complete debug output and steps taken

Attached:

Created attachment 16737
latest vagrant provision output

if i create a new puppet method, e.g. php::import_dump, and use the maintenance script directly, php maintenance/importDump.php, the import works. i haven’t had a chance to take a closer look at mwscript, but it looks like something may be wrong with how it runs importDump.php

Attached:

From the latest log file:

Error: mwscript importDump.php --wiki=wiki /vagrant/puppet/modules/role/files/gwtoolset/template/En.xml returned instead of one of [0]

That lack of a return code indicates a crash in the import script. Is there anything in common between the various dumps that fail to load that differentiates them from the dumps that do load? Maybe some dependency on an installed extension or puppet module that is missing at the time you provision the VM?

yes, i agree.

i did try to find a common denominator, by trying to narrow down a common template in the included De.xml and Nl.xml dumps and removing it, but wasn’t able to narrow down a situation where removing a combination of the same templates allowed for the importDump.php script to succeed properly.

i also tried troubleshooting the importDump.php script directly, but it calls many methods, which don’t provide any insight if there’s an issue.

i’ll continue to troubleshoot when i get time. thanks for your feedback and thinking with me!

I've managed to track down the crash, it's coming from the first clock_gettime in timer.c of luasandbox being called (I can't tell which clock_gettime call, though, strace doesn't give me enough info).

The crash only happens when the import command is run inside puppet. The exact same mwscript command running from the vagrant vm's command line works fine.

It's a segfault coming from writing to the wrong pointer. I.e. the pointer passed as the second argument to clock_gettime probably isn't malloced properly. A typical day in C, I guess :)

Repro steps would be to import a template that needs the luasandbox via "mwscript importDump.php" inside puppet.

The strace of a crash: https://phabricator.wikimedia.org/P14

I guess the one value that helps narrow down which clock_gettime is the culprit is "2" for the first parameter.

(In reply to Gilles Dubuc from comment #12)

Repro steps would be to import a template that needs the luasandbox via
"mwscript importDump.php" inside puppet.

It would be really nice if you could figure out repro steps that don't require a complex puppet run.

(In reply to Gilles Dubuc from comment #11)

It's a segfault coming from writing to the wrong pointer. I.e. the pointer
passed as the second argument to clock_gettime probably isn't malloced
properly. A typical day in C, I guess :)

That doesn't seem supported by the strace you posted. If it segfaulted in the middle of clock_gettime, it wouldn't be indicating that clock_gettime returned successfully.

It's not impossible that such an strace could be output by a bad pointer causing clock_gettime to clobber the stack or heap, but it seems unlikely here considering that every call is taking the address of an already-existing struct timespec.

(In reply to Gilles Dubuc from comment #13)

I guess the one value that helps narrow down which clock_gettime is the
culprit is "2" for the first parameter.

That "2" isn't the first parameter, it's the tv_sec member of the struct timespec that was written by the clock_gettime call.

Change 166585 had a related patch set uploaded by Gilles:
[DONTMERGE] Role to reproduce 71354

https://gerrit.wikimedia.org/r/166585

These are the shortest repro steps I was able to put together.

Apply the patch I've just added above.

  • Make sure that the "zend" and "scribunto" roles are both provisioned (this has to be provisioned beforehand in order to avoid #71901 ). Zend might not be necessary, but since hhvm tends to be guilty until proven otherwise these days...
  • Enable and provision the "luacrash" role

-> The crash should happen during the provision run and you should find the strace in /tmp/stracetemplate_isodate

I'm not sure if the repro steps can be simplified further, considering that the command will only crash when run inside puppet. The exact same command run from vagrant ssh runs fine.

Enjoy! You clearly know more about making sense of that strace than I do :)

Hah, it turns out that the apparent success in the regular command line was misleading: it segfaults in there just the same and doesn't import the template.

Thus, the repro steps are a lot easier and don't involve puppet at all.

Get the XML dump of a template using lua from my patch posted above and do this:

mwscript importDump.php --wiki=wiki /vagrant/puppet/modules/role/files/multimedia/template/ISOdate.xml
echo $?
139

I'm pretty sure that this would happen with any dump of a template using lua.

Same for running directly without the mwscript wrapper:

vagrant@mediawiki-vagrant:/vagrant/mediawiki/maintenance$ php importDump.php /vagrant/puppet/modules/role/files/multimedia/template/ISOdate.xml
Segmentation fault (core dumped)

The crash is identical as before, strace segfaults at the exact same spot.

I think it's mildly concerning that mwscript doesn't report anything other than its exit code, because the segfault is invisible to someone running mwscript manually from the command line.

Ok, I managed to downgrade vagrant far enough to avoid bug 71878 and reproduce this. It turns out the bug is already fixed by Gerrit change 159757,[1] but the version of php-luasandbox available to vagrant doesn't include that change.

So the action needed here is that the php-luasandbox package used by vagrant needs updating.

[1]: Specifically, the lack of the #include <pthread.h> is causing the compiler to assume pthread_self() returns an int rather than an unsigned long int, so the value gets truncated and pthread_getcpuclockid() crashes on the incorrect value.

What's the process to package a new version?

Oddly enough it seems like your fix predates the time the latest package (the one being used by vagrant) was put together: http://apt.wikimedia.org/wikimedia/pool/main/p/php-luasandbox/

Change 166585 abandoned by Gilles:
[DONTMERGE] Role to reproduce 71354

Reason:
No need for that to repro the bug anymore

https://gerrit.wikimedia.org/r/166585

@Gilles is this still reproducible? If so I would suggest adding Ops to the ticket and asking for update of the php5 luasandbox package for trusty.

bd808 triaged this task as Medium priority.Dec 30 2014, 1:21 AM

@Gilles is this still reproducible? If so I would suggest adding Ops to the ticket and asking for update of the php5 luasandbox package for trusty.

Yes.

@Gilles is this still reproducible? If so I would suggest adding Ops to the ticket and asking for update of the php5 luasandbox package for trusty.

Yes.

Not sure who to ask on this, other than can you guys make a ticket to this effect that is distinct from this one?

@akosiaris would be the person I would ask about php packages so maybe he has an idea of the difficulty

Gilles claimed this task.

Updating to the latest php-luasandbox with apt solved the issue.