Jobs are no longer executed
Closed, ResolvedPublic

Description

Since the upgrade to MediaWiki 1.22 jobs can only be executed with the maintenance script runJobs.php. Jobs are still being added to the database table "jobs", but although $wgJobRunRate is set to "1", no jobs are executed during normal page requests.

The jobs_attempts field in the DB is 0, meaning MediaWiki did not try a single time to execute them and they also are not executed over time.

This still worked correctly in MediaWiki 1.21. It would be great, if it could be fixed again!


Version: 1.22.0
Severity: major
Whiteboard: needs-release-notes
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=60208
https://bugzilla.wikimedia.org/show_bug.cgi?id=60698
https://bugzilla.wikimedia.org/show_bug.cgi?id=60844

Details

Reference
bz58719
bzimport raised the priority of this task from to High.
bzimport set Reference to bz58719.
bzimport added a subscriber: Unknown Object (MLST).

Aaron / Tim: Any idea how Joerg could debug this problem further?

aaron added a comment.Jan 3 2014, 4:10 AM

Apache and PHP error logs would be helpful.

There are no errors in the logs.

Let's for example say I have a job from ReplaceText: When I click the "do these replacements" button, the jobs are added to the database table. Then they are _not_ executed. And when I manually run the maintenance script runJobs.php from the shell, then they _are_ executed. And at none of these points in time do I get an error message.

In ccabd0efb05e, MediaWiki was changed to start runJobs.php in the
background at the end of the request (instead of running jobs in
the same process), using the PHP binary specified as $wgPhpCli.
(The default is "/usr/bin/php".)

If you put something like $wgPhpCli = false; in your LocalSettings.php
file, are jobs executed?

Setting $wgPhpCli = false; makes the jobs run again. However, $wgPhpCli defaults to "/usr/bin/php" and I do have PHP 5.3 available at that location. So changing $wgPhpCli should not be necessary at all; things should be working with the default just fine.

Looking at ccabd0efb05e, it seems to me like one of the shell functions, wfShellWikiCmd() or wfShellExec(), do not work properly... What now?

A probably cause is open_basedir restriction in effect preventing /usr/bin/php to be accessed. See bug 60208

I think this is not my problem: open_basedir (amongst other paths) contains "/usr/bin/". That should allow access to /usr/bin/php.

Ahh, open_basedir with /usr/bin/ is set when calling the script via the webserver.

In CLI mode I have: open_basedir => no value => no value.

Another problem may be that /usr/bin/php is the wrong php interpreter. For example, if your host has several versions of the php CLI installed, and /usr/bin/php is an unsupported one, it may fail to execute.

The PHP interpreter basically seems to work. E.g. when I run "/usr/bin/php -v" from the shell, this returns the expected version information. Also something like /usr/bin/php -r 'phpinfo();' executes just fine.

I just realize that when I produce syntax errors on the shell, these are NOT logged to error log. (I guess that is why my error log is still empty after unsuccessfully running runJobs.php.) However, I see them on the shell and when I execute runJobs.php on the shell, I do NOT see any error.

Change 108740 had a related patch set uploaded by Aaron Schulz:
Various fixes to job running code in Wiki.php

https://gerrit.wikimedia.org/r/108740

  • Bug 60533 has been marked as a duplicate of this bug. ***

Change 108740 merged by jenkins-bot:
Various fixes to job running code in Wiki.php

https://gerrit.wikimedia.org/r/108740

The patch https://gerrit.wikimedia.org/r/108740 is not correct for all cases.

When you have a shared core code (the core code especially the code of /maintenance being shared by using the symbolic link method for more than one wiki) the present patch is wrong, because it does not indicate, which Wiki's job queue is treated.

The present code only runs "php runJobs.php", it missing something like

"php runJobs.php --conf=<path-to-localsettings-of-the-present-wiki>"

I hope, you understand what I mean.

(In reply to comment #14)

The patch https://gerrit.wikimedia.org/r/108740 is not correct for all cases.

When you have a shared core code (the core code especially the code of
/maintenance being shared by using the symbolic link method for more than one
wiki) the present patch is wrong, because it does not indicate, which Wiki's
job queue is treated.

The present code only runs "php runJobs.php", it missing something like

"php runJobs.php --conf=<path-to-localsettings-of-the-present-wiki>"

I hope, you understand what I mean.

I detected this problem when running the E:ReplaceText (latest git version) together with latest git version of core, i.e. dated 2014-01-31.

T. Gries: You should probably open a new bug for this

(In reply to comment #16)

T. Gries: You should probably open a new bug for this

I also thought so but wanted avoid this.

backport to the 1.22 branch is still missing

Change 110923 had a related patch set uploaded by Nemo bis:
Various fixes to job running code in Wiki.php

https://gerrit.wikimedia.org/r/110923

Change 110923 abandoned by Nemo bis:
Various fixes to job running code in Wiki.php

https://gerrit.wikimedia.org/r/110923

Change 110923 restored by MarkAHershberger:
Various fixes to job running code in Wiki.php

Reason:
I want it.

https://gerrit.wikimedia.org/r/110923

Change 110923 merged by jenkins-bot:
Various fixes to job running code in Wiki.php

https://gerrit.wikimedia.org/r/110923

Thanks for taking care, Mark!