Page MenuHomePhabricator

Running dispatchChanges as cronjob doesn't close down as expected
Closed, ResolvedPublic

Description

When dispatchChanges.php is run as a cronjob it seems like it (as least sometimes) doesn't close down as expected, and then doesn't release the database connections. After a while MySQL then runs out of free connections, and then Mediawiki and other processes on the server grinds to a halt.

Either a PID-file should be used to stop new processes from starting, or a proper close should be done, or a proper daemon should be made. Restarting new processes from crontab without knowing if the previous ones are properly closed is bad.


Version: unspecified
Severity: major
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=46643

Details

Reference
bz46476

Event Timeline

bzimport raised the priority of this task from to Unbreak Now!.Nov 22 2014, 1:41 AM
bzimport set Reference to bz46476.
bzimport added a subscriber: Unknown Object (MLST).

Setting this to highest and major until it can be verified if this script is indeed expected to be run from crontab. If so, and that is the intended use at WMF, then it should be critical or a blocker.

don't know if it's related but kept getting system error and system reboot during the weekend while running the dispatcher.

when I stopped the dispatcher cronjob, then had no issues:

1Mar 23 13:30:01 hathor suhosin[5998]: ALERT - script tried to disable memory_limit by setting it to a negative value -1 bytes which is not allowed (attacker 'REMOTE_ADDR not set', file 'unknown')

For reference:

  • I didn't change anything about the way the dispatcher script is launched, kept alive, or terminated. It should behave as before. Can you please provide the exact setup that casued problems?
  • per default, the number of passes before quitting is the number of client wikis. So, if you have 6 client wikis configured, the dispatcher will do 6 passes, then exit. When the process exits, the database connection is closed. Does the process somehow lock up and stay around? Also, please run with the --verbose option for more helpful output.
  • when run from cron, the --max-time option should be set the the period at which the cron job is run, so the old dispatcher terminates when the new one is started. Ideally, we'd never need to do this, and this could be set to "one a year". In practice, something between 10 minutes and 1 hours seems sensible, because it allows for a swift recovery should the dispatcher die.
  • dispatchChanges itself doesn't mess with the memory limit; MediaWiki does so per default I think. even if it did, I have no idea how that could trigger a reboot.
  • since ops requested a proper daemon, I guess that's the way top go. I'd be happy if I wouldn't have to code that, though.

Confirmed:

When running dispatchChanges without the --max-time parameter (and thus, relying on --max-passes, possibly using the default value), the script would loop until the given number of changes where successfully completed. However, if there were no new changes, there was nothing to dispatch, and "idle" passes were not counted as completed. So the process would be "stuck" until a new change was recorded.

The solution is to count idle passes against --max-passes as well.

patch: I525152d3

(In reply to comment #4)

patch: I525152d3

Aude / jeblad / Daniel: The patch has been merged. What work is left here? Or can this bug report be closed as FIXED?

I think this can be closed as fixed