Description
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | None | T99268 RfC: Create a proper command-line runner for MediaWiki maintenance tasks | |||
Open | None | T314231 Instantiate MaintenanceScripts after extensions have registered with the autoloader |
Event Timeline
With regard to update.php, we sometimes need to run maintenance tasks from in the middle of update.php's updating of the database schema.
The main requirement this puts on a maintenance task under this RFC is that one task (update.php) needs to be able to launch another and get back some indication of success or failure, the child task can't just exit() on failure.
The classes that power maintenance scripts will be moved out of the entrypoints, probably into includes/Maintenance. This will probably happen gradually. The entry points will continue to function, but eventually will emit deprecation notices.
They'll still extend the Maintenance class, and continue to be included in the autoloader.
I note the existing Maintenance class has both definitions of parameters and such, and also code for setting up the environment and parsing command lines. Somehow or other we'd need to split those two concerns, the new backend classes likely don't need all the environment setup and command line parsing code in their base class.
By default when you ask the runner for the list of scripts, it'll provide just the sysadmin ones, and there will be flag so you can ask for the developer ones as well.
A command line flag, a LocalSettings.php flag, or both?
It'll be called mwcmd.php (please, come up with a better name) in the root directory.
maintenance.php?
Syntax: php mwcmd.php update (for update.php), php mwcmd.php MassMessage:sendMessages (for extension scripts).
We could probably also use a syntax to somehow use a class from an out-of-tree file that's not in the registry. For example, if I copy a maintenance script to my home directory and change it somehow (for debugging or a quick hack). Or maybe an extension's "cleanup" script that will fix stuff in the database after the extension has been uninstalled without having to reinstall it first.
Right. We'd probably want fatalError() to do something else for children. Maybe throw an exception or return of a StatusValue?
The classes that power maintenance scripts will be moved out of the entrypoints, probably into includes/Maintenance. This will probably happen gradually. The entry points will continue to function, but eventually will emit deprecation notices.
They'll still extend the Maintenance class, and continue to be included in the autoloader.
I note the existing Maintenance class has both definitions of parameters and such, and also code for setting up the environment and parsing command lines. Somehow or other we'd need to split those two concerns, the new backend classes likely don't need all the environment setup and command line parsing code in their base class.
Good point. I think for nearly all scripts we don't need to maintain PHP interface compatibility (there might be a few that get extended?). We could probably extract a smaller "MaintenanceTask" (or another name) base class from Maintenance that doesn't include that stuff, but keeps a similar interface for most maintenance scripts to use.
By default when you ask the runner for the list of scripts, it'll provide just the sysadmin ones, and there will be flag so you can ask for the developer ones as well.
A command line flag, a LocalSettings.php flag, or both?
I was intending for a command line flag.
It'll be called mwcmd.php (please, come up with a better name) in the root directory.
maintenance.php?
I like!
Syntax: php mwcmd.php update (for update.php), php mwcmd.php MassMessage:sendMessages (for extension scripts).
We could probably also use a syntax to somehow use a class from an out-of-tree file that's not in the registry. For example, if I copy a maintenance script to my home directory and change it somehow (for debugging or a quick hack). Or maybe an extension's "cleanup" script that will fix stuff in the database after the extension has been uninstalled without having to reinstall it first.
Hmm, so you'd need to specify both an extra file name to include (since it wouldn't be in the autoloader), and then the class name to run. php maintenance.php --extra-include=~/foo.php class:MyFoo. (Can we just look up the name in the registry, and then check if that name is a class that implements MaintenanceTask? Or should we require a class: prefix to force it into a class?).
Probably throw an exception, which would be caught by maintenance.php or runChild() and handled appropriately.
Hmm, so you'd need to specify both an extra file name to include (since it wouldn't be in the autoloader), and then the class name to run. php maintenance.php --extra-include=~/foo.php class:MyFoo. (Can we just look up the name in the registry, and then check if that name is a class that implements MaintenanceTask? Or should we require a class: prefix to force it into a class?).
There should be the option to still go via the registry, in case ~/foo.php is just a hacked copy of a registered class. We probably shouldn't allow naming classes directly at all unless --extra-include is given. I don't have an opinion on whether a "class:" prefix should be required.
TechCom is hosting an IRC meeting to discuss this RFC tomorrow. The meeting is scheduled for Tuesday 11 September at 2pm PST(21:00 UTC, 23:00 CET) in #wikimedia-office. NOTE: this meeting is one day earlier than TechCom IRC discussions are normally.
I updated the RfC text with the points that Anomie raised: https://www.mediawiki.org/w/index.php?diff=2876495&oldid=2859545&title=Requests_for_comment%2FProper_command-line_runner_for_maintenance_tasks&type=revision :)
I noted this on the RFC talk page, but let me also put it here: Why limit to maintenance tasks ?
Why not a general architecture to run (command line) tools, of which the maintenance tool is just a more specific set/scope (maybe it's own registry ?)
I advise people to take a look at something like https://laravel.com/docs/5.6/artisan
which also allows you to document the command and its arguments, to call command line programs programmatically, to run them using a scheduler or to queue the command itself as job.
Ive been enjoying using that a lot and I hope it can inspire some of the work that would go into a tool like this.
We tend to use "maintenance script" to refer to any "command line tool".
I advise people to take a look at something like https://laravel.com/docs/5.6/artisan
which also allows you to document the command and its arguments,
MediaWiki's existing Maintenance class does that too.
to call command line programs programmatically,
If you look at T99268#4530407, I was talking about much the same thing.
Note that in general it's better architecture to have shared backend logic than to "call command line programs programmatically" with the attendant serialization of parameters to $argv-style string arrays.
to run them using a scheduler
I don't know what you mean here, unless you're trying to say we should have some sort of cron reimplementation built into MediaWiki (which would probably use the job queue?).
Although really if you want scheduled runs of a "command line tool", you're probably best served by actually using cron.
or to queue the command itself as job.
See above regarding shared backend logic.
Based on the log, I think this was supposed to go to last call, but I'm not sure that ever happened (I don't see it in the TechCom notes...). Can we send it to last call now? I don't believe anything has significantly changed since it was last discussed.
Per the TechCom meeting on July 10, this goes on Last Call until July 24. If no relevant objections remain unaddressed by that time, this RFC will be approved as proposed.
This RFC has been approved as proposed per the TechCom meeting on 2019-08-07.
Implementation will probably be taken on by the Platform Engineering at some point, but it's currently not high priority.
@CCicalese_WMF, do you have an idea how to fit this into our processes? Writing the framework shouldn't take long, it's not much work. Should be doable in a week or two of focused work. Converting existing scripts would take longer, but would be trivial.
I'm an outreachy applicant. Can I take up this task? How should I get started with this?
Hi @Dikshagupta99 and thanks for your interest! Have you taken a look at https://www.mediawiki.org/wiki/Requests_for_comment/Proper_command-line_runner_for_maintenance_tasks already? If yes, do you have a more specific question?
@WDoranWMF @kchapman @daniel Would you all be interested in promoting/mentoring this project via Google Summer of Code 2020 or Outreachy Round 20?
Hi! This project seems interesting to me and I would like to contribute to it via GSoC'20. But before that I have some queries:
- Is the command runner we are planning is similar to the way we use Git? Like we enter git on the terminal or command prompt and we see all the commands that can be run with one line description about each command.
- Can you please link the source of maintenance scripts we are planning to include?!
Thanks in advance!
@Soumyaa1804 Thanks for your interest! FYI, we are not planning to promote this project via GSoC and might end up promoting via Outreachy. We will finalize in the next few days.
Okay! I would be interested to work on this even in Outreachy if it gets promoted there and my initial application gets selected. :)
(as per the program rules need to restrict access to this project task to Outreachy Mentors group until the application period begins)
Hi! @daniel I would like to contribute, can you please assign me some task to get started with?
@Akansha99: See https://phabricator.wikimedia.org/maniphest/query/rDGv.ANXQ4ce/#R (if Daniel does not correct me)
@srishakatux: I am wondering if step 9 on https://www.mediawiki.org/wiki/Outreachy/Participants should link to https://www.mediawiki.org/wiki/Good_first_bugs
@Aklapper Yes, that's a good point; I've added the link from the Outreachy/Participants page.
We are unlisting this project from Outreachy (Round 20). If you a potential intern, please explore other projects here https://www.mediawiki.org/wiki/Outreachy/Round_20#Ideas_for_projects.
Change 693134 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):
[mediawiki/core@master] WIP: Symfony Console based CLI for maintenance scripts
I'll be doing some more exploration around the patch above during the hackathon, if anyone's interested to code / critique / test, help is welcome :)
I noticed today that one of the first commits in the Maintenance system actually had a class-based registry, and while we lost that at some point, it connected the dots for me to something we still do today: The maintenance scripts are in the autoloader and each script has logic to avoid execution if loaded outside their own entry point. This was I believe designed to be run via a wrapper one day.
From the original commit:
$wgMaintenanceScripts = array();
Register their maintenance scripts [class name] with the system
$wgMaintenanceScripts for extensions to add their scripts to the default list.
public static function getMaintenanceScripts() { global $wgMaintenanceScripts; return $wgMaintenanceScripts + self::getCoreScripts(); }
http://mediawiki.org/wiki/Special:Code/MediaWiki/54225
https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/core/+/a1c51e18af85a9ac464c5b555921e58ec422cd11%5E%21/
If someone is willing to make small patches doing this step-by-step, I'd be happy to review it.
Last time I tried, I got lost in the way this is doing setup and loading config. Perhaps with SettingsBuilder, it can now be done in a saner way...
@Ladsgroup, @Krinkle, @kostajh: I have an experimantal chain of patches up, starting at https://gerrit.wikimedia.org/r/c/mediawiki/core/+/782569. It still needs some polishing, testing, and bikeshedding about names, but it seems to work. I'd love to hear your thoughts on the approach I'm takeing there, before I invest more time.
Change 818574 had a related patch set uploaded (by Daniel Kinzler; author: Daniel Kinzler):
[mediawiki/core@master] ExtensionRegistry: split exportExtractedData
Change 783428 had a related patch set uploaded (by Daniel Kinzler; author: Daniel Kinzler):
[mediawiki/core@master] Introduce runner entry point