Page MenuHomePhabricator

Make it possible to use code from an external repository for editor-controlled Javascript/CSS
Open, MediumPublic

Assigned To
None
Authored By
Tgr
Feb 19 2018, 9:15 PM
Referenced Files
F40291677: obraz.png
Oct 26 2023, 12:56 AM
F40291382: obraz.png
Oct 26 2023, 12:56 AM
Tokens
"Love" token, awarded by daniel."Love" token, awarded by Dringsim."Burninate" token, awarded by Framawiki."Love" token, awarded by MichaelSchoenitzer."Love" token, awarded by kerberizer."Love" token, awarded by Abbe98."Love" token, awarded by Cirdan."Love" token, awarded by TheDJ."Love" token, awarded by MGChecker.

Description

There are several large problems with how editor-controlled Javascript/CSS (user/site scripts/styles and gadgets) are used in MediaWiki:

Moving the code to some external version control platform should be a good solution for some of these issues and at least a foundation for a good solution for the rest:

  • code sharing would be trivial and easy to track
  • many public code repositories (most notably Github) also serve as development platforms and provide excellent code review and continuous integration support (and issue management and various other things)
  • while testing/debugging support would still have to be implemented on the MediaWiki side, pull requests / feature branches at least provide a sane basis for it (as opposed to the current state of the art for code review, which is pasting suggested changes to the talk page)

(frwiki is already working on moving their gadget code to git and might be interested in this. The 2018 Montpellier pre-hackathon also has a related theme.)

Implementation proposal:

  • Provide a configuration setting to associate a wiki with a list of repository providers.
  • Provide a web configuration interface for associating wiki pages (MediaWiki:*.js/css/json, maybe User:*.js/css/json) with files in a git repository (on some specified branch/tag).
  • Lock those pages from editing.
  • Modify the appearance of these pages to give some hint of what's going on.
  • Provide some kind of form for updating the pages associated to a certain repository to the latest version. This would work much like a normal edit (appear in recent changes etc.) so monitoring such code with the usual set of wiki tools would still be possible.
  • Optionally, provide a webhook for listening to updates; when the code is updated in git, notify the maintainers (how would they be identified though)? Or even update automatically, although that might be a bad security-convenience tradeoff.
  • Provide some TemplateSandbox-like tool to override which branch the files are loaded from (for the purposes of testing pull requests). gerrit 340768 will make this fairly easy.
  • Probably provide some sort of monitoring/reporting so that security engineers can get an easy overview of what repos to watch.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
  • Provide a configuration interface for associating wiki pages with files in a git repository (on some specified branch)
  • Lock those pages from editing

I don't think the code would depend much of the specifics of what external version control system is being used. That choice could probably be left to the code maintainers.

I'd be skeptical of having development of on-wiki pages splintered across a multitude of external sites, with attendant PI disclosure and account management issues.

I'd be skeptical of having development of on-wiki pages splintered across a multitude of external sites, with attendant PI disclosure and account management issues.

The list of repository providers should be configurable; beyond that, IMO it's up to individual wiki communities if they want a whitelist of providers (possibly one only consisting of gerrit and/or Phabricator), or to the Wikimedia community as a whole if wants to enforce one on all communities. I doubt using e.g. Github is too much problem in practice though (some communities already opted to do that, copying gadgets manually). PI and account issues only affect the gadget maintainers who might well consider that a good tradeoff for the improved productivity / maintainability.

Just a stupid question: Why not utilizing phab:Diffusion as repository?

That would re-use existing nicks of Phab/SUL and connect messages and watching within one environment.

A top-level project might be established, ExternalResource or whatever.

Within that, there would be three directories:

  • user for individual nicks, avoiding ID conflicts for multiple tableSort ideas, referring to SUL identities.
  • group for working groups as desired on Labs/Tools, maintaining several tools by one group of people, with similar application area.
  • global for independent things named according to T117540

Then things might be accessed without conflicting paths.

And hosted in a wikimedia.org domain with WMF control over server, access and logfiles.

That wouldn't work for non-WMF wikis, unless the WMF suddenly decided to allow third-party wikis to use the WMF Phabricator for such things.

That wouldn't work for non-WMF wikis, unless the WMF suddenly decided to allow third-party wikis to use the WMF Phabricator for such things.

Why not?

That is a configurable issue. Imagine some $wgExternalResourceRepository string or object.

  • WMF would use phab:Diffusion for our purpose.
  • Non-WMF installations might use any GIT or local or whatever.

A separated configuration record for adaption of such things is used everywhere, and not supposed to be hardcoded somewhere deep in the gutts.

Just a stupid question: Why not utilizing phab:Diffusion as repository?

There is nor reason why someone couldn't do that. I could think of some why someone wouldn't want to do that, but that decision is best left to the gadget maintainer and/or the wiki community and/or the larger community that operates the wiki. So the issue of which provider to support/allow is not really relevant to this task.

That would re-use existing nicks of Phab/SUL and connect messages and watching within one environment.

That level of integration is unrealistic for a side project, and I doubt it's really desired either. Repo providers have their own tracking/notification systems and those work just fine, there is no point in reinventing them. (And in any case, it wouldn't be any harder to do for other providers than for Phabricator.)

I don't think the code would depend much of the specifics of what external version control system is being used. That choice could probably be left to the code maintainers.

I'd be skeptical of having development of on-wiki pages splintered across a multitude of external sites, with attendant PI disclosure and account management issues.

Sure. This could potentially be mitigated by having a configurable whitelist for sources, I think. For Wikimedia purposes, we'd perhaps only whitelist phabricator.wikimedia.org and wmflabs.org, while other wikis could whitelist github.com, for example. Though if it were the MediaWiki installation doing the fetching and then "caching" the pages locally, a lot of these privacy concerns could be mitigated. It's not totally clear to me what the implementation being proposed here is. I'm not sure we're discussing direct client access to foreign resources.

I'm beginning to agree that integrating Git repositories as a substitute for on-wiki JavaScript and CSS pages may make sense, but I think it quickly becomes a question of how much integration with MediaWiki would be desirable.

I plan to work on this in my free time. My initial plan (haven't really looked at feasibility yet) is to write a new extension which can manage User:*.js/css and MediaWiki:*.js/css pages:

  • Provide a configuration interface for associating wiki pages with files in a git repository (on some specified branch)
  • Lock those pages from editing

Does lock here mean pinning? One idea, drawing off of past work that Ori and Yuri and others have done, would be to have wiki pages using a bit of structured data. Individual wikis could pin a script by referencing its URL and hash, similar to how pip works. When the page gets edited to include a new URL and hash, it could even provide a diff or a link to the release notes or similar in the edit summary.

  • Provide a webhook for listening to updates; when the code is updated in git, update the local page. This would work much like a normal edit (appear in recent changes etc.) so monitoring such code with the usual set of wiki tools would still be possible.

I would consider keeping this part manual. Maybe provide a drop-down menu of available versions of a repo/script, and have that drop-down menu be automatically populated on a regular basis. This would allow wiki editors to deliberately update rather than getting surprise updates. And in the future, you could theoretically add a drop-down menu option of "auto-update" if people really want that.

Though if it were the MediaWiki installation doing the fetching and then "caching" the pages locally, a lot of these privacy concerns could be mitigated.

What I was getting at was that if you want to edit the code in this scheme, you'd have to go to the external repository, create an account there, and submit edits/pull requests/whatever there. Thus you have to manage having that external account and deal with whatever privacy policies that external site has. None of that is mitigated by caching pages locally, unless you'd have MediaWiki allowing submission of on-wiki edits that get pushed anonymously to the remote repo instead of being saved locally.

With a carefully-controlled whitelist that's mitigated somewhat, although the original proposal didn't include one.

Looking at the task again, it seems to me that the appointed repo is a per-gadget association, not one unique repo per site or per WMF farm.

There might be gadget code residing on a Phab@WMF/Diffusion branch, integrated in regular business here, and some on GIT, and some 3rd party gadget could live at 3rdpartygadget.com or whereever.

Does lock here mean pinning?

The workflow I had in mind is that gadget maintainers just use the latest version of master. I guess that only makes sense for in-house code though, not third-party dependencies. And supporting a build process is not really realistic and we want gadgets to be able to share dependencies, so pulling in third-party code as standalone gadgets will be necessary sometimes. It's not something for the MVP version though - it raises interesting legal questions, amongst other things.

I would consider keeping this part manual. Maybe provide a drop-down menu of available versions of a repo/script, and have that drop-down menu be automatically populated on a regular basis. This would allow wiki editors to deliberately update rather than getting surprise updates. And in the future, you could theoretically add a drop-down menu option of "auto-update" if people really want that.

That's a bit inconvenient, but then it won't happen frequently. I suppose it's worth the security benefits (we already have trouble with too many people being able to deploy sitewide Javascript, we probably shouldn't extend that to contributors of random GitHub repos).

Change 427622 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/extensions/GitGadgets@master] [WIP] Create the extension

https://gerrit.wikimedia.org/r/427622

Broadly speaking, I've seen three different approaches proposed on this task and related tasks. I've tried to summarise them below.

1. Single repository for gadgets (Installed as extension)

Accommodated use cases:

  • Code review and edit requests.
  • Linting, unit testing and integration testing.
  • Very easy to install and update for third-party administrators.
  • Easy to get the compatible version of a gadget for a specific MediaWiki version.
  • Changes are live on Beta Cluster within 5 minutes.
  • Gadget maintainer can test the changes on real wikis. (During deployment via WikimediaDebug).
  • Changes can be live on Wikimedia wikis within 1 hour or 24 hours (depending on the time and day).

Limitations:

  • The deployed version of a gadget is controlled via Gerrit. If a revert is needed, this must be done via SWAT.

Example implementation:

  • The "Gadgets" extension would gain the ability to register gadgets as ResourceLoader modules (in addition to gadgets from local wiki pages).
  • mediawiki/extensions/WikimediaGadgets
    • This repository would contain all files for global Gadgets.
    • Gadget maintainers can write tests and add linters in any way they want via package.json (For example, ESLint and QUnit, or something else).
    • This extension would contain generic logic that automatically registers each module in this extension as a Gadget on the wiki.

Requirements for proposing changes to global gadgets (creating new ones, or editing existing ones):

  • Have a Wikimedia Developer account.
  • Basic familiarity with Git (either using a GUI, like "GitHub Desktop", "Atom", "Git Tower" etc.; or via command-line).

Workflow for proposing changes:

  • Have a git-clone of the WikimediaGadgets repository.
  • Make the changes to the files on your computer.
  • Create a commit and send to Gerrit.

Requirements for deploying changes:

2. Multiple repositories for gadgets (Installed as extensions)

Accommodated use cases:

  • Code review and edit requests from anyone.
  • Linting, unit testing and integration testing.
  • Easy to install and update for third-party administrators.
  • Easy to get correct version for specific MediaWiki stable versions.
  • Changes are live on Beta Cluster within 5 minutes.
  • Gadget maintainer can test the changes on real wikis.
  • Changes can be live on Wikimedia wikis within 1 hour or 24 hours.

Limitations:

  • The deployed version of a gadget is controlled via Gerrit. (Revert via SWAT).
  • When creating a new global gadget, there is one-time setup involved:
    • Gadget maintainer requests creation of Gerrit repository.
    • Gadget maintainer add new repository to the list of extensions in wmf-config.
    • Gadget maintainer sets up continuous integration.

Example implementation:

  • The "Gadgets" extension would gain the ability to register gadgets as ResourceLoader modules.
  • mediawiki/gadget/<MyGadget>
    • This repository would contain the files for one global Gadget.
    • This repository would also contain a small MediaWiki extension file that just registers the Gadget. (e.g. an attribute in extension.json that adds the module name to a list of gadget modules.)

Requirements for proposing changes:

Workflow for proposing changes:

  • Have a git-clone of the Gadget<MyGadget> repository.
  • Make the changes to the files on your computer.
  • Create a commit and send to Gerrit.

Requirements for deploying changes:

3. Manager extension (Fetches from Git at run-time)

Accommodated use cases:

  • Code review and edit requests.
  • Linting, unit testing and integration testing. (Possibly to set-up manually, depending on git host)
  • Changes (including reverts) can be live on Wikimedia wikis within 5 minutes.

Limitations:

  • When creating a new global gadget, there is one-time setup involved:
    • Gadget maintainer creates a repository somewhere (e.g. request new Gerrit repo on mediawiki.org).
    • Gadget maintainer adds (or requests addition) of the repository to the list of GitGadget sources (e.g. on Meta-Wiki, Test Wiki, or Beta Cluster).
    • Gadget maintainer sets up continuous integration.

Example implementation:

  • The "Gadgets" extension would gain the ability to register gadgets as ResourceLoader modules.
  • A new "GitGadgets" extension would provide databases, APIs, special pages and user rights for managing external git-based sources of gadgets.

Comparison

1. Single repository for gadgets:
  • Third-party admins: Easier and faster to install than proposal 2 and 3. Installing the extension provides all the same gadgets as Wikimedia wikis automatically.
  • Gadget maintainers: Easier and faster to create more gadgets than proposal 2 and 3. (No new repository needed, and no MediaWiki extension needed.)
  • Gadget maintainers: Reverting changes requires SWAT deployment.
2. Multiple repositories for gadgets:
  • Third-party admins: Harder to install than proposal 1, because requires installing each gadget separately.
  • Gadget maintainers: Harder to create new gadgets than proposal 1. (Requires a new repository, extension and continuous integration.)
  • Gadget maintainers: Reverting changes requires SWAT deployment.
3. Manager extension
  • Gadget maintainers: Less time required to revert changes. (A gadget manager can revert the version of the git gadget using a special page on the wiki.)
  • Gadget maintainers: Harder to create new gadgets than proposal 1. (Requires a new repository and continuous integration.)
  • Third-party admins: Harder to install than proposal 2 and 3. Requires setting up external git sources. (This may be simplified if we provide a way to "export" sources from Wikimedia wikis and then "import" them locally. But either way, adds complexity for administrators to maintain and keep in sync.)
  • Third-party admins: Harder to get the right version for MediaWiki compatibility (compared to proposal 2 or 3). The version (branch or commit id) of each gadget repository needs to be configured manually. This means site admins have to learn and use a new system, separate from extension versioning.

Thank you for this exhaustive analysis. However, I woud like that to be completed by three more issues, if they turn out to show differences:

  • Security: How is that protected against undesired gadget types, and attacking the code? By whom and by which authority a review will be performed before code comes into effect on wikis?
  • Accounts: Where and with what authorization and connection to trusted MediaWiki SUL nicks will show up?
  • Discussion workflow: Does the entire communication, watching and management take place under MediaWiki Phabricator with one single account and by MediaWiki Phabricator workboards, or separately on various repo platforms?

Altogether in one point: How strong is integration and reusing of well known MediaWiki procedures and communication?

Being able to hack Lua modules elsewhere might be nice also.

Being able to hack Lua modules elsewhere might be nice also.

Strong support for the Module namespace too.

chasemp triaged this task as Medium priority.Dec 9 2019, 5:09 PM
chasemp added a project: Security-Team.
Aklapper removed Tgr as the assignee of this task.Jul 2 2021, 5:21 AM

Removing task assignee due to inactivity, as this open task has been assigned for more than two years (see emails sent to assignee on May26 and Jun17, and T270544). Please assign this task to yourself again if you still realistically [plan to] work on this task - it would be very welcome!

(See https://www.mediawiki.org/wiki/Bug_management/Assignee_cleanup for tips how to best manage your individual work in Phabricator.)

So I kind of actually implemented this myself 🙂

There are two tools:

  • Wiki2git is a command-line Node tool to export existing gadgets from Mediawiki. This creates a Git repository preserving all contributions (authors, revisions).
  • Wikiploy is an npm module that makes it easy to deploy gadgets back to Mediawiki. This has two flavors:
    • Wikiploy full is using Puppeteer, which controls Chrome Canary, for example. This is fun and easy to set up but rather slow. I might even remove that in the future or maybe move it to a separate tool.
    • WikiployLite is using the bot API to deploy scripts. You have to set up a token for the bot, but it's super fast. The whole process of building and deployment is faster than editing the wiki.

This is an example setup I have in one of my user scripts:

obraz.png (768×1 px, 84 KB)

I have a one-click build & deploy link in my VSCode for this script. For the curious, the bar with green links comes from a VSC extension Commandbar.

In a different gadget I deploy to Wikipedia and Wikisource all in one script. That gadget is actually originally by @matmarex as you can see in commit history 🙂

obraz.png (768×1 px, 132 KB)

Anyway, Wiki2git and Wikiploy solves my problems working on Gadgets 🙂. I can now use tools like Mocha/Jest to test and Less/Sass + browserify to build. I could also use Babel to compile to ES5... But honestly, I don't care about IE anymore. Maybe I would still use Babel.js for default Gadgets.

So, I'm not sure if there is anything to be done within this task. I might add some new features to the tools. Please let me know if you see some obvious things missing or things that could work better? It would be great to hear, for example, if you think using Puppeteer is good, fun, or bad? As mentioned, I'm thinking of removing Puppeteer support...

Links:

FWIW and in case anyone's interested, on bgwiki we have a Python daemon that does the job for us for a few years now: https://github.com/wikimedia-bg/git-sync

The JS, Module, Spam-black/whitelist, and Titleblacklist pages are trasnparently synchronized to https://github.com/wikimedia-bg/ -- the wikipedia-{ui,lua,spam,tbl} repos respectively.

There's also a small gadget that provides proper links to GH in the page histories: https://bg.wikipedia.org/wiki/MediaWiki:Gadget-ParsePhabLinks.js

FWIW and in case anyone's interested, on bgwiki we have a Python daemon that does the job for us for a few years now: https://github.com/wikimedia-bg/git-sync

The JS, Module, Spam-black/whitelist, and Titleblacklist pages are trasnparently synchronized to https://github.com/wikimedia-bg/ -- the wikipedia-{ui,lua,spam,tbl} repos respectively.

There's also a small gadget that provides proper links to GH in the page histories: https://bg.wikipedia.org/wiki/MediaWiki:Gadget-ParsePhabLinks.js

Thanks for sharing this :) Interesting idea!

I don't think there are hackathon-sized improvements to MediaWiki core specifically that would help here.