Page MenuHomePhabricator

[RFC] Content bundler
Closed, DeclinedPublic

Description

MediaWiki needs a standard way to version, bundle, transport and sync on-wiki data (e.g. pages and files). We propose extending extension.json's schema and update.php process and providing tools to accommodate this functionality.

Use cases

Help Pages for MediaWiki

As it is currently distributed, MediaWiki does not currently come with any end-user help documentation. The Help namespace is present, but empty. Bundling documentation with the distribution of MediaWiki would allow new MediaWiki installations to point to on-wiki documentation.

This sort of bundling would help on wikis that are available to people who are not able to access the MediaWiki.org.

Properly implemented synchronization would allow upgrades to update the documentation without overwriting local modifications by merging where possible and using conflict resolution where necessary.

Extension Distribution

Extension developers will, of course, be able to distribute help pages as described above.

However, extensions like UploadWizard will be able to bundle other on-wiki content that they depend upon, such as images, when other solutions like InstantCommons are not feasible.

Standard Dev→Staging→Prod workflow

When updating a local wiki, often it is necessary to modify local wiki content and then move that content to the staging and then production sites. Providing tools to move that content will make the jobs of administrators easier.

Why not just use import?

Using the current infrastructure doesn't provide a standard, well documented set of tools for the use cases we are looking to accommodate.

Implementation

TBD

extension.json schema updates

At least two fields need to be added to the schema. One will point to the enclosed bundle's URI and the other will point to an endpoint that will be used to perform as-needed updates.

Tools

We anticipate providing the following tools to help create and manage these content bundles.

Special:Bundler

Lists known bundles and marks any that have updates available. Allows the administrator to perform updates.

Special:Bundler/conflicts

This page will allow the administrator to handle any edit conflicts that arise from bundle updates.

API endpoints

Provides a way to publish bundle updates, list and import available updates to known bundles, list and resolve conflicts.

CLI

Similar to the API. Conflicts will probably not be handled on the CLI.

update.php

Performs a hands-free update of any bundles whether included in the core or extensions. Any conflicts will be handled via Special:Bundler/conflicts

Event Timeline

MarkAHershberger raised the priority of this task from to Needs Triage.
MarkAHershberger updated the task description. (Show Details)
MarkAHershberger changed the visibility from "Public (No Login Required)" to "Subscribers".
MarkAHershberger changed the edit policy from "All Users" to "MediaWiki-Stakeholders-Group (Project)".
MarkAHershberger changed the visibility from "Subscribers" to "Public (No Login Required)".Nov 8 2015, 5:27 PM
MarkAHershberger changed the visibility from "Public (No Login Required)" to "Subscribers".Nov 8 2015, 5:29 PM
MarkAHershberger changed the visibility from "Subscribers" to "Public (No Login Required)".Nov 8 2015, 5:38 PM

Made this visible to the public. Currently this is still in draft, but I welcome your comments.

DanBolser writes:

I'm new to phabricator, so sorry if I'm doin it rong:

This works for me. I've spec'd out a bit of how I think this should be
implemented. I'll add that later today.

This feature should be in core. Help pages, for example, should be
available at installation time.

These tools are there and can definitely be used, but there needs to be
a way to automate all this.

Yup, I'm wondering what we can take from those existing tools, or even if
some could be obsoleted by this project (perhaps a way to get contribution
would be to contact each of the authors of those tools).

DanBolser writes:

I'm wondering what we can take from those existing tools, or even if
some could be obsoleted by this project (perhaps a way to get
contribution would be to contact each of the authors of those tools).

I see.

In that vein, I thought that one thing we could do is use some of the
tools from the Collection extension to create and/or manage bundles.
Make the core functionality minimal but installing the Collection
extension would give you a really nice set of tools for managing bundles.

Hi,

A few notes from our perspective:

  • Bundling should include (a selection of) wiki pages from all namespaces and images/files, but perhaps also other resources, e.g. logo.
  • One should be able to manually (from a special page) make a selection and start the bundling, like now with Special:Export
  • It should also be possible to select&bundle automatically, i.e. from a script or command line
  • There should also be an "unbundling" option, to deploy bundled content, again both manually from a special page and automatically from a script or command line
  • The automatic process is useful for transferring content and resources from an editor instance to a public instance of a wiki, where the public instance serves an approved subset of the pages in the editor instance
  • The manual process is useful for deployment of content and resources in a dev/test/staging/production environment. In many situations, security directives prohibit automatic deployment in a production environment.

I agree that most of the program code has already been written, we just need to assemble it.

Not sure how to reply there, so I'll add my thoughts here... Does the "Include templates" option on Special:Export recursively get all templates?

Does the "Include templates" option on Special:Export recursively get all templates?

Yes, it does.

I said I would write up my ideas for this, but I haven't gotten around to it. So, here we go.

I thought about using the extension collection and then, just now, looked at the Gather project. I think there is the opportunity to build on one or both of these projects. for this. Obviously, this is my muddled thoughts, but it gives me something to aim for and get feedback on.

  • Table for bundle names
table: bundle
col: name varchar(255)
col: id int
  • Table to list pages in a bundle with revisions
table: bundle_contents
col: sourcewiki -- null or interwiki id
col: page -- points to title
col: revid -- specifies specific revision of title, fallback to current if id is not specified
  • API offers list of bundles on the wiki
    • File pages mean bundle file and page description.
    • MW Help pages are first bundle instance
  • action=listbundles

returns

[ { bundle: "name", id: int }, ... ]
  • action=getbundle&id=???

returns

{ url: URL-to-post, request: 'request data to sends' }
  • CLI creates artifact to put under $IP/bundle
  • getBundle.php --list URL

List the bundles available from the wiki at URL.

  • getBundle.php --fetch=NAME URL

Get the NAME bundle from the wiki at URL, put it under $IP/bundles as compressed xml

  • getBundle.php --fetch=NAME --install URL

Get the NAME bundle from the wiki at URL, install into this wiki

After seeing @Legoktm's talk on shadow namespaces, there is some overlap, but I just ran into Template:Ping needed by Flow. I think this could be a baby step.

RobLa-WMF mentioned this in Unknown Object (Event).May 4 2016, 7:33 PM
Krinkle subscribed.

Closing old RFC that is not yet on to our 2020 process and does not appear to have an active owner. Feel free to re-open with our template or file a new one when that changes.