Make it easy to fork, branch, and merge pages (or more)
Open, Needs TriagePublic

Description

It used to be conventional wisdom that forking was the death of an open source project. We all remembered the emacs -vs- xemacs -vs- Lucid emacs wars, nobody wanted to repeat that. So we took great care to keep our repos centralized and singular.

Then git arrived, and shortly after, github. Suddenly, forking wasn't evil! Instead, creating a fork was the *very first* thing that you did when you wanted to contribute.

There are a lot of benefits to the forking model. Nobody has to ask permission! Instead of "they reverted my edit!" new editors would instead just see "they didn't immediately merge my edit" -- which is less immediately offputting, and allows for additional refinement of the contribution before merge. And further, groups can potentially build a collection of articles over a long period of time, without the need to make their initial work immediately public. (The "diff" and "merge" steps are just as important as the "fork", in order to make this model work well!)

There are number of ways to experiment with "fork-and-merge" models for editing wikimedia projects. Some concrete suggestions are fleshed out below. Add your own, let's discuss, and we can figure out the best way to start experimenting with fork-and-merge.

Related tasks (thanks, @Tgr & @awight): T108664: Provide an interactive edit conflict resolution tool; T26617: Implement diff on sentence level (instead of per paragraph block); T91137: RFC: Support a branching content history model; T40795: History should support branches (at this revision there was a merge/split with that revision").

This is also mentioned on the 2015 Community Wishlist Survey: Support for version branching for pages (where considerable opposition to the idea was gained).

SUMMIT GOALS

  • The primary goal of the summit is to agree on an actionable "next step" (or "steps", if we're ambitious), so that work on improved revision models doesn't continue to stagnate. We should leave the summit with at least one implementable feature or proof-of-concept which will advance or inform the broad goal and can be implemented before the next dev summit. Some concrete suggestions raised in this thread include:
  • A UX roadmap. "How should users experience branches/forks/merges?" Since this impacts the community, I expect this to be a set of *experiments* rather than a fixed set of UX decisions. For example:
    • Prototype lightweight branches with a JavaScript gadget that forks the page into user space, deploy it on mediawiki.org, and get user feedback.
    • Have our design team mock up user-facing branches from T91137 (see wireframes at https://www.mediawiki.org/wiki/Requests_for_comment/Branching) into a form suitable for public comment.
    • Re-envision UX for a "saved drafts" feature, which might use branching support semi-invisibly as an implementation mechanism. Perhaps mobile could be the guinea pig here, letting mobile users save edits-in-progress and continue them on their desktop devices.
  • A final goal should be broad agreement on a technical roadmap. How are branched revisions going to be stored in core; how that in teracts with RESTBase, etc; how to represent a branching revision history. This roadmaps shouldn't dive into UX considerations or overly-specific implementation details (those are part of the "concrete next step" planning, if necessary) but we should have an envelope-sketch plan that we all agree makes sense, and which should help guide future RFC discussions on specific implementation proposals.

Note that the technical and UX roadmaps may be initially somewhat at-odds, as @Pginer-WMF rightfully points out in the comments below. The goal of this summit is just to write a blind first draft for both, so that we can then start a dialog between them. Otherwise both design and implementation get hung up waiting for the other. We don't expect to actually implement the drafts as-is, but we *will* publish and talk about them with the community and start working on reconciliation.

This card tracks a proposal from the 2015 Community Wishlist Survey: https://meta.wikimedia.org/wiki/2015_Community_Wishlist_Survey

This proposal received 1 support votes, and was ranked #99 out of 107 proposals. https://meta.wikimedia.org/wiki/2015_Community_Wishlist_Survey/Miscellaneous#Support_for_version_branching_for_pages

Related Objects

StatusAssignedTask
ResolvedDannyH
Opencscott
OpenNone
ResolvedNone
ResolvedNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
DeclinedNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
DeclinedAnomie
DeclinedNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
Openawight
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenSchnark
OpenNone
InvalidNone
OpenNone
DuplicateNone
StalledNone
OpenNone
There are a very large number of changes, so older changes are hidden. Show Older Changes
Isarra added a subscriber: Isarra.Oct 25 2015, 7:25 PM

Forking Wikipedia is easy. It's merging back that's hard.

Forking is hard too. It's not small, and just dealing with a database that cumbersome can be problematic, and when all you have is an xml dump (apparently the preferred way to get copies?), there may not even be any reliable tools around to even turn it back into a database.

Qgil added a comment.Oct 28 2015, 10:17 AM

This Summit proposal is still a bit vague, don't you think? I mean, there are several tasks related trying to address basically the same root problem, bot none of them seems to be driving a discussion with clear Summit objectives.

Related comment:

Considering that T113004: Make it easy to fork, branch, and merge pages (or more) is blocking this task and it is also being proposed as a Summit proposal, do you think it would be better to focus efforts on that task? I just fear spreading the attention to much, and not reaching to any conclusions at the end.

Restricted Application added a subscriber: StudiesWorld. · View Herald TranscriptNov 4 2015, 9:18 PM

@tstarling
I'm happy for this to be the canonical bug, but yeah I think we were both after the same feature. I'm not sure how to collapse the two.

@tstarling @awight, please clarify the situation of this proposal.

Today is November 6, and this proposal is basically not on track. Unless the situation suddenly changes and/or @RobLa-WMF and the Architecture Committee really want to schedule it, it will be removed as a Wikimedia-Developer-Summit-2016 proposal.

cscott added a comment.Nov 6 2015, 6:13 PM

I think this is one of the topics which everyone agrees we "should" have, but would benefit from focused discussion about "how we're going to do it". It would be a shame to see it drop by the wayside.

Let me make a stab at adding a "Summit Plan" section to the task description, and hopefully @tstarling and @awight can review/improve it.

cscott renamed this task from Making it easy to fork (and merge!) wikipedia to Making it easy to fork, branch, and merge wikipedia.Nov 6 2015, 6:32 PM
cscott updated the task description. (Show Details)
Neil_P._Quinn_WMF renamed this task from Making it easy to fork, branch, and merge wikipedia to Make it easy to fork, branch, and merge pages.Nov 6 2015, 7:20 PM
cscott added a comment.Nov 6 2015, 8:12 PM

@Neil_P._Quinn_WMF changed the title from "Making it a easy to ... fork Wikipedia" to "Making it easy to ... fork *pages*".

I guess "fork, branch, and merge pages" is a more accurate description of the concrete tasks currently proposed. But the original title was following original ideas by Erik Moeller and others about forking the entire wiki, or at least large chunks of it. WRT to T91137, for example, the question is "are branch names specific to a certain page, or are they (could they be) global to an entire wiki?".

Say you wanted to create a "Friendly Wiki" space, with its own community-driven code of conduct which promised constructive criticism, not reverts. There might be a "friendly wiki" global branch, and a "friendly wiki" community who policed that branch, reviewed edits, then suggested merges back to the "master branch" when the "friendly editors" were happy.

Or (to raise a specter), an "Inclusionist Wiki" branch, that held some articles deleted from the master branch of the wiki due to "insignificance". Or a "Deletionist Wiki" branch that had *more* deletions, and no "In Popular Culture" sections.

These uses of a branching content model are closer to "branching wikipedia" than they are to "branching pages".

Without taking a position necessarily on whether global branching was desirable, I was trying to use a title inclusive of discussion of the possibility.

saper added a subscriber: saper.Nov 6 2015, 8:49 PM
saper added a comment.Nov 6 2015, 8:51 PM

We need to learn from Ward Cunnigham's Federated Wiki a bit. They a very nice JSON paragraph-like revision protocol to exchange information between wikis. It seems to me that having things on a paragraph level made Federated Wiki much simpler, with paragraph-level visual editing included.

saper added a comment.Nov 6 2015, 8:54 PM

An idea for a small step forward:

One use case which we already have in MediaWiki today is merging histories of two articles via delete/undelete. The process is as follows:

  • Delete article A
  • Move article B onto name A
  • Undelete revisions of A and B

It would be great if non-linear history created this way to be presented on the diff page accordingly.

phuedx added a subscriber: phuedx.Nov 6 2015, 9:19 PM

@cscott, sorry, perhaps I didn't fully understandd the full scope of the task. Feel free to change the title back!

jayvdb added a subscriber: jayvdb.Nov 7 2015, 12:40 AM
awight added a comment.EditedNov 7 2015, 1:39 AM

@cscott
The summit plan looks great! It's clear that there's tension between actually supporting branching and simulating it in some cheaper form. Since we're talking about a potentially huge performance impact, and currently unknown use cases, perhaps it's best to focus the next steps on how to explore this feature, finding inexpensive ways to link branches (T107595?) and putting together infrastructure to support experimentation.

There are a *lot* of radical things we could build on top of this feature, it would be fun to write a very general system which would even encompass things like "this article A was extracted from paragraphs 12-14 of article B, at revision N. Synced again at revision M."

There are a lot of benefits to the forking model. Nobody has to ask permission! Instead of "they reverted my edit!" new editors would instead just see "they didn't immediately merge my edit"

There is nothing new in this idea, it was already discussed around 2003 way before GitHub. See http://meatballwiki.org/wiki/ViewPoint, http://meatballwiki.org/wiki/ViewPointComments plus http://meatballwiki.org/wiki/TimStarling and http://meatballwiki.org/wiki/DefendAgainstPassion . My opinion aligns with ChrisPurcell's.

Although the ticket talks about "pages", I think it is relevant to consider which are the units of knowledge that are most useful to fork and merge.

We may consider supporting the creation of alternative paragraphs and sections, or at least presenting it to the user at that granularity level. That would create a working area for each page with pieces that editors can easily move into the page (merge) or back to the working area, as well as creating multiple versions (forking) of those pieces in order to improve them and discuss them.

By encouraging smaller granularity, the purpose of changes can be more clear (e.g., improve the "Etymology" section of the "Rice" article), and the concepts simpler to communicate ("proposed sections/paragraphs that anyone can suggest an improvement for and can be eventually added to the article"), and the process of merging may become less painful by reducing the chances of processing big changes containing some parts worth merging and others that are not.

All this depends of the content contribution patterns that we want to support (e.g., users creating a full article individually vs editors collaborating on smaller parts). Although the ticket description indicates that "This roadmaps shouldn't dive into UX considerations", I think that a more detailed definition of the issues we try to solve, the intended benefits for the user and the design goals we expect to support with any given solution would be very useful before jumping into discussing the specifics of a solution.

a working area for each page with pieces that editors can easily move into the page (merge) or back to the working area

Like a wikitext talk page? :)

cscott renamed this task from Make it easy to fork, branch, and merge pages to Make it easy to fork, branch, and merge pages (or more).Nov 7 2015, 1:59 PM
cscott added a comment.Nov 7 2015, 2:16 PM

Although the ticket description indicates that "This roadmaps shouldn't dive into UX considerations", I think that a more detailed definition of the issues we try to solve, the intended benefits for the user and the design goals we expect to support with any given solution would be very useful before jumping into discussing the specifics of a solution.

The "shouldn't dive into UX" is *only* for the tertiary goal, and is just to try to unblock it and determine if we can achieve technical consensus on broad-stroke implementation details. I totally agree that the consensus reached may well be modified due to either of the first two goals: feedback or lessons learned by the "first step" project, or a more detailed UX design. But I still think it's useful to sit down at the summit with folks who know mediawiki-core inside and out and talk turkey: lots of places currently depend on a linear history. What's the best way to broaden that without breaking things?

It's worth nothing that the parsoid team has had "stable ids" on its roadmap for a while, which would provide persistent hash-like identifiers for paragraph-scale blocks of content. This could help content translation, for example. But (to play devil's advocate) when git was designed linus specifically *rejected* fine-grained tracking of this sort, claiming that any need for such information can just as efficiently be generated after-the-fact. It may well be the case here as well; but it's another possible arrow in our quiver.

a working area for each page with pieces that editors can easily move into the page (merge) or back to the working area

Like a wikitext talk page? :)

Yes, but...

A talk page is associated with the whole content page. Users participating in them need to describe to which piece of content they are referring to when proposing an improvement. This indirection gap requires an effort for those creating proposals and those trying to understand how those proposals map to the article content. For example, collecting all proposals ever made in a talk page about a given article section is not an easy task in the current granularity level.

So I agree that talk pages are currently used for this purpose (and many more due to the flexibility a blank page provides), but the fact that people using them refer to pieces of content also suggests that providing tools for operating at that level may be useful for some use cases. Having said that, this is not intended to replace a global view for the article which is still relevant in many ways. So we'll need to consider how those views should work together when exploring these possibilities.

The "shouldn't dive into UX" is *only* for the tertiary goal, and is just to try to unblock it and determine if we can achieve technical consensus on broad-stroke implementation details.

Ok. What I was trying to convey is that the technical direction will be affected by the answer of "How should users experience branches/forks/merges?" since it may lead to many different possibilities (with many different technical implications).

But (to play devil's advocate) when git was designed linus specifically *rejected* fine-grained tracking of this sort, claiming that any need for such information can just as efficiently be generated after-the-fact.

My main concern is which building blocks (i.e., concepts, operations, etc.) are the most adequate for users in our context.

Let's imagine that facilitating the revision of content at the paragraph level is a good idea. In that case, I'm interested in how the model is presented to the users, and I'm fine with any underlying implementation to support it (storing paragraph ids or getting them after the fact). The key question is whether limiting a change to the scope of a paragraph is beneficial to focus the collaboration or it prevents it by fragmenting potential broader changes to the whole page.

I think Git is a good example of a very powerful tool that is not very intuitive (and even alternative terminology has been proposed to try to add some clarity). That could be a result of its general purpose nature, and that is why I think that understanding what does versioning mean in our context is key to provide users with the right tools so that they don't feel that they have been given a powerful but unintelligible alien time machine.

@Pginer-WMF I agree with your points. I still think its worthwhile to discuss the implementation landscape so we have a rough idea what sorts of things are "easy" and which are "hard" (or even "impossible"). Hopefully the result will be a dialog between design saying "here's what we want" and implementation saying "here's what we can do" and then we find some middle ground. But we need to start the discussion on both sides first.

cscott updated the task description. (Show Details)Nov 10 2015, 8:22 PM

A different proposal on the 2015 Community Wishlist Survey is "Allow copy of pages", which is really talking about forking books AFAICT. Better fork/join mechanisms might be useful here, especially if the "forked" content could live in its own namespace.

Github allows this, for instance: I can fork project "foo/bar" into "cscott/some-other-project" and still preserve the history and linkage back to the original "foo/bar".

As I understand the user, they want to start a book on "Java 8" (say) starting from the book on "Java 7". Ideally we wouldn't lose the links back to the original book, in case there were errors fixed which would be common to both versions of the book.

I think one way of thinking about this is to build a list of viable approaches that we would like proof-of-concept implementations of. The prep work for this session could be to build a clear list of these, and also try to find people who offer to build (or lead building) a proof of concept for that approach. The meeting itself could be to take the amount of time, divide a block of it evenly(?) between the different options, and then let the people planning to build proof of concept implementations explain how they plan to do it. If no one offers to build it by/at the summit, and no one is excited to build it, then it gets removed from the list of considered options.

Does that seem like a reasonable approach?

fbstj awarded a token.Nov 18 2015, 8:50 AM
Izno added a subscriber: Izno.Nov 20 2015, 3:36 PM
Tgr added a comment.Nov 20 2015, 10:32 PM

The top ask from the dewiki/WMDE community wishlist is better edit conflict resolution, which has a lot of overlap with this (an edit conflict being basically a mini fork/merge).

AS added a subscriber: AS.Dec 3 2015, 6:28 PM
Nemo_bis updated the task description. (Show Details)Dec 7 2015, 7:37 AM
GWicke added a subscriber: GWicke.Dec 29 2015, 7:59 PM

I also think that a more concrete proposal is needed for the summit, considering the difficulty of the topic.

In the past, we have discussed HTML diffs like those on localwiki. Those are desirable for VisualEditor users in general, and would provide a fairly intuitive UX basis for an interactive merge tool.

In the past, we have discussed HTML diffs like those on localwiki. Those are desirable for VisualEditor users in general, and would provide a fairly intuitive UX basis for an interactive merge tool.

Add T105173 to blockers if you think that's closely related?

Ejegg added a subscriber: Ejegg.Jan 9 2016, 1:01 AM
Krenair added a subscriber: Krenair.Jan 9 2016, 1:03 AM

We discussed this at an unconference session at the WMF All Hands meeting. One of us will need to publish the notes (assuming they are useful, which I hope they are)

phuedx removed a subscriber: phuedx.Jan 11 2016, 5:24 AM

Wikimedia Developer Summit 2016 ended two weeks ago. This task is still open. If the session in this task took place, please make sure 1) that the session Etherpad notes are linked from this task, 2) that followup tasks for any actions identified have been created and linked from this task, 3) to change the status of this task to "resolved". If this session did not take place, change the task status to "declined". If this task itself has become a well-defined action which is not finished yet, drag and drop this task into the "Work continues after Summit" column on the project workboard. Thank you for your help!

Restricted Application added a subscriber: JEumerus. · View Herald TranscriptJan 20 2016, 6:15 PM
IMPORTANT: If you are a community developer interested in working on this task: The Wikimedia Hackathon 2016 (Jerusalem, March 31 - April 3) focuses on #Community-Wishlist-Survey projects. There is some budget for sponsoring volunteer developers. THE DEADLINE TO REQUEST TRAVEL SPONSORSHIP IS TODAY, JANUARY 21. Exceptions can be made for developers focusing on Community Wishlist projects until the end of Sunday 24, but not beyond. If you or someone you know is interested, please REGISTER NOW.
DannyH updated the task description. (Show Details)Feb 6 2016, 12:33 AM
Qgil removed a subscriber: Qgil.Feb 11 2016, 12:31 PM
Zppix moved this task from Unsorted to Working on on the Contributors-Team board.Apr 26 2016, 2:31 PM

Restating the consensus from the meeting cited above: the suggestion was made that I prototype these tools as part of a "better support for the Draft namespace" task, before making more fundamental changes to article editing or mediawiki. The goal would be to create a set of tools (gadgets, maybe an extension) that would allow one-button "fork to Draft namespace" and "merge from Draft namespace" with a similar UX to how user-friendly fork/merge works in github. Then additional tools can be built to manage edit conflicts between the copy stored in the draft namespace and the current "master".

Apologies if my terminology is a little bit vague; I haven't spent much time playing around with the Draft namespace yet, so this is just my recap of the approach more experienced editors suggested. Help eagerly accepted! Otherwise this is on my list of "big projects to do before the year is up".

Restricted Application added a subscriber: Luke081515. · View Herald TranscriptApr 27 2016, 3:56 PM

Apologies if my terminology is a little bit vague; I haven't spent much time playing around with the Draft namespace yet, so this is just my recap of the approach more experienced editors suggested. Help eagerly accepted! Otherwise this is on my list of "big projects to do before the year is up".

That makes sense. Currently, the Draft namespace is primarily oriented around new articles that aren't ready for mainspace. However, I think the idea to also use it for a fork/merge workflow is interesting and worth considering.

Arlolra added a subscriber: Arlolra.Nov 8 2016, 5:55 PM