Page MenuHomePhabricator

RFC: Implement ArchCom-affiliated working groups (process inspired by Rust's "subteams")
Closed, DeclinedPublic

Description

(2016-06-04: renamed from "subteam" to "affiliated working group")

Problem

The Wikimedia engineering community has traditionally made decisions using a mix of do-ocracy and rough consensus. In the last years, we have augmented this with a formal RFC process driven by the architecture committee. This has helped to broaden discussions, and resulted in a better shared understanding on a range of topics.

However, we are also seeing issues with the current process:

  • Lack of clarity on the overall technical direction of MediaWiki and the Wikimedia platform.
  • Difficulty of scaling the decision making process.
  • Stakeholder involvement & legitimacy.
  • Clarity and transparency of decision making process.

Proposal

(abstract): The Rust community's governance process seems to be especially suitable for our needs and philosophy. In the last year, they faced similar challenges to ours, and managed to scale to 331 RFCs by 127 community members, of which 161 were accepted and merged[1]. While heavily drawing on prior art (Python and IETF in particular), their process seems to be very streamlined for modern online collaboration, and is well documented in a thoughtful RFC.

Full RFC

mw:Requests_for_comment/Governance

See also

Related Objects

Event Timeline

GWicke raised the priority of this task from to Medium.
GWicke updated the task description. (Show Details)
GWicke added a project: Architecture.
GWicke added subscribers: GWicke, RobLa, daniel and 5 others.
GWicke updated the task description. (Show Details)
GWicke added subscribers: bd808, ssastry.
GWicke added a subscriber: Tgr.

Thanks for starting this @GWicke! I've started drafting the Wikimedia Architecture Team budget for FY16-17 (July-June), which has elements we should consider incorporating in this doc. In particular, I'm basically putting my high-level aspirations for next year as:

  • Helping identify existing technical debt in Wikimedia software
  • Avoiding creating new technical debt unnecessarily
  • Ensuring new technical debt that is incurred is done as thoughtfully as practical
  • Helping Wikimedia engineers develop better habits to achieve the above
  • Helping build alignment between different engineering groups (encouraging better communication habits between groups and discouraging walls between engineering silos)

@Krinkle and I had a great conversation about this earlier. We are developing better habits in specific domains (e.g. Performance and Security), but what habits should we mutually encourage to avoid general technical debt?

Trial-ing the RFC process part makes a lot of sense.

I support trialling the full Rust decision-making system (sub-teams, RFC process, ...)

Rust process looks good enough to at least give it a try run. One thing I would add is that to pass, RFC must evaluate resources needed to achieve it and where it would come from (i.e. having patch ready, needing WMF help, having non-WMF volunteers committed to implementing it, etc.) One of the functions of the shepherds would be to assist in that.

Rust process looks good enough to at least give it a try run.

Excellent! My instinct is that "make our ArchCom decision process look more like Rust's" is good directional guidance, with being conservative about adopting specific policies until we have a plausible rationale for them. If we all generally agree "we more-or-less following Rust's lead" rather than explicitly copying their rules word-for-word, that will help us adapt their rules to our environment in a flexible way.

One thing I would add is that to pass, RFC must evaluate resources needed to achieve it and where it would come from (i.e. having patch ready, needing WMF help, having non-WMF volunteers committed to implementing it, etc.) One of the functions of the shepherds would be to assist in that.

Hrm....that doesn't seem right. Embedding staffing requirements in the process seems to overcomplicate RFCs. Our engineering community should be realistic about things to come to consensus about ("the consensus is: everyone gets a pony! \o/"), but we shouldn't allow for TechCom to abdicate their responsibility to set a direction and/or propose a goal, regardless of their power in making staffing decisions for WMF or other entities. Certainly, if ArchCom gets into a habit of approving ornate designs and violating the YAGNI principle repeatedly, they'll ruin their credibility, but we also don't want to have a circular dependency between ArchCom decisions and investment decisions when those decisions may rest with different people. Furthermore, we need to give the latitude for ArchCom to approve things that are "nice to have" to make them good candidates for internships and other new developer opportunities.

We should strive to make faster ArchCom decisions and maybe have some failsafes to back out decisions that were made before their time. For example, one thing we could do is to give many (all?) RFCs "revisit" dates, which isn't an expiration, but is a date that we should invite comment+scrutiny when an RFC past its "revisit" date is noticed. RFCs that rely on timely implementation to prevent vaporware blockage should have reasonably short "revisit" dates on them.

There's a bunch of WikiDev '16 tasks that I've blocked on T124504: Transition WikiDev '16 working areas into working groups. In particular, the working areas (T119018) we had there suggest initial thoughts on working groups. Let's figure out if the problems suggested in each of the working areas are problems we agree make sense as working groups.

Note, there is significant mismatch of terminology between this task and the current RfC text. "Working groups", "teams" and "areas of focus" were used, but are undefined. The RfC text seems focused on finding "leaders" that would be delegated to decide about RfCs in their own fiefdom.

The RfC talks about groups and sub-groups lead by someone. That someone should be like an assignee by default, not a fief ruling an area giving orders to their serfs. Compared to the current situation in the ArchCom where the group doesn't scale and nothing seems to be owned by anyone by default, this change looks advantageous to me.

That someone should be like an assignee by default, not a fief ruling an area

Are you proposing to remove the sentence "Sub-teams are empowered to decide on RFCs within their scope"?

That someone should be like an assignee by default, not a fief ruling an area

Are you proposing to remove the sentence "Sub-teams are empowered to decide on RFCs within their scope"?

Perhaps we should rephrase that. We definitely want to avoid creating fiefdoms and competing decision making structures. We also want to avoid a situation where ArchCom is expected to micromanage everything, since that doesn't scale.

I think the sensible middle ground is to aspire for ArchCom to largely be a rubberstamping body 99.9% of the time. That 0.1% where a working area isn't living up to the community norms for achieving consensus is where ArchCom would be expected to intervene and assert community norms. We then should treat those cases as ones where we maybe even do a retrospective email to wikitech-l, possibly change or clarify a policy, and move on (hopefully having learned and improved our practice). I say "possibly change" to avoid rule/process bloat.

Note, there is significant mismatch of terminology between this task and the current RfC text. "Working groups", "teams" and "areas of focus" were used, but are undefined.

I don't object to defining these terms in the RFC if helpful, but they basically use the standard English meaning.

The RfC text seems focused on finding "leaders" that would be delegated to decide about RfCs in their own fiefdom.

This is an inaccurate description of the RFC text.

The RFC is quite clear. The leader does three things:

  1. Keeps the subteam on topic.
  2. Determines initial membership of subteam (only initial, later membership changes are by consensus within subteam).
  3. Elevating issues with unclear or broad scope to core team.

Decisions about actual RFCs are by the subteam, not the subteam leader alone.

I am curious what wording in the RFC text led you to say ""leaders" that would be delegated to decide".

I think the sensible middle ground is to aspire for ArchCom to largely be a rubberstamping body 99.9% of the time.

That is a different model from the current RFC text. The current text says the subteam makes the decision. It doesn't even go across the ArchCom desk. (I think this is right, because it's not scalable for ArchCom to even look at minor RFCs).

It's the subteam leader's responsibility (see above) to elevate RFCs that ArchCom *does* need to act on. The subteam leader is also a core team member, so these communication lines should hopefully be pretty smooth.

Are you proposing to remove the sentence "Sub-teams are empowered to decide on RFCs within their scope"?

Why would we remove that sentence? It says nothing about individual decision-making ability. It says something about team decision-making ability.

That someone should be like an assignee by default, not a fief ruling an area giving orders to their serfs.

I don't think this is quite what the RFC is proposing either. It says the subteam works on a particular area, which doesn't mean the subteam leader is assignee by default.

I think the sensible middle ground is to aspire for ArchCom to largely be a rubberstamping body 99.9% of the time.

That is a different model from the current RFC text. The current text says the subteam makes the decision. It doesn't even go across the ArchCom desk. (I think this is right, because it's not scalable for ArchCom to even look at minor RFCs).

It's the subteam leader's responsibility (see above) to elevate RFCs that ArchCom *does* need to act on. The subteam leader is also a core team member, so these communication lines should hopefully be pretty smooth.

In fairness to @Nemo_bis, I inserted this part after my reply to him:

The core team (ArchCom) serves as the "appeals court" in cases where a subteam has made a decision that hasn't yet achieved wider consensus and is disputed by the wider Wikimedia engineering community. Appeals are raised by means of RFC.

This keeps the administrative load for ArchCom down, but also gives people a recourse if one of the subgroups is running too far afield of their original mandate and/or they aren't living up to larger community standards.

@GWicke: your last edit on wiki takes out the section I wrote about consensus. Here's what it used to say:

Achieving consensus

Sub-teams are still asked to work toward consensus among interested parties with the shared goal of finding the best technical solution to a problem. Sub-teams are encouraged to enlist experts to ensure the sub-team has the right people to set the best direction. The core team (ArchCom) serves as the "appeals court" in cases where a subteam has made a decision that hasn't yet achieved wider consensus and is disputed by the wider Wikimedia engineering community. Appeals are raised by means of RFC.

The current version makes the ArchCom role less clear. It basically suggests that we need to agree on subgroup process up front, and then doesn't make clear what intervention options ArchCom has if the subgroup follows process but fails to get wider consensus. Am I missing something?

Also, the current version removes the link to mw:Consensus. Was that intentional?

@RobLa-WMF: I folded your point on the ArchCom overseeing the working groups into the last point of the previous section:

In case of conflicts, the ArchCom can intervene to ensure that decisions are made in line with the overall project principles and priorities, and the agreed upon processes are followed.

It also states that it is responsible for shutting down working groups, which I think makes it clear that the ArchCom has authority over working groups in case of conflict.

The decision making process explicitly aims for an ideally broad consensus. To quote from the "consensus" section:

One important question is: consensus among which people, exactly? Of course, the broader the consensus, the better. But at the very least, consensus within the members of the subteam should be the norm for most decisions. If the core team has done its job of communicating the values and priorities, it should be possible to fit the debate about the RFC into that framework and reach a fairly clear outcome.

It also spells out what to do in case of disagreement:

For the "deep" case, the subteam leader is empowered to make a final decision, but should consult with the rest of the core team before doing so.

All this is part of the process referenced in the working group section of the RFC.

Also, the current version removes the link to mw:Consensus. Was that intentional?

No, that was not intentional. I added a section spelling out what to do in case of disagreements, and linked to mw:Consensus from there:

The goal is to reach as wide consensus among discussion participants and working group as possible. In case of lack of consensus in the working group, the group leader can make a decision in consultation with the core team (ArchCom).

Since Swift (the language, not the file store) and Rust are competing for the same developer mindshare, it's not surprising that Swift also has a consensus-oriented model: Swift Evolution Process. I'm curious if there is any significant difference between the Swift process and the Rust process, and if so, which is better? I haven't read the two closely enough to be able to spot the significant differences. Thoughts?

Similarities:

  • Core team
  • Consensus-based process
  • Review or Final Comment Period, both typically lasting a week.

Differences:

  • Swift has only core team that ultimately decides everything. Rust has sub-teams that decide most things (limited to their scope).
  • Swift explicitly asks you to discuss the idea publicly before proposing an RFC.
  • Swift encourages prototyping the implementation and uses of it when submitting or developing a proposal.
  • Instead of established sub-teams and sub-team leaders, there is a review manager (also from the core team). This review manager assesses consensus, but among the core team.

The Swift model has some bits and pieces we could consider using (e.g. proposal states, list of commonly rejected proposals)

But Rust addresses scalability, Swift does not (it just formalizes a similar process to the one we already have).

Rob, MZMcBride and All,

And what's a shepherd in relation to the facilitator of any specific team or subteam (if this is relevant)?

Thanks, Stas (above) for the summary of the RUST decision-making process - https://github.com/rust-lang/rfcs/blob/master/text/1068-rust-governance.md#decision-making. (Is this also possibly relevant - https://oqi.wisc.edu/resourcelibrary/uploads/resources/Project_Prioritization_Guide_v_1.pdf?)

Thank you,
Scott

In T123606#1996092, @Mattflaschen wrote:

But Rust addresses scalability, Swift does not (it just formalizes a similar process to the one we already have).

Perhaps, but I don't yet see momentum gathering around any one particular working group. People seem to love them in theory, but so far in the discussions about actually forming working groups, there doesn't seem to be a lot of support. See T124504, T119030, and T119162.

My fear is that what people really want from ArchCom is a place where we can officially bless solutions looking for a problem, rather than forming working groups tasked with taking an open-minded approach to solving a problem.

In T123606#1996092, @Mattflaschen wrote:

But Rust addresses scalability, Swift does not (it just formalizes a similar process to the one we already have).

Perhaps, but I don't yet see momentum gathering around any one particular working group.

I think Front-end-Standards-Group essentially is a subteam by another name, Content Format seems like a plausible subteam, and we'll see regarding other topics.

I think the scaling (not having ArchCom do everything) is the most important part of this proposed change.

RobLa-WMF renamed this task from WIP RFC: Improving and scaling our technical decision making process to RFC: Implement ArchCom subteams (process inspired by Rust's).Mar 16 2016, 11:42 PM
RobLa-WMF updated the task description. (Show Details)
In T123606#2016852, @Mattflaschen wrote:

Perhaps, but I don't yet see momentum gathering around any one particular working group.

I think Front-end-Standards-Group essentially is a subteam by another name, Content Format seems like a plausible subteam, and we'll see regarding other topics.

That's effectively true. However, I haven't yet figured out how to make this official, since there isn't consensus in TechCom that the Front-end-Standards-Group is a subteam by the Rust-inspired standard. @Krinkle and @Catrope do fantastic work shepherding appropriate RFCs through, and @Volker_E has been very supportive of the process. The last time @Volker_E, @GWicke, and I met on the subject, @GWicke was skeptical. T119162#1980692 is where I documented the last conversation we had about it.

I think the scaling (not having ArchCom do everything) is the most important part of this proposed change.

Absolutely. The way that I'm thinking about it now, I think TechCom is the best place for a relative outsider to ask the question "what does the Wikimedia developer community think of X?", where X can be a pretty broad set of things. The answer that TechCom strives for (when an answer is needed) is "the community consensus is Y". That consensus is often IETF-style "rough consensus" which may not constitute a unanimous or even majority view.

RobLa-WMF mentioned this in Unknown Object (Event).May 4 2016, 7:33 PM
RobLa-WMF renamed this task from RFC: Implement ArchCom subteams (process inspired by Rust's) to RFC: Implement ArchCom-affiliated working groups (process inspired by Rust's "subteams").Jun 4 2016, 8:47 PM
RobLa-WMF updated the task description. (Show Details)

Z362, Z411, and Z425 are attempts to use Phab's Conpherence feature as a means of having focused conversations around a topic (rather than a task).

The challenge this quarter has been finding someone with the bandwidth, inclination and capability to lead a group. We haven't yet figured out how to franchise the TechCom model. ;-) We've been focused on Security in the past quarter due to staffing necessity.

@mmodell just made me aware of the ZeroMQ 42/C4 process (thanks Mukunda!). I haven't read this to see what distinguishes it from the others, but on first glance, it appears worth adding to the "Prior art" list.

Krinkle edited projects, added TechCom; removed TechCom-RFC, Developer-notice.