The 2017 revision to the RFC process (diff) simplified things enormously. The barrier to entry is now low, with literally the only requirement being to create a task, and there's no unnecessary formalities required on the author's part.
But, I also believe this change hasn't helped RFC authors and stakeholders to be able to anticipate what will happen between now and approval.
Objective and scope
Take what we've learned in the past three years, and add a bit of structure back to our process.
- It should be naturally clear what the stakeholders likely want to know about an RFC, who these stakeholders are for a given RFC, and at what point they need to know the things they want to know. (For example, it is not needed to have a solid plan and proof of concept at filing of the RFC.)
- RFC authors should gain greater autonomy, and receive a quicker overall process by knowing what is happeniing now and what will happen next, without them needing to ask or be told by someone.
This is meant to be a retrospective and inform an interative change to our process. To constrain ourselves:
- No new requirements for what's needed at intial RFC task creation.
- No added new process overhead for RFC authors to follow.
Initial thinking points
- What are are questions that participants (esp. TechCom members) ask on almost every RFC? These are good things to consider adding into a boilerplate, FAQ, or other self-discoverable process.
- What are the steps that participants expect to be followed on most RFCs (e.g. define and consult stakeholders before scheduling meetings)? We should make it hard to mistakenly think that a step has already happened. Also try to make internal triage work easier to rotate and transfer for TechCom internally.
- What steps have we forgotten during past RFCs? What steps should have happened earlier or in a different order? We can amend the process in a small ways to optimise against such inefficiencies.
- Remove the Inbox and Backlog stages.
The Inbox was blocking authors early on, awaiting response from a weekly triage. Instead, there is now a task template that covers what TechCom would typically ask for, as well upfront documentation of what happens when and in what order.
Our previous process also did not account well for when the initial pieces of information (problem statement, resources) are not yet specified. It handled this with a Backlog column that was placed left from the Inbox on the workboard, as such it was effectively a negative (-1) phase.
Instead, when the task is filed (using the template) the process starts in "Phase 1: Define" where the focus is on these bits of information. And once filled out, the author moves it forward (no need for TechCom involvement). The same applies to "Phase 2: Resource". Authors are encouraged to proceed through the phases on their own whenever possible.
- Replace "Under discussion" stage with "Explore" and "Tune" phases.
The exploration phase is when stakeholders are notified and asked for input. It is also where proposals should be drafted (doing that earlier is still possible just as well, but this is the point where it is explicitly focused on, and needed before moving forward). In the old process it was not clear when proposals are expected, and that uncertainty sometimes led to delays when RFCs would not be filed until after proposal(s) were already iterated upon elsewhere.
The tuning phase is where the authors and other participants collaborate and iterate on the proposal(s) until the requirements and raised concerns are addressed. This now also makes explicit mention of the Architecture Principles.
- Last Call changed from "at least one week" to "two weeks".
Two weeks is what we've used in practice and I suggest we codify this to set a clearer expectation. If there is a reason to wait longer for some reason, I suggest we not start the Last Call yet. I've also added that no more than 2 RFCs should be on Last Call at any given time to ensure stakeholders have decent time to review and consider their impact.