Page MenuHomePhabricator

GSoD 2020 Proposal: Improving Wikimedia's onboarding processes and documentation standards
Closed, ResolvedPublic


Profile Information

Name: Bello Gbadebo
IRC nick: Gbahdeyboh
Location: Lagos, Nigeria
Typical working hours: Between 6pm and 2am WAT(UTC+01:00)

I was introduced to open source a few years ago, which was around the same time my journey in tech began. I've ever since then been interested in contributing and giving back to the community via open source contributions. It can be a bit overwhelming at times, considering the fact that you need quite some strong technical knowledge to make code contributions to certain open source projects. But writing code isn't the only way to give back to the community or contribute to open source, I figured this out early enough and started contributing my little quota to open source by building and fostering the growth of communities around open source.

When this year's GSoD was announced, I figured out one aspect of making contributions to open source I haven't really explored asides building communities and writing code is writing technical documentations. Although I've written several technical articles and a few documentations for some projects I've previously worked on, which aren't open source, I strongly feel participating in programs like GSoD would not only let me improve my technical writing skills, but also give me an opportunity to explore new ways of contributing to open source while learning at the same time.


Wikimedia documentations serve as one of the first contact of new contributors to Wikimedia projects, in fact, even after getting familiar with the codebase, engineers working on this project still have to revisit the documentation at regular intervals. Therefore, the importance of a reliable, rich, well-structured, and uniform documentation across all projects can never be overemphasized.

Over the years, Wikimedia engineers have made several successful attempts to improve the existing documentation in many ways.
Here's a snippet from T203131:

There are a plethora of open tickets requesting new and improved Technical documentation for Toolforge. Improvements to documentation have also been consistently requested in the annual Toolforge survey. Gaps in consistency and documentation have been identified by users and collaborators. Many aspects of Toolforge are well-documented and some are not documented at all. This can lead to an inconsistent experience for technical collaborators.

The above lays emphasis on the inconsistent nature of documentations, one of such reasons is due to some aspects of Toolforge being documented and some not being documented at all.

Wikimedia has a rich documentation guide which might be unavoidably overwhelming for new contributors, I've observed a few loop-holes that might cause inconsistencies across new documentations documentarians write.

  • The documentation guide gives technical documentation templates which is structured based on genres. These genres provide examples that mostly link to external documentation resources such as Google, heroku, github, etc. All the external resources linked to have different standards for writing technical documentations some of which do not align with that of Wikimedia. This might confuse documentarians to use the styles in the external resource linked or perhaps mix its styles with that of Wikimedia which can lead to Inconsistency in documentation formats.
  • The existing documentation for technical writers might be a bit overwhelming to new documentarians because it contains a lot of links to other resources. I had to constantly open new tabs for several links found in every documentation page I visited.
  • According to a chat I had with a Wikimedia engineer, he said; Some of the docs are written by people who are immersed in the project hence, they can be non-beginner friendly in many cases because the person who wrote the docs are the ones who built the tool. While engineers who built a tool writing a technical documentation for their tool isn't a problem, It could become a problem if they always assume contributors would have the same degree of expertise they do. To further buttress this point, here is what a GSoC student said about the current gerrit documentation. For example the current guide to Gerrit: will take a first-time committer atleast 30-45 minutes even if they want to do very simple patches, however a GitHub repository would take 5-10 minutes at most: forking takes a few seconds, a few minutes to commit and another few to submit a pull request. The thread can be read to the end to see other problems some of these documentations have in which some successful attempts were clearly made to fix the problems stated.
  • One other problem that causes inconsistencies is that it is really difficult to keep track of some of Wikimedia's projects, the New Developers guide clearly states that The maintainers of each software project are pretty free to choose the infrastructure they prefer. In general, basically all software projects have....... While this allows the maintainers of projects to use tools they are most comfortable using, it makes it difficult to track projects which can lead to inconsistencies in documentation. This discussion thread downwards clearly shows the sort of problems this poses as maintainers have some of Wikimedia's projects on their personal accounts.

Project Outline

Having highlighted very few observations above, it becomes really evident that a lot could still be done to improve the experience that new and old contributors have with existing and newly created documentations.
A standardized style of documentation across all of Wikimedia's projects would create uniformity and make new contributors familiar with the structure of new documentations they visit.


  • Improving the technical onboarding process by creating onboarding Micro Tasks for some projects.
  • Improving documentation standards by creating a default starting template for all genres.
  • Based on feedbacks, Identify non-beginner friendly documentation and help improve them.
  • Improve technical documentation review processes.

Having stated the above deliverables, here is my detailed approach to achieving them;

Based on feedbacks, Identify non-beginner friendly documentation and help improve them

The consumers of documentations should be the most important factor we fathom when writing or improving technical documentations. It is of utmost importance to understand one's audience and experience level, in fact, it is best to write documentations that take into consideration all experience levels(beginners inclusive), If you aren't sure what sort of audience you'd be having.

Some documentations are written by people who are deeply immersed in the project and hence are not beginner/new-contributor friendly. Such documentations would be identified and I would work closely with the initial documenter and other engineers to improve them.
These improvements would be based on the feedback gathered. My approach to understanding what documentations aren't beginner friendly and how to best improve a few of them would involve the following steps;

  • Have discussions with already existing users of the documentation.
  • Onboard new contributors and observe/record their experience with the existing documentation(Optional).
  • Understand the challenges they faced, and concepts they found difficult to understand.
  • Suggest, discuss, and make changes based on feedbacks gotten.

Improving the technical onboarding process by creating onboarding Micro Tasks for some projects.

I was previously a Google Summer of Code aspirant and did make a few contributions. One of the things that brought me up to speed with making contributions to Wikimedia was the Micro tasks my mentor created (see T247835). They were indeed very helpful and created a clear path to what needs to be learned as a prerequisite to contributing to the code base. Currently, an attempt is being made to improve the technical onboarding process using this method (see T250830). This is a really good way to get new contributors to get their hands dirty quickly and get up to speed with contribution requirements for specific projects. I am going to be taking this same approach for certain projects that might be a bit difficult to get started with. Micro tasks that allow new contributors get started with projects would be created and a success criteria would be clearly defined for each task created. This would, in turn, improve their onboarding experience with certain projects.

Improving documentation standards by creating a default starting template for all genres.

Having uniformity across all documentations gives the audience a familiar feel on every Wikimedia documentation, which is important. The current guidelines for documenters are rich in information and is a good place to start. But a few improvements can still be made. The writing process and genre table define what is expected of each documentation based on its genre, but the examples linked to are not Wikimedia's documentations and have separate style guides that might not conform with that of Wikimedia.

I plan to create a default template for each type of documentation genre that can be used by documentarians to get started. A very good example of such a template example can be found here. With such a template provided for each documentation genre, documentarians don't have to remember all the style guidelines/structure or reference it always, they just pick a template that has all guidelines followed already and edit it to suit the documentation content. This would come in very handy for new documentarians too, as they don't have to worry too much about what the existing style guides are, they just focus on editing the templates for the genre they are currently writing for.
A list of all genre's that templates would be created for can be found here

Improve technical documentation review processes.

Currently, it's a bit hard to precisely say how technical documentations are being reviewed at Wikimedia as most of it is done informally. This isn't surprising, as documentations are Wiki's which anyone can edit, however, this doesn't mean a standardized review process for all documentations shouldn't be put in place.

Standardizing documentation review processes and properly reviewing documentations poses some advantages that would greatly improve the quality of technical documentation written. I aim at creating a standard documentation review process that would enforce a more organized way of getting feedbacks and improving newly created documentations.

My approach is in four basic steps to be briefly discussed below;

1) Prototype/Self review

The writer of a documentation is always the first reviewer of his own documentation, this step involves a writer thoroughly going through his own documentation and making sure it follows Wikimedia's documentation style guides, formatting, methodologies, and use of language. This help the reviewer identify his own mistakes and makes sure the documentation conforms with the style guide. Addtionally, new documentations are to be created by strictly following this guide.

2) Peer review (Optional)

This process involves the writer of the documentation having a colleague review the documentation for him informally, this enables the documentarian to get feedbacks from peers(who might as well be the users of the documentations) before submitting it for a proper review.

3) Technical Review:

This phase would require a senior colleague to go through the documentation and check for technical accuracy and completeness of its content, and how easy it would be for new contributors to adapt. Comprehensive responses would be gotten here.

4) Approval

Here, the documentation has undergone all the review processes and is published afterward.

The four steps highlighted above are not in full details as this is a proposal, additionally, a document would be created that extensively explains these processes(which are subjected to change for now). The document would be used as a reference for documentarians to know what the review process is, how to get started, and how to go about submitting a documentation draft for review.

Summary (TL;DR)

The above proposal aims to achieve the following;

  • Based on feedbacks, Identify non-beginner friendly documentations and help improve them.
  • Create Onboarding Micro Tasks for some projects, this in turn improves the onboarding process for new contributors.
  • Improve documentation standards by creating a default starting template for all genre's. This would, in turn, make sure documentarians follow the style guide and can easily get started with projects.
  • Improve technical documentation review processes, define new processes and find a way to make sure they are implemented.

The four deliverables are focused on two major things;

  1. Improving onboarding processes(Micro tasks, survey).
  2. Improving documentation standards(starting templates, review processes).


Community bonding period (August 17 - September 13)

Week 1 (September 14 - 18)
  • Analyze the project in detail with my mentors.
  • Discuss about:
    • How often the tasks should be reviewed.
    • Share schedules and decide on a weekly/daily workflow.
    • Tools and resources that can be used.
    • Bi-weekly and daily project reports.
  • Prioritize which genres to work on first during weeks 2-4
Week 2 - 4 (September 21 - October 9)
  • Carefully study Wikimedia's style guide.
  • Define the expectations of genre templates to be created for each genre.
  • Create templates for different documentation genre's that can help documentarians get started easily.
  • Create specific templates for each genre that conforms with the style guide.
  • Discuss methods that can help standardize these guides across all projects.
Week 5 - 7 (October 12 - 30)
  • Discuss with documentarians and find out how documentations are currently being reviewed.
  • Propose a documentation review format and discuss it with mentors and documentarians
  • Create a document guide explaining the finalized format discussed in depth.
  • Discuss how the format documented can be enforced on documentarians and across all Wikimedia documentations.
Week 8 (November 2 - 6)
  • Investigate causes of inconsistencies in existing documentations
  • Interact with documenters and engineers and identify non-beginner friendly documentations.
  • Identify docs that are non-beginner friendly
Week 9 - 10 (November 9 - 20)
  • Suggest, discuss, and make changes based on feedbacks gotten.
Weeks 11 (November 23 - 27)
  • Work on pending/outstanding tasks.
  • Reviewing all documentations and processes that were improved.
Week 12 (November 30 - December 5)
  • Writing a final report.
  • Submitting the final report for review and evaluation.
  • Project finalization and submissions.

Past Contributions

While I'm yet to complete documentation centric tasks, I've made some attempts to making code contributions to Mediawiki.
These contributions were not merged, but sharing them sure does depicts that I'm familiar with the technicalities of some of Wikipedia's projects.


Event Timeline

Looking at this as an outsider, I'd say it's mostly a correct summary of our challenges, however I don't understand yet how this task is supposed to improve it (and what exactly "it" is). :P There is always documentation to improve as it's a wiki that anyone can edit (which can make any page better or worse), but the outline and scope above makes me think of work for five years instead of a few months.

The Gerrit workflow is less common than the Github workflow (and the comparison to Github seems to ignore steps such as setting up a Github account).
The provided visual example for the Gerrit workflow confuses me: Where am I, as a developer? It looks like I have to start reading from the bottom middle, which does not seem common? What is the difference between a "patch", a "patchset", and a "proposed change"? What do the "no" and "yes" arrows mean which from the line "git commit --amend /path/to/file"? Why is the "review ?" line only part of the very last patchset, does it mean previous patchsets received no review? Why should I have to create yet another new "patch" (or "proposed change" if there is some difference) if I have not received a +2 review when my previous patchset received a +1 or no review yet at all? Things like these... It's very complicated to explain, as you correctly realized. :)

Also note that every new screenshot increases maintenance costs for years to come. Creating additional docs is one part, keeping them up-to-date another one. :)

Hi @Aklapper, thanks a lot for the feedback.

This proposal is subjected to quite a lot of changes, some things might still be very unclear to me at this time and would need a lot of research and interaction with my prospective mentors and Wikimedia engineers/contributors in order to be refined. For example, I wasn't considering the maintenance cost that adding more screenshots would add over the years. There would be quite a lot of other things I didn't consider as well, this is why the first week of the tentative timeline I included, is mainly discussing and analyzing the project in detail with my prospective mentors, while the second week is interacting with Wikimedia engineers and contributors.

I would work on making the outline and scope more concise.

The Gerrit workflow diagram I added was not intended to explain Gerrit, but rather to add a visual context to the What is Gerrit section here. Readers would still have to read the documentation to understand the work-flow, and as you correctly stated, it needs a lot of improvement. I would work on making it better and a bit more comprehensive. The idea was to show how adding diagrams could add more meaning to certain written concepts and make them easier to grasp.

Nice work, @Gbahdeyboh! I think you've put together a very thorough proposal here. My concern is that there's too much in scope to complete in the given timeframe. For example, creating, distributing, and analyzing a survey typically takes longer than one week. I'd like to see you narrow down the scope of the proposal to those deliverables that you think you can achieve during the timeframe, giving you time to explore existing patterns and get feedback.

Additionally, I see that you've called out "Improve technical documentation review processes." as a deliverable, but I don't see any other information about this piece. Can you elaborate on how this fits in to your goals and timeline?

Hi, @apaskulin @Aklapper,

Thanks a lot for the feedbacks you gave, they were extremely helpful.

I made some changes to the proposal and I would love to know your thoughts on them.

Hi @Gbahdeyboh, I really enjoyed reading your proposal and have a few thoughts.

In terms of deliverables, I'm not sure we are looking so much for what is inconsistent or to concentrate on creating or improving more resources for technical writers. We've covered this ground a fair bit in the last few years -- so I'd probably deemphasize those as the main deliverables.

These following deliverables are really interesting:

  • Based on feedbacks, Identify non-beginner friendly documentation and help improve them.
  • Create Onboarding Micro Tasks for some projects.
  • Create a default starting template for all genres

In particular, I think that creating onboarding micro tasks as part of improving the onboarding process would be really useful. I think a lot of people who are newer to the projects could benefit from beginning with concrete tasks that will help them orient in Wikimedia's Open Source environment.

I was also interested in this "Improve technical documentation review processes," which Alex pointed out. If you plan on working on this, could you tell us a little more in the proposal?

Hi @srodlund, Thanks a lot for the feedback. It was really helpful.
I would work on improving the documentation as highlighted.

Hi @srodlund @apaskulin, I made some updates to my proposal. I would appreciate it if you could kindly review it once again.

So you know what new things to look out for, here are the deliverables I added/edited to narrow the proposal down to achievable within the given timeframe.

There are four deliverables;

  • Based on feedbacks, Identify non-beginner friendly documentation and help improve them.
  • Improving the technical onboarding process by creating onboarding Micro Tasks for some projects.
  • Improving documentation standards by creating a default starting template for all genres.
  • Improve technical documentation review processes.

The first focuses on understanding the challenges that the other 3 deliverables are trying to solve, getting feedbacks from engineers and contributors and using this feedbacks as actionables on other deliverables.

The second aims at improving the onboarding process of new contributors(For a few projects) by creating engaging Micro Tasks that can help them get up to speed with what needs to be learned to make contributions. I understand that I can't create Micro Tasks for all projects and I plan on creating for just a few based on the feedbacks gotten in the first deliverable. This would, in turn, encourage other documentarians to create Micro Tasks for their respective projects.

The third deliverable focuses on creating a universal starting template for all Wikimedia's documentation, which would enforce style guides and consistency.

The last is focused on defining documentation review processes that can help documentarians formally get feedbacks on their documentations, and hence improve the overall quality of newly created documentations.

The four deliverables are focused on two major things;

  1. Improving onboarding processes(Micro tasks, survey).
  2. Improving documentation standards(starting templates, review processes).

Hence, I am changing the title of this proposal to Improving Wikimedia's onboarding processes and documentation standards

Please kindly read through the proposal again, I would really appreciate some final review. Thanks!

Gbahdeyboh renamed this task from GSoD 2020 Proposal: Improving existing Wikimedia documentations and defining standards for new and existing ones to GSoD 2020 Proposal: Improving Wikimedia's onboarding processes and documentation standards.Jul 8 2020, 6:10 PM
Gbahdeyboh updated the task description. (Show Details)

@Gbahdeyboh Thank you for the revisions and additions. You did an excellent job synthesizing our feedback and integrating it, making this a very comprehensive proposal! @apaskulin is away from email right now, so she may not answer you before the deadline, but we are both looking forward to discussing this proposal together.

Thanks @srodlund 😊. Your comment made me elated! 😊😊

apaskulin closed subtask T262918: GSoD 2020 as Resolved.
apaskulin closed subtask T262915: GSoD 2020: Week 1 as Resolved.

GSoD 2020 is complete!