Page MenuHomePhabricator

Migrate and consolidate Research teams' code to Gitlab
Closed, ResolvedPublic

Description

Continue migrating code to research gitlab repositories to increase reusability and discoverability. Where applicable, add gitlab CI pipelines (e.g. for testing, packaging). Assist new team members (former GDI) integrating with research tooling and processes.

Migrated projects

Note, creating a repo for the Organizer Lab has been moved to its own task (T344625). Also, the research-mwaddlink was wrongly included in this list, this repo is only mirrored from gerrit.

Event Timeline

leila triaged this task as High priority.

Weekly updates:

  • The new team members from the former GDI team have been added to the research group on Gitlab
  • The ML Platform team was added to the research group. T341856

Weekly updates:

  • Started collecting a list of existing code / projects that need to be migrated, with a special emphasis on code that is intertwined with PII data

Weekly updates:

Weekly updates

Weekly updates:

  • the creation of a gitlab repo for T344625 is targeted for October
  • setup meeting to collect PII requirements for T342914

Weekly updates

  • Meeting with Yu-Ming regarding the Organizer Lab codebase and PII data intertwined with code, which was very informative especially regarding the PII data handling (T342914)

Updates:

Thanks for preparing this task for Q2, Fabian.

For context for others who may read the changes: Fabian and I discussed the scoping of this task as part of prioritization discussion for Q2. At the core of it we had a couple of considerations for scoping:

  1. All code developed by Research that is actively being used (in production or by the team in other places as part of our WMF work) should be on Gitlab before we can call this task Done.
  1. We want to remain mindful of the research scientist time that we ask to be put on this front to bring legacy work to Gitlab.

We concluded that the scope of this task does include all past code by the Research team. Instead, it will focus on present or past code that will go to production, is in production, or will require maintenance. There will be some things that we won't touch, and if we run into issues in the future, we will have to do the migration at that point and address to those issues (we remain committed not to do maintenance and feature work outside of Gitlab.)

@fkaelin if you have done an accounting of code that remains outside of Gitlab when this task is done, it will be very helpful if you link it from this task description so at least we know what is out and may need work on in the future.

Updates:

  • Work on T344625 will start in Q3 due to other higher priority tasks

@fkaelin acknowledged. This means your goal for the quarter will not be completed by the end of the quarter as we planned. That is fine given the priorities communicated in T344625. However, can you work on an updated timeline with Miriam on this front and ask her/Yu-Ming to update the deadline for T344625? I'd like us to have clarity around where we can consider your work done. (For now I put the deadline to end of January as it was already past and clear that won't happen in Q2.)

fkaelin updated the task description. (Show Details)
fkaelin added subscribers: MGerlach, Isaac.

The remaining known repos have been migrated to gitlab and the github repos archived (cc @Isaac @MGerlach).

The new locations have also been added to the task description.

Regarding the Organizer Lab randomization, as this is not a migration (e.g. there is no existing repo for this new project) and there is already a phab task for it, I will remove it from this Q2 focused task and close this one as resolved.