Page MenuHomePhabricator

Remove pipeline training code from research/mwaddlink
Open, LowPublic

Description

Historically, the research/mwaddlink repository on Gerrit contained two related-but-distinct codebases:

  1. The training code to generate Add-Link-Structured-Task models
  2. The Link Recommendation service to generate recommendations based on those models

With the migration towards Airflow, this has changed. Nowadays, the repository is only used for the Link Recommendation service itself. Model training code is present in the Airflow DAGs repository on GitLab (more specific link).

In a discussion with @OKarakaya-WMF, @Urbanecm_WMF and Ozge decided to remove the pipeline training code from research/mwaddlink to avoid confusion (example patch from a confused author), replacing it with a link to the new location.

Event Timeline

Change #1237917 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[research/mwaddlink@main] [WIP] cleanup: Remove training code

https://gerrit.wikimedia.org/r/1237917

Let's start with removing the code itself. I'll work on cleaning up requirements.txt (and hopefully the overall size of the image) once we know the code removal doesn't break anything.

Change #1237917 merged by jenkins-bot:

[research/mwaddlink@main] cleanup: Remove training code

https://gerrit.wikimedia.org/r/1237917

Change #1240045 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[operations/deployment-charts@master] linkrecommendation: Bump version

https://gerrit.wikimedia.org/r/1240045

Change #1240045 merged by jenkins-bot:

[operations/deployment-charts@master] linkrecommendation: Bump version

https://gerrit.wikimedia.org/r/1240045

@Urbanecm_WMF was this work concluded? I remember we had considered working on it in a certain sprint