Page MenuHomePhabricator

Migrate Data Engineering Pipelinelib repos to GitLab
Closed, ResolvedPublic8 Estimated Story Points

Description

Repos using the pipeline still on Gerrit:

Once these are migrated:


Migrating Data Engineering maintained NodeJS non PipelineLib repos will be tracked in T366611: Migrate Data Engineering NodeJS library repos to GitLab

Details

Related Changes in Gerrit:
Related Changes in GitLab:
TitleReferenceAuthorSource BranchDest Branch
Remove Gerrit gitreview and URl from Gitlab reporepos/data-engineering/eventstreams!4ebysansversion_bumpmaster
Add Eventstream and Eventgate-Wikimedia project repos to trusted runnersrepos/releng/gitlab-trusted-runner!82ebysansrepo_to_tested_runnersmain
Remove Gerrit gitreview and URl from Gitlab reporepos/data-engineering/eventstreams!3ebysansRemove_Gerritmaster
Migrate Repository CI from Github to Gitlabrepos/data-engineering/eventgate!1ebysansgitlab_migrationmaster
Migrate Gerrit PipelineLib to Gitlab Kokkurirepos/data-engineering/node-rdkafka-factory!1ebysansgitlab_migrationmaster
Migrate Gerrit PipelineLib to Gitlab Kokkurirepos/data-engineering/eventstreams!1ebysansgitlab_cimaster
Migrate CI from Gerrit PipelineLib to Gitlab Kokkurirepos/data-engineering/eventgate-wikimedia!1ebysansgitlab_cimaster
Customize query in GitLab

Event Timeline

lbowmaker set the point value for this task to 8.Mar 25 2024, 5:18 PM
lbowmaker moved this task from Backlog to Estimated (To be planned) on the Data-Engineering board.

Would be nice to get a confirmation for archiving node-rdkafka-statsd since it'll progress T349118

I have imported the following repos to gitlab:

The next steps will be to setup the required CI checks to done on the repos.

@Snwachukwu, for libraries (node-rdkafka-factory and eventgate), we should think about how we want to publish and distribute them. It looks like node-rdkafka-factory is using npm (e.g. here is eventgate depending on it). Publishing to npm is a bit of a pain (and I'd have to dig to find docs...if there are any).

Can we publish and depend on npm pacakges from gitlab, like we do for python wheels?

Can we publish and depend on npm pacakges from gitlab, like we do for python wheels?

Yes, that's actually what I do for service-utils

Yes, that's actually what I do for service-utils

Very cool!

@tchin should we make that a reusable gitlab_ci template job in workflow_utils?

Sure we can. I will change right away. I would also change the project name to Eventstreams for uniformity.

In regards to node-rdkafka-statsd. @thcipriani what does archive mean for release engineering? Will you be deleting or will the repo be kept as read only?

In regards to node-rdkafka-statsd. @thcipriani what does archive mean for release engineering? Will you be deleting or will the repo be kept as read only?

Standard protocol is to blank the repo with a README pointing to this task, and set the repo to read-only, but not delete.

@Jdforrester-WMF thanks for the response. How do we go about this? DO we submit a ticket to release engineering or we do ourselves? DO you mind helping me with a guide for the process while I confirm if this is what we want from my the data engineering team.

Yes, that's actually what I do for service-utils

Very cool!

@tchin should we make that a reusable gitlab_ci template job in workflow_utils?

Sure, I'll take a stab at it

Don't forget that any CI that has a production deployment pipeline needs the repo to be added to trusted runners and also have their tags protected (Slack thread on protecting tags)

Thanks for the reminder @tchin . Is there some restriction on the gitlab trusted runners repo. I an unable to push to the repo. @tchin @dancy

Thanks for the reminder @tchin . Is there some restriction on the gitlab trusted runners repo. I an unable to push to the repo. @tchin @dancy

Yes, the gitlab-trusted-runner has restricted access, but you can fork the repo, make your changes in the fork, and then create a merge request from your fork into the main repo.

ebysans updated https://gitlab.wikimedia.org/repos/releng/gitlab-trusted-runner/-/merge_requests/82

Add Eventstream and Eventgate-Wikimedia project repos to trusted runners

@dancy Thank you! I have just done that. See merge request here.

Hm, you know, event schema repos are not 'pipeline lib' repos, but it would be really nice to migrate them to gitlab. cc @Ahoelzl @gmodena

Added jsonschema-tools to list, as it is similar to node-rdkafka-factory and eventgate.

Added jsonschema-tools to list, as it is similar to node-rdkafka-factory and eventgate.

Should I merge and deploy this MR before it gets migrated?

Should I merge and deploy this MR before it gets migrated?

Hm, looks like CI checks for that are failing?

Let's migrate to gitlab and then move that GitHub PR into a GitLab PR and review it there.

It will take work in gitlab schema repos to make the CI use it anyway.

I have removed the non PipelineLib repos from this task. Migrating those will be tracked in T366611: Migrate Data Engineering NodeJS library repos to GitLab.

I have removed the non PipelineLib repos from this task. Migrating those will be tracked in T366611: Migrate Data Engineering NodeJS library repos to GitLab.

@Ottomata Nice! better way to track.
The 2 repos are now migrated with working pipelines. Next steps will be to update docs and deployment-charts.

@Snwachukwu, can you 'archive' / blank(?) the migrated repos asap? And maybe update their descriptions in gerrit to say they have moved to GitLab with pointers?

SREs are sending code reviews to gerrit still:

https://gerrit.wikimedia.org/r/c/eventgate-wikimedia/+/1040163

cc @elukey

Sure. I wil try to do that right away.

Change #1040862 had a related patch set uploaded (by Snwachukwu; author: Snwachukwu):

[operations/deployment-charts@master] Update Eventgate-Wikimedia and Eventstreams repository to Gitlab source and version

https://gerrit.wikimedia.org/r/1040862

I drafted a plan for deploying Eventgate-wikimedia and Eventstreams services to staging and production clusters

Change #1040862 merged by jenkins-bot:

[operations/deployment-charts@master] Update Eventgate-Wikimedia and Eventstreams repository to Gitlab source and version

https://gerrit.wikimedia.org/r/1040862

Change #1043135 had a related patch set uploaded (by Snwachukwu; author: Snwachukwu):

[integration/config@master] Archive and Remove CI/Zuul config for Eventgate-Wikimedia and Eventstreams

https://gerrit.wikimedia.org/r/1043135

Change #1043135 merged by jenkins-bot:

[integration/config@master] Zuul: Archive and remove CI for eventgate-wikimedia and eventstreams

https://gerrit.wikimedia.org/r/1043135

Change #1043187 had a related patch set uploaded (by Snwachukwu; author: Snwachukwu):

[integration/config@master] Drop custom jobs defined in Jenkins jjb

https://gerrit.wikimedia.org/r/1043187

Change #1043187 merged by jenkins-bot:

[integration/config@master] jjb: remove eventgate-wikimedia and eventstreams

https://gerrit.wikimedia.org/r/1043187

The repos have been archived: Here are the steps i took to archive them:

  1. Add [ARCHIVED] to eventgate-wikimedia and eventstream description
  2. CHange state to "READ ONLY"
  3. Archive CI/Zuul
  4. Remove CI/Jenkins