Page MenuHomePhabricator

Move integration-castor03.integration.eqiad.wmflabs to a newer cloudvirt machine
Closed, ResolvedPublic

Description

The CI jobs rsync some cached materials from a central instance integration-castor03.integration.eqiad.wmflabs and there might be several jobs fetching from it at the same time.

The rsync compression has show up to be a bottleneck (found by Tim T188375#5378612) and has been removed. However we now are bandwidth throttled and I am thus willing to repplay rsync compression.

However, the instance is on cloudvirt1002 which has slow CPUs. An issue I found this summer (T223971).

So I am wondering whether we could move that instance to a newer cloudvirt which has faster CPUs. From some benchmarks I did on T223971, some cloudvirt are at least twice faster so would benefit rsync --compressed.

Ideally we would need to keep the same instance IP if that it is at all possible. Else I got to prepare some CI configuration changes.

The operation must be done when CI is not too busy since it is rather disrupting. Ideally during European morning.

Event Timeline

hashar created this task.Sep 11 2019, 6:11 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 11 2019, 6:11 PM
hashar added a project: Cloud-VPS.
hashar added a subscriber: aborrero.

I think last time I synced with @aborrero to have the instance moved.

aborrero triaged this task as Normal priority.Tue, Sep 24, 11:38 AM

Would you mind proposing a destination cloudvirt for this migration?

Other than that, just ping me on IRC when you want this done in the EU morning.

hashar added a comment.EditedTue, Sep 24, 7:24 PM

Would you mind proposing a destination cloudvirt for this migration?

I would guess any recent cloudvirt would work, I am just trying to avoid the old cloudvirt which have a mysteriously slow CPU.

Other than that, just ping me on IRC when you want this done in the EU morning.

Guess we can do that this Thursday.

Mentioned in SAL (#wikimedia-releng) [2019-10-01T11:13:24Z] <hashar> shutting down integration-castor03 # T232646

Mentioned in SAL (#wikimedia-cloud) [2019-10-01T12:19:10Z] <arturo> migrating integration-castor03 to cloudvirt1021 (T232646)

Change 540108 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] disable castor temporarily

https://gerrit.wikimedia.org/r/540108

Change 540108 merged by jenkins-bot:
[integration/config@master] disable castor temporarily

https://gerrit.wikimedia.org/r/540108

Change 540110 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] Revert "disable castor temporarily"

https://gerrit.wikimedia.org/r/540110

hashar closed this task as Resolved.Tue, Oct 1, 12:47 PM

The instance has been moved by @aborrero (thx!)

It is now running on cloudvirt1021 which has fast CPU.

Change 540110 merged by jenkins-bot:
[integration/config@master] Revert "disable castor temporarily"

https://gerrit.wikimedia.org/r/540110