Page MenuHomePhabricator

Quibble should clone repositories in parallel
Closed, ResolvedPublic

Description

Quibble clones each of the repositories serially. It does so by invoking zuul-cloner which has a very lame foreach implementation for cloning.

Potentially we could use a threadpool or use git submodules and delegate parallelization to it.

GOTCHA: there might be name clash when cloning. Eg if one clones mediawiki/extensions/Foo first, that creates the directory ./extensions and then prevent us from cloning mediawiki/core in it (iirc). So order of cloning does matter.

Related Objects

Event Timeline

hashar created this task.Dec 11 2018, 4:28 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 11 2018, 4:28 PM

Change 493764 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/quibble@master] Support to clone repositories in parallel

https://gerrit.wikimedia.org/r/493764

hashar moved this task from Backlog to In progress on the Quibble (marble) board.Mar 1 2019, 10:24 PM
greg assigned this task to hashar.Mar 5 2019, 12:39 AM
greg moved this task from Backlog to In-progress on the Release-Engineering-Team (Kanban) board.

Change 493764 merged by jenkins-bot:
[integration/quibble@master] Support to clone repositories in parallel

https://gerrit.wikimedia.org/r/493764

@hashar thanks for adding this feature! Have you considered making the default number of workers 8/16/whatever?

It definitely should defaults to some higher number of workers. Exact value left to be defined.

I have let it default to 1 for now to still rely on the old code:

    if workers == 1:
        return zuul_cloner.execute()
// else slightly reimplemented code with parallelism

So when doing the upgrade on CI, no change is introduced. Then I can later enable it by turning the feature flag --git-parallel. If that works fine, for sure it should definitely has a useful default (8 sounds good) :]

So I am playing it safe :]

So I am playing it safe :]

very sensible :)

hashar added a comment.Apr 2 2019, 9:54 AM

So this is possible since Quibble 0.0.31 (0.0.30 is broken). Still have to update the Jenkins jobs to pass --git-parallel 8.

Change 500689 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] Clone repos in parallel in wmf-quibble

https://gerrit.wikimedia.org/r/500689

Change 500689 merged by jenkins-bot:
[integration/config@master] Clone repos in parallel in wmf-quibble

https://gerrit.wikimedia.org/r/500689

I have updated the wmf-quibble jobs for a starter. Will see later at generalizing the parameter to all jobs and probably just make it the default in a future version of Quibble.

Change 501398 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] Add --git-parallel=8 to most Quibble jobs

https://gerrit.wikimedia.org/r/501398

hashar added a comment.Apr 4 2019, 8:12 PM

Left to do I guess is to have the command default to 8 worker threads which should be fine?! Maybe.

Change 501398 merged by jenkins-bot:
[integration/config@master] Add --git-parallel=8 to most Quibble jobs

https://gerrit.wikimedia.org/r/501398

Seems it is working fine, will make it the default in a future Quibble version.

Change 503003 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/quibble@master] Default to use 4 git workers

https://gerrit.wikimedia.org/r/503003

hashar closed this task as Resolved.Apr 11 2019, 7:40 PM

I think that is good enough for now. We can add more --git-parallel later on.

Change 503003 merged by jenkins-bot:
[integration/quibble@master] Default to use 4 git workers

https://gerrit.wikimedia.org/r/503003

Mentioned in SAL (#wikimedia-releng) [2019-06-24T10:09:48Z] <hashar> Tag Quibble 0.0.32 @ c0fe6eb # T211701 T218357 T220199 T223752