Page MenuHomePhabricator

Quibble should clone repositories in parallel
Closed, ResolvedPublic

Description

Quibble clones each of the repositories serially. It does so by invoking zuul-cloner which has a very lame foreach implementation for cloning.

Potentially we could use a threadpool or use git submodules and delegate parallelization to it.

GOTCHA: there might be name clash when cloning. Eg if one clones mediawiki/extensions/Foo first, that creates the directory ./extensions and then prevent us from cloning mediawiki/core in it (iirc). So order of cloning does matter.

Related Objects

Event Timeline

Change 493764 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/quibble@master] Support to clone repositories in parallel

https://gerrit.wikimedia.org/r/493764

Change 493764 merged by jenkins-bot:
[integration/quibble@master] Support to clone repositories in parallel

https://gerrit.wikimedia.org/r/493764

@hashar thanks for adding this feature! Have you considered making the default number of workers 8/16/whatever?

It definitely should defaults to some higher number of workers. Exact value left to be defined.

I have let it default to 1 for now to still rely on the old code:

    if workers == 1:
        return zuul_cloner.execute()
// else slightly reimplemented code with parallelism

So when doing the upgrade on CI, no change is introduced. Then I can later enable it by turning the feature flag --git-parallel. If that works fine, for sure it should definitely has a useful default (8 sounds good) :]

So I am playing it safe :]

So I am playing it safe :]

very sensible :)

So this is possible since Quibble 0.0.31 (0.0.30 is broken). Still have to update the Jenkins jobs to pass --git-parallel 8.

Change 500689 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] Clone repos in parallel in wmf-quibble

https://gerrit.wikimedia.org/r/500689

Change 500689 merged by jenkins-bot:
[integration/config@master] Clone repos in parallel in wmf-quibble

https://gerrit.wikimedia.org/r/500689

I have updated the wmf-quibble jobs for a starter. Will see later at generalizing the parameter to all jobs and probably just make it the default in a future version of Quibble.

Change 501398 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] Add --git-parallel=8 to most Quibble jobs

https://gerrit.wikimedia.org/r/501398

Left to do I guess is to have the command default to 8 worker threads which should be fine?! Maybe.

Change 501398 merged by jenkins-bot:
[integration/config@master] Add --git-parallel=8 to most Quibble jobs

https://gerrit.wikimedia.org/r/501398

Seems it is working fine, will make it the default in a future Quibble version.

Change 503003 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/quibble@master] Default to use 4 git workers

https://gerrit.wikimedia.org/r/503003

I think that is good enough for now. We can add more --git-parallel later on.

Change 503003 merged by jenkins-bot:
[integration/quibble@master] Default to use 4 git workers

https://gerrit.wikimedia.org/r/503003