Page MenuHomePhabricator

ToolLabs git clone is really slow
Closed, ResolvedPublic

Description

I created a Tools Lab account on 1st may which was approved by Tim Landscheidt.

I've created a tool (catimages) for my GSoC project and was trying to set up pywikibot for a test bed. When I try to clone pywikibot or mediawiki-core I find that the cloning hangs at "Cloning into 'mw-core'..." for a lot of time and then the cloning begins at ~10-20 KBps.

To get the cloning started sometimes it's taken 2 hrs, other times 30 mins. And sometimes the cloning again hangs at some percentage.

Is that an expected ? I found it quite difficult to clone pywikibot, had to restart cloning 3-4 times and wait for ~1-2 hrs each time. Before I got 1 successful attempt.

Event Timeline

AbdealiJK created this task.May 3 2016, 4:54 AM
Restricted Application added a project: Cloud-Services. · View Herald TranscriptMay 3 2016, 4:54 AM
Restricted Application added subscribers: Zppix, Aklapper. · View Herald Transcript
jayvdb added a subscriber: jayvdb.

Just noting that https://www.mediawiki.org/wiki/Manual:Pywikibot/Installation/Labs does suggest doing a clone.

Over at https://wikitech.wikimedia.org/wiki/Help:Tool_Labs/Developing#Using_the_shared_Pywikibot_files_.28recommended_setup.29 there is instructions for using a shared version, but that isnt helpful if modifications to the code are needed.

Restricted Application added a subscriber: pywikibot-bugs-list. · View Herald TranscriptMay 3 2016, 6:43 AM

The above was done using https://. I was not planning on pushing from the Tools Lab and was only planning on pulling when I make a change on my local system.

Over at IRC (Cloud-Services) :mutante mentioned this may be because of migration of the software hosting the git repos.

I finally got mediawiki-core to clone successfully. I times it using the time command and here is the data:

$ time git clone --recursive https://gerrit.wikimedia.org/r/p/mediawiki/core.git mw-core
Cloning into 'mw-core'...
remote: Counting objects: 95071, done
remote: Finding sources: 100% (22048/22048)
remote: Getting sizes: 100% (2724/2724)
remote: Compressing objects:  99% (79996/79997)
remote: Total 627104 (delta 13522), reused 624884 (delta 12608)
Receiving objects: 100% (627104/627104), 239.90 MiB | 618.00 KiB/s, done.
Resolving deltas: 100% (519784/519784), done.19412/519784)   
Checking connectivity... done.
Checking out files: 100% (6201/6201), done.

real	173m37.029s
user	9m37.520s
sys	29m47.641s

It took 2.5+ hrs

I make that experience too: at a normal medium labs instance, git clone is about two or three times faster than at the toollabs-bastion.

bd808 added a subscriber: bd808.May 22 2016, 4:37 AM

I finally got mediawiki-core to clone successfully. I times it using the time command and here is the data:

real	173m37.029s
user	9m37.520s
sys	29m47.641s

I make that experience too: at a normal medium labs instance, git clone is about two or three times faster than at the toollabs-bastion.

Git operations are fairly file-system intensive. Many files are created, deleted, read, and/or stat'ed when working with a large repository. When these operations happen on Tool Labs host they are happening over NFS so that the resulting files are available on all grid hosts. Most other Labs projects do not use NFS and instead rely on host local storage which is much faster. File stat operations are especially slow over NFS which can be especially impactful for git status and git add.

Generally Labs admins are working to improve the speed and reliability of the NFS servers constantly, but there is really only so much that can be done. There are some tips on StackOverflow for improving some git operations over NFS by tuning your git configuration.

A trick that you can use to speed up cloning is to make your initial clones on a host-local filesystem (like /tmp) and the move the whole git repo to its final home on an NFS partition.

$ time git clone --recursive https://gerrit.wikimedia.org/r/p/mediawiki/core.git mw-core-bd808-test
Cloning into 'mw-core-bd808-test'...
remote: Counting objects: 94277, done
remote: Finding sources: 100% (21800/21800)
remote: Getting sizes: 100% (2537/2537)
remote: Compressing objects:  99% (79440/79441)
remote: Total 631706 (delta 13485), reused 629664 (delta 12586)
Receiving objects: 100% (631706/631706), 251.83 MiB | 9.90 MiB/s, done.
Resolving deltas: 100% (522984/522984), done.
Checking connectivity... done.
Checking out files: 100% (6321/6321), done.

real    3m14.461s
user    5m6.477s
sys     1m5.384s
tools-bastion-02.tools:/tmp
bd808$ time mv mw-core-bd808-test ~bd808

real    4m17.739s
user    0m0.668s
sys     0m28.939s

About 7.5 minutes instead of the horrible 2.5 hours that was seen by @AbdealiJK when operating directly on NFS.

@bd808 That is awesome !
It would be useful to add this to the Gotchas section - https://wikitech.wikimedia.org/wiki/Help:Tool_Labs#Gotchas

AbdealiJK closed this task as Resolved.May 22 2016, 4:58 AM
AbdealiJK claimed this task.

I've added the suggestion mentioned by @bd808 on the wiki - https://wikitech.wikimedia.org/w/index.php?title=Help%3ATool_Labs&type=revision&diff=551995&oldid=493235

I'm closing this as it's the expected working and not a bug per se.