Page MenuHomePhabricator

Clean up the deployment host
ClosedPublic

Authored by thcipriani on Dec 9 2016, 1:14 AM.
Referenced Files
Unknown Object (File)
Aug 8 2023, 12:06 AM
Unknown Object (File)
Jul 1 2023, 5:09 AM
Unknown Object (File)
Mar 29 2023, 12:44 PM
Unknown Object (File)
Mar 22 2023, 1:59 AM
Unknown Object (File)
Mar 19 2023, 10:10 AM
Unknown Object (File)
Mar 16 2023, 2:25 PM
Unknown Object (File)
Mar 11 2023, 2:39 PM
Unknown Object (File)
Mar 11 2023, 2:39 PM
Subscribers
None

Details

Maniphest Tasks
T112509: scap3 should repack / pack-refs git repos under /srv/deployment
Reviewers
dduvall
mmodell
demon
hashar
Group Reviewers
Release-Engineering-Team
Commits
rMSCA38289c69e15a: Clean up the deployment host
Patch without arc
git checkout -b D503 && curl -L https://phabricator.wikimedia.org/D503?download=true | git apply
Summary

I've been watching beta since we've started committing to git for every
deployment, and it seems that the git directory gets out-of-hand pretty
quickly. Leaving loose objects laying around when we flatten the
/srv/mediawiki directory makes for a large /srv/mediawiki/.git
directory.

The flattening of the mediawiki directory has been running for a week on
beta. The size of /srv/mediawiki/.git yesterday was 829 MB, after a
git gc it was 334 MB.

This commit runs git gc --auto for every deployment on
/srv/mediawiki. It also handles clean up of tags for all scap3 repos,
since there are a lot hanging around.

Should fix T112509

Diff Detail

Repository
rMSCA Scap
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

thcipriani retitled this revision from to Clean up the deployment host.
thcipriani updated this object.
thcipriani edited the test plan for this revision. (Show Details)
thcipriani added reviewers: demon, mmodell, dduvall, hashar.

Couldn't we set this in ./.git/config for a given repo so it happens more frequently/aggressively?

On further thought...we probably want to run this frequently, but tune some of the settings so --auto does really useful things each time. Default retention periods for unreferenced stuff (and reflog) could be shorter.

Also I'm finding some rumblings of a gc.autodetach config that runs it in the background so we'd be non-blocking.

These two settings would probably help:

gc.auto = 1000
gc.pruneexpire = now

It seems that gc.autodetach defaults to true when it's available so there is no need to set it.

We might want to experiment with gc.autopacklimit since that could have a large impact on rsync efficiency (as well as git fetch efficiency once we switch to using git transport)

mmodell edited edge metadata.
This revision is now accepted and ready to land.Dec 9 2016, 7:18 AM
scap/git.py
204

maybe also do git reflog expire and set gc.reflogExpireUnreachable=now

This revision was automatically updated to reflect the committed changes.