Page MenuHomePhabricator

Source revision is in Phabricator, but can't be found by deployment tools
Closed, ResolvedPublic

Description

In T181661#3837059 we discovered that a deployment failure was caused by a missing source revision, rOEQ15d5283b7422 in the editquality repo.

It's present in Phabricator, https://phabricator.wikimedia.org/source/editquality/browse/master/;15d5283b7422919d85203b5ba907027f9356e421

Revisions and Commits

Event Timeline

Ok I'm going to try to summarize what we learned by quite a lot of manual poking at the ores1001 target to form a hypothesis and then attempting a workaround based on that, and finally re-running a deploy which succeeded.

  1. On tin, rOEQ15d5283b7422 only existed at HEAD and was not on any local branch of the submodule's deployment repo.
  2. On the target, after scap has already updated .gitmodules and ran git submodule sync we see the following refspec in the submodule's .git/config:
[remote "origin"]       
        url = http://tin.eqiad.wmnet/ores/deploy/.git/modules/submodules/editquality
        fetch = +refs/heads/*:refs/remotes/origin/*

So fetching from origin is only going to fetch local branches from tin, it will not see the remote tracking branches.

git for-each-ref of tin:

f1c517c35a1a6bfc852b74cd6f4c6c6eb8f367ed commit	refs/heads/master
d4bd3e922a4141ae82d21caef35d37d4464336b7 commit	refs/remotes/origin/CoC
15d5283b7422919d85203b5ba907027f9356e421 commit	refs/remotes/origin/HEAD
2203c6ae71fdec1a0b919ff8ebb21f5b41fa2ad6 commit	refs/remotes/origin/Pix1234-patch-2
fb2c5a3e815a61141257757ffd4cd5d71f55a7d3 commit	refs/remotes/origin/adds_urdu

Note that the commit in question (rOEQ15d5283b7422) only exists in refs/remotes/origin/HEAD from the perspective of the repo on tin. origin is ssh://vcs@git-ssh.wikimedia.org/source/editquality.git

I advised @awight of the above and suggested as follows (irc log slightly edited for clarity)

<twentyafterfour> awight: I think what needs to happen is that you check out the branch in the submodule on tin so that there is a local ref that can be fetched"
I can probably fix scap's refspec to also fetch the remotes but that'll take an update to the scap package which will take time to build and deploy

<awight> I’m pretty sure that detached head is the default for submodules tho
So it’s really surprising that this didn’t hit us until today

<twentyafterfour> yeah it is ok if the submodule is in a detached head but the commit that it's pointing to needs to also exist on a branch

<awight> Awesome debugging, that gives me a simple workaround at least, of going into the submodules, checking out master, then back to the root and updating to the correct snapshot.

<twentyafterfour> right that _should_ work

<awight> yep lemme try that

<awight> :D works like a charm.

<twentyafterfour> this may be a change in behavior in scap due to the --reference majic
I think that the old way that we cloned submodules might have ended up with a different refspec (treating the repo as bare vs non-bare, is the only thing I can think of )

<awight> When was this change deployed to production scap?
\o/ parallel deployment to ores* just completed perfectly, and in 58 seconds
That’s gonads to the wall.

<twentyafterfour> monday the 11th is when it got published

<awight> That’s gotta be it, then.

So I'm going to figure out what needs to change to make git use the right refspec in the submodule update in deploy-local. Probably going to have to force-update the refspec in git config ...

mmodell triaged this task as High priority.
mmodell added a revision: Restricted Differential Revision.Dec 16 2017, 8:29 AM