I outlined the situation which this addresses in T182865#3837272:
This is what we (@mmodell and @awight) learned through quite a lot of manual poking at the ores1001 target to form a hypothesis and then attempting a workaround based on that, and finally re-running a deploy which succeeded.
- On tin, 15d5283b7422919d85203b5ba907027f9356e421 only existed at HEAD and was not on any local branch of the submodule's deployment repo.
- On the target, after scap has already updated .gitmodules and ran git submodule sync we see the following refspec in the submodule's .git/config:
[remote "origin"] url = http://tin.eqiad.wmnet/ores/deploy/.git/modules/submodules/editquality fetch = +refs/heads/*:refs/remotes/origin/*
So fetching from origin is only going to fetch local branches from tin, it will not see the remote tracking branches.
git for-each-ref of tin:
f1c517c35a1a6bfc852b74cd6f4c6c6eb8f367ed commit refs/heads/master d4bd3e922a4141ae82d21caef35d37d4464336b7 commit refs/remotes/origin/CoC 15d5283b7422919d85203b5ba907027f9356e421 commit refs/remotes/origin/HEAD 2203c6ae71fdec1a0b919ff8ebb21f5b41fa2ad6 commit refs/remotes/origin/Pix1234-patch-2 fb2c5a3e815a61141257757ffd4cd5d71f55a7d3 commit refs/remotes/origin/adds_urdu
Note that the commit in question (15d5283b) only exists in refs/remotes/origin/HEAD from the perspective of the repo on tin. origin is ssh://vcs@git-ssh.wikimedia.org/source/editquality.git
I advised @awight of the above and suggested as follows (irc log slightly edited for clarity)
<twentyafterfour> awight: I think what needs to happen is that you check out the branch in the submodule on tin so that there is a local ref that can be fetched"
I can probably fix scap's refspec to also fetch the remotes but that'll take an update to the scap package which will take time to build and deploy<awight> I’m pretty sure that detached head is the default for submodules tho
So it’s really surprising that this didn’t hit us until today<twentyafterfour> yeah it is ok if the submodule is in a detached head but the commit that it's pointing to needs to also exist on a branch
<awight> Awesome debugging, that gives me a simple workaround at least, of going into the submodules, checking out master, then back to the root and updating to the correct snapshot.
<twentyafterfour> right that _should_ work
<awight> yep lemme try that
<awight> :D works like a charm.
<twentyafterfour> this may be a change in behavior in scap due to the --reference majic
I think that the old way that we cloned submodules might have ended up with a different refspec (treating the repo as bare vs non-bare, is the only thing I can think of )<awight> When was this change deployed to production scap?
\o/ parallel deployment to ores* just completed perfectly, and in 58 seconds
That’s gonads to the wall.<twentyafterfour> monday the 11th is when it got published
<awight> That’s gotta be it, then.