Page MenuHomePhabricator

Puppet repo not being updated on github
Closed, ResolvedPublic

Description

I just realised that the puppet repo mirror on github isn't being updated. I don't know whether the issue is on our end or on github's side.
An example:
https://github.com/wikimedia/puppet/commits/production/manifests/site.pp

Last commit to that specific file is: d4ee2ffafee5c9c9eb901b1cbb9390b74e8a3a76 (https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+log/production/manifests/site.pp)

Event Timeline

MarcoAurelio added a project: Gerrit.
MarcoAurelio subscribed.

Can someone with access check the replication gerrit plugin logs (if any) to see if there's any trace? Puppet is not the only repo that stopped being mirrored 5 days ago. AFAICS we have much more of them not being updated. If no logs, maybe the replication config changed? Could it be the updates of /r/p/ -> /r/ being carried out? Thanks.

From gerrit show-queue --wide --by-queue

Queue: ReplicateTo-slaves
88880f41              13:49:15.987      [a88b933a] push gerrit2@gerrit2001.wikimedia.org:/srv/gerrit/git/operations/puppet.git
eda75d84 waiting .... 13:49:37.453      [8daee15e] push gerrit2@gerrit2001.wikimedia.org:/srv/gerrit/git/mediawiki/skins/Nostalgia.git
e2d68cf7 waiting .... 13:49:52.409      [82cd5061] push gerrit2@gerrit2001.wikimedia.org:/srv/gerrit/git/mediawiki/core.git
397dbf5b waiting .... 13:50:28.091      [b951afe2] push gerrit2@gerrit2001.wikimedia.org:/srv/gerrit/git/mediawiki/extensions/ContentTranslation.git
93bfed88 waiting .... 13:52:13.012      [13cc5d31] push gerrit2@gerrit2001.wikimedia.org:/srv/gerrit/git/mediawiki/extensions/AbuseFilter.git
3d14d0ff waiting .... 13:52:39.973      [bd28c042] push gerrit2@gerrit2001.wikimedia.org:/srv/gerrit/git/mediawiki/extensions/WikimediaEditorTasks.git
5d4bc4a5 waiting .... 13:52:47.814      [dd37b432] push gerrit2@gerrit2001.wikimedia.org:/srv/gerrit/git/mediawiki/extensions/WikibaseSchema.git
5d8ae422 waiting .... 13:52:48.682      [dd96d44b] push gerrit2@gerrit2001.wikimedia.org:/srv/gerrit/git/mediawiki/extensions.git
bbad3665 waiting .... 13:58:00.957      [3bc2a606] push gerrit2@gerrit2001.wikimedia.org:/srv/gerrit/git/operations/puppet.git
------------------------------------------------------------------------------
  9 tasks, 1 worker threads

So it is at least replicating to the hot spare gerrit2001, though one of the replica is hmm slow/stall maybe.

The last replication to github is from yesterday:

[2019-03-25 21:54:52,076] [5dd7ad8e] Replication to git@github.com:wikimedia/mediawiki-services-parsoid started...
[2019-03-25 21:54:54,462] [5dd7ad8e] Replication to git@github.com:wikimedia/mediawiki-services-parsoid completed in 2386ms, 128253ms delay, 0 retries

And github is apparently still listed:

replication list
Remote: github
Url: git@github.com:wikimedia/${name}

Remote: slaves
Url: gerrit2@gerrit2001.wikimedia.org:/srv/gerrit/git/${name}.git

I tried to manually force the replication for a repository at 14:04:22 UTC with:

ssh -p 29418 gerrit.wikimedia.org replication start operations/puppet --now --wait

But nothing shows up in logs so.. I don't know what is broken.

I tried to manually force the replication for a repository at 14:04:22 UTC with:

ssh -p 29418 gerrit.wikimedia.org replication start operations/puppet --now --wait

But nothing shows up in logs so.. I don't know what is broken.

hrm. I tried the same thing, but waited a long time (11minutes):

~
(/^ヮ^)/*:・゚✧ ssh -p 29418 gerrit.wikimedia.org -- replication start operations/puppet --wait                                                        
2019-03-26 13:53:41,743 sshecret SSH_AUTH_SOCK=/run/user/1000/194f906110c5681bc2ac8c5c340ca5dd.sock                                                   
Host key fingerprint is SHA256:j7HQoQ6fIuEgDHjONjI2CZ+2Iwxqgo2Ur5LbPqBgxOU
+---[RSA 1024]----+
|                 |
|.  .             |
|= +     .        |
|+BoE   o .       |
|BBX . o S        |
|@@o+ + o =       |
|X=*.. + o .      |
|*ooo .           |
|o+o.             |
+----[SHA256]-----+
Replicate operations/puppet ref ..all.. to gerrit2001.wikimedia.org, Succeeded! (OK)                                                                  
Replication of operations/puppet ref ..all.. completed to 1 nodes,
----------------------------------------------
Replication completed successfully!
~ 11m 13s
(/^ヮ^)/*:・゚✧

but github is still not up-to-date :\

Rather just a random idea, but can somebody check the SSH auth between Gerrit machine and git@github.com still works? I encouradged an error a few days ago, when I was suddently unable to access GitHub, removing and re-adding my SSH key to GitHub user profile fixed that issue.

Rather just a random idea, but can somebody check the SSH auth between Gerrit machine and git@github.com still works? I encouradged an error a few days ago, when I was suddently unable to access GitHub, removing and re-adding my SSH key to GitHub user profile fixed that issue.

Just tested, it does indeed still work.

Also, there are repos being updated frex MobileFrontend was updated 9minutes ago according to https://github.com/wikimedia

https://github.com/wikimedia shows activity for some repos so if it were an authentication issue I think no repo would be updating, right?

thcipriani claimed this task.

Seems to be working now (manually at least)

(/^ヮ^)/*:・゚✧ ssh -p 29418 gerrit.wikimedia.org -- replication start operations/puppet --wait
2019-03-26 15:37:14,169 sshecret SSH_AUTH_SOCK=/run/user/1000/194f906110c5681bc2ac8c5c340ca5dd.sock
Host key fingerprint is SHA256:j7HQoQ6fIuEgDHjONjI2CZ+2Iwxqgo2Ur5LbPqBgxOU
+---[RSA 1024]----+
|                 |
|.  .             |
|= +     .        |
|+BoE   o .       |
|BBX . o S        |
|@@o+ + o =       |
|X=*.. + o .      |
|*ooo .           |
|o+o.             |
+----[SHA256]-----+
Replicate operations/puppet ref ..all.. to github.com, Succeeded! (OK)

Yesterday I removed mediawiki-replication from Read on refs/* since the description of the group is:

[ARCHIVED] Group just for setting permissions for replication purposes.

The group has no members and has only ever had one member who was removed the day they were added.

Today I noticed that group mentioned in our replication.config as the authGroup. After restoring the permission I removed yesterday, replication is working again.

/me updates group description.

Seems that is working again indeed:

[2019-03-26 21:37:14,820] [] scheduling replication operations/puppet:..all.. => git@github.com:wikimedia/operations-puppet

@thcipriani can you potentially copy paste your dicovery about mediawiki-replication / replication.conf authGroup to https://wikitech.wikimedia.org/wiki/Gerrit#Replication ? ;]

That is a good finding for sure.

@hashar @thcipriani What about repos with no recent commits? Will they still "lag behind" until a commit to them is made or does replication regularly updates all of them? I know there's a --all option which is not to be used to prevent messing with the renamed repos.

To clarify, all repositories are replicating just fine. The last ten ones are:

$ grep github.*started replication_log|tail -n10
[2019-03-27 11:51:01,870] [fe6023bf] Replication to git@github.com:wikimedia/operations-puppet started...
[2019-03-27 11:57:39,519] [223b081a] Replication to git@github.com:wikimedia/operations-dns started...
[2019-03-27 12:27:41,423] [d99bc71f] Replication to git@github.com:wikimedia/operations-mediawiki-config started...
[2019-03-27 12:30:15,923] [f02a6766] Replication to git@github.com:wikimedia/wikibase-termbox started...
[2019-03-27 12:45:39,646] [db4d36b1] Replication to git@github.com:wikimedia/operations-mediawiki-config started...
[2019-03-27 13:09:17,935] [f5aea2e4] Replication to git@github.com:wikimedia/mediawiki-extensions-UploadWizard started...
[2019-03-27 13:09:21,214] [15257698] Replication to git@github.com:wikimedia/mediawiki-extensions started...
[2019-03-27 13:09:51,780] [b5d90afb] Replication to git@github.com:wikimedia/mediawiki-extensions-GrowthExperiments started...
[2019-03-27 13:09:54,685] [bf62498f] Replication to git@github.com:wikimedia/operations-mediawiki-config started...
[2019-03-27 13:17:20,752] [bb5a8b6c] Replication to git@github.com:wikimedia/performance-docroot started...

:-]

Reedy subscribed.

Doesn't look to be resolved to me...

https://github.com/wikimedia/mediawiki-extensions-WikimediaMaintenance/commits/master

Numerous commits on gerrit that aren't there

The replication does work that extension got replicated to github:

[2019-03-28 12:45:32,563] [19650510]
Replication to git@github.com:wikimedia/mediawiki-extensions-WikimediaMaintenance completed in 2571ms, 15000ms delay, 0 retries

And it is not missing anything as far as i can tell.

The replication does work that extension got replicated to github:

[2019-03-28 12:45:32,563] [19650510]
Replication to git@github.com:wikimedia/mediawiki-extensions-WikimediaMaintenance completed in 2571ms, 15000ms delay, 0 retries

And it is not missing anything as far as i can tell.

It's not after my latest commit forced a push...