Page MenuHomePhabricator

Gerrit replication does not replicate tags deletion to GitHub
Open, Needs TriagePublic

Description

Split from T413213, when deleting a tag it is properly deleted on the Gerrit replicas.

Summary

The github replica has mirror = false in which case the ref deletion is not pushed because it does not match the push refspec. However if we set mirror = true, the push refspec is not looked at and that would delete other refs that might be present on GitHub such as pull requests (which are under refs/pull/.


The replication logs for the other Gerrit replicas:

[2026-01-07 11:14:56,125] Push to gerrit2@gerrit2002.wikimedia.org:/srv/gerrit/git/mediawiki/core.git references: RemoteRefUpdate{refSpec=null:refs/tags/list, status=NOT_ATTEMPTED, id=(null)..AnyObjectId[0000000000000000000000000000000000000000], force=yes, delete=yes, ffwd=no} [CONTEXT PLUGIN="replication" project="mediawiki/core" pushOneId="82254395" request="REST /r/*/mediawiki%2Fcore/*/list" ]
[2026-01-07 11:14:56,131] Push to gerrit@gerrit2003.wikimedia.org:/srv/gerrit/git/mediawiki/core.git references: RemoteRefUpdate{refSpec=null:refs/tags/list, status=NOT_ATTEMPTED, id=(null)..AnyObjectId[0000000000000000000000000000000000000000], force=yes, delete=yes, ffwd=no} [CONTEXT PLUGIN="replication" project="mediawiki/core" pushOneId="e2fd1ffa" request="REST /r/*/mediawiki%2Fcore/*/list" ]

They pushes the refspec null:refs/tags/list which did delete the tag. I have verified on the bare repo on each of the hosts).

However For GitHub the list tag was still present! The replication log states:

[2026-01-07 11:15:05,984] Replication to git@github.com:wikimedia/mediawiki-core started... [CONTEXT PLUGIN="replication" project="mediawiki/core" pushOneId="a24de74a" request="REST /r/*/mediawiki%2Fcore/*/list" ]
[2026-01-07 11:15:06,123] Replication to git@github.com:wikimedia/mediawiki-core completed in 139ms, 15000ms delay, 0 retries [CONTEXT PLUGIN="replication" project="mediawiki/core" pushOneId="a24de74a" request="REST /r/*/mediawiki%2Fcore/*/list" ]

But the tag was still present! The replication to GitHub is configured to push heads and tags:

/etc/gerrit/replication.config
[remote "github"]
  authGroup = mediawiki-replication
  createMissingRepositories = false
  maxRetries = 50
  mirror = false
  projects = ^(?:(?!apps\\/).)*$
  push = +refs/heads/*:refs/heads/*   # <------ for branches
  push = +refs/tags/*:refs/tags/*     # <------ for tags
  remoteNameStyle = dash
  replicatePermissions = false
  rescheduleDelay = 15
  threads = 2
  url = git@github.com:wikimedia/${name}

But the tag deletion has clearly NOT been replicated since the tag was still shown and I went to delete it manually.

There is something off in the replication config. The reference is https://gerrit.wikimedia.org/r/plugins/replication/Documentation/config.md and it might be related to:

remote.NAME.mirror

If true, replication will remove remote branches that are absent locally or invisible to the replication (for example read access denied via authGroup option).

By default, false, do not remove remote branches.

The GitHub configuration has mirror = false while the two Gerrit replicas have mirror = true. I would need to read the replication plugin code to see whether that affects tags as well.

Event Timeline

The only code in the replication plugin (at v3.10.4) which matches /mirror/i is in src/main/java/com/googlesource/gerrit/plugins/replication/PushOne.java. That is used at two places:

  • doPushAll() (which is to push all ref for a full replication
  • doPushDelta() which pushes a set of references

The doPushDelta method is:

private List<RemoteRefUpdate> doPushDelta(Map<String, Ref> local) throws IOException {
  List<RemoteRefUpdate> cmds = new ArrayList<>();
  boolean noPerms = !pool.isReplicatePermissions();
  Set<String> refs = flattenRefBatchesToPush();
  for (String src : refs) {
    RefSpec spec = matchSrc(src);
    if (spec != null) {
      // If the ref still exists locally, send it, otherwise delete it.
      Ref srcRef = local.get(src);

      // Second try to ensure that the ref is truly not found locally
      if (srcRef == null) {
        srcRef = git.exactRef(src);
      }

      if (srcRef != null && canPushRef(src, noPerms)) {
        push(cmds, spec, srcRef);
      } else if (config.isMirror()) {
        delete(cmds, spec);
      }
    }
  }
  return cmds;
}

Given the deletion has a null OID, that srcRef would be null (it does not exist) and then the delete() is only invoked for mirrors. That seems to apply to any references beside branches. I think it is a shortcoming of the doc, it should mentions that involves any ref being configured for pushing (in our case refs/heads/* and refs/tags/*.

The delete() method is:

private void delete(List<RemoteRefUpdate> cmds, RefSpec spec) throws IOException {
  String dst = spec.getDestination();
  boolean force = spec.isForceUpdate();
  cmds.add(new RemoteRefUpdate(git, (Ref) null, dst, force, null, null));
}

The second parameter ((Ref) null) is the null oid which indicates a deletion.

The replication push does log the references being pushed at INFO level to replication_log:

private PushResult pushVia(Transport tn) throws IOException, PermissionBackendException {
  tn.applyConfig(config);
  tn.setCredentialsProvider(credentialsProvider);

  List<RemoteRefUpdate> todo = generateUpdates(tn);
  if (todo.isEmpty()) { 
    // If we have no commands selected, we have nothing to do.
    // Calling JGit at this point would just redo the work we
    // already did, and come up with the same answer. Instead
    // send back an empty result.
    return new PushResult();
  }

  if (replConfig.getMaxRefsToLog() == 0 || todo.size() <= replConfig.getMaxRefsToLog()) {
    repLog.atInfo().log("Push to %s references: %s", uri, lazy(() -> refUpdatesForLogging(todo)));
  } else {
    repLog.atInfo().log(
        "Push to %s references (first %d of %d listed): %s",
        uri,
        replConfig.getMaxRefsToLog(),
        todo.size(),
        lazy(() -> refUpdatesForLogging(todo.subList(0, replConfig.getMaxRefsToLog()))));    
  }     
  
  return pushInBatches(tn, todo);
}

And in the logs I can find those Push to * references messages for the deletion of mediawiki/core ref refs/tags/list:

$ grep 'Push to.*mediawiki/core.*references:.*refs/tags/list' replication_log
[2026-01-07 11:14:56,125] Push to gerrit2@gerrit2002.wikimedia.org:/srv/gerrit/git/mediawiki/core.git
references: RemoteRefUpdate{refSpec=null:refs/tags/list, status=NOT_ATTEMPTED, id=(null)..AnyObjectId[0000000000000000000000000000000000000000], force=yes, delete=yes, ffwd=no}
[CONTEXT PLUGIN="replication" project="mediawiki/core" pushOneId="82254395" request="REST /r/*/mediawiki%2Fcore/*/list" ]
[2026-01-07 11:14:56,131] Push to gerrit@gerrit2003.wikimedia.org:/srv/gerrit/git/mediawiki/core.git
references: RemoteRefUpdate{refSpec=null:refs/tags/list, status=NOT_ATTEMPTED, id=(null)..AnyObjectId[0000000000000000000000000000000000000000], force=yes, delete=yes, ffwd=no}
[CONTEXT PLUGIN="replication" project="mediawiki/core" pushOneId="e2fd1ffa" request="REST /r/*/mediawiki%2Fcore/*/list" ]

And therefore for Wikimedia-GitHub we should enable mirror = true. I need to think about the consequences of deleting heads and tags that are in GitHub but not in Gerrit.

xref https://github.com/wikimedia/mediawiki/activity?activity_type=branch_deletion, for (I think) a list of (manual) tag/branch deletions that have taken place for the GitHub repo

And therefore for Wikimedia-GitHub we should enable mirror = true.

FWIW, it looks like doing this might effectively be a manual revert of https://gerrit.wikimedia.org/r/c/operations/puppet/+/528259 from August 2019 (which, from a brief look, seems like it may itself have been a manual revert of https://gerrit.wikimedia.org/r/c/operations/puppet/+/43239 from January 2013).

https://gerrit.wikimedia.org/r/c/operations/puppet/+/528259 / bdc6e091c22efe7fb24dbe12a7492ec2a211c6f2 states:

gerrit: do not treat github as a mirror

From the gerrit documentation:

> remote.NAME.mirror : If true, replication will remove remote branches
> that are absent locally or invisible to the replication (for example
> read access denied via authGroup option).

We don't want to do that on github as it results in closing lots of pull
requests.

GitHub stores pull requests under refs/pull/*. Our replication config for GitHub has:

push = +refs/heads/*:refs/heads/*
push = +refs/tags/*:refs/tags/*

At the time of that commit, there was the same push config:

hieradata/role/common/gerrit.yaml
gerrit::jetty::replication:
    github:
        push:
            - '+refs/heads/*:refs/heads/*'
            - '+refs/tags/*:refs/tags/*'

I trust @thcipriani that mirror = true indeed caused refs to be deleted. My guess is that it happens when doing a complete replication. That is done by the doPushAll() method:

private List<RemoteRefUpdate> doPushAll(Transport tn, Map<String, Ref> local) throws IOException {
  List<RemoteRefUpdate> cmds = new ArrayList<>();
  boolean noPerms = !pool.isReplicatePermissions();
  Map<String, Ref> remote = listRemote(tn);
  for (Ref src : local.values()) {
    if (!canPushRef(src.getName(), noPerms)) {
      repLog.atFine().log("Skipping push of ref %s", src.getName());
      continue;
    }

    RefSpec spec = matchSrc(src.getName());
    if (spec != null) {
      Ref dst = remote.get(spec.getDestination());
      if (dst == null || !src.getObjectId().equals(dst.getObjectId())) {
        // Doesn't exist yet, or isn't the same value, request to push.
        push(cmds, spec, src);
      }
    }
  }

  if (config.isMirror()) {
    for (Ref ref : remote.values()) {
      if (Constants.HEAD.equals(ref.getName())) {
        repLog.atFine().log("Skipping deletion of %s", ref.getName());
        continue;
      }
      RefSpec spec = matchDst(ref.getName());
      if (spec != null && !local.containsKey(spec.getSource())) {
        // No longer on local side, request removal.
        delete(cmds, spec);
      }
    }
  }

It get the list of remote (GitHub) references and when they are not on the local side (our Gerrit), instruct to do a delete. This only occurs when mirror = true.

The sum up is the reference deletions are not replicated on GitHub, that is for tags being deleted or imagines branches being deleted. I did some work to have the release branches and wmf branches to be converted to tag (T303828 and T351341).

If I pick mediawiki/extensions/Cite in Gerrit it has the REL1_39 to REL1_45 branches and the last 3 wmf/ branches: https://gerrit.wikimedia.org/r/admin/repos/mediawiki/extensions/Cite,branches

GitHub, at https://github.com/wikimedia/mediawiki-extensions-Cite, shows me the repository has 373 branches. That is the the wmf/* branches deletion (T303828) which did not get replicated.


What I don't get is that when doPushAll() mirrors, the list of remote refs are passed through matchDst() which check whether the remote reference matches the push refspecs (refs/heads/* and refs/tags/*). If that does not match it returns null which should shortcircuit the deletion at:

if (config.isMirror()) {
  for (Ref ref : remote.values()) {
    if (Constants.HEAD.equals(ref.getName())) {
      repLog.atFine().log("Skipping deletion of %s", ref.getName());
      continue;
    } 
    RefSpec spec = matchDst(ref.getName());
    if (spec != null && !local.containsKey(spec.getSource())) {
      // No longer on local side, request removal. 
      delete(cmds, spec);
    } 
  }   
}

Thus I do not understand how refs under refs/pull/* might have been deleted. I guess I will need to write a test to verify the behavior. :sigh:

Of course I am reading the replication plugin code at v3.10.4. When Tyler encountered the issue, back in 2019, we obviously used a different version of Gerrit.

Looking at our history of deployment (in https://gerrit.wikimedia.org/g/operations/software/gerrit/+log/refs/heads/deploy/wmf/stable-3.10 ):

commit 40d88dc46992e015c404295643cbfef15e025496
Author: paladox <thomasmulhall410@yahoo.com>
Date:   Thu Jul 11 15:58:43 2019 +0000

    Gerrit v2.15.14
    
    Change-Id: I9398884d76d557d86e661f223a42d0d44b1fff4f

Notes (review):
    Verified+2: Thcipriani <tcipriani@wikimedia.org>
    Code-Review+2: Thcipriani <tcipriani@wikimedia.org>
    Submitted-by: Thcipriani <tcipriani@wikimedia.org>
    Submitted-at: Mon, 15 Jul 2019 19:04:51 +0000
    Reviewed-on: https://gerrit.wikimedia.org/r/522133
    Project: operations/software/gerrit
    Branch: refs/heads/deploy/wmf/stable-2.15

So In July 2019 we ran v2.15.14. We then upgraded to v3.2.2 in June 2020.

I revisited the implementation of PushOne and eventually found 29875954b9e766620f31991f425020ae22ade485 https://gerrit-review.googlesource.com/c/plugins/replication/+/230458:

Author: Marcin Czech <maczech@gmail.com>
Date:   Fri Jul 5 13:03:48 2019 +0100

    Add replication refs-filtering before push
    
    Add ability to filter out refs before being pushed for replication to
    remote instance. This will help to prevent split-brain issue by
    allowing us to create filter in multi-site plugin to stop the out of sync
    instance from overriding the changes of the instance that is up to date.

But that adds an extension point.

So I gotta verify whether refs/pull/1 is attempted for deletion when the remote has mirror = true and push = +refs/heads/*:refs/heads/*. As I understand the code, that should be skipped on a full replication and for a delta, refs/pull/1 does not exit on the local/Gerrit repo thus there is no reason it get passed to doPushDelta(), since it does not exist locally.

Mystery. I'll try to reproduce.
It is in replication plugin since tag v2.16.11.1 which is after v2.15.14 and we thus did not have that code at the time.

PushOne instance creation is being injected two configs:

  • RemoteConfig assigned to config
  • ReplicationConfig assigned to replConfig

replConfig thus have the push = +refs/heads/*:refs/heads/* config.

The doPushAll mirroring grabs all the remote reference (on GitHub) and pass them through a matchDst():

@Nullable
private RefSpec matchDst(String ref) {
  for (RefSpec s : config.getPushRefSpecs()) {
    if (s.matchDestination(ref)) {
      return s.expandFromDestination(ref);
    }
  }
  return null;
}

On the other hand, Destination is created with a DestinationConfiguration and has a .wouldPushRef():

for (RefSpec s : config.getRemoteConfig().getPushRefSpecs()) {
  if (s.matchSource(ref)) {
    return true;
  }
}

I suspect that either:

  • PushOne is injected the wrong RemoteConfig and that config.getPushRefSpecs() would end up with a default which matches anything?
  • matchSrc/matchDst should be updated to use replConfig.getRemoteConfig()?

I don't know what is in in RemoteConfig config :/

I don't know what is in in RemoteConfig config :/

And that is injected in src/main/java/com/googlesource/gerrit/plugins/replication/Destination.java. The class has:

  @Inject
  protected Destination(
...
      @Assisted DestinationConfiguration cfg) {
    config = cfg;
    if (!cfg.getAuthGroupNames().isEmpty()) {

That config is the replication config for the remote replica (eg GitHub). It is then bound for dependency injection to RemoteConfig:

:         bind(RemoteConfig.class).toInstance(config.getRemoteConfig());

Thus should have the push stanza and cause PushOne to skip.