Page MenuHomePhabricator

Cleanup unused refs in All-Users under refs/starred-changes/xx/yyyyxx/zz
Closed, ResolvedPublic


When looking at All-Users.git I have found a bunch of obsolete references under refs/starred-changes/xx/yyyyxx/zz which were to track which files got reviewed. The last change having such entry is 709572 made in August 2021. That seems to match our upgrade to Gerrit 3.3 ( T262241#7256043 ).

My guess is tracking reviewed files got moved out of the git repository. We should be able to garbage collect any reference pointing to an object having reviewed/.

git fetch origin --prune +refs/starred-changes/*:refs/remotes/origin/starred-changes/*
git ls-remote . refs/remotes/origin/starred-changes/*|sort -g -t/ -k6|egrep -v '^(8485e986e458a566e6f6160f71d704edc10c57fc|ce7b81997cf51342dedaeccb071ce4ba3ed0cf52)'|less|wc -l

The checksum are for git blobs:

$ echo -en 'blob 4\0star'|shasum 
ce7b81997cf51342dedaeccb071ce4ba3ed0cf52  -
$ echo -en 'blob 6\0ignore'|shasum 
8485e986e458a566e6f6160f71d704edc10c57fc  -

Though some entries have both star AND reviewed/XXX so it is not that straight forward :]

Event Timeline

The star set to ignore is I think removed entirely in Gerrit 3.7 via Change 343774 and the release note:

Delete ignored state of changes and ‘star:’ queries
The query predicates star:ignored, is:ignored and star:star are not supported anymore. The latter is identical to is:starred or has:star.

Thus I guess they can be dropped as well.

I think a way to achieve it is to:

  • crawl through the reference
  • show the pointed object then
    • if it contains star then
      • update the reference to point to ce7b81997cf51342dedaeccb071ce4ba3ed0cf52 (star)
    • else delete it

I also note @thcipriani mentioned the starred changes issue in a September 2020 blog post which stated we had 22 000 references there.

Mentioned in SAL (#wikimedia-releng) [2024-01-26T13:44:54Z] <hashar> gerrit: deleting obsolete references in All-Users.git under refs/starred-changes/* to only keep the one having a star label # T355794

I have used a script to process the references which keeps the references pointing to star and update the refs that contain a star to only have a star label.

I tested that locally with a local copy of All-Users.git and fetching the refs with:

git fetch -pP origin  +refs/starred-changes/*:refs/starred-changes/*

The script emits git update-ref commands:

1#!/usr/bin/env python3
3import subprocess
5# replace with:
6# git ls-remote ssh:// refs/starred-changes/*/*/*
7p =
8 ['git', 'for-each-ref',
9 '--format=%(objectname)%09%(refname)',
10 'refs/starred-changes/*/*/*',
11 ],
12 check=True,
13 text=True,
14 capture_output=True,
17known_for_deletion = set([
18 # `ignore` blob
19 '8485e986e458a566e6f6160f71d704edc10c57fc'
22for line in p.stdout.rstrip().split("\n"):
24 # <blob sha> <refs/starred-changes/*/*/*>
25 (objectname, refname) = line.split("\t")
27 if objectname == 'ce7b81997cf51342dedaeccb071ce4ba3ed0cf52':
28 # We keep `star`
29 continue
31 if objectname in known_for_deletion:
32 print(f"git update-ref -d {refname}")
33 continue
35 p =
36 ['git', 'show', objectname],
37 check=True, text=True, capture_output=True
38 )
39 stars = p.stdout.rstrip().split("\n")
41 if 'star' in stars:
42 # convert it
43 print(f"git update-ref {refname} ce7b81997cf51342dedaeccb071ce4ba3ed0cf52")
44 continue
45 else:
46 known_for_deletion.add(objectname)
47 print(f"git update-ref -d {refname}")
48 continue

Eventually we are left with solely the star labels:

$ git for-each-ref refs/starred-changes/*/*/*|cut -d\  -f1|sort|uniq -c
  27656 ce7b81997cf51342dedaeccb071ce4ba3ed0cf52
hashar claimed this task.

We are left with solely star, which surely should be removed once a change got merged.

Change 997270 had a related patch set uploaded (by Hashar; author: Hashar):

[All-Users@refs/meta/config] Grant Administrators delete right on refs/starred-changes

Change 997270 merged by Hashar:

[All-Users@refs/meta/config] Grant Administrators delete right on refs/starred-changes