Page MenuHomePhabricator

Review entries in https://github.com/Bitergia/mediawiki-repositories/ to exclude/include and find out if still needed
Closed, ResolvedPublic

Description

...as that was last updated two years ago.

https://phabricator.wikimedia.org/T146135#3176718 (basically a copy as a SparQL query) might also be outdated.

Event Timeline

Well, https://github.com/Bitergia/mediawiki-repositories/ is not used anymore in our stack so the summary is slightly misleading, but the situation is:

(The topic reminds me of bits and pieces again in T103292#1693361.)

Trying to gather understanding by running diff -pu gerrit-repo-list-from-ssh-T187711.txt gerrit-repo-list-from-bitergia-T187711.txt:

Aklapper moved this task from Backlog to March on the Developer-Advocacy (Jan-Mar-2018) board.

Using grep -r "organization" wikimedia-affiliations.json | sort | uniq -c to create the query author_org_name:"Canonical" OR author_org_name:"Deviantart" OR author_org_name:"Etsy" OR author_org_name:"Facebook" OR author_org_name:"Google" OR author_org_name:"HP" OR author_org_name:"IBM" OR author_org_name:"Intel" OR author_org_name:"Phacility" OR author_org_name:"RedHat" OR author_org_name:"Suse" (affilations I added intentionally to our database to be able to find potential pulled 3rd party repos), the Gerrit stats look as empty as I had expected and hoped.

Which leaves us with the (less important) Git stats: Applying the same query we get 22 repositories to take a closer look at. So I think we can throw away https://phabricator.wikimedia.org/T146135#3176718 mostly nowadays.

On a related note, I am puzzled why there is a Github repository in there: https://github.com/wikimedia/jquery.ime

Aklapper renamed this task from Review entries in https://github.com/Bitergia/mediawiki-repositories/ to exclude/include to Review entries in https://github.com/Bitergia/mediawiki-repositories/ to exclude/include and find out if still needed.Mar 28 2018, 10:19 AM
Aklapper closed this task as Resolved.

Well. For Git in https://wikimedia.biterg.io/goto/32ce4404dcba6ba93efb2aee2c9dd752 , picking a random repo from the bottom, like repo_name:"https://gerrit.wikimedia.org/r/operations/debs/kafka", you of course still get 171 authors and just checking like in the previous comment via author_org_name does not help as most contributors to that upstream project are under Independent per definition (no company email address used). We have three Wikimedia contributors though in that upstream project.

So nothing new, this boils down to the old "cannot identify repositories pulled from upstream" problem.

I'll ignore Git for the rest of this comment, as we concentrate on Gerrit in our metrics anyway.
I took the one year old Gerrit query from T146135#3176718 and inverted it to only show those Gerrit repos excluded in T146135 to see its contributors and if that still makes sense and went for the last 6 months to avoid a visualization timeout of 30s. The listed author names don't show pollution either.

I updated https://www.mediawiki.org/w/index.php?title=New_Developers%2FQuarterly%2FHow_to_Create_a_Report&type=revision&diff=2747355&oldid=2718239 accordingly.