Page MenuHomePhabricator

Automated generation of (Gerrit) repositories for Korma
Closed, ResolvedPublic

Description

This task is intended to have an automated way to generate a list of repositories of interest for the Foundation.

As seen in T103984: Exclude certain repositories (upstream / inactive) from Gerrit metrics by blacklisting, there are some repositories to be excluded.

  • Known issues in the current process:
    • List of repositories are not updated: new repos in Gerrit are not added to Korma
    • Missing Git repositories in the list of Gerrit projects (some Git repos are out of the current review process)
    • Not all of the Gerrit projects are of interest for Korma (eg: upstream projects such as Phabricator may contain extra info)
  • Expected Outcome:
    • List of repositories per data source to be analyzed.
    • For each data source, a file would be created. And each file would contain a repository per line. That repository should be the direct link where the information is found. Or at least, this should contain enough information to be later parsed and automatically build the correspondent URL.

See Also:
T101777: Remove deprecated repositories from korma.wmflabs.org code review metrics
T110678: Automated generation of (Git) repositories for Korma

Event Timeline

Missing Git repositories in the list of Gerrit projects (some Git repos are out of the current review process)

I would remove this problem from the scope of this task. Let's count Gerrit repos in the time being.

Qgil triaged this task as Medium priority.Jul 7 2015, 11:17 AM

(Assigning to Daniel because he'll work on this at some point)

Hi,

I've started to work on this. In first place I've modified the code of Automator to accept external config files with lists of repositories [1] and later I've added a new temporary repository to store Mediawiki lists [2].

I said temporary because I still do not know if it's a good idea to have this repo in GitHub or maybe it's better to store that in the Wikimedia infrastructure. But let's start with this :).

The good point is that we can all check the evolution of such file and easily keep track of its changes.

[1] https://github.com/MetricsGrimoire/Automator/commit/8646b6587cbdd6a56588047305c01031ba52102b
[2] https://github.com/Bitergia/mediawiki-repositories

Aklapper raised the priority of this task from Medium to High.Aug 11 2015, 9:16 AM

Discussed in our meeting:

  • Bitergia provides the automated process to gather data from Git/Gerrit and have a list updated every month.
  • Wikimedia is responsible to maintain the manual list of repositories to filter out upstream repositories etc.

Some tasks done within this task:

  • Octopus, the tool to automatically retrieve data sources information already supports Gerrit.
  • Automator, the tool to run the whole machinery already supports that Octopus option.
  • Octopus now can export repositories that will feed Automator when retrieving info.

Still in progress:

Once this is done, we are supposed to easily change that file with the full list of repositories. I'm starting only with Gerrit since this seems to be the most important data source.

Some updates: there's a new tool in Metrics Grimoire named as 'rremoval' [1] (Repository Removal Tool). This will help to remove those repositories that are not interesting in the analysis.

We still have to integrate the several tools in Automator to automate the whole process.

[1] https://github.com/MetricsGrimoire/rremoval

Aklapper set Security to None.

Some extra steps:

  • Octopus is now reliable retrieving information from Gerrit
  • Automator is able to deal with information coming from Octopus, removing projects from the blacklist.
  • Rremoval now removes projects that are in the blacklist and those that are found in the database but not in the original list of projects (deprecated).

Work to be done:

  • Automator should 'git push' each time the file provided when querying gerrit and 'git pull' your changes on the blacklist.
Aklapper renamed this task from Automated generation of repositories for Korma to Automated generation of (Gerrit) repositories for Korma.Aug 28 2015, 2:00 PM

And finally, Automator is now able to fetch and push changes to the git repository. So, as commented, this is just a matter of updating the blacklist to your specific needs, the rest of the info should be automatically updated by Automator (removal of blacklisted projects + deprecated ones).

Closing this task :).

As an example of this behaviour, the last commit by Automator with this respect is available at [1]. There you'll see that three new repositories were added from the previous version found there.

[1] https://github.com/Bitergia/mediawiki-repositories/commit/6201c98564e567603f8b65fe51a153790aad666a