Page MenuHomePhabricator

Sort out a process to add new Github repositories to wikimedia.biterg.io Git indexing
Open, Stalled, LowestPublic

Description

...as I am afraid that this does not happen automagically.

This is about the "git" section in https://gitlab.com/Bitergia/c/Wikimedia/sources/raw/master/projects.json for entries that start with https://github.com/.

Similar to T218519.

Event Timeline

Aklapper lowered the priority of this task from Low to Lowest.Mar 20 2019, 11:22 AM
Aklapper moved this task from To triage to Apr-Jun 2019 on the Developer-Advocacy board.

There is nothing in the JSON output of https://api.github.com/orgs/wikimedia/repos that indicates whether a Github repository has been forked but I excluded those forked repos in T186736. Lots of noise so identifying new repos will not be easy.

Code, basically:

wget -q "https://api.github.com/orgs/wikimedia/repos?page=1&per_page=100" -O githubwmf01.json
cat githubwmf*.json | jq -r '.[] | select(.description|values | contains("irror") | not).html_url' > githubrepos.tmp
cat githubrepos.tmp >> githubreposall.tmp
(repeat for all pages and all groups)
cat githubreposall.tmp | sort > githubrepos.txt
curl -s https://gitlab.com/Bitergia/c/Wikimedia/sources/raw/master/projects.json | jq -r '.Wikimedia.git[]' | grep "https://github.com/" | sort > bitergiarepos.txt
echo "=== New Github repos to potentially add to Bitergia (if not Gerrit mirrors):"
comm -23 githubrepos.txt bitergiarepos.txt

Maybe it's way easier to just index any and all repo on Github under the groups we're interested in.

Aklapper changed the task status from Open to Stalled.Dec 12 2019, 3:08 AM
Aklapper removed Aklapper as the assignee of this task.

Currently not actionable due to dependency tasks hence removing from my workboard

There is nothing in the JSON output of https://api.github.com/orgs/wikimedia/repos that indicates whether a Github repository has been forked ...

Just wondering about that statement, can't the fork property, which is present for each repository, be used to determine whether the repository is a fork or not? 🤔

For the records (and complexity), T213246#6770339 has a list of repos mirrored from GitHub to Gerrit as of 20210122...