...as I am afraid that this does not happen automagically.
This is about the "git" section in https://gitlab.com/Bitergia/c/Wikimedia/sources/raw/master/projects.json for entries that start with https://github.com/.
Similar to T218519.
...as I am afraid that this does not happen automagically.
This is about the "git" section in https://gitlab.com/Bitergia/c/Wikimedia/sources/raw/master/projects.json for entries that start with https://github.com/.
Similar to T218519.
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | None | T218529 Investigate indexing Issues and PRs of Github repositories canonical on wikimedia.biterg.io | |||
Stalled | None | T218528 Sort out a process to add new Github repositories to wikimedia.biterg.io Git indexing | |||
Resolved | thcipriani | T237470 Create and maintain a list of organization repos that are maintained on Gerrit, GitHub, and Diffusion |
There is nothing in the JSON output of https://api.github.com/orgs/wikimedia/repos that indicates whether a Github repository has been forked but I excluded those forked repos in T186736. Lots of noise so identifying new repos will not be easy.
Code, basically:
wget -q "https://api.github.com/orgs/wikimedia/repos?page=1&per_page=100" -O githubwmf01.json cat githubwmf*.json | jq -r '.[] | select(.description|values | contains("irror") | not).html_url' > githubrepos.tmp cat githubrepos.tmp >> githubreposall.tmp (repeat for all pages and all groups) cat githubreposall.tmp | sort > githubrepos.txt curl -s https://gitlab.com/Bitergia/c/Wikimedia/sources/raw/master/projects.json | jq -r '.Wikimedia.git[]' | grep "https://github.com/" | sort > bitergiarepos.txt echo "=== New Github repos to potentially add to Bitergia (if not Gerrit mirrors):" comm -23 githubrepos.txt bitergiarepos.txt
Maybe it's way easier to just index any and all repo on Github under the groups we're interested in.
There is nothing in the JSON output of https://api.github.com/orgs/wikimedia/repos that indicates whether a Github repository has been forked ...
Just wondering about that statement, can't the fork property, which is present for each repository, be used to determine whether the repository is a fork or not? 🤔
For the records (and complexity), T213246#6770339 has a list of repos mirrored from GitHub to Gerrit as of 20210122...