Page MenuHomePhabricator

Gitlab Merge Requests count looks way too low
Open, Needs TriagePublicBUG REPORT

Description

Looking at wikimedia.biterg.io dashboard for Gitlab Merge Requests, over 90 days it shows only 71 merge requests, 15 submitters and 6 repositories. That sounds way too low. I guess the crawler is not indexing all the Gitlab repositories?

Dashboard link https://wikimedia.biterg.io/app/dashboards#/view/b2218fd0-bc11-11e8-8aac-ef7fd4d8cbad

Event Timeline

Aklapper renamed this task from Biterg Gitlab Merge Requests count looks way too low to Gitlab Merge Requests count looks way too low.EditedMay 15 2024, 9:51 AM
Aklapper changed the subtype of this task from "Task" to "Bug Report".

https://gitlab.com/Bitergia/c/Wikimedia/sources/blob/master/projects.json has not been updated since 2024-02-05 (not sure if that's still the canonical source to pull from for indexing though), and lists 71 repositories.

Regarding https://gitlab.com/Bitergia/c/Wikimedia/sources/blob/master/projects.json , according to https://support.bitergia.com/support/tickets/1251

This repo was abandoned after February's migration. The new one isn't public. We are discussing whether we can update it on demand or commit to keeping the public ones updated.

I updated https://www.mediawiki.org/w/index.php?title=Community_metrics&diff=6556755&oldid=6425860 accordingly.

The current non-public config file indexes 100 code repositories which contain "gitlab" in their URLs.

We need to go for a weekly view to find gaps, and while we have some recently, they look normal compared to the last 3 years
[...]
Most of Wikimedia Deutschland's MR activity in the last 3 years has been focused on https://gitlab.wikimedia.org/repos/releng/cli, so I've asked to double-check whether this repo has shown warnings in the collection logs.
If you have other suspects or observations that could help narrow the search for lost data, please let us know. You have more context [...] within the Wikimedia community and what to expect.

There is a separate ticket about indexing more GitLab repositories at https://support.bitergia.com/support/tickets/1274 which seems to be the actual culprit here.

I am reopening this task which was filed assuming the crawler is not indexing all the Gitlab repositories which is what ticket 1274 tracks now :)