Page MenuHomePhabricator

Add wikimedia-research/* GitHub repos to Codesearch
Closed, DeclinedPublic

Description

https://github.com/wikimedia-research/

Adding the repos in the above GitHub organisation would ease searching for uses of data collected via EventLogging/the Event Platform.

Note well that the repos are only available on GitHub.

Event Timeline

On a related note, are there plans to migrate from Github to Wikimedia GitLab?

We generally recommend not having code paths outside of wikimedia/ org in github too. For example, the github org has rules that enforces 2FA for owners. If you have to, sure. But it's not recommended.

Legoktm changed the task status from Open to Stalled.Jan 21 2023, 7:08 PM
Legoktm subscribed.

There is no support for codesearch to auto-discover repositories in an org on GitHub because it would require using their proprietary API.

Your options are to either provide a manual list of all repositories you want indexed or move to Wikimedia GitLab/Gerrit.

We generally recommend not having code paths outside of wikimedia/ org in github too. For example, the github org has rules that enforces 2FA for owners. If you have to, sure. But it's not recommended.

wikimedia-research also enforces 2FA for everyone in the org

There is no support for codesearch to auto-discover repositories in an org on GitHub because it would require using their proprietary API.

I mean, we're swimming in a big ocean of proprietary API & services usage already. If what you mean is that between rate limiting, authentication, and engineering the hook to the API it's just not worth all that extra effort, then sure, I agree.

Your options are to either provide a manual list of all repositories you want indexed or move to Wikimedia GitLab/Gerrit.

For what it's worth it doesn't look like Codesearch indexes Wikimedia GitLab either.

Compare https://codesearch.wmcloud.org/search/?q=NamedHivePartitionSensor&i=nope&files=&excludeFiles=&repos= to https://gitlab.wikimedia.org/search?search=NamedHivePartitionSensor&search_code=true&repository_ref=main&group_id=189&project_id=93

On a related note, are there plans to migrate from Github to Wikimedia GitLab?

We have some repos using GitHub Pages. If Wikimedia GitLab ever gets Pages enabled we can consider migrating. According to Kate Chapman on Slack as of yesterday that's not on the roadmap.

@phuedx: At the moment you should probably just use GitHub's search e.g. https://github.com/search?o=desc&q=org%3Awikimedia-research+event&s=indexed&type=Code

Should we just decline this ticket? I doubt it'll ever be resolved in a satisfactory way.

For what it's worth it doesn't look like Codesearch indexes Wikimedia GitLab either.

See T268196: Figure out the future of codesearch in a GitLab world for that...

We have some repos using GitHub Pages. If Wikimedia GitLab ever gets Pages enabled we can consider migrating. According to Kate Chapman on Slack as of yesterday that's not on the roadmap.

I'm not sure which "roadmap" this is about (and where to find it). I could not find a Phab ticket requesting enabling pages (or any discussion). Public links welcome!

There was some discussion about it in the original GitLab consultation but it seems when the results of the consultation were summarized that feature never made it to the roadmap.

We did explicitly ask to have private repos on GitLab a while back (T305082) and honestly that would actually be the single greatest motivator to switching to GL entirely, but there's been no movement on that.

There is no support for codesearch to auto-discover repositories in an org on GitHub because it would require using their proprietary API.

I mean, we're swimming in a big ocean of proprietary API & services usage already. If what you mean is that between rate limiting, authentication, and engineering the hook to the API it's just not worth all that extra effort, then sure, I agree.

codesearch was created as a replacement to proprietary services, has never used any proprietary APIs and never will.

For what it's worth it doesn't look like Codesearch indexes Wikimedia GitLab either.

It does.

Compare https://codesearch.wmcloud.org/search/?q=NamedHivePartitionSensor&i=nope&files=&excludeFiles=&repos= to https://gitlab.wikimedia.org/search?search=NamedHivePartitionSensor&search_code=true&repository_ref=main&group_id=189&project_id=93

Repositories are added upon request. I'm not aware of anyone asking for these repositories to be indexed.

codesearch was created as a replacement to proprietary services, has never used any proprietary APIs and never will.

I never said codesearch did.

Repositories are added upon request. I'm not aware of anyone asking for these repositories to be indexed.

Ah, on request. Got it. I see that Timo only suggested all GitLab repos also be indexed in Codesearch in T268196.

Anyway, there's nothing left to discuss on this ticket so I'm closing it.