Page MenuHomePhabricator

Automatically close Pull Requests in repos mirrored on Github
Closed, ResolvedPublic

Description

Situation

Github currently does not allow disabling creation of Pull Requests. And some people

Hence there are PRs by new contributors rotting in Github: Authors do not know that noone will ever notice these PRs, and maintainers/stewards don't know about these PRs.

I cannot quantify how many patches have been ignored so far, as I have no idea how to make the search query identify and ignore canonical repos. See T237470: Create and maintain somehow a list of repos mastered in GitHub (and in Phabricator Diffusion) and T109939: For mirrored GitHub repositories, actually give the canonical source Gerrit URL in the repo description.

Per-repo options

To quote Thomas from 2016,

One way to deal with this is to add a pull request template to github telling the user that this is not the correct place for submitting patches and a link to the relevant bugzilla page.
The way to add a pull request template is by creating a file .github/PULL_REQUEST_TEMPLATE in the repository. As the repos on github are just mirrors such a file would have to go into upstream git.

There are also CONTRIBUTING.md files, of course.
But all of this is on a per-repo level which is cumbersome with thousands of repos.

Proposal

  • Have a bot which automatically closes pull requests and tells people what to do.
  • Make the message include similar stuff which is posted on Gerrit patches by Gerrit-newcomer-bot
  • Next-level bonus points for somehow getting these patches into Gerrit when closing the Github PR. (Gerrit Patch Uploader, API, etc)
Implementation

See an example at https://github.com/GNOME/gimp/pull/24#issuecomment-578557470
According to https://wiki.gnome.org/Sysadmin/GitHub ,

Behind the automatic closures of Pull Requests is the close_pull_requests.py script that runs daily at midnight. The script was originally announced on the desktop-devel mailing list in November 2016.

Event Timeline

Aklapper created this task.Apr 8 2020, 11:12 AM

Here's what I'm thinking:

  • A GitHub app can provide more instant feedback than a daily script
  • The logic could look something like this: a PR is created -> app living on toolforge is triggered via a webhook -> app checks if .gitreview exists in repo (to not act in non-mirrors) -> (app creates a patch in Gerrit?) -> app closes PR with a helpful message

I can implement this if needed.

@Majavah: Yay, thanks for your interest and for sharing your knowledge about a better approach! Help is definitely welcome as I am not a developer and don't use Github very often. :)

In my understanding the problem to solve before this one is to find a reliable way how to identify those repositories on Github which are *not* mirrored from Wikimedia Gerrit (random example: https://github.com/wikimedia/wikipedia-ios/ ). T109939 and T237470 basically.
We don't want to interrupt the workflow of those developers who use Github instead of Wikimedia Gerrit. :-/

Majavah claimed this task.Apr 8 2020, 3:10 PM
Majavah moved this task from ❔ Unsorted to 📝 Backlog on the User-Majavah board.

In my understanding the problem to solve before this one is to find a reliable way how to identify those repositories on Github which are *not* mirrored from Wikimedia Gerrit (random example: https://github.com/wikimedia/wikipedia-ios/ ). T109939 and T237470 basically.
We don't want to interrupt the workflow of those developers who use Github instead of Wikimedia Gerrit. :-/

I believe this can be achieved very simply by just checking if a file named .gitreview is present in the repository root - that file is used by git-review which is the tool used for submitting patches to Gerrit.

Please not that initial setup will require someone with admin access to the organization to create and install the app. The app itself after that can be limited to only being able to modify pull requests.

Process update: it's working on my test repository!

The code is at rLTPRC tool-github-pr-closer. I'm not a writer, so I'd appreciate if someone could help with writing the closing message.

ping @Aklapper

Reedy added a subscriber: Reedy.EditedApr 9 2020, 12:04 PM

While it’s very much not a common thing, I think we might still have some repos that are canonically on phabricator/diffusion which use an .arcconfig (off the top of my head)

We should check with Release-Engineering-Team about that

Regarding the message, it should definitely link to one of the gerrit getting starred guides etc. Can definitely find people to help write that (we have a few lovely tech writers who I'm sure would be more than happy to help)

Can I ask where you found the basis for your instructions?

Go to organization settings -> github apps -> new github apps

That doesn't seem to be right in our org:

And on "Installed GitHub Apps", there is no "New" type button (AFAIK they want you to install them via https://github.com/marketplace)

And the items on the WebHook page don't match the options you're saying to set

Thanks!

Aklapper added a comment.EditedApr 9 2020, 12:25 PM

I'd probably change message_template.md to something like this:

:wave: Hi @{{author}},
	
Thank you for contributing to Wikimedia! This Github repository is a read-only mirror.
The project uses [Gerrit](https://gerrit.wikimedia.org) for code review.
Please [submit your pull request to Gerrit](https://www.mediawiki.org/wiki/Gerrit/Tutorial), so your code changes can be reviewed.

If you are interested in sticking around within the Wikimedia community, please take a look at https://www.mediawiki.org/wiki/New_Developers
Thanks in advance!

But let me ping a technical writer first! :)

I'd probably change message_template.md to something like this:

[...]

This sounds great, thanks! I think that at least in the beginning it would be nice to have some "come yell here if this is broken" link, do you think this should have its own project here or should it just eg. link to this task?

Reedy added a comment.Apr 9 2020, 12:40 PM

Is there a way we can run this on demand/interactively/similar? There's potentially some PRs in some less viewed github repos that haven't already been manually closed, and would be nice to get back to zero on them

Is there a way we can run this on demand/interactively/similar? There's potentially some PRs in some less viewed github repos that haven't already been manually closed, and would be nice to get back to zero on them

This could be run as an one-off script: running a search for is:pr is:open org:wikimedia currently yields 332 results, which is an amount that can be just run in a go (with some delays to stay within rate limits).

Reedy added a comment.Apr 9 2020, 12:59 PM

Is there a way we can run this on demand/interactively/similar? There's potentially some PRs in some less viewed github repos that haven't already been manually closed, and would be nice to get back to zero on them

This could be run as an one-off script: running a search for is:pr is:open org:wikimedia currently yields 332 results, which is an amount that can be just run in a go (with some delays to stay within rate limits).

Cool! I note some repos do use github as their canonical repo, so not all of those open PR will want closing, but as the script will be following the same logic to leave comments and close PR, this shouldn't be an issue :)

Majavah added a comment.EditedApr 9 2020, 1:02 PM

Can I ask where you found the basis for your instructions?
[...]
That doesn't seem to be right in our org:

For me "GitHub apps" is under "Developer settings", could you check again?

Also it looks like it's not possible to give permission to only 2 different files, it's one file or all files. That means that this will require to have read-only contents permission, does this cause any issues?

Reedy added a comment.Apr 9 2020, 2:43 PM

Can I ask where you found the basis for your instructions?
[...]
That doesn't seem to be right in our org:

For me "GitHub apps" is under "Developer settings", could you check again?

Might be worth clarifying in your documentation a little, it wasn't immediately obvious :)

Also it looks like it's not possible to give permission to only 2 different files, it's one file or all files. That means that this will require to have read-only contents permission, does this cause any issues?

I honestly don't think so. All our repos are public, so the bot/script isn't going to be able to access anything that anyone (the script or otherwise) couldn't just access directly via the web etc, or by cloning the repo.

If we had private repos (with for example passwords), I could understand there being a need to have a restriction. But for our use case, I don't see this being an issue at all

Reedy added a comment.Apr 9 2020, 3:03 PM

While it’s very much not a common thing, I think we might still have some repos that are canonically on phabricator/diffusion which use an .arcconfig (off the top of my head)

We should check with Release-Engineering-Team about that

T191182: Stop using Differential for code review - It's definitely deprecated

While it’s very much not a common thing, I think we might still have some repos that are canonically on phabricator/diffusion which use an .arcconfig (off the top of my head)

We should check with Release-Engineering-Team about that

T191182: Stop using Differential for code review - It's definitely deprecated

If support for Diffusion is not added the only downside is that PRs in repositories hosted there will not get automally closed - so not false positives here, just false negatives. Supporting adds an extra API call to GitHub, so any objections if that's not supported (initially, at least)?

Can I ask where you found the basis for your instructions?
[...]
That doesn't seem to be right in our org:

For me "GitHub apps" is under "Developer settings", could you check again?

Might be worth clarifying in your documentation a little, it wasn't immediately obvious :)

{{done}}

@Aklapper the text of that message looks good!

Thanks @srodlund. @Reedy can you do the configuration on GitHub side or should I contact someone else?

I created the script for closing down existing pull requests. AFAIK this is now ready for deployment.

Reedy added a comment.Apr 11 2020, 4:38 PM

Thanks!

As always, some code review is a good idea

I note https://phabricator.wikimedia.org/source/tool-github-pr-closer/browse/master/close-old-prs.py seems to have a lot of duplicated code compared to https://phabricator.wikimedia.org/source/tool-github-pr-closer/browse/master/github.py

If you've structured the code like you have to have functions in github.py, why not make use of them in close-old-prs.py? I know the code isn't all exactly the same, but it's mostly the same, doing the same token requests etc

Thanks!

As always, some code review is a good idea

[...]

`Thanks, I dryed the code. Anything else?

@Aklapper @Reedy What would be the next steps in getting this deployed?

Majavah triaged this task as Medium priority.Apr 13 2020, 8:10 AM
Majavah moved this task from Backlog to config on the Wikimedia-GitHub board.Apr 14 2020, 11:19 AM

This sounds great, thanks! I think that at least in the beginning it would be nice to have some "come yell here if this is broken" link, do you think this should have its own project here or should it just eg. link to this task?

Hmm, for now I'd say link to this task should be fine if as I cannot imagine too many requests? I might be wrong in hindsight. :) If you want a project that's fine to of course, see https://www.mediawiki.org/wiki/Phabricator/Creating_and_renaming_projects :)

Aklapper added a subscriber: thcipriani.

@Aklapper @Reedy What would be the next steps in getting this deployed?

As I have no clue about Github, I'd love Release-Engineering-Team to take a look at this. (Or maybe even @thcipriani as I just stumbled across T245826?) :)

I don't mind setting it up

Webhook: active=true, url=https://(base-url)/github, secret=(generate random string here)

What's the URL for the bot/tool?

GitHub app name, Description, Homepage URL: these are displayed to the user, so please put something helpful here. app name is used for when naming the bot account.

We should write a description and decide what URL to use for the homepage

App name? "Wikimedia PR Closer"

Or keep something more generic?

@Aklapper have we filed a task for doing something similar for issues?

I understand the app Name will be the bot name, so rather than "Wikimedia PR Closer" I would prefer something more user-friendly, such as "Wikimedia Welcome Bot for GitHub users"

Well, it's not really welcoming anyone ;)

"Wikimedia Helper Bot for GitHub users"

Heh, calling it "Welcome Bot" sounded nicer, even if it would still be saying go away from GitHub ;)

T249703#6042877 seemed like a welcome to me, but "Wikimedia Helper Bot for GitHub users" is good to go, too.

bd808 added a subscriber: bd808.EditedApr 20 2020, 12:37 AM

Webhook: active=true, url=https://(base-url)/github, secret=(generate random string here)

What's the URL for the bot/tool?

The webhook on the Toolforge side should probably use the new --canonical flag to webservice and its *.toolforge.org hostname so that the GitHub side of the config doesn't need to be changed once the toolforge.org migration reaches the forced migration stage.

https://github-pr-closer.toolforge.org/

(edit: forgot to add the URL)

Majavah added a comment.EditedApr 20 2020, 5:22 AM

I don't mind setting it up

Great, thanks!

Webhook: active=true, url=https://(base-url)/github, secret=(generate random string here)

What's the URL for the bot/tool?

Webhook URL should be https://github-pr-closer.toolforge.org/github

GitHub app name, Description, Homepage URL: these are displayed to the user, so please put something helpful here. app name is used for when naming the bot account.

We should write a description and decide what URL to use for the homepage

App name? "Wikimedia PR Closer"

Or keep something more generic?

+1 for naming it "Wikimedia Helper Bot for GitHub users". How about using https://www.mediawiki.org/wiki/Gerrit as the homepage URL?

have we filed a task for doing something similar for issues?

Issues can easily be disabled from repository settings.

Reedy added a comment.May 9 2020, 4:03 PM

Wikimedia Helper Bot for GitHub users

is too long, so going with

Wikimedia Helper Bot for GitHub

Reedy closed this task as Resolved.May 9 2020, 5:01 PM

Big Thanks to everybody who worked on this (especially Majavah for diving into this with a speed and a passion that I unfortunately could not always follow and support myself because too many things on the to-do list).
@Majavah: I really appreciate that you made this happen and I owe you a drink once there are physical Hackathons and conferences again.

@Majavah: I read this entire ticket again wondering which logic identifies (and excludes) repos which are canonical on Github and not mirrored to Github.
Asking because we've been wondering in T109939 how to identify them, so any knowledge sharing is welcome in T109939! :)

@Majavah: I read this entire ticket again wondering which logic identifies (and excludes) repos which are canonical on Github and not mirrored to Github.
Asking because we've been wondering in T109939 how to identify them, so any knowledge sharing is welcome in T109939! :)

It looks for .gitreview and possibly .arcreview I think

@Majavah: I read this entire ticket again wondering which logic identifies (and excludes) repos which are canonical on Github and not mirrored to Github.
Asking because we've been wondering in T109939 how to identify them, so any knowledge sharing is welcome in T109939! :)

It looks for .gitreview and possibly .arcreview I think

This tool only looks for .gitreview due to T191182: Stop using Differential for code review. That file looks like this:

[gerrit]
host=gerrit.wikimedia.org
port=29418
project=wikimedia/meet-accountmanager.git
track=1
defaultrebase=0

where project is the repository name in Gerrit with a .git suffix. I'm happy to write a script for T109939 if you can decide the format and landing page.

Tgr added a subscriber: Tgr.May 22 2020, 11:32 AM

It would be nice to link to or at least name the Gerrit repository name in the closing message. (It would be even cooler to provide specific instructions on how to submit the pull request to Gerrit, not sure how feasible that is though.)

This tool only looks for .gitreview

Ah, thanks! :) Alright, that means checking for existence of a file, so looks like we cannot use that approach in T109939 because that's not API metadata...

hashar added a subscriber: hashar.Sep 30 2020, 8:13 AM

I am slightly hijacking this closed task. We got a notification from Github warning about the removal of the events integration_installation and integration_installation_repositories:

We're contacting you regarding upcoming changes to GitHub Apps owned by your organization:

  • Wikimedia Helper Bot for GitHub

We no longer support two events which your GitHub Apps may rely on, integration_installation and integration_installation_repositories.

These events can be replaced with the installation and installation_repositories events respectively.

The integration_installation and integration_installation_repositories events will be removed after October 1st, 2020.

Please visit https://developer.github.com/changes/2020-04-15-replacing-the-installation-and-installation-repositories-events for more information about suggested changes, brownouts, and removal dates.

From a quick look at the code, nothing matches integration so maybe we are not affected?

Guess I should have done a search first. thank you!