Page MenuHomePhabricator

Connect WikiBugs IRC bot to Wikimedia GitLab
Open, MediumPublic3 Estimated Story Points

Description

Elsewise we won't see the activity.

See https://www.mediawiki.org/wiki/GitLab

Event Timeline

Oooh, yes, that's going to be fun. A few thoughts.

https://gitlab.wikimedia.org/api/v4/events?target_type=merge_request seems to be pretty much what we need, except it can only filter by after={date} and not by date and time... so it's not particularly convenient for polling. We may have to go for the inverse route (either emails or webhooks).

Unfortunately it seems global webhooks are only available in the GitLab premium tiers:

image.png (151×619 px, 12 KB)

Some thoughts on architecture when using a webhook

  • We could use a tiny php script to receive the webhook and push it into Redis (which means we don't have to deal with uwsgi deployment etc). This should be so simple that it doesn't really need any tests.
  • The 'new grrrrit' then reads events from redis (or a mock redis for testing), processes them, and pushes them into the irc redis queue.
  • The existing IRC component then handles the messaging.

We should probably move handle_useful_info from redis2irc to wikibugs.py (and make the 'raw' message type the only message type), but that doesn't really block anything here.

Change 710665 had a related patch set uploaded (by Merlijn van Deen; author: Merlijn van Deen):

[labs/tools/wikibugs2@master] Add GitLab to Redis webhook

https://gerrit.wikimedia.org/r/710665

A few further notes after fiddling with a fresh Gitlab instance.

There are three main ways to get push events from GL:

  • Webhooks
  • Integrations
  • System hooks

Webhooks can only be enabled on project, or (if you pay ;-)) group level. Or you can enable them per-project using a big for loop: https://docs.gitlab.com/ee/raketasks/web_hooks.html

System hooks are always global. These do give notifications on merge requests, but not when comments are left, so are not useful for our use case. In addition, there's a privacy issue as the system hook also reports e.g. failed logins.

_Integrations_ can be enabled globally, webhooks cannot (?!). There are a few existing ones that use a webhook: Teams/Google Chat/Webex/Mattermost/Discord/Slack/Unify Circuit. However:

  1. that will block per-project use of those integrations.
  2. all the integrations send preformatted content....

e.g.

Unify:

{u'markdown': True,
           u'subject': u'Merlijn van Deen / My awesome project',
           u'text': u'Merlijn van Deen (MerlijnvanDeen) opened merge request [!2 *Update TestFile*](https://gitlab.wikimedia.org/MerlijnvanDeen/my-awesome-project/-/merge_requests/2) in [Merlijn van Deen / My awesome project](https://gitlab.wikimedia.org/MerlijnvanDeen/my-awesome-project)

Discord:

{u'avatar_url': None,
           u'content': u'',
           u'embeds': [{u'author': {u'icon_url': None,
                                    u'name': u'MerlijnvanDeen',
                                    u'url': None},
                        u'color': None,
                        u'description': u'Merlijn van Deen (MerlijnvanDeen) opened merge request [!2 *Update TestFile*](https://gitlab.wikimedia.org/MerlijnvanDeen/my-awesome-project/-/merge_requests/2) in [Merlijn van Deen / My awesome project](https://gitlab.wikimedia.org/MerlijnvanDeen/my-awesome-project)\n',
                        u'fields': [],
                        u'footer': None,
                        u'image': None,
                        u'provider': None,
                        u'thumbnail': None,
                        u'timestamp': None,
                        u'title': None,
                        u'url': None,
                        u'video': None}],
           u'tts': False,
           u'username': None}

Mattermost:

u'payload={"username":"","fallback":"Merlijn van Deen (MerlijnvanDeen) opened merge request \\u003chttps://gitlab.wikimedia.org/MerlijnvanDeen/my-awesome-project/-/merge_requests/2|!2 *Update TestFile*\\u003e in \\u003chttps://gitlab.wikimedia.org/MerlijnvanDeen/my-awesome-project|Merlijn van Deen / My awesome project\\u003e","text":"Merlijn van Deen (MerlijnvanDeen) opened merge request \\u003chttps://gitlab.wikimedia.org/MerlijnvanDeen/my-awesome-project/-/merge_requests/2|!2 *Update TestFile*\\u003e in \\u003chttps://gitlab.wikimedia.org/MerlijnvanDeen/my-awesome-project|Merlijn van Deen / My awesome project\\u003e"}'

Slack:

{u'data': u'payload={"username":"","fallback":"Merlijn van Deen (MerlijnvanDeen) opened merge request \\u003chttps://gitlab.wikimedia.org/MerlijnvanDeen/my-awesome-project/-/merge_requests/2|!2 *Update TestFile*\\u003e in \\u003chttps://gitlab.wikimedia.org/MerlijnvanDeen/my-awesome-project|Merlijn van Deen / My awesome project\\u003e","text":"Merlijn van Deen (MerlijnvanDeen) opened merge request \\u003chttps://gitlab.wikimedia.org/MerlijnvanDeen/my-awesome-project/-/merge_requests/2|!2 *Update TestFile*\\u003e in \\u003chttps://gitlab.wikimedia.org/MerlijnvanDeen/my-awesome-project|Merlijn van Deen / My awesome project\\u003e"}'

Teams/Google Chat/Webex don't seem to work.

There is also an 'irker' (basic tcp-to-irc gateway) integration, but it seems to ignore everything that doesn't go to master. So OK for a stream of merges, not so useful to provide info on reviews etc. Plus it's not very information-dense...

{"to":["irc://arctus.nl/#chan"],"privmsg":"[\u000304My awesome project\u000f] Administrator pushed \u00021\u000f new commit to \u000305master\u000f: \u000302\u001fhttp://valhallasw-gitlab-test.wmflabs.org/my-awesome-group/my-awesome-project/-/compare/47d73eae...47d73eae\u000f"}
{"to":["irc://arctus.nl/#chan"],"privmsg":"\u000304My awesome project\u000f/\u000305master\u000f \u00031447d73eae\u000f GitLab (\u0003126 files\u000f): Initialized from '.NET Core' project template"}

With a little bit of hacking, I was able to make the Mattermost integration act like a regular webhook:

{
  "username": "",
  "object_kind": "merge_request",
  "event_type": "merge_request",
  "user": {
...
  },
  "project": {
...
  },
  "object_attributes": {
    "assignee_id": null,
    "author_id": 1,
    "created_at": "2021-08-07 17:07:26 UTC",
    "description": "",
    "head_pipeline_id": 2,
    "id": 2,
    "iid": 2,
    "last_edited_at": "2021-08-07 19:05:37 UTC",
    "last_edited_by_id": 1,
    "merge_commit_sha": null,
    "merge_error": null,
    "merge_params": {
      "force_remove_source_branch": "1"
    },
    "merge_status": "can_be_merged",
    "merge_user_id": null,
    "merge_when_pipeline_succeeds": false,
    "milestone_id": null,
    "source_branch": "webhook",
    "source_project_id": 2,
    "state_id": 1,
    "target_branch": "master",
    "target_project_id": 2,
    "time_estimate": 0,
    "title": "Update Program.cs",
    "updated_at": "2021-08-07 19:05:37 UTC",
    "updated_by_id": 1,
    "url": "http://valhallasw-gitlab-test.wmflabs.org/my-awesome-group/my-awesome-project/-/merge_requests/2",
    "source": {
      "id": 2,
      "name": "My awesome project",
      "description": null,
      "web_url": "http://valhallasw-gitlab-test.wmflabs.org/my-awesome-group/my-awesome-project",
      "avatar_url": null,
      "git_ssh_url": "git@valhallasw-gitlab-test.wmflabs.org:my-awesome-group/my-awesome-project.git",
      "git_http_url": "http://valhallasw-gitlab-test.wmflabs.org/my-awesome-group/my-awesome-project.git",
      "namespace": "My Awesome Group",
      "visibility_level": 0,
      "path_with_namespace": "my-awesome-group/my-awesome-project",
      "default_branch": "master",
      "ci_config_path": null,
      "homepage": "http://valhallasw-gitlab-test.wmflabs.org/my-awesome-group/my-awesome-project",
      "url": "git@valhallasw-gitlab-test.wmflabs.org:my-awesome-group/my-awesome-project.git",
      "ssh_url": "git@valhallasw-gitlab-test.wmflabs.org:my-awesome-group/my-awesome-project.git",
      "http_url": "http://valhallasw-gitlab-test.wmflabs.org/my-awesome-group/my-awesome-project.git"
    },
    "target": {
      "id": 2,
      "name": "My awesome project",
      "description": null,
      "web_url": "http://valhallasw-gitlab-test.wmflabs.org/my-awesome-group/my-awesome-project",
      "avatar_url": null,
      "git_ssh_url": "git@valhallasw-gitlab-test.wmflabs.org:my-awesome-group/my-awesome-project.git",
      "git_http_url": "http://valhallasw-gitlab-test.wmflabs.org/my-awesome-group/my-awesome-project.git",
      "namespace": "My Awesome Group",
      "visibility_level": 0,
      "path_with_namespace": "my-awesome-group/my-awesome-project",
      "default_branch": "master",
      "ci_config_path": null,
      "homepage": "http://valhallasw-gitlab-test.wmflabs.org/my-awesome-group/my-awesome-project",
      "url": "git@valhallasw-gitlab-test.wmflabs.org:my-awesome-group/my-awesome-project.git",
      "ssh_url": "git@valhallasw-gitlab-test.wmflabs.org:my-awesome-group/my-awesome-project.git",
      "http_url": "http://valhallasw-gitlab-test.wmflabs.org/my-awesome-group/my-awesome-project.git"
    },
    "last_commit": {
      "id": "4da1b25be1f69639d8e7d085aff7aa8dda02f858",
      "message": "Update Program.cs",
      "title": "Update Program.cs",
      "timestamp": "2021-08-07T17:07:19+00:00",
      "url": "http://valhallasw-gitlab-test.wmflabs.org/my-awesome-group/my-awesome-project/-/commit/4da1b25be1f69639d8e7d085aff7aa8dda02f858",
      "author": {
        "name": "Administrator",
        "email": "admin@example.com"
      }
    },
    "work_in_progress": false,
    "total_time_spent": 0,
    "time_change": 0,
    "human_total_time_spent": null,
    "human_time_change": null,
    "human_time_estimate": null,
    "assignee_ids": [],
    "state": "opened",
    "action": "update"
  },
  "labels": [],
  "changes": {
    "last_edited_at": {
      "previous": "2021-08-07 18:03:43 UTC",
      "current": "2021-08-07 19:05:37 UTC"
    },
    "title": {
      "previous": "Draft: Update Program.cs",
      "current": "Update Program.cs"
    },
    "updated_at": {
      "previous": "2021-08-07 19:05:11 UTC",
      "current": "2021-08-07 19:05:37 UTC"
    }
  },
  "repository": {
    "name": "My awesome project",
    "url": "git@valhallasw-gitlab-test.wmflabs.org:my-awesome-group/my-awesome-project.git",
    "description": null,
    "homepage": "http://valhallasw-gitlab-test.wmflabs.org/my-awesome-group/my-awesome-project"
  },
  "project_url": "http://valhallasw-gitlab-test.wmflabs.org/my-awesome-group/my-awesome-project",
  "project_name": "My Awesome Group / My awesome project"
}

Specifically, this required the following changes:

In any case, this proves that, yes, an integration does have access to the full event stream. However, given that the group-level webhooks are only available in ee, I don't know if GitLab will be interested in getting a global-level webhook integration back. (it seems to have been moved recently from integrations to a separate webhook component).

@Legoktm mentioned Debian's gitlab instance (Salsa) has some irc integration as well. Seems to be the built-in Irker plus something based on webhooks & https://salsa.debian.org/kgb-team/kgb/-/blob/master/script/kgb-bot. This is something we could also deploy in principle (with wikibugs just handling phab, and kgb handling merge requests etc).

The issue of per-project vs global config does remain.

If we go the per-project route, I think we should go for Irker/KGB - just let repo owners figure out their needs/config. If we go for a global route I think we need to look into building a global Integration that is effectively just publishing a webhook.

... or we can poll /events once a minute and keep track of the last seen event ourselves. Not the most efficient option, but nice in terms of security and lack of global permissions.

Sorry, still catching up. I do think we want wikibugs to be globally watching all projects (maybe ignore user-owned projects by default?).

I think we need to look into building a global Integration that is effectively just publishing a webhook

by "building" do you mean patching the GitLab code?

... or we can poll /events once a minute and keep track of the last seen event ourselves. Not the most efficient option, but nice in terms of security and lack of global permissions.

Well, that is how the Phabricator reporter works :), though it polls every second.

Also I'm not sure if we're going to run into the problem again that production services cannot talk to WMCS/Toolforge (only goes in the other direction). But I assume a hole is being opened for CI purposes, so we could piggyback on that.

by "building" do you mean patching the GitLab code?

Effectively, yes. The different integrations are reasonably decoupled although it's not exactly a 'plugin architecture'.

Well, that is how the Phabricator reporter works :), though it polls every second.

Sort of. The main difference is that Phabricator keeps track of the events for us, so there's no duplication / no risk of missing events.

I poked around the /events endpoint a bit to see how pagination is handled... and learned a few annoying things:

:-(

:S

Do you have a patch or an idea how much changes are needed for adding the integration? That does seem like the best way forward. cc'ing @brennen since I'm not really sure what our capacity for local patches is .

I'm also curious how the GitLab-->Phabricator bot is going to work, since presumably that also needs to watch all projects.

:S

Do you have a patch or an idea how much changes are needed for adding the integration? That does seem like the best way forward. cc'ing @brennen since I'm not really sure what our capacity for local patches is .

It's not exactly a patch, but...

Specifically, this required the following changes:

In any case, this proves that, yes, an integration does have access to the full event stream. However, given that the group-level webhooks are only available in ee, I don't know if GitLab will be interested in getting a global-level webhook integration back. (it seems to have been moved recently from integrations to a separate webhook component).

That's not a full implementation yet, though, as it doesn't add a new webhook (it just patches an existing one). I wasn't able to figure out where exactly the list-of-webhooks gets instantiated (and was testing things by hacking a 'production install' rather than a dev version -- I'll play around with the gitpod gitlab environment at some point to see if that makes life easier).

I think my main concern would be the handling of confidential issues/commits/repositories; to implement the hook correctly would require a good understanding of how the rights system in Gitlab works. (...and a good tracking of any changes therein).

I think my main concern would be the handling of confidential issues/commits/repositories; to implement the hook correctly would require a good understanding of how the rights system in Gitlab works. (...and a good tracking of any changes therein).

Are we even allowing private repos? Gerrit didn't, which is why we could avoid implementing rights checking even though stream-events gives out non-private events. And issues are going to be globally disabled AIUI.

The approach of having a bot iterate over all projects and configure the webhook individually doesn't sound that bad tbh.

Are we even allowing private repos?

Probably not very many of them, and not by default, but I think there are probably a handful of use cases for exceptions.

thcipriani triaged this task as Medium priority.
thcipriani set the point value for this task to 3.
thcipriani added subscribers: mmodell, thcipriani.

Assigning @mmodell to help shepherd patch to completion.

I'm not sure how to move this forward, I think some guidance from RelEng would be helpful on whether 1) we can get access to a full event stream, which appears to need some patching of GitLab, or 2) whether we can have a global admin account that can configure webhooks for all projects (not necessarily run by Wikibugs maintainers, it could operate on the GitLab side).

And also whether GitLab will even be able to go through the firewall and send webhooks to Toolforge/Cloud VPS (neither Phab nor Gerrit could).