Page MenuHomePhabricator

[GOAL] Make it possible to send notifications for bot-made actions
Open, Needs TriagePublic

Description

NOTE: The task exists to document the issues related to sending notifications for actions made by bots. It also exists to group existing Phabricator tasks related to that topic. Unless the task is claimed by a developer, it should not be interpreted as a commitment of anyone to do the work. Sending notifications for bot-made actions is technically complex and needs to be done carefully.
Context

Previously, some bot-made actions resulted in notifications being sent. This feature was sending a great amount of mail (~200,000 per day). In 2024, due to the high volume of outgoing Wikimedia mail, major e-mail providers (such as Gmail) temporarily stopped accepting all Wikimedia mail, which was deemed to be a major outage, as it was impacting any and all of our mail. To avoid that breakage, emails for bot-made actions were disabled in T356984: Stop sending change notification email if edit is done by a bot, following prior discussion in a security Phabricator ticket (T354378).

More details about the issue were described in this message by @Ladsgroup:

Now I can explain about this a bit more (cleared by the security team). Yes. When I said the discussion happened privately, I meant it happened in a security ticket.

This feature was sending ~200,000 emails a day on a normal day but that in itself wasn't the issue. The issue was that since bots don't get rate limited. A couple weeks ago, a bot started editing fast enough and that brought down all of our mail infrastructure so bad that important mails such as login notification, password reset and email address confirmation mails were getting dropped. So this is not just a concern or a hypothetical scenario. It happened in January this year. Most importantly, this is not just an accident issue, this can easily be weaponized to bring our mailservers down or use our infra to DOS another mail provider. The only things you need is a bot account (or a compromised one). We don't go around break important user features due to hypothetical scenarios, this case literally brought us down before.

You might say "buy more hardware" and that would fix the mail servers being overloaded. But one reason the outage got worse was that we were sending so many mails that Gmail stopped accepting email from us altogether. We can't fix that by adding more hardware. One way to tackle this is to set up a dedicated second lane mail servers (call them "high volume" or something like that) and teaching mediawiki to send notification emails to them so if gmail or others start throttling or rejecting them, security-sensitive emails wouldn't be affected.

I know this is disruptive, and I apologize for that. On top of that, the current state of watchlist code is not helping either (specially regarding unseen watchlist entries getting highlighted being tied to how emails are sent...) but we really didn't have a choice. I hope to make improvements on the watchlist infra later.

In the meantime, I suggested creating a tool in toolforge to do the work for you, you could potentially create a generalized one that users could "sign up" by providing their watchlist token. So one person would need to create and maintain the tool and others could just use it (and don't need to re-invent the wheel).

My apologies again for this but we don't have any other choice.

Problem

Phabricator contains several open tasks that ask to introduce and/or re-enable notifications for some bot-made actions (edits, mentions in edit summaries, ...). Before implementing any such feature, there needs to be a solution that ensures our outgoing mail cannot grow out of limits (similar to what led to T356984). The most important problem is that bots are exempt from all rate limiting rules, allowing them to operate quickly (much quickly than humans). This also means than a single bot can be more costly to Wikimedia infrastructure than a single human.
Unfortunately, this problem also impacts introducing this setting an user property, as that would also allow the outgoing mail to grow quickly (eventually to the levels that caused T356984).

Related tasks

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Previously, some bot-made actions resulted in notifications being sent.

What (Echo) notifications used to be sent? Can we rely on the bot flag for telling spammy and non-spammy versions apart (like what I proposed for watchlist mails in T356984), i.e. are they results of edits?

Hi! I'm trying to fix T329573 as one of the tasks in a Rapid Fund grant, but apparently it's blocked by this issue. Are there any plans to fix this? Thank you!

Hi! I'm trying to fix T329573 as one of the tasks in a Rapid Fund grant, but apparently it's blocked by this issue. Are there any plans to fix this? Thank you!

I'm not aware of any plans for my team to address this, and it seems like a substantial amount of work, and so I do not think we can pick this up without it affecting any of our other priorities.

I do not know if any other teams have this on their roadmap.

Hi, @Michael! Thanks for your reply.

I have just noticed there is a way to get watchlist notifications on-wiki through the Echo extension: T203941.

However, from my understanding of T309855, it hasn't been deployed to production wikis yet.

Since the reason for stopping email notifications for bot edits seems to be the big number of emails sent, I assume it wouldn't be a problem to show these notifications on-wiki.

Do you think it would make sense, then, to try to move forward with T309855 while email notifications for bot edits remain disabled?