Page MenuHomePhabricator

Herald rules causing delays to task edit saves - getting worse
Open, MediumPublic

Description

Herald rules are run on every change to every task. With the current number of herald rules we are already experiencing significant delays to each task edit. The proliferation of herald rules is going to be a problem soon.

Do you have an idea of the scale? How much time is being added by them?

@greg: Currently, about 80ms for 64 herald rules.

That doesn't sound great.....

Updates:

2015-08-0880ms
2016-03-14~150ms
2018-09-20~300ms
2024-06-27~270ms

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

You can start with that as an example, since you're being added to nearly all tasks.

@Danny_B: Do you know a way how I could add myself to watch all projects, become a watcher of all future projects to be created, and get automatically subscribed to all newly created tasks that do not have any projects associated? If that is possible, I could, but I'm currently not aware of such functionality.

You can simply watch all relevant projects.
Creation of projects is not so often, so you can watch newly created projects or T103700: LOG: Phabricator project creation.
You can also help to elaborate the needs in upstream report to have the needed functionality added.
Unassociated tasks: Use {H38}.

You can simply watch all relevant projects.

Practically speaking, how can I add myself to watch nearly all existing projects without spending hours of clicking?

Creation of projects is not so often, so you can watch newly created projects or T103700: LOG: Phabricator project creation.

Can I automatically become a watcher of all future projects to be created, or would I have to spend time and do this manually?

You can also help to elaborate the needs in upstream report to have the needed functionality added.

https://secure.phabricator.com/T11312#187893 describes the upstream position and I don't (yet) see why to not follow that advice.
As requested here I have replied to Evan's questions in https://secure.phabricator.com/T11312#188004

Unassociated tasks: Use {H38}.

For H38 I get "You do not have permission to view this object."

Unassociated tasks: Use {H38}.

For H38 I get "You do not have permission to view this object."

Hm, that's my rule... I haven't been able to figure out how to edit the policies on it, so I hope this is helpful instead:

My H38.png (662×1 px, 54 KB)

You can edit policys at private rules, and you can't change the permissions for other rules too.

Update: It's now up to 140ms-150ms for each task (with 91 herald rules)

Do these numbers come from DarkConsole's Services tab? Wondering how/where to compare.

Not sure if this is the original source of the numbers, but the "Herald Transcripts" view shows how long each evaluation took (at least, roughly):

https://phabricator.wikimedia.org/herald/transcript/

For example, "93 ms" means that rule evaluation took 93 milliseconds.

Is it the number of Rules that is the problem, or more the number of Conditions? I.e. would it help if we asked users who have 2 rules that could easily be merged, to do so? E.g. H274 and H275 could easily be merged into a single Rule, but it would have the same number of Conditions so merging might not help...

Can we [easily] add custom text to the top of the page https://phabricator.wikimedia.org/herald/create/ giving a bold warning?
Something like: Please do not use Herald if you can avoid it, because each new rule causes longer delays (see T108586).
and then if merging tasks would help, then also mention, If you must use it, then try to only use a single rule with multiple conditions.

I see some patterns, like

  • "inform me when natural language foo was mentioned" (such as H188, H92, H81),
  • "email me explicitly when tasks in project XYZ get the UBN priority" (H154, H294, H236, H230, H231),
  • to set flags,

Rules like H241 H252 H219 all do the same but don't have a 'shared maintainer'. As we've allowed personal rules (and I do not remember why) it's probably more cumbersome to ping a rule author than jfdi by creating a rule yourself.

I personally don't understand the use cases of some rules such as H133 (can be done by changing the personal notification settings?) or H204 (as I don't believe in single default assignees due to cookie-licking) or H170 (edit the user profile name to make clear that it's a private account?).

Is it the number of Rules that is the problem, or more the number of Conditions? I.e. would it help if we asked users who have 2 rules that could easily be merged, to do so? E.g. H274 and H275 could easily be merged into a single Rule, but it would have the same number of Conditions so merging might not help...

I think the number of conditions is probably the biggest factor so combining them into fewer rules might not help very much.

I was curious how much antivandalism was slowing things down so I disabled the rule and then checked transcripts on transactions that happened while the rule was disabled. It didn't seem to make a significant different in herald runtime. So although antivandalism does a couple of expensive sql queries, it appears that I've optimized that code sufficiently - the difference was only a couple of milliseconds at most.

I see some patterns, like

  • "inform me when natural language foo was mentioned" (such as H188, H92, H81),
  • "email me explicitly when tasks in project XYZ get the UBN priority" (H154, H294, H236, H230, H231),
  • to set flags,

Rules like H241 H252 H219 all do the same but don't have a 'shared maintainer'. As we've allowed personal rules (and I do not remember why) it's probably more cumbersome to ping a rule author than jfdi by creating a rule yourself.

I personally don't understand the use cases of some rules such as H133 (can be done by changing the personal notification settings?) or H204 (as I don't believe in single default assignees due to cookie-licking) or H170 (edit the user profile name to make clear that it's a private account?).

The reason for H133 was, that I set all preferences to notification only, not email, but I still got mails for some types of activity where there was no setting, so this was the only way to change this. I turned it off for now, I hope that I don't get mails now.... I've set all available settings from mail to notify now, let's see if the behaviour of phab changed in the meantime.

Just looking manually at enabled global rules that target tasks, the following rules appear to all do the same thing for different cases

Group 1: add the owner's tag if assigned to the owner[1]

  • {H370}
  • {H336}
  • {H318}
  • {H317}
  • {H311}
  • {H306}
  • {H302}
  • {H268}
  • {H243}
  • {H212}
  • {H182}
  • {H128}
  • {H125}
  • {H91}

Group 2: Add the owner's tag if assigned to or authored by the owner

  • {H322}
  • {H271}
  • {H267}
  • {H261}
  • {H248}
  • {H233}

Group 3: Add tags for spaces

  • {H270}
  • {H264}
  • {H256}
  • {H255}
  • {H237}
  • {H178}
  • {H177}
  • {H171}
  • {H116}

Is there a more efficient way to be doing this?

[1] Owner meaning the specific user; these are global herald rules

Is there a more efficient way to be doing this?

I could probably build a custom action that's something like

"add tag #user-$u when assigned to user: $u" and apply that globally, once, eliminating most of the first group.

Group 4: Umbrella projects: if tagged as one of a few other projects, add an extra project (once):

  • {H337}
  • {H314}
  • {H285}
  • {H216}
  • {H187}
  • {H143}
  • {H131}
  • {H109}

Group 5: Umbrella or mirror projects: if tagged as one of a few other projects (or just one other project), add an extra project (always):

  • {H232}
  • {H193}
  • {H175}
  • {H174}
  • {H24}
  • {H15}
  • {H14}
  • {H10}

Since group 5 rules are activated each time they match, it may be worth investigating if @Maintenance_bot would be able to take over those functions (@Ladsgroup?) Group 4 would be similar, but likely harder to implement

Also, not really a group, but

  • {H315}
  • {H329}

both email a list if a task is UBN and has a specific tag

If you want to move things to the maintenance bot, PR is very welcome: https://github.com/Ladsgroup/Phabricator-maintenance-bot

I made a PR to the maintenance bot that could replace umbrella project adding Herald rules (group 5 in T108586#5669368).

https://github.com/Ladsgroup/Phabricator-maintenance-bot/pull/22

@Aklapper Hey, is it fine if we move the group 4 herald rules to maintenance bot?

@Aklapper Hey, is it fine if we move the group 4 herald rules to maintenance bot?

the pull request is for group 5, which run {always} instead of, as group 4 does, {once}

Thanks. Making it work with group 4 shouldn't be too hard.

@Aklapper Hey, is it fine if we move the group 4 herald rules to maintenance bot?

Is there any general info available, or a dedicated task / thread? I lack context what that bot does and why.

Maybe moving some actions to an external place might be workaround to improve performance, but does this potentially complicate making/reverting changes (which are currently Herald rules) when it comes to how quickly changes get deployed, how many folks can make such changes, etc? I honestly don't know...

Maybe moving some actions to an external place might be workaround to improve performance, but does this potentially complicate making/reverting changes (which are currently Herald rules) when it comes to how quickly changes get deployed, how many folks can make such changes, etc? I honestly don't know...

Yeah IMO it's a trade-off, I don't think we should move all Herald rules but some clearly would benefit from being done by the bot. To mitigate the usability of the bot, I can work on its documentation but also I want to mention that the source code is linked on the bot's profile page, it has seven contributors (so it's not just me) and it's configuration is rather straightforward, for example: https://github.com/Ladsgroup/Phabricator-maintenance-bot/blob/master/column_mover.py#L33

I personally think we should automate even more work (not Herald work that's already automated) to the bot (that's why it has a rather general name) to reduce the toil our tech people has to do (I even worked on machine learning for triaging but that's too early for making it prioritize automatically).

The patch is merged and working as expected. I disabled cases in group 5. It'll be done by @Maintenance_bot

I still think doing group 4 is also pretty easy. You just should go through transactions to make sure the project has not been added before.

now it supports group 4 type of cases as well: https://github.com/Ladsgroup/Phabricator-maintenance-bot/commit/131808cd06078eb4542165450ee2d32b9b32925b

I migrated H30 and I'm disabling it. I also wrote a quick js to make migrating herald rule to maint bot easier. I think we should slowly take the data out of python file into a json for when someone have time.

I migrated and disabled several herald rules of group 4. I'm going to add more later (plus other groups like personal rules).

Do we have some numbers on how much is the median for saving now? its grafana board doesn't have anything like that.

Aklapper lowered the priority of this task from High to Medium.May 30 2021, 9:45 AM

FYI last week I went through most Herald rules and disable 10 due to not being useful anymore.
More relevant though:

https://phabricator.wikimedia.org/herald/transcript/ lists all recent Herald activity.
The Profiler for each run under a URI like https://phabricator.wikimedia.org/herald/transcript/12345678/profile/ shows the costs of each analyzed Herald rule in μs.

Jdforrester-WMF added a subscriber: greg.
Jdforrester-WMF removed a subscriber: greg.

Although I already said before at https://www.mediawiki.org/wiki/Topic:V9yllpjfjpennzcl, let's say again: a "job" system like T277285: Run non-blocking filters after the edit, as a job is needed.

Noting that after asking upstream (https://we.phorge.it/Q207) about the possibility of Herald rules running asynchronously using daemons, I've created https://we.phorge.it/T16290 to allow for discussion upstream about this possibility (if it turns out to be possible).

Noting that after asking upstream (https://we.phorge.it/Q207) about the possibility of Herald rules running asynchronously

To understand potential performance benefits of upstream very cautious assumptions to do not break any workflow... a small database exploration would be useful:

The question is: in phabricator.wikimedia.org, how many active Maniphest Herald rules only use one of the following actions and nothing else? (without "actions affecting content")

  • "Send me an email" - email.self
  • "Send an email to" - email.other
  • "Mark with flag" - flag
  • "Remove flag" - unflag
  • "Do nothing" (lol) - nothing

so, without any of these "actions affecting content":

  • "Assign task to" - maniphest.assign.other
  • "Change priority to" - maniphest.priority
  • "Change status to" - maniphest.status
  • "Change subtype to" - maniphest.subtype
  • "Add projects" - projects.add
  • "Add subscribers" - subscribers.add
  • "Remove projects" - projects.remove
  • "Remove subscribers" - subscribers.remove
  • "Assign task to me" - maniphest.assign.self
  • "Add me as subscriber" - subscribers.self.add
  • "Remove me as subscriber" - subscribers.self.remove
  • "Add comment" - comment
  • "Webhook" - webhook
  • "Require secure email" - email.must-encrypt

The above exploration can be translated to the following SQL:

USE phabricator_herald;

-- Any active herald rule about Maniphest (personal and global) without actions affecting content.
SELECT COUNT(*) FROM herald_rule
  WHERE contentType = 'HeraldManiphestTaskAdapter'
  AND isDisabled = 0
  AND NOT EXISTS (
    SELECT * FROM herald_action
    WHERE herald_rule.id = herald_action.ruleID
    AND action NOT IN ('email.self', 'email.other', 'flag', 'unflag', 'nothing'));

I would be surprised to see a big number. I expect ~10 results, so no real benefit probably.

Ah, to get a list, to praise these people:

USE phabricator_herald;

SELECT herald_rule.id, herald_rule.name FROM herald_rule
  WHERE contentType = 'HeraldManiphestTaskAdapter'
  AND isDisabled = 0
  AND NOT EXISTS (
   SELECT * FROM herald_action
   WHERE herald_rule.id = herald_action.ruleID
   AND action NOT IN ('email.self', 'email.other', 'flag', 'unflag', 'nothing')) ORDER BY herald_rule.id ASC;

But again, I would be very surprised to see more than 50 candidates here.

If we have a small number of results: I'm afraid this won't be automatically speculated by Phorge.

So, in case of few results, an option may be needed (yuck) in the individual Herald rules, with text to be designed as much confusing as possible as usual.

e.g. "When your Herald butler must run these actions?" Option 1: "URGENT!1 (default)" Option 2: "Chill..." - or something like that.

P.S. in phabricator.wikimedia.org - has it ever been necessary to manually disable Herald rules for de-activated ("disabled") users? Just to know whether to open another upstream feature request (yum).

I would be surprised to see a big number. I expect ~10 results, so no real benefit probably.

P.S. in phabricator.wikimedia.org - has it ever been necessary to manually disable Herald rules for de-activated ("disabled") users?

I've sometimes done that for Personal Rules. I don't remember if Phorge code is smart enough to ignore them. But I like to filter the Herald Rules on active rules.

I would be surprised to see a big number. I expect ~10 results, so no real benefit probably.

Can we look at the list of monograms? So I can manually sum their herald transcript costs

(I will do that looking at the link "View Herald Transcript" e.g. https://phabricator.wikimedia.org/herald/transcript/6901672/profile/ - which can be found in the task page, when that task is created)

I don't remember if Phorge code is smart enough to ignore them

Eh. Me too.

@valerio.bozzolan Admittedly I'm out of my depth here, but I'm not sure if async Herald rules would necessarily have to be limited to non-content-affecting actions? I included this in the upstream task I filed:

One thing @avivey mentioned under that question was regarding notifications: if Herald rules were asynchronous, then Phorge might end up sending double-notifications within a short space of time (one for the original edit, and one for Herald's reaction to that edit). However, another potential thought that occurred to me was: to prevent double-notifications in a case where Herald rules are executed by a daemon, why not move notification-sending to also be executed by a daemon (after any Herald rules have ran)?

Yep @A_smart_kitten I agree with you that I wouldn't invest days of work in a partial solution that speeds up just 20 rules. I evaluated that speculation first because it was at first glance the easiest and safer.

Let's continue in the upstream task.