Page MenuHomePhabricator

@phaultfinder should look in subprojects/milestones when checking whether a given alert already has an open task
Open, Needs TriagePublicFeature

Description

Teams that deal with automatically-created Alertmanager tasks will sometimes want to move the task to a subproject/milestone of their main team board - e.g., a board for a team's current sprint - in order to be able to deal with the task more effectively within Phabricator (and within teams' established task-management workflows).

Currently, though, the software behind @phaultfinder (https://github.com/knyar/phalerts/, if I'm not mistaken) doesn't consider subprojects/milestones when trying to work out whether or not a task for a given alert already exists. This results in duplicate tasks for the same issue being created, which then have to be merged into the originally-created task; which in turn increases task noise/work required. (see e.g., T395892#10880018)

This task therefore proposes that the software behind @phaultfinder is modified to also search any milestones/subprojects of a base project, when checking to see whether or not a task for a given alert already exists.


In terms of implementing this, looking at https://github.com/knyar/phalerts/blob/master/phalerts.py, it seems like there might be a few possible options for how to go about it (assuming that the repo owner is happy to accept a PR to enable this behaviour). The simplest that I can think of (right now, off the top of my head) would be to simply remove the lines phalerts.py#L166-L169, which appear to limit the result of the maniphest.search API query to tasks which are specifically tagged with the (base) project provided. Phabricator Maniphest search already seems to include subprojects & milestones when searching by a base project (e.g., this search that filters by Language and Product Localization includes tasks that are tagged with LPL Essential); so if this filtering was removed from phalerts upstream, it seems like @phaultfinder might be able to find tasks within milestones/subprojects with no further changes needed.
(Disclaimer: there might be side effects to this idea that I haven't immediately thought of!)

Event Timeline

Aklapper changed the subtype of this task from "Task" to "Feature Request".Jun 3 2025, 6:10 PM
Aklapper edited projects, added observability; removed SRE Observability.

Thank you for reaching out @A_smart_kitten, you are correct phalerts is the software powering @phaultfinder. Upstream is responsive and happy to accept PRs; I don't know the history behind the check you have highlighted, though I'm sure an upstream issue about said behavior and the issues we're running into will help!

Thank you for reaching out @A_smart_kitten, you are correct phalerts is the software powering @phaultfinder. Upstream is responsive and happy to accept PRs; I don't know the history behind the check you have highlighted, though I'm sure an upstream issue about said behavior and the issues we're running into will help!

Thanks for the comment! To be honest, I was going to leave that to members of SRE Observability, as the people that (I believe) have previously communicated with upstream, and might therefore have more knowledge about it / more of a working relationship with the upstream dev/s than (e.g.) myself. (Unless you'd prefer for someone else to raise the upstream ticket? In which case, I can't promise that I'd be able to do so myself, but potentially someone else might be able to if not me.)

I wonder if phalerts could create the tasks with a reference attached (eg: alert_<unique reference>) [example of our old bz migrated with references T2001] and then check if that exists to find duplicates.

I suspect that would also fix the issue of duplicates when people rename tasks as well.

Thank you for reaching out @A_smart_kitten, you are correct phalerts is the software powering @phaultfinder. Upstream is responsive and happy to accept PRs; I don't know the history behind the check you have highlighted, though I'm sure an upstream issue about said behavior and the issues we're running into will help!

Thanks for the comment! To be honest, I was going to leave that to members of SRE Observability, as the people that (I believe) have previously communicated with upstream, and might therefore have more knowledge about it / more of a working relationship with the upstream dev/s than (e.g.) myself. (Unless you'd prefer for someone else to raise the upstream ticket? In which case, I can't promise that I'd be able to do so myself, but potentially someone else might be able to if not me.)

Indeed we have interacted before with upstream; to be honest I would prefer someone with more Phabricator knowledge than me to approach the issue, specifically I can't really articulate the reason for the check in upstream code and what the consequences are in removing or changing it.

I wonder if phalerts could create the tasks with a reference attached (eg: alert_<unique reference>) [example of our old bz migrated with references T2001] and then check if that exists to find duplicates.

I suspect that would also fix the issue of duplicates when people rename tasks as well.

Good question, I think that would be technically possible because alerts and alert groups have fingerprints in alertmanager. What I don't know is how much the logic would complicate, it is maybe suboptimal in some cases what phalerts does today however it is quite easy to understand what it is going to do with an alert / alert group

lmata moved this task from Inbox to Radar on the Observability-Alerting board.