Page MenuHomePhabricator

Using AI to improve tagging of Phabricator tasks
Closed, DeclinedPublicFeature

Description

Feature summary:
When filing a task, an AI that has been trained on the existing database of tasks looks at the task and adds/removes tags it suspects to be appropriate, or it makes suggestions while filing. Allow users to opt out.

Use case(s):
Tags don't always have logical names here. Even when they do, many users who aren't developers may not know which tags are appropriate. If you have a problem searching, you may enter "search" in the tags to find an appropriate tag but "CirrusSearch" won't pop up when you do that. If images aren't loading, you may need SRE, but who'd know that?

Benefits:
Developers:

  • Find relevant tasks faster.
  • Spend less time correcting tags.

Users:

  • Spend less time figuring out which tags might be appropriate.
  • Get attention for your task more quickly.

While generative AI is often useless or even dangerous, AI may be appropriate for adding tags:

  • It's not mission critical.
  • Errors in tagging are probably not a huge issue. If it's correct 80% of the time it's likely a net positive. AI could be instructed to never remove tags, or have a blacklist of tags to never remove. (e.g. SRE)
  • It's not a creative process.
  • There are already humans around who fix erroneous tagging, erroneous tags are already an expectation.

Event Timeline

Aklapper moved this task from To Triage to External on the Phabricator board.

After years of watching a bot trying to set or correct project tags in Mozilla Bugzilla, I am not sure that has created more benefit than additional correction work.
See also https://newsroom.workday.com/2026-01-14-New-Workday-Research-Companies-Are-Leaving-AI-Gains-on-the-Table
"Allow developers to find relevant tasks faster. Spend less time correcting tags" is the bright and so far rather unrealistic theory.
"Errors in tagging are probably not a huge issue"? Oh, they are, based on what I've seen so far.

If you have a problem searching, you may enter "search" in the tags to find an appropriate tag but "CirrusSearch" won't pop up when you do that.

Then we should add an additional hashtag for that project tag in order to fix that specific problem.

If images aren't loading, you may need SRE, but who'd know that?

A potential Herald rule which would automatically add SRE based on certain specified criteria.

Maybe this proposal makes sense in a few years though.
In any case, I cannot stop anyone from writing a bot who tries to attempt this, whether this task exists or not.

"Errors in tagging are probably not a huge issue"? Oh, they are, based on what I've seen so far.

I think an AI could outperform most non-regulars. Those who only come here when something breaks. You'll be lucky if they use tags at all. The benchmark isn't the experts here. Of course the experts will do better.

If you have a problem searching, you may enter "search" in the tags to find an appropriate tag but "CirrusSearch" won't pop up when you do that.

Then we should add an additional hashtag for that project tag in order to fix that specific problem.

You'll need thousands of additional hashtags. Even if you do, there's a disconnect between how problems manifest and the components that are affected.

If images aren't loading, you may need SRE, but who'd know that?

A potential Herald rule which would automatically add SRE based on certain specified criteria.

I don't know how Herald works, but how do you imagine that? When I filed T415473 I had forgotten the appropriate tags as I don't file tasks weekly. After filing I looked up an older similar report I filed to copy the tags. I had missed "SRE" and "Traffic".

Maybe this proposal makes sense in a few years though.
In any case, I cannot stop anyone from writing a bot who tries to attempt this, whether this task exists or not.

Out of curiosity, I made up some tasks and asked Bing Copilot to suggest tags. Note that Copilot isn't specifically trained to suggest tags, though it obviously has processed Phabricator:

You said:
I need to file some reports on Wikimedia Phabricator. Can you suggest some tags for each one?

Copilot said:
Absolutely — I can help you choose the right Phabricator tags, but I need to know what each report is about. Phabricator tagging on Wikimedia is very task‑specific, and the best tags depend on:

  • Which product or site is affected (MediaWiki core, Commons, Wikidata, etc.)
  • Which component is involved (API, parser, caching, UI, extensions…)
  • Whether it’s a bug, feature request, performance issue, or configuration problem
  • Which platform(s) it affects (desktop, mobile web, apps)

If you share the text or a short description of each report, I can propose accurate tags that match Wikimedia’s usual conventions and workflows.

Whenever you’re ready, just drop the first one and we’ll go through them one by one.

You said:
First report: "Please enable mandatory 2FA for everyone on enwikivoyage. The community has discussed this and voted in favor with unanimous support: https://en.wikivoyage.org/wiki/Wikivoyage:Travellers%27_pub#Require_2FA"

Copilot said:
Below is a clean, Phabricator‑appropriate tagging set for this kind of request. This type of task is essentially a configuration change to a Wikimedia wiki, with Security/Authentication implications, and it requires SRE + Security + Product Security review.
🏷️ Recommended Phabricator Tags
Core tags

  • Wikimedia-Site-requests
    • All configuration changes for individual Wikimedia wikis go here.
  • SRE
    • Operations team that implements config changes.

Security / Authentication

  • Security
    • Because it affects authentication policy and requires security review.
  • Security-Team or Product-Security
    • Depending on how granular you want to be; both are commonly used for auth‑related changes.
  • Authentication-Requests
    • Used for changes to login, 2FA, and account‑security requirements.

Project-specific

  • Wikivoyage
    • Optional but helpful for visibility by the Wikivoyage community.

📝 Optional tags (use only if relevant)

  • User-notice
    • If the change will require user communication or announcements.
  • Community-consensus
    • Sometimes used when a task is explicitly backed by a local vote.

If you want, I can also draft the task description in Phabricator style, or help with the next report.

You said:
Second report:
"No images are loading on Wikipedia. Full size images (e.g. https://upload.wikimedia.org/wikipedia/commons/f/f2/DSCN2146_Medlands_Beach.jpg) work but the images in articles (e.g. https://upload.wikimedia.org/wikipedia/commons/thumb/f/f2/DSCN2146_Medlands_Beach.jpg/250px-DSCN2146_Medlands_Beach.jpg) are not loading. I'm using Opera."

Copilot said:
For this kind of issue — thumbnails failing while originals load — the problem almost always sits in Thumbor, Swift, CDN caching, or browser‑specific delivery paths. The tags should reflect that.
🏷️ Recommended Phabricator Tags
Core tags

  • Thumbor
    • All Wikimedia thumbnail generation issues belong here.
  • Wikimedia-Image-Scaling
    • Covers problems with thumbnailing infrastructure and scaling pipelines.
  • Traffic
    • Because CDN or caching layers (Varnish/ATS) may be involved when thumbnails fail but originals load.
  • SRE
    • Operations team responsible for debugging infrastructure‑level outages.

Browser-specific

  • Browser-Support-Opera
    • Since the user reports the issue only in Opera, this helps route it to the browser‑compatibility maintainers.

Project-wide

  • Commons
    • Optional but often added because the images originate from Commons.

📝 Optional tags (use only if relevant)

  • User-notice
    • If the issue appears widespread and might require communication.
  • Performance-Team
    • If you suspect degraded performance rather than outright failure.

If you want, I can also help you draft the task description in a clean Phabricator style or move on to the next report.

You said:
Next report:
"When I try to register an account it says "The password you entered is already being used by Henry125. Please choose a different password". As I now know Henry125's password, I'm concerned this might be a security vulnerability."

Copilot said:
This one should be treated as a potential security vulnerability, because MediaWiki should never reveal whether a password matches any existing account. Even if it turns out to be a browser extension, gadget, or local script causing the message, the report must be routed through the proper security channels.

Below is the correct tagging set for Phabricator.
🏷️ Recommended Phabricator Tags
Security / Sensitive

  • Security
    • Mandatory. This ensures the task is handled privately and not exposed publicly.
  • Security-Team
    • Routes the issue to the team responsible for authentication‑related vulnerabilities.
  • Product-Security
    • Often added for issues involving credential handling or password validation logic.

Authentication / Account creation

  • Authentication-Requests
    • Relevant because the bug appears during account creation and involves password validation.
  • MediaWiki-User-management
    • Optional but appropriate if the issue is ultimately in MediaWiki’s account‑creation workflow.

Platform-wide

  • MediaWiki-General
    • Optional catch‑all if the root cause is unclear.

⚠️ Important process note

This task must be filed as a security‑sensitive report, not a public one.
If you file it publicly, Phabricator maintainers will immediately convert it to a security task anyway, but it’s better to start it in the right place.

If you want, I can help you phrase the report in a way that avoids exposing the password in the task body while still giving the security team everything they need.

You said:
Next report:
"When I click "expand" on the table at https://en.wikipedia.org/wiki/List_of_Labubus nothing happens. This is using my laptop (Toshiba L14 running W11, if it matters). It works on my phone, though."

Copilot said:
For this kind of issue — a collapsible table not expanding on desktop but working on mobile — the likely causes fall into Gadgets, ResourceLoader, JavaScript errors, or browser‑specific issues. The tagging should reflect that.
🏷️ Recommended Phabricator Tags
Core functionality

  • MediaWiki-Interface
    • Covers collapsible elements, UI behaviors, and core JS modules.
  • MediaWiki-ResourceLoader
    • If the expand/collapse script isn’t loading or is being blocked.
  • JavaScript
    • Useful when the issue is clearly JS‑related but the root cause is unknown.

Browser / platform

  • Browser-Support-Windows
    • Since the issue appears only on your Windows 11 laptop.
  • Browser-Support-Opera or Browser-Support-Chromium
    • If you’re using Opera or another Chromium‑based browser on that laptop.

Project-specific

  • English‑Wikipedia‑Site‑requests (optional)
    • Only if the issue turns out to be caused by a local gadget or site script.

📝 Optional tags (use only if relevant)

  • Performance-Team
    • If the expand action is delayed or partially working rather than failing outright.
  • User-notice
    • If the issue becomes widespread and needs communication.

If you want, I can also help you write the task description in a clean Phabricator format or move on to the next report.

You said:
Next report:
"I'm trying to change my skin to MonoBook, but after saving my preferences it has reverted to Vector. I can change the other settings, only the skin preference can't be changed."

Copilot said:
For a skin‑preference issue that won’t save, the likely causes fall into MediaWiki preferences handling, user‑options storage, skin availability/config, or local wiki overrides. The tags should reflect that.
🏷️ Recommended Phabricator Tags
Core functionality

  • MediaWiki-Preferences
    • Handles bugs where user settings fail to save or revert unexpectedly.
  • MediaWiki-User-management
    • Relevant because user options (including skin choice) are stored as part of user account data.
  • MediaWiki-Skins
    • Appropriate when the issue specifically concerns switching between skins.

Skin-specific

  • Skin-MonoBook
    • Since the issue is specifically with selecting MonoBook.
  • Skin-Vector
    • Optional, but sometimes helpful when the fallback skin is involved.

Platform-wide

  • MediaWiki-General
    • Optional catch‑all if the root cause is unclear.

Project-specific (optional)

  • English-Wikipedia-Site-requests
    • Only if you suspect a local gadget, script, or configuration override is preventing MonoBook from being selected.

📝 Optional tags (use only if relevant)

  • JavaScript
    • If the skin selector UI itself appears broken.
  • User-notice
    • If the issue becomes widespread and affects many users.

If you want, I can also help you phrase the task description in a clean Phabricator style or move on to the next report.

This is honestly better than I expected. It's far from perfect (for example, browser-support-windows doesn't exist), but it's getting a lot right. In the future I'm actually going to ask Copilot for tag suggestions. This is helpful!

If anyone wants to try reproducing this, note that I use custom instructions to make Copilot hallucinate less (instead just admitting when it hasn't got a clue) and less sycophantic.

You'll need thousands of additional hashtags. Even if you do, there's a disconnect between how problems manifest and the components that are affected.

And so far I believe that a bot won't solve that disconnect (yet). My opinion may likely change over the next years.

I'm not saying that some trained bot (what some folks call "AI") would not be useful in the long run.
I am trying to express that so far the bot results which I have seen have not convinced me that the gain outweighs all the mistakes.

You'll need thousands of additional hashtags. Even if you do, there's a disconnect between how problems manifest and the components that are affected.

And so far I believe that a bot won't solve that disconnect (yet). My opinion may likely change over the next years.

I'm not saying that some trained bot (what some folks call "AI") would not be useful in the long run.
I am trying to express that so far the bot results which I have seen have not convinced me that the gain outweighs all the mistakes.

Would you also oppose something that merely suggests tags while creating a task?

First report: "Please enable mandatory 2FA for everyone on enwikivoyage. The community has discussed this and voted in favor with unanimous support: https://en.wikivoyage.org/wiki/Wikivoyage:Travellers%27_pub#Require_2FA"

The only tag this should have is Wikimedia-Site-requests. Maybe also Product Safety and Integrity or Trust-and-Safety would also want to be informed given the likely cascading impacts on them via 2FA reset requests for example. And maybe also MediaWiki-extensions-OATHAuth since I don't think requiring 2FA for all users is supported in that repo so it would need code changes too.

  • Definitely not SRE (sre doesn't handle config change deploys).
  • I don't think this would need Security or Security-Team. It isn't really in the scope of those teams. #product-security doesn't even exist
  • #Authentication-Requests also doesn't exist.
  • #Wikivoyage also doesn't exist.
  • It shouldn't have User-notice either; config changes to one wiki almost never belong in tech news.
  • #Community-consensus doesn't exist. It shouldn't have Community-consensus-needed either since (in that fanciful hypothetical) the community has already come to a consensus.

Second report: "No images are loading on Wikipedia. Full size images (e.g. https://upload.wikimedia.org/wikipedia/commons/f/f2/DSCN2146_Medlands_Beach.jpg) work but the images in articles (e.g. https://upload.wikimedia.org/wikipedia/commons/thumb/f/f2/DSCN2146_Medlands_Beach.jpg/250px-DSCN2146_Medlands_Beach.jpg) are not loading. I'm using Opera."

  • #Wikimedia-Image-Scaling doesn't exist and Performance-Team is archived. I wonder if there is a knowledge-cutoff issue here . The rest seem fine

Next report: "When I try to register an account it says "The password you entered is already being used by Henry125. Please choose a different password". As I now know Henry125's password, I'm concerned this might be a security vulnerability."

(And finally I would never add MediaWiki-General to a task that has more specific tags already, but others seem to disagree with me on that)

Next report: "I'm trying to change my skin to MonoBook, but after saving my preferences it has reverted to Vector. I can change the other settings, only the skin preference can't be changed."

TLDR I'm seeing a lot more noise than signal in that copilot example.

TLDR I'm seeing a lot more noise than signal in that copilot example.

To me it's roughly 50/50, which is still impressive to me, but maybe I'm being generous. I'm using the "smart" profile which "thinks deeply or quickly based on the task", perhaps the "think deeper" profile would perform better.

When you take into consideration that if a system was tailored to Phabricator task tagging specifically, it could do some fuzzy matching to convert tags like Skin-MonoBook to MonoBook. Any nonexistent/archived tags that can't be matched like that could be dropped. Some of the mistakes could then be fixed using Herald, for example if a task is redundantly tagged with MediaWiki-Core-Preferences and MediaWiki-User-management. To me, redundant tags are less of a problem than a lack of tags or wrong tags.

The initial prompt could also be adjusted to make it more conservative. This being said, it would always make some mistakes. If anyone were to decide to attempt this, it should (at least until it proves itself) stick to suggestions.

[edit] I would argue that a project making a change as impactful as the first imaginary report could actually warrant an entry in tech news, it seems at least debatable.