Phabricator should suggest possible duplicates when creating a new task
Open, LowPublic
Actions

Assigned To

None

Authored By

	Qgil
	Apr 11 2014, 5:18 AM

Description

When we created a new bug report in Bugzilla, we got a list of possible duplicates. I find it useful. It might be even more useful for new users filing what is going to be probably an obvious duplicate.

Upstream ticket: https://secure.phabricator.com/T4828

Context:

It seems 6.57%~ of tickets are marked as duplicates in Wikimedia Phabricator (see comment from aklapper).

What is needed:

A "language pattern" or something to reduce that 6.57 percent, without slowing down 93.43 percent of the people.

Details

Reference: fl74

Related Objects
Search...

Status	Assigned	Task
Resolved	Qgil	T553 Engineering Community team goals for October 2014
Resolved	Qgil	T174 Launch Wikimedia Phabricator Day 1
Resolved	Qgil	T175 Nominate a team in charge of deploying and maintaining Wikimedia Phabricator code
Resolved	• RobLa-WMF	T17 Allocate resources for the migration and maintenance
Resolved	Qgil	T19 Define which features existing in our current tools are really missing in Phabricator
Resolved	Qgil	T15 Migrate Bugzilla to Phabricator
Resolved	Aklapper	T22 Identify features Bugzilla users would miss in Phabricator
Duplicate	Aklapper	T972 Allow for anonymous task logging with special template
Open	None	T45 Phabricator should suggest possible duplicates when creating a new task

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

• flimport mentioned this in T72: Overview of what's happening in a project.Sep 12 2014, 1:26 AM

Qgil edited projects, added Phabricator; removed Wikimedia Phabricator Maintenance.Sep 19 2014, 10:32 AM

Qgil moved this task from Backlog to Wikimedia requests on the Phabricator (Upstream) board.Sep 30 2014, 10:10 PM

• flimport assigned this task to Aklapper.Oct 1 2014, 11:01 PM

• flimport added a subscriber: Qgil.Oct 2 2014, 9:47 PM

• flimport added a subscriber: scfc.Oct 7 2014, 3:00 AM

• flimport added a subscriber: Ragesoss.Oct 7 2014, 8:00 PM

Aklapper removed Aklapper as the assignee of this task.Oct 10 2014, 7:13 PM

Aklapper updated the task description. (Show Details)

Aklapper set Security to None.

• flimport added a subscriber: revi.Oct 10 2014, 9:00 PM

Liuxinyu970226 subscribed.Oct 18 2014, 4:13 AM

Qgil removed a project: Phabricator.Oct 22 2014, 10:40 PM

Qgil merged tasks: T908: Phabricator doesn't prompt for duplicate tasks when creating a task, T886: Task creation doesn't suggest possible duplicate tasks.Oct 26 2014, 5:21 AM

Qgil added subscribers: • MZMcBride, Nemo_bis.

• Quiddity subscribed.Oct 29 2014, 6:17 PM

Aklapper mentioned this in T972: Allow for anonymous task logging with special template.Oct 29 2014, 8:06 PM

Screenshot_2014-10-29_13.09.43.png (880×1 px, 158 KB)

Screenshot of how Quora handles this.

Screenshot_2014-10-29_13.10.48.png (652×2 px, 229 KB)

Screenshot of how Bugzilla handles this.

• Elitre subscribed.Nov 14 2014, 12:08 PM

Imagine a world where James Forrester spends his day just marking my tasks here as duplicate.

Nemo_bis awarded a token.Nov 20 2014, 5:10 PM

He7d3r subscribed.Nov 24 2014, 2:25 AM

Gryllida awarded a token.Dec 8 2014, 3:00 AM

Kozuch awarded a token.Dec 17 2014, 8:35 PM

Glaisher merged a task: T87650: Search for duplicates after a bug title is typed in.Jan 27 2015, 4:25 PM

Glaisher added subscribers: Aklapper, liangent.

Glaisher subscribed.

Qgil merged a task: T90348: Add feature to show similar tasks when trying to type a title for new task.Feb 21 2015, 6:24 PM

Qgil added subscribers: • mmodell, Mjbmr.

Liuxinyu970226 unsubscribed.Mar 5 2015, 3:39 AM

Ragesoss awarded a token.Apr 9 2015, 10:29 PM

Qgil merged a task: T98088: Phabricator should search for similar tasks when filing a new task.May 5 2015, 3:49 AM

Qgil added a subscriber: Gryllida.

Bene awarded a token.May 30 2015, 4:52 PM

Nemo_bis unsubscribed.Jun 16 2015, 12:52 PM

He7d3r awarded a token.Aug 12 2015, 6:15 PM

Ofbeaton subscribed.Apr 13 2016, 12:11 AM

Restricted Application removed a subscriber: Mjbmr. · View Herald TranscriptApr 13 2016, 12:11 AM

Qgil mentioned this in T135327: Decide Phabricator improvements to be funded by WMF Technical Collaboration.May 20 2016, 9:27 AM

Danny_B added a project: Upstream.May 23 2016, 6:07 PM

Restricted Application added a subscriber: TerraCodes. · View Herald TranscriptMay 23 2016, 6:07 PM

Nemo_bis added a project: Developer-Wishlist (2017).Jan 22 2017, 7:58 AM

RandomDSdevel awarded a token.Jan 31 2017, 12:47 AM

srishakatux moved this task from To Be Triaged to Frameworks on the Developer-Wishlist (2017) board.Feb 1 2017, 2:25 AM

srishakatux moved this task from Frameworks to Tools (Phabricator, Gerrit, etc.) on the Developer-Wishlist (2017) board.

This project is selected for the Developer-Wishlist voting round and will be added to a MediaWiki page very soon. To the subscribers, or proposer of this task: please help modify the task description: add a brief summary (10-12 lines) of the problem that this proposal raises, topics discussed in the comments, and a proposed solution (if there is any yet). Remember to add a header with a title "Description," to your content. Please do so before February 5th, 12:00 pm UTC.

Tgr mentioned this in T158149: Find an owner for top 10 Developer Wishlist 2017 proposals.Feb 15 2017, 2:11 AM

zhuyifei1999 subscribed.Feb 15 2017, 12:22 PM

Last update in https://secure.phabricator.com/T4828#106500 from Apr 11 2015:

If we did https://secure.phabricator.com/T7805 first and got a generic ApplicationSearch endpoint out of it, I'd be open to writing this as an extension CustomField and then disavowing all knowledge of it. The results UI wouldn't be custom, but maybe that's fine. We might need to pay down some infrastructure debt to let installs put this immediately underneath the "Title" field, I think a couple of the fields are still hard-coded.

Tgr mentioned this in T158148: Promote Developer Wishlist 2017 proposals.Feb 28 2017, 1:31 AM

This suggestion got the most votes in the Developer Wishlist (with a quite confident lead). @Qgil @greg is either of you interested in making this an official WMF goal?

I'm already working on phabricator search stuff so this is not that far out of scope...

@mmodell @greg I am assuming that you have this ball in your court. If you think that sponsoring the development of this feature might help, please let me know. Depending on the cost, we might be able to cover it during this fiscal year (before the end of June).

(I still would like to see the completion of T136213 before jumping on new funded tasks, though).

Thinking through the ways of addressing this. Will post more when we're ready to commit :)

Liuxinyu970226 awarded a token.Mar 27 2017, 3:48 PM

Qgil raised the priority of this task from Lowest to Low.Apr 20 2017, 8:12 AM

Mainframe98 mentioned this in T165222: Blacklist to block an user from editing a list of pages.May 14 2017, 6:32 PM

David_Hedlund awarded a token.May 14 2017, 6:33 PM

David_Hedlund subscribed.

In relation to T158149: Find an owner for top 10 Developer Wishlist 2017 proposals, I dare to ask: what is the current status? :)

Open, Low :P

But seriously, not on RelEng's radar (and our Q2 goals are already fleshed out and too many ;) ). Looks like not on upstream's current plans either (based on lack of updates there).

Cirdan subscribed.Jul 2 2018, 6:59 AM

• Tbayer awarded a token.Dec 12 2018, 7:48 AM

• Tbayer subscribed.

kostajh awarded a token.Feb 22 2019, 6:58 PM

kostajh subscribed.

We really need this. I just merged the seventh duplicate task of T259565, and they were all created in a span of few hours, some just minutes apart. It would really be good if Phabricator is to be smarter on this end. (I was about to create duplicate of this task too, before recalling that I saw something like this.)

Ladsgroup awarded a token.Aug 4 2020, 4:20 PM

Ladsgroup subscribed.

Is now a good time to reconsider low? The lack of this does create a lot of work for triaging bugs and managing fragmented conversations - in particular for user facing products

As long as I neither see a good NLP / AI algorithm for the English language (more relevant to me) nor much research that an implementation to suggest potential duplicates significantly lowers the number of created duplicates (less relevant to me) this feels low/lowest priority to me. Maybe it's just my usually disappointing personal (anecdotal) experience in Gitlab and Bugzilla instances with such "proposals" which makes me relucant.
An algorithm might be way more successful if it gave way more weight to recently created (or edited?) tickets, I guess?
In any case, regarding WMF I doubt that there are currently resources to tackle such a huge project. :-/ Feels like upstream territory.

In T45#6358610, @Ammarpad wrote:

I just merged the seventh duplicate task of T259565

Looking at the task summary I have problems to find language based patterns that would have allowed proposing the "right" existing ticket.
I see the root parse three times, I see the word flow in two ticket summaries, four times mobile, and history in five. Hmm. Some cross-matching but still quite mixed for having 10 items in the pool.

T259565: [Regression] Unparsed wikitext in various JavaScript messages
T259696: Footnote in Flow messages in not parsed
T259602: Last edit indicator is broken on Minerva skin
T259601: History box error on Mobile Web for enwiki
T259584: Link to history broken
T259583: Revision History not accessible on mobile
T259581: Mobile page history "footer" showing raw URL
T259575: [regression -wmf.2] Homepage - SE filter "Create a new article" description displays ulr -encoded text not a link
T259580: "flow-wikitext-editor-help-and-preview" message is broken on flow pages on all wikis
T259571: Page history log bug
T259579: "Last modified" footer on mobile unparsed date and user links

DannyS712 subscribed.Aug 5 2020, 4:38 PM

Thank you for the honesty!

Maybe we could do something as simple as showing a list of all the most recently submitted tasks on the submission page? That might catch some things.

In T45#6364677, @mmodell wrote:

Maybe we could do something as simple as showing a list of all the most recently submitted tasks on the submission page? That might catch some things.

I don't think many people want to get a list of 50 tasks into their face and then spend time reading that list every single time.
It might catch a few things.
It will also condition basically everybody to scream and quickly scroll down.

Looking at the last 10000 tickets created, 4.19% of tickets marked as a duplicate.
Might be biased (too recently created to have been triaged?), so looking at all tickets created since launching Phab, 6.57% of tickets are marked as duplicates.

SELECT t.status,COUNT(t.id) FROM phabricator_maniphest.maniphest_task t WHERE t.id > 249776 GROUP BY t.status;
+-----------+-------------+
| status    | COUNT(t.id) |
+-----------+-------------+
| declined  |         213 |
| duplicate |         419 |
| invalid   |         324 |
| open      |        5158 |
| resolved  |        3800 |
| stalled   |          86 |
+-----------+-------------+

SELECT t.status,COUNT(t.id) FROM phabricator_maniphest.maniphest_task t WHERE t.id > 75682 GROUP BY t.status;
+-----------+-------------+
| status    | COUNT(t.id) |
+-----------+-------------+
| declined  |       12500 |
| duplicate |       12088 |
| invalid   |        9638 |
| open      |       36926 |
| resolved  |      112026 |
| stalled   |         901 |
+-----------+-------------+

• mmodell awarded a token.Aug 24 2020, 6:40 AM

danshick-wmde subscribed.Nov 24 2020, 1:37 PM

Any changes that would make this possible? ^_^

See my previous comments here; has some situation changed, or have new arguments arisen?

Bugreporter mentioned this in T283980: Phacility (Maintainer of Phabricator) is winding down. Upstream support ending..May 30 2021, 6:42 AM

Sj awarded a token.Apr 9 2023, 8:03 PM

R4356th subscribed.Apr 29 2023, 6:19 PM

valerio.bozzolan updated the task description. (Show Details)May 24 2023, 6:55 AM

Another root problem: it's difficult, if not impossible for newcomers, to get a "big picture" after landing on a bug reporting form. That of course causes some of these duplicates.

I very like the idea of "minimal forms", but, I'd like to reduce this feeling of «Welcome in this form, put your complaint/request here in this box, we'll find duplicates for you».

In the case of forms with at least one Tag, it would make sense if there was a way to visit that Tag. I'm definitely taking it very far, but it's a problem. A side-effect is to help people to be curious and discover other things and help each other "finding many friends on this journey" or stuff like that.

danshick-wmde unsubscribed.May 24 2023, 7:18 AM

valerio.bozzolan moved this task from Wikimedia requests to Need Discussion on the Phabricator (Upstream) board.Jun 14 2023, 1:55 PM

Aklapper moved this task from Need Discussion to Backlog on the Phabricator (Upstream) board.Jun 14 2023, 5:39 PM

Reedy moved this task from Backlog to To upstream/missing upstream link on the Upstream board.Aug 28 2023, 1:23 AM

	F560: Screenshot_2014-10-29_13.09.43.png
	Oct 29 2014, 8:11 PM

	F562: Screenshot_2014-10-29_13.10.48.png
	Oct 29 2014, 8:11 PM

Phabricator should suggest possible duplicates when creating a new taskOpen, LowPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

Phabricator should suggest possible duplicates when creating a new task
Open, LowPublic
Actions

Related Objects
Search...