Page MenuHomePhabricator

Convert Bugzilla's "Bug NNNNN" links to "TNNNNN" links in Phabricator
Open, LowestPublic

Tokens
"Mountain of Wealth" token, awarded by Liuxinyu970226."Love" token, awarded by zhuyifei1999."Mountain of Wealth" token, awarded by Nemo_bis."Mountain of Wealth" token, awarded by Quiddity.
Assigned To
None
Authored By
Aklapper, Oct 18 2014

Description

In comments, [Bb]ug #?<N> should be parsed as T<N+2000>: that is, "bug 323" would be rendered as T2323; or converted to "bug 323 (T2323)". Ideally, the full bugzilla syntax would be parsed/converted, see https://bzr.mozilla.org/bugzilla/4.4/view/head:/Bugzilla/Template.pm#L234

Potentially incomplete proposal:

string in commentto become
[Bb]ug 9123000bug 9123000 (T9125000)
[Bb]ug #9123000bug 9123000 (T9125000)
[Bb]ug #?9123000,? [Cc]omment #?123bug 9123000 [[ https://old-bugzilla.wikimedia.org/show_bug.cgi?id=9123000#c123 | comment 123 ]] (T9125000)
[Bb]ug 9123000#c123bug 9123000 [[ https://old-bugzilla.wikimedia.org/show_bug.cgi?id=9123000#c123 | comment 123 ]] (T9125000)
http://bugzilla.wikimedia.org/show_bug.cgi?id=9123000http://bugzilla.wikimedia.org/show_bug.cgi?id=9123000 (T9125000)
https://bugzilla.wikimedia.org/show_bug.cgi?id=9123000https://bugzilla.wikimedia.org/show_bug.cgi?id=9123000 (T9125000)
http://bugzilla.wikimedia.org/9123000http://bugzilla.wikimedia.org/9123000 (T9125000)
https://bugzilla.wikimedia.org/9123000https://bugzilla.wikimedia.org/9123000 (T9125000)
http://bugs.wikimedia.org/show_bug.cgi?id=9123000http://bugs.wikimedia.org/show_bug.cgi?id=9123000 (T9125000)
https://bugs.wikimedia.org/show_bug.cgi?id=9123000https://bugs.wikimedia.org/show_bug.cgi?id=9123000 (T9125000)
http://bugs.wikimedia.org/9123000http://bugs.wikimedia.org/9123000 (T9125000)
https://bugs.wikimedia.org/9123000https://bugs.wikimedia.org/9123000 (T9125000)

Use case:

@chasemp: For sentences imported from Bugzilla tickets in comments like

  • Bug 12345 has been marked as a duplicate of this bug. ***

(yes, those are three stars at the start of the line turned into a bullet point, meh, another cosmetic thingy)

I know that some folks like to look at duplicates because they might provide more information about a bug and clicking instead of manual copy and paste fiddling sounds easier...

@chasemp: If a volunteer (or me) wanted to work on this, what would be needed? Write code with regexes and set up a bot for it?
Could a volunteer work on this task (or if not fully, how much of this task)?

I'm not sure how the bot fits in, but I am also thinking of this as a historical one-time fixup. I never had any plans for an ongoing translator. This is a straight mysql job in the first case which means simpler and more dangerous :)

This really doesn't require me / privs to write the job, only to validate / run it. It could easily be worked on on phab-01 or whatever.

The off-the-cuff break down to achieve this for

comments:

  1. Find all issues that were historically BZ using the custom reference field
  2. use the task PHID to find all comment type transactions
  3. use the comment transaction to find the actual comment itself
  4. retrieve the comment content from the db (assuming we only want to mangle comments from before migration)
  5. Run some kind of transform on the content (regex based changing Bug: Foo to Bug: Bar or whatever), and write the output back as the content.
  6. Repeat for every comment on every BZ imported task
  7. Wipe remarkup cache

    descriptions
  8. Find all issues that were historically BZ using custom reference field
  9. retrieve the task desription (is this stored as a transaction now or not, I'm not sure anymore?)
  10. Perform same regexy type logic as above and update description
  11. Repeat for every task from BZ
  12. Wipe remarkup cache

The good news is a basic framework for this would handle all cases, and it's not too hard to write this, just very time consuming to make sure it's not doing something awkward and untoward.

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

That could happen for phrases such as "Mozilla bug 123456" or wikilinks

Let's please not introduce/discuss functionality in Phabricator here that never existed in Bugzilla. "Mozilla bug 1234" links to "bug 1234" in the local Bugzilla instance. Other example: I've also seen people writing "#1234" in Bugzilla but that never linked to bug report 1234 either.

I did not mean we should add autolinking functionality that did not previously exist. What I meant is that somehow, it should be possible to recover the original text of each migrated comment, in cases where it was messed up by the conversion. Though each migrated task will include the original bug number in a "Reference" field, if, for example, "bug 1234" happens to be a security bug that you don't have access to, you won't be able to see its bug number.

Hopefully, there aren't many security bugs, which would mean such a case would be unlikely. However, at the very least, nonexistent bug numbers should not be replaced with task numbers. That should be tested and verified.

the original bug number should be retained somehow. If some bug number shouldn't have been auto-linked, it should be possible to find out what it is.

old-bugzilla.wikimedia.org will still exist, cf. T40.

Will it exist for quite a while, or only for a very limited time? I wouldn't think the ops team would want to keep it online forever.

Qgil added a comment.Oct 28 2014, 7:46 AM

old-bugzilla.wikimedia.org will still exist, cf. T40.

Will it exist for quite a while, or only for a very limited time? I wouldn't think the ops team would want to keep it online forever.

Quite a while, see T366#16004.

This could be done by a bot after the BZ and RT migration.

Qgil edited projects, added Phabricator; removed Bugzilla-Preview.Oct 29 2014, 9:21 PM

This task is not essential, it has a risk of overcomplicating the migration process, and it can be done afterward either with direct changes to the database or a bot.

Qgil lowered the priority of this task from Normal to Low.Nov 7 2014, 10:49 AM
TTO added a subscriber: TTO.Nov 25 2014, 8:51 AM

Anyone willing to take this up? I suppose it has to be done by a Phabricator admin person/someone who can edit everyone's comments.

Personally I would support a replacement of all occurrences of the form

bug 1234

into

bug 1234 (T3234)

That way, in the rare case of referring to a Mozilla, PHP etc. bug number, the original text is not lost.

In T687#783662, @TTO wrote:

Personally I would support a replacement of all occurrences of the form

bug 1234

into

bug 1234 (T3234)

That way, in the rare case of referring to a Mozilla, PHP etc. bug number, the original text is not lost.

This sounds reasonable to me, and if that bug number is incorrect it could still be fixed manually. An additional suggestion is to first look for texts which are generated by Bugzilla like the duplication notice. I find it unlikely that someone had written *** This bug has been marked as a duplicate of bug ##### *** in one line so we could change it there with pretty high certainty. Unfortunately I don't know if there are any autogenerated texts, and it would be beneficial if that is handled before.

After that is done the bot could add the TXXX link after [Bb]ug *( #)?(\d+)(?! \(T\d+\))\D (this regex would prevent that it adds the phabricator number on already changed numbers like in your quote).

jayvdb added a subscriber: jayvdb.Dec 2 2014, 3:35 AM
Nemo_bis updated the task description. (Show Details)Dec 4 2014, 9:23 AM
Nemo_bis updated the task description. (Show Details)
Nemo_bis updated the task description. (Show Details)Dec 4 2014, 9:26 AM
revi added a subscriber: revi.Dec 4 2014, 12:32 PM
Tgr added a subscriber: Tgr.Dec 5 2014, 11:18 PM
In T687#15275, @Qgil wrote:

No, linking directly to comments is just too hard, because Phabricator doesn't give them an attribute based on a position, but on an ID. For instance, @He7d3r's comment above is #10, but look at the actual ID: T687#15271

Not worth the fight.

An ugly but easy solution is to convert them to T12345 [[ https://old-bugzilla.wikimedia.org/show_bug.cgi?id=10345#c123 | #123 ]]. IMO still preferable to leaving broken comment references around.

scfc added a subscriber: scfc.Dec 6 2014, 5:05 AM
chasemp claimed this task.Dec 30 2014, 7:42 PM

Potentially incomplete proposal added to initial description.

If someone understands the regexes in http://bzr.mozilla.org/bugzilla/4.4/view/head:/Bugzilla/Template.pm#L234 better and if there are cases missing, please edit/correct the initial description.

Aklapper updated the task description. (Show Details)Dec 31 2014, 5:15 PM
Aklapper updated the task description. (Show Details)Dec 31 2014, 5:20 PM
scfc updated the task description. (Show Details)Dec 31 2014, 7:47 PM
Qgil added a comment.Jan 2 2015, 4:01 PM
In T687#822815, @Tgr wrote:

An ugly but easy solution is to convert them to T12345 [[ https://old-bugzilla.wikimedia.org/show_bug.cgi?id=10345#c123 | #123 ]]. IMO still preferable to leaving broken comment references around.

If we do this, maybe it would be better to have a more explicit link for the comment part, such as T12345 ([[ https://old-bugzilla.wikimedia.org/show_bug.cgi?id=10345#c123 | comment #123 ]]) i.e. T12345 (comment #123). We would need to make sure that these links work in the static version in the works (T1198).

MC8 added a subscriber: MC8.Jan 12 2015, 3:32 PM
chasemp lowered the priority of this task from Low to Lowest.Jan 30 2015, 7:59 PM
chasemp removed chasemp as the assignee of this task.
TTO added a comment.Mar 22 2015, 9:13 AM

I'm sad to see no progress here :( This is probably my biggest pain point with Phabricator, as most of the tasks I deal with are ones imported from Bugzilla.

Would I be right in guessing that the few people who have the access to perform this task, are too busy to deal with it?

I dispute the priority being set to "Lowest"; this should have been resolved from Day 0 of Phabricator being the official issue tracker.

scfc added a comment.Mar 22 2015, 4:09 PM

@TTO: The priority reflects the order in which the assignee or "the project" intends to resolve this; cf. "Setting task priorities".

@chasemp: If a volunteer (or me) wanted to work on this, what would be needed? Write code with regexes and set up a bot for it?
Could a volunteer work on this task (or if not fully, how much of this task)?

@chasemp: If a volunteer (or me) wanted to work on this, what would be needed? Write code with regexes and set up a bot for it?
Could a volunteer work on this task (or if not fully, how much of this task)?

I'm not sure how the bot fits in, but I am also thinking of this as a historical one-time fixup. I never had any plans for an ongoing translator. This is a straight mysql job in the first case which means simpler and more dangerous :)

This really doesn't require me / privs to write the job, only to validate / run it. It could easily be worked on on phab-01 or whatever.

The off-the-cuff break down to achieve this for

comments:

  1. Find all issues that were historically BZ using the custom reference field
  2. use the task PHID to find all comment type transactions
  3. use the comment transaction to find the actual comment itself
  4. retrieve the comment content from the db (assuming we only want to mangle comments from before migration)
  5. Run some kind of transform on the content (regex based changing Bug: Foo to Bug: Bar or whatever), and write the output back as the content.
  6. Repeat for every comment on every BZ imported task
  7. Wipe remarkup cache

descriptions

  1. Find all issues that were historically BZ using custom reference field
  2. retrieve the task desription (is this stored as a transaction now or not, I'm not sure anymore?)
  3. Perform same regexy type logic as above and update description
  4. Repeat for every task from BZ
  5. Wipe remarkup cache

The good news is a basic framework for this would handle all cases, and it's not too hard to write this, just very time consuming to make sure it's not doing something awkward and untoward.

bz9123000 should be part of the list too.

Change 236417 had a related patch set uploaded (by Seb35):
Phabricator extension to add links on Bugzilla "bug NNN" syntax

https://gerrit.wikimedia.org/r/236417

Seb35 added a subscriber: Seb35.Sep 6 2015, 2:41 PM

I worked the two last days on this feature and I just got a working PoC (see figure below and linked Gerrit change #236417). But it comes with two warnings:

  • I didn’t previously know Phabricator from a developer point of view, so perhaps there are better way to achieve the wanted result,
  • the extension contains a very hacky part, trickying an internal part of the class -- without it, it would only work in smaller contexts as comment preview but not while displaying an entire page with the "bug NNN" syntax -- the next step to improve it is to contact the Phabricator community

I didn’t carefully read this task and related tasks about what syntax should be used for one behaviour or another. The work done on this PoC can be easily adapted to support various syntaxes (e.g. Gerrit links, RT links, Bugzilla comment links, etc.)

Qgil added a comment.Sep 15 2015, 7:44 AM

Thank you @Seb35! I see that there is some additional discussion in the Gerrit change.

@Seb35, @Aklapper, @mmodell, @chasemp, what do you think about adding this task to DevRel-October-2015, aiming to merge this feature during the next month?

We are not running strict sprints and nothing would happen if we cannot make it during October, but then we would push it for November and at least we would keep our attention until it's done. It's not that we get patches to improve Wikimedia Phabricator every week, and we should support @Seb35 in his first phab-contribution. :)

I appreciate what @Seb35 has accomplished but I wonder how much value there is in this considering that bugzilla references are used in a diminishingly small portion portion of our tasks.

I'm not terribly apposed to this merging, however, I want to make sure it's not going to have much performance impact. I suspect that could be an issue given how much remarkup is used in phabricator (especially on the live previews of comments which repeatedly render the text to html via remarkup engine)

Qgil added a comment.Sep 15 2015, 8:10 AM

I have no technical knowledge to recommend a specific solution. I'm just proposing that we address @Seb35's contribution in the best way.

I originally thought that tasks like this one would be solved by a bot that would change all the references, although that would create a wave of notifications that would surely upset users. If we decide that converting "Bug NNN" on the fly is not an option either because of performance issues, then what we need to do is to decline these tasks to clarify expectations and not make anybody invest their time un cul-de-sacs.

@Qgil: it is possible to go directly to the database and update the markup without generating any alerts, and that would achieve the desired result without a runtime performance impact, however, like I said I'm not entirely apposed to this solution, and at least it's mostly working now, therefore it'll be a lot less work to merge this verses developing something that would update all the old content.

I still update bug NNN whenever I see them in task descriptions.
If I'm reading old comments, looking for "See also" etc, it's very frustrating to have to manually do these steps for all these old bug numbers:

  • copy
  • paste into search
  • add 2000
  • add the prefix "T"

(E.g. the 10 instances I had to do just now, whilst rereading T54817)

It would be very very helpful, (for all of the people who try to make sense of pre-2014 content), if we could implement either of the above proposed solutions.

I usually replace bug NNN with {TNNN} for autoexpansion, but the hover-functionality of TNNN (via database-level-replacement, or via Seb35's code) would also work fine.
Please please please and thank you!

I wish I had the time to look at this, I'm going to keep in mind for a hackathon since it seems to still be a pain point. I would also direct anyone who was interested in how I would go about it.

Bawolff added a subscriber: Bawolff.Dec 1 2015, 7:03 PM

I still update bug NNN whenever I see them in task descriptions.
If I'm reading old comments, looking for "See also" etc, it's very frustrating to have to manually do these steps for all these old bug numbers:

  • copy
  • paste into search
  • add 2000
  • add the prefix "T" (E.g. the 10 instances I had to do just now, whilst rereading T54817)

That's a lot of steps.

I go to https://bugzilla.wikimedia.org/<bug number here>, and let my browser redirect me

I go to https://bugzilla.wikimedia.org/<bug number here>, and let my browser redirect me

Tgr added a comment.Dec 11 2015, 8:48 AM

I wish I had the time to look at this, I'm going to keep in mind for a hackathon since it seems to still be a pain point. I would also direct anyone who was interested in how I would go about it.

Maybe you could write the directions here so people can see the amount of effort / types of skill it requires and decide interest based on that.

When I add a T-number by hand, a backlink is created in the target article. Is there a way to do the conversion in such a way that that still happens? It would be quite useful.

In T687#1872421, @Tgr wrote:

I wish I had the time to look at this, I'm going to keep in mind for a hackathon since it seems to still be a pain point. I would also direct anyone who was interested in how I would go about it.

Maybe you could write the directions here so people can see the amount of effort / types of skill it requires and decide interest based on that.

When I add a T-number by hand, a backlink is created in the target article. Is there a way to do the conversion in such a way that that still happens? It would be quite useful.

@chasemp: If a volunteer (or me) wanted to work on this, what would be needed? Write code with regexes and set up a bot for it?
Could a volunteer work on this task (or if not fully, how much of this task)?

I'm not sure how the bot fits in, but I am also thinking of this as a historical one-time fixup. I never had any plans for an ongoing translator. This is a straight mysql job in the first case which means simpler and more dangerous :)

This really doesn't require me / privs to write the job, only to validate / run it. It could easily be worked on on phab-01 or whatever.

The off-the-cuff break down to achieve this for

comments:

  1. Find all issues that were historically BZ using the custom reference field
  2. use the task PHID to find all comment type transactions
  3. use the comment transaction to find the actual comment itself
  4. retrieve the comment content from the db (assuming we only want to mangle comments from before migration)
  5. Run some kind of transform on the content (regex based changing Bug: Foo to Bug: Bar or whatever), and write the output back as the content.
  6. Repeat for every comment on every BZ imported task
  7. Wipe remarkup cache

    descriptions
  8. Find all issues that were historically BZ using custom reference field
  9. retrieve the task desription (is this stored as a transaction now or not, I'm not sure anymore?)
  10. Perform same regexy type logic as above and update description
  11. Repeat for every task from BZ
  12. Wipe remarkup cache

The good news is a basic framework for this would handle all cases, and it's not too hard to write this, just very time consuming to make sure it's not doing something awkward and untoward.

Tgr added a comment.Dec 11 2015, 10:04 PM

Thanks @chasemp! I guess this wouldn't create the <person> mentioned this in <task> backlinks, right?

Restricted Application added a subscriber: TerraCodes. · View Herald TranscriptMay 16 2016, 9:34 PM
Poyekhali moved this task from To Triage to Doing on the Phabricator board.Jun 10 2016, 3:44 AM
Aklapper moved this task from Doing to Misc on the Phabricator board.Oct 7 2016, 5:07 PM
zhuyifei1999 added a subscriber: zhuyifei1999.

This project is selected for the Developer-Wishlist voting round and will be added to a MediaWiki page very soon. To the subscribers, or proposer of this task: please help modify the task description: add a brief summary (10-12 lines) of the problem that this proposal raises, topics discussed in the comments, and a proposed solution (if there is any yet). Remember to add a header with a title "Description," to your content. Please do so before February 5th, 12:00 pm UTC.

Is there any progress on this?

No (if there was progress it could be seen in this task)