Page MenuHomePhabricator

Convert Bugzilla's "Bug NNNNN" links to "TNNNNN" links in Phabricator
Open, LowestPublic

Description

In comments, [Bb]ug #?<N> should be parsed as T<N+2000>: that is, "bug 323" would be rendered as T2323; or converted to "bug 323 (T2323)". Ideally, the full bugzilla syntax would be parsed/converted, see https://bzr.mozilla.org/bugzilla/4.4/view/head:/Bugzilla/Template.pm#L234

Potentially incomplete proposal:

string in commentto become
[Bb]ug 9123000bug 9123000 (T9125000)
[Bb]ug #9123000bug 9123000 (T9125000)
[Bb]ug #?9123000,? [Cc]omment #?123bug 9123000 [[ https://old-bugzilla.wikimedia.org/show_bug.cgi?id=9123000#c123 | comment 123 ]] (T9125000)
[Bb]ug 9123000#c123bug 9123000 [[ https://old-bugzilla.wikimedia.org/show_bug.cgi?id=9123000#c123 | comment 123 ]] (T9125000)
http://bugzilla.wikimedia.org/show_bug.cgi?id=9123000http://bugzilla.wikimedia.org/show_bug.cgi?id=9123000 (T9125000)
https://bugzilla.wikimedia.org/show_bug.cgi?id=9123000https://bugzilla.wikimedia.org/show_bug.cgi?id=9123000 (T9125000)
http://bugzilla.wikimedia.org/9123000http://bugzilla.wikimedia.org/9123000 (T9125000)
https://bugzilla.wikimedia.org/9123000https://bugzilla.wikimedia.org/9123000 (T9125000)
http://bugs.wikimedia.org/show_bug.cgi?id=9123000http://bugs.wikimedia.org/show_bug.cgi?id=9123000 (T9125000)
https://bugs.wikimedia.org/show_bug.cgi?id=9123000https://bugs.wikimedia.org/show_bug.cgi?id=9123000 (T9125000)
http://bugs.wikimedia.org/9123000http://bugs.wikimedia.org/9123000 (T9125000)
https://bugs.wikimedia.org/9123000https://bugs.wikimedia.org/9123000 (T9125000)

Use case:

@chasemp: For sentences imported from Bugzilla tickets in comments like

  • Bug 12345 has been marked as a duplicate of this bug. ***

(yes, those are three stars at the start of the line turned into a bullet point, meh, another cosmetic thingy)

I know that some folks like to look at duplicates because they might provide more information about a bug and clicking instead of manual copy and paste fiddling sounds easier...

@chasemp: If a volunteer (or me) wanted to work on this, what would be needed? Write code with regexes and set up a bot for it?
Could a volunteer work on this task (or if not fully, how much of this task)?

I'm not sure how the bot fits in, but I am also thinking of this as a historical one-time fixup. I never had any plans for an ongoing translator. This is a straight mysql job in the first case which means simpler and more dangerous :)

This really doesn't require me / privs to write the job, only to validate / run it. It could easily be worked on on phab-01 or whatever.

The off-the-cuff break down to achieve this for

comments:

  1. Find all issues that were historically BZ using the custom reference field
  2. use the task PHID to find all comment type transactions
  3. use the comment transaction to find the actual comment itself
  4. retrieve the comment content from the db (assuming we only want to mangle comments from before migration)
  5. Run some kind of transform on the content (regex based changing Bug: Foo to Bug: Bar or whatever), and write the output back as the content.
  6. Repeat for every comment on every BZ imported task
  7. Wipe remarkup cache

descriptions

  1. Find all issues that were historically BZ using custom reference field
  2. retrieve the task desription (is this stored as a transaction now or not, I'm not sure anymore?)
  3. Perform same regexy type logic as above and update description
  4. Repeat for every task from BZ
  5. Wipe remarkup cache

The good news is a basic framework for this would handle all cases, and it's not too hard to write this, just very time consuming to make sure it's not doing something awkward and untoward.

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

old-bugzilla.wikimedia.org will still exist, cf. T40.

Will it exist for quite a while, or only for a very limited time? I wouldn't think the ops team would want to keep it online forever.

Quite a while, see T366#16004.

This could be done by a bot after the BZ and RT migration.

This task is not essential, it has a risk of overcomplicating the migration process, and it can be done afterward either with direct changes to the database or a bot.

Qgil lowered the priority of this task from Medium to Low.Nov 7 2014, 10:49 AM

Anyone willing to take this up? I suppose it has to be done by a Phabricator admin person/someone who can edit everyone's comments.

Personally I would support a replacement of all occurrences of the form

bug 1234

into

bug 1234 (T3234)

That way, in the rare case of referring to a Mozilla, PHP etc. bug number, the original text is not lost.

In T687#783662, @TTO wrote:

Personally I would support a replacement of all occurrences of the form

bug 1234

into

bug 1234 (T3234)

That way, in the rare case of referring to a Mozilla, PHP etc. bug number, the original text is not lost.

This sounds reasonable to me, and if that bug number is incorrect it could still be fixed manually. An additional suggestion is to first look for texts which are generated by Bugzilla like the duplication notice. I find it unlikely that someone had written *** This bug has been marked as a duplicate of bug ##### *** in one line so we could change it there with pretty high certainty. Unfortunately I don't know if there are any autogenerated texts, and it would be beneficial if that is handled before.

After that is done the bot could add the TXXX link after [Bb]ug *( #)?(\d+)(?! \(T\d+\))\D (this regex would prevent that it adds the phabricator number on already changed numbers like in your quote).

Nemo_bis updated the task description. (Show Details)
In T687#15275, @Qgil wrote:

No, linking directly to comments is just too hard, because Phabricator doesn't give them an attribute based on a position, but on an ID. For instance, @He7d3r's comment above is #10, but look at the actual ID: T687#15271

Not worth the fight.

An ugly but easy solution is to convert them to T12345 [[ https://old-bugzilla.wikimedia.org/show_bug.cgi?id=10345#c123 | #123 ]]. IMO still preferable to leaving broken comment references around.

Potentially incomplete proposal added to initial description.

If someone understands the regexes in http://bzr.mozilla.org/bugzilla/4.4/view/head:/Bugzilla/Template.pm#L234 better and if there are cases missing, please edit/correct the initial description.

In T687#822815, @Tgr wrote:

An ugly but easy solution is to convert them to T12345 [[ https://old-bugzilla.wikimedia.org/show_bug.cgi?id=10345#c123 | #123 ]]. IMO still preferable to leaving broken comment references around.

If we do this, maybe it would be better to have a more explicit link for the comment part, such as T12345 ([[ https://old-bugzilla.wikimedia.org/show_bug.cgi?id=10345#c123 | comment #123 ]]) i.e. T12345 (comment #123). We would need to make sure that these links work in the static version in the works (T1198).

chasemp lowered the priority of this task from Low to Lowest.

I'm sad to see no progress here :( This is probably my biggest pain point with Phabricator, as most of the tasks I deal with are ones imported from Bugzilla.

Would I be right in guessing that the few people who have the access to perform this task, are too busy to deal with it?

I dispute the priority being set to "Lowest"; this should have been resolved from Day 0 of Phabricator being the official issue tracker.

@TTO: The priority reflects the order in which the assignee or "the project" intends to resolve this; cf. "Setting task priorities".

@chasemp: If a volunteer (or me) wanted to work on this, what would be needed? Write code with regexes and set up a bot for it?
Could a volunteer work on this task (or if not fully, how much of this task)?

@chasemp: If a volunteer (or me) wanted to work on this, what would be needed? Write code with regexes and set up a bot for it?
Could a volunteer work on this task (or if not fully, how much of this task)?

I'm not sure how the bot fits in, but I am also thinking of this as a historical one-time fixup. I never had any plans for an ongoing translator. This is a straight mysql job in the first case which means simpler and more dangerous :)

This really doesn't require me / privs to write the job, only to validate / run it. It could easily be worked on on phab-01 or whatever.

The off-the-cuff break down to achieve this for

comments:

  1. Find all issues that were historically BZ using the custom reference field
  2. use the task PHID to find all comment type transactions
  3. use the comment transaction to find the actual comment itself
  4. retrieve the comment content from the db (assuming we only want to mangle comments from before migration)
  5. Run some kind of transform on the content (regex based changing Bug: Foo to Bug: Bar or whatever), and write the output back as the content.
  6. Repeat for every comment on every BZ imported task
  7. Wipe remarkup cache

descriptions

  1. Find all issues that were historically BZ using custom reference field
  2. retrieve the task desription (is this stored as a transaction now or not, I'm not sure anymore?)
  3. Perform same regexy type logic as above and update description
  4. Repeat for every task from BZ
  5. Wipe remarkup cache

The good news is a basic framework for this would handle all cases, and it's not too hard to write this, just very time consuming to make sure it's not doing something awkward and untoward.

bz9123000 should be part of the list too.

Change 236417 had a related patch set uploaded (by Seb35):
Phabricator extension to add links on Bugzilla "bug NNN" syntax

https://gerrit.wikimedia.org/r/236417

I worked the two last days on this feature and I just got a working PoC (see figure below and linked Gerrit change #236417). But it comes with two warnings:

  • I didn’t previously know Phabricator from a developer point of view, so perhaps there are better way to achieve the wanted result,
  • the extension contains a very hacky part, trickying an internal part of the class -- without it, it would only work in smaller contexts as comment preview but not while displaying an entire page with the "bug NNN" syntax -- the next step to improve it is to contact the Phabricator community

I didn’t carefully read this task and related tasks about what syntax should be used for one behaviour or another. The work done on this PoC can be easily adapted to support various syntaxes (e.g. Gerrit links, RT links, Bugzilla comment links, etc.)

Thank you @Seb35! I see that there is some additional discussion in the Gerrit change.

@Seb35, @Aklapper, @mmodell, @chasemp, what do you think about adding this task to DevRel-October-2015, aiming to merge this feature during the next month?

We are not running strict sprints and nothing would happen if we cannot make it during October, but then we would push it for November and at least we would keep our attention until it's done. It's not that we get patches to improve Wikimedia Phabricator every week, and we should support @Seb35 in his first phab-contribution. :)

I appreciate what @Seb35 has accomplished but I wonder how much value there is in this considering that bugzilla references are used in a diminishingly small portion portion of our tasks.

I'm not terribly apposed to this merging, however, I want to make sure it's not going to have much performance impact. I suspect that could be an issue given how much remarkup is used in phabricator (especially on the live previews of comments which repeatedly render the text to html via remarkup engine)

I have no technical knowledge to recommend a specific solution. I'm just proposing that we address @Seb35's contribution in the best way.

I originally thought that tasks like this one would be solved by a bot that would change all the references, although that would create a wave of notifications that would surely upset users. If we decide that converting "Bug NNN" on the fly is not an option either because of performance issues, then what we need to do is to decline these tasks to clarify expectations and not make anybody invest their time un cul-de-sacs.

@Qgil: it is possible to go directly to the database and update the markup without generating any alerts, and that would achieve the desired result without a runtime performance impact, however, like I said I'm not entirely apposed to this solution, and at least it's mostly working now, therefore it'll be a lot less work to merge this verses developing something that would update all the old content.

I still update bug NNN whenever I see them in task descriptions.
If I'm reading old comments, looking for "See also" etc, it's very frustrating to have to manually do these steps for all these old bug numbers:

  • copy
  • paste into search
  • add 2000
  • add the prefix "T"

(E.g. the 10 instances I had to do just now, whilst rereading T54817)

It would be very very helpful, (for all of the people who try to make sense of pre-2014 content), if we could implement either of the above proposed solutions.

I usually replace bug NNN with {TNNN} for autoexpansion, but the hover-functionality of TNNN (via database-level-replacement, or via Seb35's code) would also work fine.
Please please please and thank you!

I wish I had the time to look at this, I'm going to keep in mind for a hackathon since it seems to still be a pain point. I would also direct anyone who was interested in how I would go about it.

I still update bug NNN whenever I see them in task descriptions.
If I'm reading old comments, looking for "See also" etc, it's very frustrating to have to manually do these steps for all these old bug numbers:

  • copy
  • paste into search
  • add 2000
  • add the prefix "T"

(E.g. the 10 instances I had to do just now, whilst rereading T54817)

That's a lot of steps.

I go to https://bugzilla.wikimedia.org/<bug number here>, and let my browser redirect me

I go to https://bugzilla.wikimedia.org/<bug number here>, and let my browser redirect me

I wish I had the time to look at this, I'm going to keep in mind for a hackathon since it seems to still be a pain point. I would also direct anyone who was interested in how I would go about it.

Maybe you could write the directions here so people can see the amount of effort / types of skill it requires and decide interest based on that.

When I add a T-number by hand, a backlink is created in the target article. Is there a way to do the conversion in such a way that that still happens? It would be quite useful.

In T687#1872421, @Tgr wrote:

I wish I had the time to look at this, I'm going to keep in mind for a hackathon since it seems to still be a pain point. I would also direct anyone who was interested in how I would go about it.

Maybe you could write the directions here so people can see the amount of effort / types of skill it requires and decide interest based on that.

When I add a T-number by hand, a backlink is created in the target article. Is there a way to do the conversion in such a way that that still happens? It would be quite useful.

@chasemp: If a volunteer (or me) wanted to work on this, what would be needed? Write code with regexes and set up a bot for it?
Could a volunteer work on this task (or if not fully, how much of this task)?

I'm not sure how the bot fits in, but I am also thinking of this as a historical one-time fixup. I never had any plans for an ongoing translator. This is a straight mysql job in the first case which means simpler and more dangerous :)

This really doesn't require me / privs to write the job, only to validate / run it. It could easily be worked on on phab-01 or whatever.

The off-the-cuff break down to achieve this for

comments:

  1. Find all issues that were historically BZ using the custom reference field
  2. use the task PHID to find all comment type transactions
  3. use the comment transaction to find the actual comment itself
  4. retrieve the comment content from the db (assuming we only want to mangle comments from before migration)
  5. Run some kind of transform on the content (regex based changing Bug: Foo to Bug: Bar or whatever), and write the output back as the content.
  6. Repeat for every comment on every BZ imported task
  7. Wipe remarkup cache

descriptions

  1. Find all issues that were historically BZ using custom reference field
  2. retrieve the task desription (is this stored as a transaction now or not, I'm not sure anymore?)
  3. Perform same regexy type logic as above and update description
  4. Repeat for every task from BZ
  5. Wipe remarkup cache

The good news is a basic framework for this would handle all cases, and it's not too hard to write this, just very time consuming to make sure it's not doing something awkward and untoward.

Thanks @chasemp! I guess this wouldn't create the <person> mentioned this in <task> backlinks, right?

This project is selected for the Developer-Wishlist voting round and will be added to a MediaWiki page very soon. To the subscribers, or proposer of this task: please help modify the task description: add a brief summary (10-12 lines) of the problem that this proposal raises, topics discussed in the comments, and a proposed solution (if there is any yet). Remember to add a header with a title "Description," to your content. Please do so before February 5th, 12:00 pm UTC.

Is there any progress on this?

No (if there was progress it could be seen in this task)