Set up redirects from old bugzilla.wikimedia.org URLs
Closed, ResolvedPublic

Description

The patch: https://gerrit.wikimedia.org/r/#/c/166283/

The requirements:

  • Single ticket URLs to map:
    • /bugzilla.wikimedia.org/23223
    • /bugzilla.wikimedia.org/show_bug.cgi?id=23223
    • /bugs.wikimedia.org/23223
    • /bugs.wikimedia.org/show_bug.cgi?id=23223
  • Links that should go to the Phabricator homepage https://phabricator.wikimedia.org/ :
    • //(bugs|bugzilla).wikimedia.org/enter_bug.cgi (which can have additional URL parameters to ignore)
    • //(bugs|bugzilla).wikimedia.org/index.cgi
    • //(bugs|bugzilla).wikimedia.org/query.cgi (which can have additional URL parameters to ignore)
    • //(bugs|bugzilla).wikimedia.org/reports.cgi (which can have additional URL parameters to ignore)
    • //(bugs|bugzilla).wikimedia.org/report.cgi (which can have additional URL parameters to ignore)
    • //(bugs|bugzilla).wikimedia.org/chart.cgi (which can have additional URL parameters to ignore)
    • //(bugs|bugzilla).wikimedia.org/describecomponents.cgi (which can have additional URL parameters to ignore)
    • //(bugs|bugzilla).wikimedia.org/attachment.cgi?id=12345 (rather uncommon plus we won't keep the numbering scheme. Still possible to query for the attachment ID manually in Phabricator though.)
    • //bug-attachment.wikimedia.org/attachment.cgi (which can have additional URL parameters to ignore)
  • //(bugs|bugzilla).wikimedia.org/buglist.cgi (always with additional URL parameters) will redirect to https://bugzilla.wikimedia.org/buglist.cgi while keeping its URL parameters intact. The banner (T1234) will tell people that what they see is outdated. More info, just for the records: Such URLs can have dozens of URL parameters, e.g. using buglist.cgi with a bug_id parameter plus a number of bug IDs with commata inbetween (to get a static list of tickets listed). Queries for a specific status (open tickets) in some specific products/components are common but we won't have a mapping between Bugzilla products and components and Phab projects somewhere publicly stored, as URL parameters are fundamentally different in Phabricator. This also applies to Bugzilla's "saved searches" functionality - people will still be able to see and edit their search parameters in old-bugzilla and try to convert them manually into Phabricator.

Details

Reference
fl65

Related Objects

StatusAssignedTask
ResolvedQgil
ResolvedQgil
ResolvedQgil
Resolved mmodell
Resolvedchasemp
ResolvedDzahn
ResolvedDzahn
ResolvedQgil
Resolvedchasemp
Declinedchasemp
Resolvedchasemp
Duplicatechasemp
Declinedchasemp
Invalidchasemp
Duplicatechasemp
ResolvedQgil
Resolvedchasemp
ResolvedAklapper
ResolvedQgil
Resolved mmodell
There are a very large number of changes, so older changes are hidden. Show Older Changes

This is basically done... just need to coordinate with @chasemp to figure out how we will actually install this in production after the migration.

In T40#7408, @mmodell wrote:

This is basically done... just need to coordinate with @chasemp to figure out how we will actually install this in production after the migration.

Is there some code/patch/textfile dump/whatever to take a look at all the redirect rules?

flimport added a subscriber: Qgil.Oct 2 2014, 9:48 PM
flimport added a subscriber: greg.Oct 2 2014, 9:58 PM
mmodell raised the priority of this task from Normal to High.Oct 6 2014, 7:31 PM

@Aklapper: I'll push up the code for review

flimport added a subscriber: scfc.Oct 7 2014, 3:00 AM
mmodell added a comment.EditedOct 12 2014, 11:11 AM

https://gerrit.wikimedia.org/r/#/c/166283/

@chasemp: this should be ready for testing on labs.

Aklapper set Security to None.
mmodell lowered the priority of this task from High to Normal.Oct 14 2014, 12:01 AM
jayvdb added a subscriber: jayvdb.Oct 21 2014, 3:29 PM
Stryn added a subscriber: Stryn.Oct 23 2014, 4:57 PM
greg added a comment.Oct 23 2014, 5:18 PM

Dumb question: will anything actually happen/be redirected during the "Bugzilla is in read-only mode for an indeterminate amount of time"?

nope, redirection starts when we move bugzilla.wm.o to old-bugzilla.wm.o? archive-bz.wm.o? up for debate I guess

Qgil added a comment.Oct 23 2014, 6:19 PM

Why do we need to change the URL?

otherwise how does do you get to it if you redirect people trying to get the existing url to phab?

Qgil added a comment.Oct 23 2014, 7:22 PM

Er... hi, good morning, and sorry for the noise. :)

old-bugzilla.wm.o works for me.

revi added a subscriber: revi.Oct 26 2014, 6:43 AM
Qgil raised the priority of this task from Normal to High.
Aklapper updated the task description. (Show Details)Nov 3 2014, 8:35 PM
Aklapper mentioned this in T625: Bug 1 be bug 1.
Elitre added a subscriber: Elitre.Nov 4 2014, 4:12 PM
Qgil updated the task description. (Show Details)Nov 5 2014, 12:40 PM

@mmodell, can you please share the current status of this task, and your plan to test it before the Bugzilla migration?

@Qgil I will be setting up the redirect scripts on bugzillapreview.wmflabs.org today.

mmodell added a comment.EditedNov 6 2014, 3:00 AM

Got it set up and working on bugzillapreview:

For bug 50092, original url is:

temporary redirect url is:

and this will redirect to:

Once we migrate the production data we can point bugzilla.wikimedia.org to the phabricator host and then old bugzilla urls will redirect to new phabricator urls.

@Qgil: is this proof of concept adequate for now?

Qgil added a comment.Nov 6 2014, 7:49 AM

It's a first testable step. Thank you!

Would it be possible to test the redirect from a non "https://bugzillapreview.wmflabs.org/..." URL? The jump of domain is an important factor here.

A "bugzillapreview.wmflabs.org/..." label appears at the top of the page, in all pages, i.e. https://bugzillapreview.wmflabs.org/project/board/30/

Qgil added a comment.Nov 6 2014, 1:17 PM

bugzillapreview seems to suffer the same type of JavaScript problems we saw in phab-01 when @mmodell was testing the redirects there.

The dropdown menus don't work, see for instance
https://bugzillapreview.wmflabs.org/dashboard/create/
https://bugzillapreview.wmflabs.org/maniphest/task/edit/435/#

https://bugzillapreview.wmflabs.org/project/board/142/# is unusable, and the formatting is broken.

There are other formatting problems, see "Yesterday" at https://bugzillapreview.wmflabs.org/project/view/142/

PS: phab-01 is broken too in a similar way, and today the Mobile teams are having a demo there. It would be great if it could be fixed.

ssh seems to hang to that server at the moment

ssh is hanging on which server?

@Qgil I could set up a separate domain for the redirects but I need chase to set up the alias or give me permission to do so on bugzillapreview - I currently can't add proxy hostnames to that labs project.

@chasemp: can you grant me permissions on your testing project?

Qgil added a comment.EditedNov 6 2014, 9:06 PM

All the problems reported at T40#19443 are fixed now, after a good refresh of my browser. Thank you @mmodell!

Just curious, why does this problem happen, and is it something that we should expect appearing in the production instance and fixing it when we deploy this feature?

@chasemp: can you grant me permissions on your testing project?

Qgil added a comment.EditedNov 7 2014, 3:56 PM

Also, what about some stress testing? Theoretically (and please correct me because I'm not the expert here) a solution might work with just a few clicks here and there, but then fail on Tuesday 25 when suddenly we get a wave of anxious Bugzilla users clicking the old links, plus the crawlers that still don't know about Phabricator, etc. That failure meaning that phabricator.wikimedia.org is slow, times out, is brought down... or the redirects break and users never make it here.

I have no qualms with perf testing, but in all honestly this is such a minimal shim of logic that you would have to stress the entire server to get some undesired outcome from this redirection process.

Qgil added a comment.Nov 7 2014, 3:58 PM

OK, I'm happy to read this coming from you. I simply had to ask. :)

Dzahn added a subscriber: Dzahn.Nov 7 2014, 8:25 PM

Mukunda, hi!

so.. did I get it right that the plan is like this ?

  • switch DNS bugzilla.wikimedia.org over to point at the phabricator box itself
  • have a new virtual host bugzilla.wm.org configured there that handles the redirects into phabricator
  • so the old Bugzilla box, zirconium doesn't get any traffic for bugzilla.wikimedia.org anymore
  • the scripts for this are PHP scripts on the phabricator instance
  • there is a backend DB used by those scripts that holds needed information? where is that database server? did you plan for it to be on localhost, on db1001, is it going to be mysql/mariadb or sqlite or ?

is that Apache config / rewrite rules / PHP script already written and in git somewhere?

We cannot emulate Bugzilla's saved searches (and their parameters) in Phab.
One vague idea is to still redirect URLs containing "buglist.cgi" to the read-only Bugzilla (T366) instead of Phabricator so people would still get their (non-updated) results. It is possible from Bugzilla to get to the search parameters via "Edit Search" in Bugzilla so people could manually try to emulate that search via Phabricator's advanced search. HOWEVER, we do not commit to having Bugzilla up forever hence not sure.
Wondering whether to add that to the read-only banner on that site once BZ is read-only hence added a reminder to T366 - definitely would need documentation.

Qgil added a comment.Nov 9 2014, 10:13 PM

Last Friday @mmodell demonstrated how this feature works from a different domain (I just forgot which domain it was). Great!

Redirects from index.cgi, query.cgi, reports.cgi, report.cgi, chart.cgi, and describecomponents.cgi still don't work as specified in the description.

In T40#20019, @Aklapper wrote:

We cannot emulate Bugzilla's saved searches (and their parameters) in Phab.
One vague idea is to still redirect URLs containing "buglist.cgi" to the read-only Bugzilla (T366) instead of Phabricator so people would still get their (non-updated) results.

Since the implementation of this feature is apparently simple, we could offer it in a totally unsupported way for as long as old-bugzilla exists, just a short-term convenience. If @mmodell agrees with this, then we should update the task description accordingly.

It would be great to have all the tests working by Wednesday 12, so we can showcase them in the Go-NoGo meeting.

In T40#20019, @Aklapper wrote:

One vague idea is to still redirect URLs containing "buglist.cgi" to the read-only Bugzilla (T366) instead of Phabricator

this sounds like we are making it even pretty complicated. just redirecting _some_ URLs but not all, having to keep up BZ and running, just for some links. with more complex rewrite rules.

Since the implementation of this feature is apparently simple,

could you please outline here if my above assumption is correct? is that the plan? do you need ops to do anything else besides a DNS change? in which database are you going to store the info? has that been mentioned to Sean? are you going to need a dba to get it up in production?

It would be great to have all the tests working by Wednesday 12,

this would be less than 2 working days from now. does the Apache config exist? PHP scripts are puppetized? do you need help?

In T40#20602, @Qgil wrote:

Last Friday @mmodell demonstrated how this feature works

16:44 < twentyafterfour> mutante: shouldn't need anything special really
16:48 < twentyafterfour> mutante: these are questions for chasemp really...

^ this confuses me, honestly

Qgil added a comment.Nov 11 2014, 1:11 AM
In T40#20602, @Qgil wrote:

Last Friday @mmodell demonstrated how this feature works from a different domain (I just forgot which domain it was). Great!

The domain: http://bugzilla.wmflabs.org

For instance:

Aklapper updated the task description. (Show Details)Nov 12 2014, 1:19 PM

(I replaced redirects to Fact by Report in the description as Fact is a 404 on our instance.)

Aklapper updated the task description. (Show Details)Nov 12 2014, 1:28 PM
Qgil added a comment.Nov 12 2014, 1:59 PM

Proposing WONTFIX for this part (though I'd be really happy to be proven wrong). This also applies to Bugzilla's "saved searches" functionality as it also ends up on buglist.cgi

I agree. In theory people could just paste the old URLs for searches and use "old-bugzilla" instead of "bugzilla", right?

Also users should be able to log in to old-bugzilla while it exists, right? There they can still have their saved searches.

A "click me" list of URLs for testing redirects is available in the table at https://www.mediawiki.org/wiki/Phabricator/versus_Bugzilla#Bugzilla_URLs_and_their_redirects

In T40#21611, @Qgil wrote:

Also users should be able to log in to old-bugzilla while it exists, right?

Yes. Clarified that in T1234 and on the wiki.

I think any bugzilla url that doesn't have a corresponding phabricator equivalent should simply redirect to old-bugzilla.

http://bugzilla.wikimedia.org/$path -> http://old-bugzilla.wikimedia.org/$path

(other than the patterns which explicitly redirect to phabricator)

Maybe some footer in the banner that notes a redirect may be from a search that is no longer valid?

@Dzahn:

  • switch DNS bugzilla.wikimedia.org over to point at the phabricator box itself

Yes

  • have a new virtual host bugzilla.wm.org configured there that handles the redirects into phabricator

no, there is a wildcard virtual host pointed to phabricator. no need to change virtual host settings.

  • so the old Bugzilla box, zirconium doesn't get any traffic for bugzilla.wikimedia.org anymore

right

  • the scripts for this are PHP scripts on the phabricator instance

yes

  • there is a backend DB used by those scripts that holds needed information? where is that database server? did you plan for it to be on localhost, on db1001, is it going to be mysql/mariadb or sqlite or ?

It's going to be reading from a cross reference table in the phabricator-maniphest database, using the user account that @chasemp already procured for this purpose.

is that Apache config / rewrite rules / PHP script already written and in git somewhere?

https://gerrit.wikimedia.org/r/#/c/166283/5 but I am about to submit a new diff that is production ready, that was just for testing.

Some more details about how this works:

It's all self contained within phabricator. We won't need any dba assistance, chase already arranged for a user account with the minimal access to connect to the phabricator database and read the lookup tables.

The php scripts will reside on the phabricator host, no modifications to the apache config is needed. The only help needed from ops is we will need to change dns for bugzilla to point to phabricator's IP.

@Dzahn wrote:
16:44 < twentyafterfour> mutante: shouldn't need anything special really
16:48 < twentyafterfour> mutante: these are questions for chasemp really...
^ this confuses me, honestly

The reason I said these were questions for chase is because he's doing most of the operational stuff for the migration. I just wrote the php scripts and regex patterns.

In T40#21611, @Qgil wrote:

I agree. In theory people could just paste the old URLs for searches and use "old-bugzilla" instead of "bugzilla", right?

only if we keep Bugzilla up as the actual app, instead of just keeping static HTML of old bugs

In T40#22103, @mmodell wrote:

The only help needed from ops is we will need to change dns for bugzilla to point to phabricator's IP.

Thanks for the answers and explanation Mukunda, much appreciated. So that sounds easy enough for us, great:) My last question then would be if it has been considered that there is not only bugzilla.wikimedia.org but also bugs.wikimedia.org and bug-attachment.wikimedia.org. I would switch all 3 of them over to phabricator then. Also, thanks for your review on that Gerrit change.

technically it will need the latter 2 of these 3:

ref: https://gerrit.wikimedia.org/r/#/c/172448/ (DNS change to iridium, that won't work because iridium does not have a public IP)
https://gerrit.wikimedia.org/r/#/c/172469/ (DNS change to misc-web, which should work)
https://gerrit.wikimedia.org/r/#/c/172471/ (varnish change to make it handle bugzilla requests and forward them to iridium)

In T40#22130, @Dzahn wrote:
In T40#21611, @Qgil wrote:

I agree. In theory people could just paste the old URLs for searches and use "old-bugzilla" instead of "bugzilla", right?

only if we keep Bugzilla up as the actual app, instead of just keeping static HTML of old bugs

I don't expect anybody (well, maybe two or three people coming late to the party?) to search in Bugzilla in a few months.
Still we should make clear in statements that old-bugzilla.wm.o will not be around forever (cf. T1198).

Aklapper updated the task description. (Show Details)Nov 14 2014, 2:03 AM

What's the reason not to make the old bugzilla id's availible forever? At least we should have a table with old and current ids so that the few outlayer people are able to find the conent of the bug reports.
My consideration that bug reports are citable might be just one example. For example in the mathoid paper, (http://arxiv.org/pdf/1404.6179v1.pdf) we refere to some bug reports.
[15] R. Morris. Bug 54367 - intermittent texvc problems. https://bugzilla.
wikimedia . org / show _ bug . cgi ? id = 5436[Online; accessed 20-March-
2014]. 2013.
[16] S. Murugan. Bug 54456 - Failed to parse (Cannot store math image on
filesystem.) https : / / bugzilla . wikimedia . org / show _ bug . cgi ? id =
54456[Online; accessed 20-March-2014]. 2013.
[17] Netheril96@gmail.com. Option “MathML if possible” doesn’t work. https:
//bugzilla.wikimedia.org/show_bug.cgi?id=25646[Online; accessed
20-March-2014]. 2010

I could imagine that there are other legitimate use cases to refer to old bug reports.

Qgil added a comment.Nov 14 2014, 10:23 AM

@Physikerwelt, old bugzilla IDs / URLs will redirect to the corresponding Phabricator tasks as long as Phabricator exists. And the plan is that those URLs, with "old-bugzilla" instead of "bugzilla", will point first to Bugzilla in read-only mode, and then to a static HTML version (T1198).

The discussion in the last comments is about the rest of URLs for other pages, search queries, etc.

Aklapper updated the task description. (Show Details)Nov 14 2014, 11:54 AM

Urgh. I admit I forgot enter_bug.cgi in the initial description. Sorry for that, fixed now. :(

@chasemp: I need to find the "fieldIndex" for the ext-ref custom field in phabricator. This is a different string on each phabricator instance, on labs the value is yERhvoZPNPtM and I don't know of any automated way to find out what it is in production...

So I need to have you run this query on the phabricator_maniphest database in production:

select * from maniphest_customfieldstorage limit 100;

There should be only two different values in the fieldIndex column, and I need to know the one that corresponds to the numeric values in the fieldValue column. E.g. here is example output from labs:

+-----+--------------------------------+--------------+------------+
| id  | objectPHID                     | fieldIndex   | fieldValue |
+-----+--------------------------------+--------------+------------+
|   1 | PHID-TASK-zu4utf5gle4eolz7zqxo | 456XCt.DWg.X | default    |
|   2 | PHID-TASK-3qm5pqvyb73ecxqjenuz | 456XCt.DWg.X | default    |

... one two skip a few ...

|  25 | PHID-TASK-rmsuz2xuldbmmhpa7zze | 456XCt.DWg.X | default    |
|  26 | PHID-TASK-rmsuz2xuldbmmhpa7zze | yERhvoZPNPtM | 8          | <-- bingo
|  27 | PHID-TASK-pgjr6anuz7a4d3hoc5c4 | 456XCt.DWg.X | default    |

Can you run the query on production and tell me the fieldIndex for the numeric field? This will tell me the key which I need to use to look up the ext-ref custom field when translating bug numbers to maniphest tasks for redirection.

Sorry for the busywork, phabricator just doesn't make this easy to look up and I can't think of another way to do it.

@mmodell

maybe I'm missing it, here is our query but huh...

https://phabricator.wikimedia.org/P76

Aklapper updated the task description. (Show Details)Nov 14 2014, 9:39 PM
Aklapper closed this task as Resolved.Nov 23 2014, 10:32 PM

Tested again with a fresh private window in a different browser (to not have any XFF leftovers in place).
All redirects listed on https://www.mediawiki.org/wiki/Phabricator/versus_Bugzilla#Redirected_URLs_after_Bugzilla_migration work as expected.

Closing as RESOLVED. Thank you everybody for your work, help, and patience with this!