Page MenuHomePhabricator

codfw: (1) phabricator host (backup node)
Closed, ResolvedPublic

Description

Labs Project Tested: phabricator
Site/Location:CODFW
Number of systems: 1
Service: phabricator
Networking Requirements: external IP, LVS
Processor Requirements: ?
Memory: 64
Disks: 500gb
NIC(s):
Partitioning Scheme: one large volume for /srv on raid1
Other Requirements:

Our current phabricator setup lacks redundancy which causes problems during deployments, kernel upgrades and any hardware related issues.

Details

It's not currently trivial to recreate phabricator from backups + puppet, so any issues will cause significant downtime.

Given how much we rely on phabricator, it's fairly critical to maintain nearly 100% uptime.

Here are the components involved with phabricator:

hostdomain/service nameDescription
iridium.eqiad.wmnetphabricator.wikimedia.orgApache + PHP
git-ssh.wikimedia.orgsshd + phabricator git integration
phdphabricator's worker queue
m3-master.eqiad.wmnetMySQLPhabricator's DB Cluster

The database is already redundant, however, all the other critical pieces are running on iridium with no fallback.

Iridium specs

The specs for iridium are fairly generous. It's a 16 core Intel(R) Xeon(R) CPU E5-2450 v2 @ 2.50GHz with 64G ram and 2x500G (raid1 mirrored) storage.

Iridium's repository storage looks like this:

Filesystem      Size  Used Avail Use% Mounted on
/dev/md0        9.1G  6.2G  2.4G  73% /
/dev/md2        456G  138G  318G  31% /srv

What's needed

In order to have some redundancy, I'd like to start with a backup web server. If we have a machine available with a decent amount of storage then we could also have a repository mirror on the same machine. The repositories will become more important soon as we are migrating from gerrit to differential real soon now.

Even without the repository storage it would still be useful to have a backup web server to handle requests for maniphest. So at minimum all that we need is a small apache+php machine with no other special dependencies.

For a good backup (as apposed to a bare minimum) I think the following specs would be appropriate: 4-8 core CPU, 16-32G ram and 500g of non-mirrored storage.

Load Balancing

Once we have a second phabricator webserver, we could experiment with load balancing as well, though I am primarily interested in redundancy not performance.

Geographical Diversity

It might also make sense to allocate the backup in codfw for the sake of keeping a live mirror of repositories in dallas. This could be used as a primary host when machines in codfw need to clone repositories and I believe it might even be possible to do 2-way mirroring of git, however, that requires more planning and configuration.

Event Timeline

Restricted Application added subscribers: scfc, Aklapper. · View Herald Transcript

Plus, every deployment involves significant downtime because phabricator services must all be stopped while puppet runs and I manually apply database migrations / schema changes.

A redundant setup was actually suggested during the initial setup of Phabricator, but decided against because:

  1. Phabricator didn't actually have good software support for scaling / failover using multiple hosts
  2. Spawning a new instance on other hardware (from Puppet) as well as restoring the required local data (from backups) was deemed trivial and mostly automatic

Has this changed since?

mmodell triaged this task as Medium priority.EditedApr 5 2016, 4:40 PM

@mark: yes, both have changed since, I believe. Some specific points:

  1. Phabricator has improved support for scaling and failover, although I believe it's always supported multiple front-end servers since before we started using it.
  2. Our Phabricator setup has gotten significantly more complex. This is partially due to LVM LVS and virtualized git+ssh. We now depend on phabricator for more than just bug tracking.
  3. As it's becoming more mission-critical, even short downtimes are disruptive to people's workflows.
  4. In the coming months, CI will depend on phabricator, thus developers' workflows will be more significantly impacted by any downtime.
  5. Due to 3 and 4, complex maintenance tasks must be done in tiny steps to avoid downtime, which causes a lot of extra work coordinating those tiny steps.
  6. The time it would take to restore a backup of all repositories alone would mean significant downtime to bring up a new phabricator host from scratch.

None of these issues are emergencies, it just seems like the time is right, either now or soon, to have some redundancy.

If hardware resources are scarce, then consider this a low priority request. On the other hand, if there are spare servers sitting idle then I think this would be a good use for one of them.

I'm pretty sure you mean LVS :)

There have been a few tasks and outlines over time for this but the general threshold for pain I recall is: can we rebuild a new phab server within a day if we have a hardware failure (I would think build time is less in reality as of now but that's the conversation I recall)? I think the answer atm would still be yes, but that may be too high of a pain point anymore. The multiple back ends thing I haven't seen done and I haven't thought through. The previous official hold-off was waiting for https://secure.phabricator.com/book/phabricator/article/cluster/ which is still covered in warnings. Off the top of my head local repos and phd have some local state if were talking about full hot/hot concurrency.

A hot/cold setup with a like phabricator box to iridium that we could cut over to for (some) maint and less than a work-days downtime for phab in case of catastrophic failure seems like a sane choice if that's the uptime we are looking for. But that is the crux of the issue.

I'm pretty sure you mean LVS :)

Yes, stupid error. Corrected now, thanks!

A hot/cold setup with a like phabricator box to iridium that we could cut over to for (some) maint and less than a work-days downtime for phab in case of catastrophic failure seems like a sane choice if that's the uptime we are looking for. But that is the crux of the issue.

This is where I would like to start. I don't think we are ready for a load-balanced cluster but a mostly hot spare would be nice and easily achievable without any extra support from upstream.

I'm fairly confident that the procedure would be nothing more complex than this:

  • Stop phd on iridium
  • rsync the repos to the backup
  • start phd on the backup
  • redirect LVS traffic to the backup
  • do whatever maintenance on iridium
  • Finally reverse the procedure to swap back again.

@chasemp @mark: Phabricator's support for "High Availability" is making progress recently, see upstream task (T10751) for details.

tl;dr is that database read-only mode and failover to a backup is coming along.

Repository master-master replication appears to be on the roadmap but not so straight-forward to implement so may take a bit longer.

mark added a subscriber: BBlack.

I think a backup Phabricator host in codfw would make a lot of sense, and is something we strive for (nearly) every service, anyway.

  • codfw and eqiad use different IP ranges and different LVS clusters. We can't simply put both backends (from different DCs) behind one LVS cluster. So failover of LVS would probably have to be done using DNS at this point.
  • failover for the web frontend can probably be done by Varnish/the misc cache cluster, but I'll let @BBlack chime in on that.

@jcrespo: Phabricator database clustering support is now documented:

https://secure.phabricator.com/book/phabricator/article/cluster_databases/

I'm going to try it out over the weekend, so far it looks promising..

As a note, m3 (miscellaneous database services - shard number 3) is entirely dedicated to phabricator database needs, and as you can see here:

It already has 3 dedicated nodes, 2 hot-swappable machines on eqiad, one extra redundant geographic replica on codfw, and several non-dedicated servers including 2 delayed slaves (dbstore1001) + analytics, plus weekly backups and point in time recovery binary logs. So we are ready to support multiple frontends a long time ago.

I think a backup Phabricator host in codfw would make a lot of sense, and is something we strive for (nearly) every service, anyway.

  • codfw and eqiad use different IP ranges and different LVS clusters. We can't simply put both backends (from different DCs) behind one LVS cluster. So failover of LVS would probably have to be done using DNS at this point.
  • failover for the web frontend can probably be done by Varnish/the misc cache cluster, but I'll let @BBlack chime in on that.

Basically from my POV, it breaks down like this:

  • Web Frontend
    • Local resiliency within 1x DC behind varnish:
      • Active-Active: put multiple phab web frontends behind an LVS service like phab.svc.eqiad.wmnet, point varnish at that.
      • Active-Passive: many ways to do this, but we could do it with LVS as above and use confctl to switch which single host gets all the traffic. I'm not sure if there's a few corner cases to solve on the pybal/LVS front there (have we ever done a 2-host active/passive where we can't mix the two transitionally?), but I bet they're easily tractable.
    • X-DC behind varnish:
      • Define both of phab.svc.(eqiad|codfw).wmnet, using whatever DC-local mechanism above.
      • Active-Active: we don't actually support this yet, but will soon for the few apps that can.
      • Active-Passive: already supported, with one-liner hieradata commits to switch
  • git-ssh
    • Local resiliency can be accomplished same way as the web frontend using LVS
    • X-DC would have to be via GeoDNS directly (since this doesn't flow through our HTTPS termination, obviously), which we can make active/passive or active/active.

One interesting thing that phabricator seems to be implementing it's own load balancing for repositories. I think the idea is that any front-end node can terminate a request for git, and then it proxies the git request to the right back-end node that actually has a copy of the repository. Currently it looks like every node needs a copy of every repository but eventually that won't be the case and repos could be spread across several hosts, e.g. 4 hosts with each repo existing on 2 of them.

I don't think this is nearly finished yet but it is an interesting feature.

@RobH @Papaul do we have a server in codfw that matches "4-8 core CPU, 16-32G ram and 500g of non-mirrored storage." and could be used for this?

RobH added a subscriber: Cmjohnson.

No need to ping Papaul, he doesn't have any involvement in hardware-requests. (Its primarily myself, and if I am out sick, then @Cmjohnson)

For the spare allocation in codfw, we have 5 spare pool systems with dual Intel® Xeon® Processor E5-2640 (2.60GHz/8c), 64GB RAM, and dual 1TB SATA HDD. I propose we use system WMF6405 for this (and I have noted it on my spares tracking sheet.) This would seem to acocmodate the request of a backup phabricator system.

I'm assigning this task to @mark for his approval of this spares allocation. (I'll also update the task title to more accurately reflect the spares allocation location, codfw.)

@mark: Please approve/deny/comment and assign back to me for followup, thanks!

RobH renamed this task from We need a backup phabricator front-end node to codfw: (1) phabricator host (backup node).Apr 19 2016, 6:09 PM
RobH moved this task from Backlog to Pending Approval on the hardware-requests board.

@Dzahn
WMF5849 rbf2001 A5 Dell PowerEdge R420 Intel® Xeon® Processor E5-2440 3.00 6 cores Yes 32 GB RAM (2) 500GB SATA
WMF3641 B5 Dell PowerEdge R610 Intel® Xeon® Processor X5650 2.66 6 cores 16GB RAM(2) 250GB SATA
wmf5823 nembus B5 Dell PowerEdge R320 Intel® Xeon® Processor E5-2420 1.90 6 cores 16GB RAM 2 (500 GB) SATA
Note: no warranty on all 3 servers

@Papaul: You shouldn't have gotten pinged for this, as I handle the hardware-requests, you can disregard. Thanks!

Its also an easy mistake to make, since I used to be the onsite, and I handled spare requests then and still do.

No need to ping Papaul, he doesn't have any involvement in hardware-requests. (Its primarily myself, and if I am out sick, then @Cmjohnson)

For the spare allocation in codfw, we have 5 spare pool systems with dual Intel® Xeon® Processor E5-2640 (2.60GHz/8c), 64GB RAM, and dual 1TB SATA HDD. I propose we use system WMF6405 for this (and I have noted it on my spares tracking sheet.) This would seem to acocmodate the request of a backup phabricator system.

I'm assigning this task to @mark for his approval of this spares allocation. (I'll also update the task title to more accurately reflect the spares allocation location, codfw.)

@mark: Please approve/deny/comment and assign back to me for followup, thanks!

Approved.

WMF6405 is allocated for this use. T137838 has been created for the setup/deployment.