I'm working on replacing some of our older SHA1 certificates with SHA256, and in poking at misc-web, we really do have a lot of services on it these days in wikimedia.org: annual, dev, doc, git, gdash, graphite, grafana, parsoid-tests, performance, integration, phabricator, people, releases, bugs, bugzilla, bug-attachment, contacts, datasets, iegreview, ishmael, legalpad, logstash, metrics, noc, old-bugzilla, planet. (Pulled via grepping dns; as the misc-web config has items that its not currently serving.)
Right now we have two cp servers assigned to this role, and they have next business day parts delivery. I imagine this is possibly being overly paranoid, but I wanted to ask if we should perhaps increase this pool by one in eqiad, for a total of three systems. I fear a mainboard/controller break may leave us with a single cp server, which could be a disaster waiting to happen over a long weekend. Since we tend to scale most other clustered services across more than two systems (not always, but usually), is it time to do that here as well?