Mark,
I'm working on replacing some of our older SHA1 certificates with SHA256, and in poking at misc-web, we really do have a lot of services on it these days in wikimedia.org: annual, dev, doc, git, gdash, graphite, grafana, parsoid-tests, performance, integration, phabricator, people, releases, bugs, bugzilla, bug-attachment, contacts, datasets, iegreview, ishmael, legalpad, logstash, metrics, noc, old-bugzilla, planet. (Pulled via grepping dns; as the misc-web config has items that its not currently serving.)
Right now we have two cp servers assigned to this role, and they have next business day parts delivery. I imagine this is possibly being overly paranoid, but I wanted to ask if we should perhaps increase this pool by one in eqiad, for a total of three systems. I fear a mainboard/controller break may leave us with a single cp server, which could be a disaster waiting to happen over a long weekend. Since we tend to scale most other clustered services across more than two systems (not always, but usually), is it time to do that here as well?
Please advise,