Splitting this into a subtask since this seems like a good place to start :-)
Description
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | None | T297026 Automate maintain-views workflow | |||
Open | fnegri | T300427 Automate maintain-views replica depooling | |||
Resolved | taavi | T346947 Move wiki replicas behind cloudlb | |||
Resolved | taavi | T351087 Migrate cloudlb hosts to nftables | |||
Resolved | jbond | T351094 nftables ignores drange filter for IPv6 if drange only has IPv4 addresses |
Event Timeline
One possible solution would be to take inspiration from https://github.blog/2016-08-17-context-aware-mysql-pools-via-haproxy/ and use a haproxy-level health check that lets us easily define a server as depooled. I was briefly considering using a conftool-based solution to manage the pooling on haproxy level but can't think of a way to fit our needs (active/backup, ports, ... etc) into the current host-based etcd schema.
I have an untested implementation of depooling a server from a cluster using set server ... status drain: https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/760880/11/cookbooks/sre/wikireplicas/update-views.py#137
Ideally the depooled server would be repooled if the haproxy service was restarted, since the fact that it is depooled in this approach is only in memory and is not tracked via puppet-managed files. If the overall approach seems good I can do some testing.
Coming back to this year and a bit later, now I think conftool can be made work here, and is probably the best answer since that's something .
The schema could be something like:
wikireplica-db-analytics: clouddb1017.eqiad.wmnet: [s1, s3] clouddb1018.eqiad.wmnet: [s2, s7] wikireplica-db-web: clouddb1013.eqiad.wmnet: [s1, s3] clouddb1014.eqiad.wmnet: [s2, s7]
The section->port mapping is standard, so that won't cause problems. On the HAProxy side, we generate backends for both replica types with only the hosts on that section, without the backup servers like today. And, on the frontend, we use nbsrv to send traffic to the other backend if there are no available backends on the primary. The HAProxy logic might look something like this:
# These will be generated from etcd/conftool backend wikireplica-db-analytics-s1 mode tcp listen .... backend wikireplica-db-web-s1 mode tcp listen .... # This will be statically generated by Puppet frontend wikireplica-analytics-s1 mode tcp bind ::3311 acl use_backup nbsrv(wikireplica-db-analytics-s1) lt 1 use_backend wikireplica-db-web-s1 if use_backup default_backend wikireplica-db-analytics-s1
@taavi - Thanks so much, that does look really helpful.
The only other thing I think would be helpful is if we could somehow also remove the spof on each of the wikireplica clusters that is provided by the two dbproxy10[18-19] servers.
These two servers still operate individually, with an haproxy configuration that is, effective, the inverse of the other.
Your suggestion above would still retain this requirement, wouldn't it? Or have I misunderstood something?
If we could make it that dbproxy10[18-19] were identical, each supporting both the analytics and web clusters, then it would make it easier to depool each of these servers for maintenance too.
My change doesn't immediately fix the proxy redundancy issue, but it definitely makes it much easier to solve as all of the backend configuration will already be live on all of the proxies. After the change I propose is live, making the proxies redundant is essentially as simple as changing the existing frontends to bind to the specific service VIP, and adding a new frontend with the reverse analytics/web configuration and the other VIP. After that the proxies would have identical configuration.
Or we could use the opportunity to do both changes at the same time, and also combine it with moving the load balancing to our new cloudlb setup and remove the separate dbproxy hosts and the LVS services in the process.
Change 973760 had a related patch set uploaded (by Majavah; author: Majavah):
[operations/puppet@production] Add wiki replica backends to conftool
This seems like an excellent way forward to me.
That would mean:
- integrating the work on this ticket, correct? T346947: Move wiki replicas behind cloudlb
- whilst effectively making this ticket redundant: T322658: Improve LVS config for wikireplicas (dbproxy1018/dbproxy1019)
I'm not too familiar with the cloudlb service at the moment, but I see that they also run haproxy.
Would we still be taling about using the same conftool based modification and config rewriting as you mentioned in T300427#9325905 ?
Is reloading the haproxy configuration upon a change with confd already handled, or is this something that we would have to add?
Correct.
I'm not too familiar with the cloudlb service at the moment, but I see that they also run haproxy.
Would we still be taling about using the same conftool based modification and config rewriting as you mentioned in T300427#9325905 ?
Yes, indeed.
Is reloading the haproxy configuration upon a change with confd already handled, or is this something that we would have to add?
Yes. Confd can be told to run an arbitrary command after writing a file, so it can be configured to run /usr/bin/systemctl reload haproxy.service.
Change 973769 had a related patch set uploaded (by Majavah; author: Majavah):
[operations/homer/public@master] cr-labs: permit cloudlb to wiki replicas
Change 973761 had a related patch set uploaded (by Majavah; author: Majavah):
[operations/puppet@production] Add wiki replicas to cloudlb
Change 973777 had a related patch set uploaded (by Majavah; author: Majavah):
[operations/puppet@production] P:wmcs: wikireplicas: allow access from cloudlb
Change 973769 merged by jenkins-bot:
[operations/homer/public@master] cr-labs: permit cloudlb to wiki replicas
Change 973760 merged by Majavah:
[operations/puppet@production] Add wiki replica backends to conftool
Change 973777 merged by Majavah:
[operations/puppet@production] P:wmcs: wikireplicas: allow access from cloudlb
Change 973761 merged by Majavah:
[operations/puppet@production] Add wiki replicas to cloudlb
Change 976735 had a related patch set uploaded (by Majavah; author: Majavah):
[operations/puppet@production] P:etcd: generate wiki replica pool accounts
Change 976735 merged by Majavah:
[operations/puppet@production] P:etcd: generate wiki replica pool accounts
Change 998356 had a related patch set uploaded (by Majavah; author: Majavah):
[operations/puppet@production] wikireplicas: maintain-views: try depooling host on lock failure
Change 1006492 had a related patch set uploaded (by Majavah; author: Majavah):
[operations/puppet@production] cloudlb: wikireplicas: shutdown sessions on down servers
Change 998356 merged by Majavah:
[operations/puppet@production] wikireplicas: maintain-views: try depooling host on lock failure
Change 1006492 merged by Majavah:
[operations/puppet@production] cloudlb: wikireplicas: shutdown sessions on down servers
@taavi I was wondering what's the status of this task. I see you pushed a few patches to maintain-views in February, what's left?
Killing existing sessions on depooled servers still doesn't work as expected. So what's left is either fixing that functionality on the HAProxy config somehow, or updating maintain-views to have the ability to kill sessions that are holding metadata locks on views that need replacing.