Page MenuHomePhabricator

Integrate dbctl IP changes as part of VLAN changes.
Open, MediumPublic

Description

As discussed with @Volans and @cmooney during the offiste - in order to be able to proceed with the VLAN IP changes for databases we'd need help from the IF team.
Right now changing an IP in a DB means we need to edit that IP on dbctl - eg:

root@cumin1002:~# dbctl instance db2196 get
{
    "db2196": {
        "host_ip": "10.192.23.6",
        "note": "",
        "port": 3306,
        "sections": {
            "x1": {
                "candidate_master": true,
                "percentage": 100,
                "pooled": true,
                "weight": 100
            }
        }
    },
    "tags": "datacenter=codfw"
}

Right now the only way to do so is via edit which means it is not really easy to automate that.

The process I'd envision for this VLAN changes would be

  • Depool the DB (via normal dbctl instance depool)
  • Issue a reimage (as I understood it was the desired method)
  • Run the new cookbook to change dbctl IP address on dbctl
  • The host comes back
  • Repool the DB with normal dbctl instance repool with already the new IP

Event Timeline

We've been discussing with @Ladsgroup that we need to double check if he IPs are really used for dbct /MW or it is just some tech debt. dbctl isn't too old, so if they were added it was probably for a reason, but maybe they're not needed anymore. If that is the case that'd simplify things a lot.

An example can be seen at: https://noc.wikimedia.org/dbconfig/eqiad.json (or codfw.json). At the bottom of the file we have the array of hostname/IP.

It might sound revolutionary but I think mediawiki should not re-implement DNS. All of these IPs should be removed from dbctl and be done via a fast lookup in mw. I don't know how fast DNS lookups are in production but making them fast and long-cached shouldn't be hard

It might sound revolutionary but I think mediawiki should not re-implement DNS. All of these IPs should be removed from dbctl and be done via a fast lookup in mw. I don't know how fast DNS lookups are in production but making them fast and long-cached shouldn't be hard

After looking at this for a while. I still want this to happen but I don't think we can get it done before the network maint :(

Just to make sure I understand, the request here is an easy-to-automate way of dbctl to change the instance IP address?

It probably wouldn't be too hard to add support for editing IP or just any single field in general, and then we could do that from a cookbook + a dbctl commit. Would that meet needs here?

It might sound revolutionary but I think mediawiki should not re-implement DNS. All of these IPs should be removed from dbctl and be done via a fast lookup in mw.

This would be ideal, and is worth looking in to I think.

I don't know how fast DNS lookups are in production but making them fast and long-cached shouldn't be hard

Usually under 10ms based on a bunch of random checks. It would be good to understand what the acceptable limits would be if we went that road.

After looking at this for a while. I still want this to happen but I don't think we can get it done before the network maint :(

I'm happy to work on the dbctl side of things to enable us to do this. However we have no massive urgency in regard to the change, and I am concious we will have similar renumbering to do in codfw rows C-D, and eqiad A-D over the next few years. So perhaps it's worth the upfront effort to make the change now, even if it pushes back the date of the re-IP of these hosts.

I'd prefer to also work with hostnames rather than IPs. I think under 10ms is good enough but this is just a feeling than anything else. Regardless of the work for the IPs, I think having a way to edit dbctl fields via CLI would help us automate some other things (eg: some parts of the master switchover). I guess we could work in parallel to investigate if MW really needs those IPs and on the other hand, get dbctl that new feature :)

joanna_borun triaged this task as Medium priority.

Just to make sure I understand, the request here is an easy-to-automate way of dbctl to change the instance IP address?

It probably wouldn't be too hard to add support for editing IP or just any single field in general, and then we could do that from a cookbook + a dbctl commit. Would that meet needs here?

Actually the idea is that dbctl should not contain the IPs at all. It should look up the IP via DNS, we should store FQDN instead.

Actually the idea is that dbctl should not contain the IPs at all. It should look up the IP via DNS, we should store FQDN instead.

I absolutely agree in principle but that's not something I have the time or capacity to do myself :)

And I think we can do both in parallel

@Marostegui As it turns out, plain old confctl can be used to do this already.

You can for instance do

sudo confctl --object-type=dbconfig-instance select name=db1211 set/host_ip=10.64.16.8
sudo dbctl config commit

If you want to check what you are going to affect first, you can replace the set/... with get instead:

💙cdanis@cumin1002.eqiad.wmnet ~ 🕛☕ confctl --object-type=dbconfig-instance select name=db1211 get     
{"db1211": {"host_ip": "10.64.16.8", "port": 3306, "sections": {"s8": {"percentage": 100, "pooled": true, "weight": 325}}, "note": ""}, "tags": "datacenter=eqiad"}

That's awesome!! Then I guess the cookbook to orchestrate all this can be done? Do we need something else?

I think you should be able to use the existing spicerack interface to confctl to do the set/host_ip=... action -- that should be equivalent to a ConftoolEntity.update call.

I don't think there's any spicerack wrapping of dbctl or any existing cookbooks that do so, but I don't see why you couldn't do a simple subprocess.run to do a commit, probably checking first that dbctl diff returns empty.

I don't see why you couldn't do a simple subprocess.run to do a commit, probably checking first that dbctl diff returns empty.

That's what auto_schema does generally, we need to sit and decide whether we want to continue on this path or provide a dbctl wrapper in spicerack (for general use). I don't care either way.

@Marostegui As it turns out, plain old confctl can be used to do this already.

Right! This works just fine:

entity = spicerack.confctl('dbconfig-instance')
entity.set_and_verify('host_ip', '127.0.0.1', name='db2144')

As for the commit I advocate to add dbctl support in Spicerack but IIRC that requires changes in dbctl as most of its logic is in its CLI part and not exposed as a library, but to be checked.

As for the commit I advocate to add dbctl support in Spicerack but IIRC that requires changes in dbctl as most of its logic is in its CLI part and not exposed as a library, but to be checked.

I agree in principle but I won't have time for that for a while. I've opened T362893: Spicerack support for dbctl to track.

Although I personally think it would be fine to add a very thin interim wrapper around the CLI, perhaps just with two methods: one to check for null diffs, and one to commit. dbctl was written with usage like this in mind; dbctl config diff returns exit codes appropriately to check for diffs, and dbctl config commit supports a --batch argument.

Anyway I think that all that is needed to unblock VLAN migrations has been done or documented on this ticket? Optimistically closing but please re-open if you disagree.

Anyway I think that all that is needed to unblock VLAN migrations has been done or documented on this ticket? Optimistically closing but please re-open if you disagree.

We probably have to decide who's going to implement this. From my chats with Riccardo and Cathal during the summit, my understanding is that I/F would be implementing this, as we (Data-Persistence) don't have much resources (and probably knowledge) to come up with a way to orchestrate this as part of the VLAN operations.
Correct me if I am wrong!

CDanis reassigned this task from CDanis to Volans.

Now that we have dbctl support in Spicerack it should be doable to add the step to modify the IP in dbctl when needed.

Now that we have dbctl support in Spicerack it should be doable to add the step to modify the IP in dbctl when needed.

Thanks Riccardo. How should we approach this work? I saw that Arzhel merged https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/1080012
My idea is to start bringing this up with the team so we can start planning or at least assessing the impact and how to tackle this. You think whatever work needs to be done to handle dbctl changes would need to go into those existing cookbooks or should it be a new cookbook that would be triggered by any of them?

We'll probably have to sync up with you (or whoever else will work on this with us from I/F) and see how to start planning this. I don't want us to leave this to the last moment and block the migration :-)

I think we could put the code directly into the move-vlan cookbook, if the host is present in dbctl, update it. I don't see too much use for an update-ip specific cookbook just for databases, but let me know if you see other use cases for it.

In terms of code something like this should do it:

# in __init__()
self.dbctl = spicerack.dbctl()

# in run() insert where is the right time to update the IP
self.update_dbctl()

# and define it like this
def update_dbctl(self):
    instance = self.dbctl.instance.get(self.args.host)
    if instance is None:
        logger.info('Host %s not present in dbctl', self.args.host)

    def update_ip(obj, _section, _group):
        obj.host_ip = self.post_config[4]

    self.dbctl.instance.write_callback(update_ip, (instance,), section=None, group=None)
    logger.info(...)

And something similar for the rollback.

Thanks!
No, I don't really have any strong opinions on either way, it was mostly to know your thoughts on this. I will talk to the team and then maybe we can have a chat ourselves to see how we can start doing it and most importantly...test it :)