Page MenuHomePhabricator

Switch dumps 1.0 processes to use the analytics MariadB replicas (dbstore100[7-9])
Closed, ResolvedPublic

Description

We wish to try using the Analytics MediaWiki replica servers for serving the dumps 1.0 processes, due to the fact that the current setup can cause issues with production workloads when dumps are running.

In order to do so, we will need to integrate the dbstore serves with dbctl and make them the default for the dumps group.
The best outcome would be if we could permit runtime reconfiguration of the destination servers per section, similar to the level of functionality provided by dbctl.

If we identify the dumps 1.0 processes with a suitable environment variable, then we could patch mediawiki-config to look up the DNS SRV records that we have in place for these analytics replicas.

When this new mechanism is adopted, we would potentially be free to remove the dumps load group from the dbctl back-end configuration.

Event Timeline

We should make them run with weight 0 in the general traffic set up, @Ladsgroup that means MW won't check for lag, am I right? (which I am fine with). I just want to make sure we don't send ANY general production traffic there.

Change #1108071 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Add conftool-data for dbstore hosts to dbctl

https://gerrit.wikimedia.org/r/1108071

There are two ways to approach this. We can add it to dbctl but I'd prefer another way. We can hard-code this in mediawiki-config. Currently thanks to Joe's refactor, we can do something like this:

if ( strpos( ClusterConfig::getInstance()->getHostname(), 'mwmaint' ) === 0 ) {

(copy pasted from CommonSettings.php) and then if that's true completely override values for s1 and set only one replica there. That's a bit hacky but much much safer. Specially given that we are not going to depool/repool this host much plus it'll hopefully be removed soon.

But how are we going to approach a host going down here? Plus, if we change hostnames/hosts, we'd need to edit MW?

But how are we going to approach a host going down here? Plus, if we change hostnames/hosts, we'd need to edit MW?

Yes, for anything related to dbstore1008:3311, we would have to manually change mw config and deploy that. like the old times. I know it's not great but first is that it's going to be for only one replica, we had the same thing with labtestwiki for long time but it wasn't much work (granted it was a test db so it wasn't that big of a deal)

But the biggest problem is that mediawiki core is quite lax on groups so there is quite a high chance that we will start serving production traffic from dbstore1008 and the other way around (dumps reading from other replicas and causing issues there as we have had this before many times)

We can add it to dbctl but I'd prefer another way. We can hard-code this in mediawiki-config.

What are the disadvantages to using dbctl?

...plus it'll hopefully be removed soon.

I don't think that we're planning on decommissioning the dbstore* servers themselves any time soon.

It's true that the xml/sql dumps should not longer call them directly soon, but I think that we may still need these servers for the other miscellaneous dumps for a while.

I think I would rather set out using dbctl if possible, even though we only have one replica per section for now. We might decide to invest in further MariaDB capacity for analytics/data-engineering for other projects.

But the biggest problem is that mediawiki core is quite lax on groups so there is quite a high chance that we will start serving production traffic from dbstore1008 and the other way around (dumps reading from other replicas and causing issues there as we have had this before many times)

This is an absolute no-go. We cannot serve production traffic (other than dumps) from these hosts.

We can add it to dbctl but I'd prefer another way. We can hard-code this in mediawiki-config.

What are the disadvantages to using dbctl?

I wrote that above, they are not safe and trigger prod reads.

...plus it'll hopefully be removed soon.

I don't think that we're planning on decommissioning the dbstore* servers themselves any time soon.

It's true that the xml/sql dumps should not longer call them directly soon, but I think that we may still need these servers for the other miscellaneous dumps for a while.

I'm not talking about decommissioning dbstore. I'm talking about either fixing the bug or fully switching off dumps 1.0 which means they won't go through mediawiki at all.

But the biggest problem is that mediawiki core is quite lax on groups so there is quite a high chance that we will start serving production traffic from dbstore1008 and the other way around (dumps reading from other replicas and causing issues there as we have had this before many times)

This is an absolute no-go. We cannot serve production traffic (other than dumps) from these hosts.

I completely agree. We need properly segmented databases for these two purposes and if there are calls made by the dumps processes to database servers outside of the dumps group, we need to track those down and fix them.
We can get the mysql commands executed for each step of the dumps with the debugging commands here: https://wikitech.wikimedia.org/wiki/Dumps/Rerunning_a_job#Rerunning_a_step_without_using_the_python_scripts and I'm happy to look in that.

As for mediawiki core making calls to the dumps group (and therefore dbstore100[7-9]) from processes other than dumps, again I think that we need to find these and shut them down. Have you any specific examples of when this happened, or any suggestions for how to investigate this before making the switch? It would be easy enough to find after making changes with dbctl, but I'm not sure how I would go about trying to find them ahead of time.

Even if we hunt down and fix every usecase, we are just one bug or mistake away from being re-introduced in the future.

The easiest way is to tell mediawiki to ignore dbctl and get completely blind towards prod databases if it's in a snapshot host and avoid adding dbstore into dbctl altogether.

The easiest way is to tell mediawiki to ignore dbctl and get completely blind towards prod databases if it's in a snapshot host and avoid adding dbstore into dbctl altogether.

But we're also concurrently moving these dumps into kubernetes as part of T352650: WE 5.4 KR - Hypothesis 5.4.4 - Q3 FY24/25 - Migrate current-generation dumps to run on kubernetes - so using hostname matching based on snapshot* is not going to work for that. I need to start using containers for running dumps from next week, too. So I need a method that works for both the bare metal and container based dumps.

The easiest way is to tell mediawiki to ignore dbctl and get completely blind towards prod databases if it's in a snapshot host and avoid adding dbstore into dbctl altogether.

But we're also concurrently moving these dumps into kubernetes as part of T352650: WE 5.4 KR - Hypothesis 5.4.4 - Q3 FY24/25 - Migrate current-generation dumps to run on kubernetes - so using hostname matching based on snapshot* is not going to work for that. I need to start using containers for running dumps from next week, too. So I need a method that works for both the bare metal and container based dumps.

We can define an env variable or php constant similar to the existing MW_API. For example for video scalers we have this:

wmf-config/CommonSettings.php:		if ( strpos( $_SERVER['HTTP_HOST'] ?? '', 'videoscaler.' ) === 0 || strpos( $_SERVER['HTTP_HOST'] ?? '', 'shellbox-video.' ) === 0 ) {

We would have to define something for that regardless I think. Probably there are a lot of usecases needing to know whether they are running in the context of dumps or not.

Even if we hunt down and fix every usecase, we are just one bug or mistake away from being re-introduced in the future.

The easiest way is to tell mediawiki to ignore dbctl and get completely blind towards prod databases if it's in a snapshot host and avoid adding dbstore into dbctl altogether.

If it helps, I'd be happy to patch dbstore_multiinstance.pp so that dbstore100[7-9] no longer permit inbound traffic from wikikube servers and the rest of the production network. We would only need it from snapshot* servers and $DSE_KUBEPODS_NET and $ANALYTICS_NETWORKS.

Reedy renamed this task from Switch dumps 1.0 proccess to use the analytics MariadB replicas (dbstore100[7-9]) to Switch dumps 1.0 process to use the analytics MariadB replicas (dbstore100[7-9]).Jan 3 2025, 7:18 PM

I am unsure on how to continue here. On one hand I would prefer to avoid having non production hosts on dbctl (especially if they are multi-instance, as we made a great effort to get rid of them in production).
On the other hand, I am unsure on how to handle a dbstore going down if they are hardcoded into MW (I know we've had this for many years with parsercache) - at the same time if a host goes down, we can't do anything really as there's no redundancy per section at the moment. So it is very likely that we'd only need to touch dbstore hosts whenever we: add/remove/change hosts, which pretty much never happens in between refresh HW cycles (every 5 years).
So the hard-coding starts to make more sense to me now
More thoughts?

I am unsure on how to continue here. On one hand I would prefer to avoid having non production hosts on dbctl (especially if they are multi-instance, as we made a great effort to get rid of them in production).

Personally, I am not yet persuaded by the argument to put a workaround for dumps in mediwiki-config. I'm very happy to listen to more suggestions or look at patches, but I would rather try to avoid a workaround solution simply because it's easier.

It just seems counter-productive to me that we should have a system whose function is to allow us to select certain database server groups for certain workloads, but then we choose not to use it because (as far as I can glean from the discussion above) we don't trust it enough.

....I am unsure on how to handle a dbstore going down if they are hardcoded into MW (I know we've had this for many years with parsercache) - at the same time if a host goes down, we can't do anything really as there's no redundancy per section at the moment.

It's a given that the current lack of redundancy in the back-end servers is the same, regardless of whether we use dbctl and the dumps load group, or a conditional statement in mediawiki-config.

So my feeling is that the best approach would be to use dbctl but consider the dumps group of servers to be entirely managed by the Data-Platform-SRE team. Downtime of these dbstore servers should not affect anything in the public-facing mediawiki and should not page the SRE team. If it does either, that should be considered a problem that needs to be fixed. We can use firewalls to make sure that mediawiki can't talk to this group of servers outside of the dumps context, if that helps, or we could just monitor for any policy violations, to start with.

...it is very likely that we'd only need to touch dbstore hosts whenever we: add/remove/change hosts, which pretty much never happens in between refresh HW cycles (every 5 years).

I don't feel confident in agreeing with this statement, as the Data-Platform is evolving on several fronts. We may decide that we wish to expand our MariaDB footprint in the near future, even though with Dumps 2.0 we intend to reduce the dependency on the dumps load group in mediawiki. I feel that the dbctl integration gives us much more flexibility in working towards these goals, as opposed to hard-coding a workaround in the mediawiki-config.

BTullis renamed this task from Switch dumps 1.0 process to use the analytics MariadB replicas (dbstore100[7-9]) to Switch dumps 1.0 processes to use the analytics MariadB replicas (dbstore100[7-9]).Jan 6 2025, 11:34 AM

I am unsure on how to continue here. On one hand I would prefer to avoid having non production hosts on dbctl (especially if they are multi-instance, as we made a great effort to get rid of them in production).

Personally, I am not yet persuaded by the argument to put a workaround for dumps in mediwiki-config. I'm very happy to listen to more suggestions or look at patches, but I would rather try to avoid a workaround solution simply because it's easier.

It just seems counter-productive to me that we should have a system whose function is to allow us to select certain database server groups for certain workloads, but then we choose not to use it because (as far as I can glean from the discussion above) we don't trust it enough.

The main reason is also the fact that we've worked pretty hard to have dbctl clean of non-production hosts and we'd be introducing them again, which makes our day to day a lot more complex and can mess up with our automations and ability to quickly choose a host for an emergency master switchover.

It's a given that the current lack of redundancy in the back-end servers is the same, regardless of whether we use dbctl and the dumps load group, or a conditional statement in mediawiki-config.

That is the point. That the only reason I can think of for having dbstore hosts in dbctl (keeping in mind that it can make our day to day a lot more complex) is to be able to depool them, but given that there is nothing to pool/depool if a host needs maintenance/goes down....having them in dbctl isn't giving us any benefit.

So my feeling is that the best approach would be to use dbctl but consider the dumps group of servers to be entirely managed by the Data-Platform-SRE team. Downtime of these dbstore servers should not affect anything in the public-facing mediawiki and should not page the SRE team. If it does either, that should be considered a problem that needs to be fixed. We can use firewalls to make sure that mediawiki can't talk to this group of servers outside of the dumps context, if that helps, or we could just monitor for any policy violations, to start with.

Even if they are entirely managed by other team, this is still making our day to day a lot more complex and confusing as we only have production hosts there. See above.

...it is very likely that we'd only need to touch dbstore hosts whenever we: add/remove/change hosts, which pretty much never happens in between refresh HW cycles (every 5 years).

I don't feel confident in agreeing with this statement, as the Data-Platform is evolving on several fronts. We may decide that we wish to expand our MariaDB footprint in the near future, even though with Dumps 2.0 we intend to reduce the dependency on the dumps load group in mediawiki. I feel that the dbctl integration gives us much more flexibility in working towards these goals, as opposed to hard-coding a workaround in the mediawiki-config.

What flexibility does it give you? The approach @Ladsgroup gave at T382947#10429069 would cover for that as we do for videoscalers.

Just to be clear, I am not yet convinced of any approach, I am trying to gather more ideas to see how we can approach this.

The main reason is also the fact that we've worked pretty hard to have dbctl clean of non-production hosts and we'd be introducing them again, which makes our day to day a lot more complex and can mess up with our automations and ability to quickly choose a host for an emergency master switchover.

We're heavily overloading the term production here, which I feel isn't helping. I understand that you're intending it to mean the interactive, public-facing mediwiki projects, as well as the asynchronous job runners etc.
However, by implication, that means that you're referring to dumps as a second-class service; whereas I am trying to consider the dump generation as a production-class service.

So it doesn't seem fair that your team would be able to make use of dbctl for runtime reconfiguration of the database servers for your workloads, whereas our team would have to patch the PHP code any time that we wish to make a change to the way in which our production workloads select their database servers.

I'm happy to help with any changes required to your automation scripts, if that helps. I certainly don't want to make anything unnecessarily complex for you or your team.

....having them in dbctl isn't giving us any benefit.

Granted, with today's configuration, there would be no tangible benefit.
The potential benefit would be that if we wish to add more servers to the dumps group in the near future, to add resilience, or to effect maintenance, then we wouldn't havce to patch and deploy the mediawiki config to do it.

So my feeling is that the best approach would be to use dbctl but consider the dumps group of servers to be entirely managed by the Data-Platform-SRE team. Downtime of these dbstore servers should not affect anything in the public-facing mediawiki and should not page the SRE team. If it does either, that should be considered a problem that needs to be fixed. We can use firewalls to make sure that mediawiki can't talk to this group of servers outside of the dumps context, if that helps, or we could just monitor for any policy violations, to start with.

What flexibility does it give you? The approach @Ladsgroup gave at T382947#10429069 would cover for that as we do for videoscalers.

I checked the code referenced for the videoscalers and I can see how it works, but I'm still not very keen.
This conditional structure only sets a single variable, based on an HTTP header or another environment variable.

For this use case, we would have to override destination database hostnames and ports for all 9 sections and any changes to these values would require a patch and redeploy. It just strikes me as unnecessarily hacky and risky to start down this path.

Admittedly, I don't have a full grasp on how much more complexity it would add to your automations to select a host for automatic master switchover etc. but I can't immediately imagine how it could be more difficult to exclude dbstore* from this mechanism.

Just to be clear, I am not yet convinced of any approach, I am trying to gather more ideas to see how we can approach this.

I am also still happy to be convinced to a mediawiki-config patch is the right way to go, but I don't want to start writing a patch when it doesn't strike me as the right way to go. If someone else wants to write something, I'll happily review it.

We're heavily overloading the term production here, which I feel isn't helping. I understand that you're intending it to mean the interactive, public-facing mediwiki projects, as well as the asynchronous job runners etc.
However, by implication, that means that you're referring to dumps as a second-class service; whereas I am trying to consider the dump generation as a production-class service.

What I am calling production is the site here. If the databases go down the site is down, if dumps go down, the site is up. I do treat dumps as something very important, but not mission critical.

So it doesn't seem fair that your team would be able to make use of dbctl for runtime reconfiguration of the database servers for your workloads, whereas our team would have to patch the PHP code any time that we wish to make a change to the way in which our production workloads select their database servers.

In the last quarter my team made 1829 commits via dbctl. We are heavy users of dbctl and in fact it was developed under our guidance to help our automations and emergencies. Given Amir's answer, I am not sure how many patches you'd need to be sending to MW repo. I know it is very hard to estimate this, especially given that the future of dbstore is uncertain, but it could help if you can elaborate a bit more on what you'd have in mind.

Adding dbstore hosts to dbctl would make our automation a bit harder, especially under emergencies (eg: we'd not be able to promote any host to master, as we'd need to help and identify the hosts that CANNOT be done - right now any could do it, to a certain extent). We've worked extremely hard during the last few years to make production free of snowflakes.

Granted, with today's configuration, there would be no tangible benefit.
The potential benefit would be that if we wish to add more servers to the dumps group in the near future, to add resilience, or to effect maintenance, then we wouldn't havce to patch and deploy the mediawiki config to do it.

Do you have any roadmap/estimation/task where we can see how many or when that would be happening? I think it would help the discussion and see how much work this could be vs how much complexity it can add to our end.

What flexibility does it give you? The approach @Ladsgroup gave at T382947#10429069 would cover for that as we do for videoscalers.

I checked the code referenced for the videoscalers and I can see how it works, but I'm still not very keen.
This conditional structure only sets a single variable, based on an HTTP header or another environment variable.

@Ladsgroup is this fixable from MW? Would it be hard?

For this use case, we would have to override destination database hostnames and ports for all 9 sections and any changes to these values would require a patch and redeploy. It just strikes me as unnecessarily hacky and risky to start down this path.

This would be the same thing in dbctl if you want to add new hosts.
dbctl would require a patch.
Maybe the confusion lays here: dbctl is used for pooling/depooling - and these hosts would not support that as there are not more than one section per host. It wouldn't help any other thing (unless you have an architecture where pooling/depooling is possible).

Admittedly, I don't have a full grasp on how much more complexity it would add to your automations to select a host for automatic master switchover etc. but I can't immediately imagine how it could be more difficult to exclude dbstore* from this mechanism.

I think I partially answered above, let me know if you'd like me to elaborate more.

Thanks again for your helpful responses.

The potential benefit would be that if we wish to add more servers to the dumps group in the near future, to add resilience, or to effect maintenance, then we wouldn't havce to patch and deploy the mediawiki config to do it.

Do you have any roadmap/estimation/task where we can see how many or when that would be happening? I think it would help the discussion and see how much work this could be vs how much complexity it can add to our end.

We don't have any expansion roadmap set out in concrete terms yet, but I can share a few pointers to where we are currently building (or have recently built) upon the analytics mediawiki replicas service.

This work will require a review the capacity and availability of this service. These use cases are in addition to the current ad-hoc use by engineers, researchers, and analytics - as well as the monthly sqoops into HDFS.

Also this slide deck: State of WMF Data Platform Technical Stack 2024 - This was an input to the recent Data Platform Strategy offsite and it mentions the relative importance to us of the analytics mediawiki replica service. As we approach the APP planning window for FY25/26 we might well be looking at how to increase the availability and/or performance characteristics of this service.

When speaking to @wiki_willy recently about the 3-5 year forecast for the Data Platform group, I said that we could not yet tell whether we would expand the dbstore cluster by ~ 100% or whether it would remain at its current size.
Much will depend on how the data platform evolves and I can't speak for all of the stakeholders involved.

Personally, I am not yet persuaded by the argument to put a workaround for dumps in mediwiki-config. I'm very happy to listen to more suggestions or look at patches, but I would rather try to avoid a workaround solution simply because it's easier.

It just seems counter-productive to me that we should have a system whose function is to allow us to select certain database server groups for certain workloads, but then we choose not to use it because (as far as I can glean from the discussion above) we don't trust it enough.

We trust it to separate workloads since in the large scale, small leakage doesn't have any impact. But we don't trust it to separate systems and infrastructure. It's a not a firewall.

....I am unsure on how to handle a dbstore going down if they are hardcoded into MW (I know we've had this for many years with parsercache) - at the same time if a host goes down, we can't do anything really as there's no redundancy per section at the moment.

It's a given that the current lack of redundancy in the back-end servers is the same, regardless of whether we use dbctl and the dumps load group, or a conditional statement in mediawiki-config.

If dbstore goes down, that condition can be removed so it would fallback to dumps group in dbctl.

So my feeling is that the best approach would be to use dbctl but consider the dumps group of servers to be entirely managed by the Data-Platform-SRE team. Downtime of these dbstore servers should not affect anything in the public-facing mediawiki and should not page the SRE team. If it does either, that should be considered a problem that needs to be fixed. We can use firewalls to make sure that mediawiki can't talk to this group of servers outside of the dumps context, if that helps, or we could just monitor for any policy violations, to start with.

Changing mw is much simpler and achieves the same result. I would rather avoid complexity if it's not needed.

...it is very likely that we'd only need to touch dbstore hosts whenever we: add/remove/change hosts, which pretty much never happens in between refresh HW cycles (every 5 years).

I don't feel confident in agreeing with this statement, as the Data-Platform is evolving on several fronts. We may decide that we wish to expand our MariaDB footprint in the near future, even though with Dumps 2.0 we intend to reduce the dependency on the dumps load group in mediawiki. I feel that the dbctl integration gives us much more flexibility in working towards these goals, as opposed to hard-coding a workaround in the mediawiki-config.

I think several discussions is being mixed here:

  • The long-term changes to dbstore requires much bigger discussions and thinking and I honestly think it shouldn't be part of dbctl at all. It should be stored in etcd and part of confctl but should have its own namespace and schema. For example, right now wikireplicas is being depooled and repooled via etcd but it's not part of dbctl. Once we start to actually have more than one host per section so the need for depool/repool comes up, than we can think of designing etcd-based config mgmt.
  • Dumps 2.0 will use mostly hadoop and not querying mw database directly (it'll be through mw APIs for specific cases but again most of the heavy lifting will be on hadoop), so the discussion about dump 2.0 needing dbstore config mgmt is moot
  • Also generally speaking, db reads on mw databases shouldn't happen bypassing mw. If we want to re-architecture the software to allow this. It's a much bigger discussion (e.g. how to avoid coupling non-mw services to mw db schema and how to make sure password hashes are not accessible, etc. etc.)

The main reason is also the fact that we've worked pretty hard to have dbctl clean of non-production hosts and we'd be introducing them again, which makes our day to day a lot more complex and can mess up with our automations and ability to quickly choose a host for an emergency master switchover.

We're heavily overloading the term production here, which I feel isn't helping. I understand that you're intending it to mean the interactive, public-facing mediwiki projects, as well as the asynchronous job runners etc.
However, by implication, that means that you're referring to dumps as a second-class service; whereas I am trying to consider the dump generation as a production-class service.

This is not about production vs non-production. It's about separation of services and concerns. While one is clearly more important than the other both are production but that doesn't mean they should share resource (database being the resource here). This is why we have different db clusters for phabricator or mailman, it's fully production (m3 and m5) but it should be fully separated from wikis databases

So it doesn't seem fair that your team would be able to make use of dbctl for runtime reconfiguration of the database servers for your workloads, whereas our team would have to patch the PHP code any time that we wish to make a change to the way in which our production workloads select their database servers.

Here, I think there is a confusion between dbctl vs confctl. dbctl is for mediawiki and must stay for mw only, you can set up your own depool/repool mgmt via confctl. Exactly like how wikireplicas are operating currently: https://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Runbooks/Depool_wikireplicas#Depooling_using_conftool_on_cumin_hosts

I'm happy to help with any changes required to your automation scripts, if that helps. I certainly don't want to make anything unnecessarily complex for you or your team.

....having them in dbctl isn't giving us any benefit.

Granted, with today's configuration, there would be no tangible benefit.
The potential benefit would be that if we wish to add more servers to the dumps group in the near future, to add resilience, or to effect maintenance, then we wouldn't havce to patch and deploy the mediawiki config to do it.

Again, I think we should do this once we start to add more replicas for dbstore but not now. Specially since your next comment confirms this is a much more long-term discussion. Once we get there, it shouldn't use dbctl still.

So my feeling is that the best approach would be to use dbctl but consider the dumps group of servers to be entirely managed by the Data-Platform-SRE team. Downtime of these dbstore servers should not affect anything in the public-facing mediawiki and should not page the SRE team. If it does either, that should be considered a problem that needs to be fixed. We can use firewalls to make sure that mediawiki can't talk to this group of servers outside of the dumps context, if that helps, or we could just monitor for any policy violations, to start with.

What flexibility does it give you? The approach @Ladsgroup gave at T382947#10429069 would cover for that as we do for videoscalers.

I checked the code referenced for the videoscalers and I can see how it works, but I'm still not very keen.
This conditional structure only sets a single variable, based on an HTTP header or another environment variable.

It's not hacky, MW needs to know what context it's running on. There are env/constant variable for videoscalers, CI, API, CLI, jobs, etc.

For this use case, we would have to override destination database hostnames and ports for all 9 sections and any changes to these values would require a patch and redeploy. It just strikes me as unnecessarily hacky and risky to start down this path.

For now, I've been explicitly asking this to be done only for s1 as that's the only place this is causing issues (for now).

Admittedly, I don't have a full grasp on how much more complexity it would add to your automations to select a host for automatic master switchover etc. but I can't immediately imagine how it could be more difficult to exclude dbstore* from this mechanism.

Just to be clear, I am not yet convinced of any approach, I am trying to gather more ideas to see how we can approach this.

I am also still happy to be convinced to a mediawiki-config patch is the right way to go, but I don't want to start writing a patch when it doesn't strike me as the right way to go. If someone else wants to write something, I'll happily review it.

I try to make a POC.

Thanks again for your helpful responses.

The potential benefit would be that if we wish to add more servers to the dumps group in the near future, to add resilience, or to effect maintenance, then we wouldn't havce to patch and deploy the mediawiki config to do it.

Do you have any roadmap/estimation/task where we can see how many or when that would be happening? I think it would help the discussion and see how much work this could be vs how much complexity it can add to our end.

We don't have any expansion roadmap set out in concrete terms yet, but I can share a few pointers to where we are currently building (or have recently built) upon the analytics mediawiki replicas service.

This work will require a review the capacity and availability of this service. These use cases are in addition to the current ad-hoc use by engineers, researchers, and analytics - as well as the monthly sqoops into HDFS.

Also this slide deck: State of WMF Data Platform Technical Stack 2024 - This was an input to the recent Data Platform Strategy offsite and it mentions the relative importance to us of the analytics mediawiki replica service. As we approach the APP planning window for FY25/26 we might well be looking at how to increase the availability and/or performance characteristics of this service.

When speaking to @wiki_willy recently about the 3-5 year forecast for the Data Platform group, I said that we could not yet tell whether we would expand the dbstore cluster by ~ 100% or whether it would remain at its current size.
Much will depend on how the data platform evolves and I can't speak for all of the stakeholders involved.

Thanks for all the information!.
Reading all this, I think it is clear to me that the expansion or re-arch of the dbstore (dumps service) is something for a mediumish/long term future and not something likely to happen in this FY. And it is not yet decided/clear what the architecture will be.

Based on that, I think the amount of needed commits to MW would be just probably one (the initial one) and it would stay like that most likely for a very long time. And the interaction with MW is probably going to be none even if a host goes down (as there is no alternative right now) so I think we should try to go for MW for now - @Ladsgroup offered help to make this happen.

For the future, we can probably discuss the usage of dbctl, but as @Ladsgroup pointed out, given the circumstances I also believe confctl is probably better, and it is working really well for the wikireplicas as far as I know.
How does this sound?

Based on that, I think the amount of needed commits to MW would be just probably one (the initial one) and it would stay like that most likely for a very long time. And the interaction with MW is probably going to be none even if a host goes down (as there is no alternative right now) so I think we should try to go for MW for now - @Ladsgroup offered help to make this happen.

OK, thanks. That sounds fine, then.
I just have one reservation, which is:

For now, I've been explicitly asking this to be done only for s1 as that's the only place this is causing issues (for now).

I'd really rather avoid this, if it's at all possible. My feeling is that it would be much better to switch all of the dumps to use the dbstore servers, rather than make enwiki an exceptional case.

Also, I was wondering whether be practical to use our DNS SRV records to locate the right dbstore host and port.
I can see a reference to where SRV records are queried for etcd servers here, so I wonder if we might be able to use the same mechanism for _s1-analytics._tcp.eqiad.wmnet and the other sections.
What do you think @Ladsgroup? Would this be workable?

For the future, we can probably discuss the usage of dbctl, but as @Ladsgroup pointed out, given the circumstances I also believe confctl is probably better, and it is working really well for the wikireplicas as far as I know.
How does this sound?

Yes, this is fine, too. If we find that we don't need it, then maybe we could even work on deprecating the dump load group

Based on that, I think the amount of needed commits to MW would be just probably one (the initial one) and it would stay like that most likely for a very long time. And the interaction with MW is probably going to be none even if a host goes down (as there is no alternative right now) so I think we should try to go for MW for now - @Ladsgroup offered help to make this happen.

OK, thanks. That sounds fine, then.
I just have one reservation, which is:

For now, I've been explicitly asking this to be done only for s1 as that's the only place this is causing issues (for now).

I'd really rather avoid this, if it's at all possible. My feeling is that it would be much better to switch all of the dumps to use the dbstore servers, rather than make enwiki an exceptional case.

I agree with this - is there anything preventing us from MW side to do this @Ladsgroup

Also, I was wondering whether be practical to use our DNS SRV records to locate the right dbstore host and port.
I can see a reference to where SRV records are queried for etcd servers here, so I wonder if we might be able to use the same mechanism for _s1-analytics._tcp.eqiad.wmnet and the other sections.
What do you think @Ladsgroup? Would this be workable?

For the future, we can probably discuss the usage of dbctl, but as @Ladsgroup pointed out, given the circumstances I also believe confctl is probably better, and it is working really well for the wikireplicas as far as I know.
How does this sound?

Yes, this is fine, too. If we find that we don't need it, then maybe we could even work on deprecating the dump load group

That would be just a nice thing to do and would also simply our day to day a lot and our automation.

Change #1109108 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[operations/mediawiki-config@master] ClusterConfig: add support for dumps trait

https://gerrit.wikimedia.org/r/1109108

Change #1109109 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[operations/mediawiki-config@master] Use a bespoke database configuration for dumps

https://gerrit.wikimedia.org/r/1109109

I've created two patches, that implement a version of what we've said:

  • The first allows to identify dumps deployments via ClusterConfig, as long as either:
    • the SERVERGROUP env variable contains the string "dumps"
    • the file /etc/wikimedia-servergroup contains the string
  • The second, in case the running program is identified as dumps, will query the SRV records for the various dbstore sections, using _$secton-analytics._tcp.eqiad.wmnet, and then repopulate the db data structures using those databases. We just leave the masters in place, and of course external storage/parsercache/x1 are untouched.

I want to improve a couple things, namely:

  • Derive the load for each server from the "weight" in the DNS SRV record, so we allow adding multiple records with different weights
  • Define a new update lbfactory update callback that refreshes the DNS SRV records

With those changes, we will have the same flexibility as with dbctl, it will just need a dns change instead than a cli command.

@BTullis does this look reasonable to you for now?

I've created two patches, that implement a version of what we've said:

@BTullis does this look reasonable to you for now?

Yes, thanks @Joe. That solution looks excellent!
Would you like me to work on a puppet patch to define /etc/wikimedia-servergroup for the snapshot servers, or do you have something in mind already?

Change #1108071 abandoned by Btullis:

[operations/puppet@production] Add conftool-data for dbstore hosts to dbctl

Reason:

Superseded by the approach in https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1109109

https://gerrit.wikimedia.org/r/1108071

Change #1109108 merged by jenkins-bot:

[operations/mediawiki-config@master] ClusterConfig: add support for dumps trait

https://gerrit.wikimedia.org/r/1109108

Mentioned in SAL (#wikimedia-operations) [2025-01-16T08:20:21Z] <oblivian@deploy2002> Started scap sync-world: Backport for [[gerrit:1109108|ClusterConfig: add support for dumps trait (T382947)]]

Mentioned in SAL (#wikimedia-operations) [2025-01-16T08:26:56Z] <oblivian@deploy2002> oblivian: Backport for [[gerrit:1109108|ClusterConfig: add support for dumps trait (T382947)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2025-01-16T08:32:51Z] <oblivian@deploy2002> Finished scap sync-world: Backport for [[gerrit:1109108|ClusterConfig: add support for dumps trait (T382947)]] (duration: 12m 30s)

Change #1109109 merged by jenkins-bot:

[operations/mediawiki-config@master] Use a bespoke database configuration for dumps

https://gerrit.wikimedia.org/r/1109109

Mentioned in SAL (#wikimedia-operations) [2025-01-20T14:11:15Z] <oblivian@deploy2002> Started scap sync-world: Backport for [[gerrit:1109109|Use a bespoke database configuration for dumps (T382947)]]

Mentioned in SAL (#wikimedia-operations) [2025-01-20T14:16:18Z] <oblivian@deploy2002> oblivian: Backport for [[gerrit:1109109|Use a bespoke database configuration for dumps (T382947)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2025-01-20T14:30:02Z] <oblivian@deploy2002> Finished scap sync-world: Backport for [[gerrit:1109109|Use a bespoke database configuration for dumps (T382947)]] (duration: 18m 47s)

The patch has been deployed. When I manually added the /etc/wikimedia-servergroup file with content "dumps" to snapshot1011 (immediately removed), I confirmed that it would make mediawiki use the dbstore servers:

$  sudo -u www-data php /srv/mediawiki/multiversion/MWScript.php shell.php --wiki enwiki
> var_dump($wgLBFactoryConf);
...
  ["hostsByName"]=>
  array(138) {
...
    ["dbstore1007.eqiad.wmnet:3314"]=>
    string(16) "10.64.48.87:3314"
    ["dbstore1009.eqiad.wmnet:3318"]=>
    string(17) "10.64.135.22:3318"
    ["dbstore1008.eqiad.wmnet:3311"]=>
    string(17) "10.64.131.23:3311"
    ["dbstore1009.eqiad.wmnet:3316"]=>
    string(17) "10.64.135.22:3316"
    ["dbstore1008.eqiad.wmnet:3315"]=>
    string(17) "10.64.131.23:3315"
    ["dbstore1007.eqiad.wmnet:3312"]=>
    string(16) "10.64.48.87:3312"
    ["dbstore1008.eqiad.wmnet:3317"]=>
    string(17) "10.64.131.23:3317"
}
...
  ["groupLoadsBySection"]=>
  array(8) {
    ["s4"]=>
    array(3) {
      ["api"]=>
      array(1) {
        ["dbstore1007.eqiad.wmnet:3314"]=>
        int(1)
      }
      ["dump"]=>
      array(1) {
        ["dbstore1007.eqiad.wmnet:3314"]=>
        int(1)
      }
      ["vslow"]=>
      array(1) {
        ["dbstore1007.eqiad.wmnet:3314"]=>
        int(1)
      }
    }
...

So now we can just prepare a patch to add that file via puppet on the snapshot hosts and they will automatically switch to use the dbstores. I would recommend, however, to do the following before switching:

  • Raise the TTL on the SRV record entries to 5 minutes
  • Raise the weight to 100 from 1

Change #1113475 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] dumps: Configure snapshot servers with the dumps trait

https://gerrit.wikimedia.org/r/1113475

I have also tried a test dump on snapshot1014 with the following command:

worker ./worker --date 20250122 --skipdone --exclusive --log --configfile /etc/dumps/confs/wikidump.conf.tests --wiki eswiki

This is proceeding well, so it validates that the permissions on the source database are correct, too.

Change #1113505 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/dns@master] Raise the weight of all analytics mariadb replica srv records

https://gerrit.wikimedia.org/r/1113505

So now we can just prepare a patch to add that file via puppet on the snapshot hosts and they will automatically switch to use the dbstores. I would recommend, however, to do the following before switching:

  • Raise the TTL on the SRV record entries to 5 minutes
  • Raise the weight to 100 from 1

Thanks @Joe.
I think that the TTL on these records is already 5 minutes (e.g. here, but please correct me if I have misunderstood.
I am raising the weight of all servers to 100 in https://gerrit.wikimedia.org/r/1113505, as advised. This makes sense, in case we wish to add a lower weight entry, in future.

Change #1113505 merged by Btullis:

[operations/dns@master] Raise the weight of all analytics mariadb replica srv records

https://gerrit.wikimedia.org/r/1113505

@Joe - Sorry to trouble you. I have just discovered a small issue with the patch. I think that we need to set the DEFAULT section to tbe the same as the s3 section. I tried this test on snapshot1014, with the /etc/wikimedia-servergroup file present and containing dumps.

dumpsgen@snapshot1014:/home/btullis$ php /srv/mediawiki/multiversion/MWScript.php shell.php --wiki enwiki
Psy Shell v0.12.7 (PHP 7.4.33 — cli) by Justin Hileman
> var_dump($wgLBFactoryConf["sectionLoads"]);
...
array(8) {
  ["s5"]=>
  array(2) {
    ["db2213"]=>
    int(0)
    ["dbstore1008.eqiad.wmnet:3315"]=>
    int(100)
  }
...
  ["DEFAULT"]=>
  array(1) {
    ["db2209"]=>
    int(0)
  }
  ["s7"]=>
  array(2) {
    ["db2218"]=>
    int(0)
    ["dbstore1008.eqiad.wmnet:3317"]=>
    int(100)
  }
...

The effect is that this works:

./worker --date 20250123 --skipdone --log --configfile /etc/dumps/confs/wikidump.conf.tests --wiki enwiki

But this doesn't work:

./worker --date 20250123 --skipdone --log --configfile /etc/dumps/confs/wikidump.conf.tests --wiki zuwikibooks

The error message is:

ERROR 2005 (HY000): Unknown MySQL server host 'db2209'

Change #1113788 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[operations/mediawiki-config@master] DBRecordCache: handle default section

https://gerrit.wikimedia.org/r/1113788

I've started a dump from snapshot1012 of enwiki with the new dumps trait.

dumpsgen@snapshot1012:/srv/deployment/dumps/dumps/xmldumps-backup$ ./worker --date 20250123 --skipdone --exclusive --log --configfile /etc/dumps/confs/wikidump.conf.dumps:en --wiki enwiki

We can see from the MySQL dashboard and the Host Overview dashboard that this is definitely using the dbstore1008:3311 instance.

I'll keep an eye on progress and hope it completes before the end of the month.

Change #1113475 merged by Btullis:

[operations/puppet@production] dumps: Configure snapshot1012 with the dumps trait

https://gerrit.wikimedia.org/r/1113475

Change #1113788 merged by jenkins-bot:

[operations/mediawiki-config@master] DBRecordCache: handle default section

https://gerrit.wikimedia.org/r/1113788

Mentioned in SAL (#wikimedia-operations) [2025-01-29T08:54:24Z] <oblivian@deploy2002> Started scap sync-world: Backport for [[gerrit:1113788|DBRecordCache: handle default section (T382947)]]

Mentioned in SAL (#wikimedia-operations) [2025-01-29T08:57:36Z] <oblivian@deploy2002> oblivian: Backport for [[gerrit:1113788|DBRecordCache: handle default section (T382947)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2025-01-29T09:05:04Z] <oblivian@deploy2002> Finished scap sync-world: Backport for [[gerrit:1113788|DBRecordCache: handle default section (T382947)]] (duration: 10m 39s)

Change #1114978 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] dumps: Use the analytics replicas by default for dumps 1.0

https://gerrit.wikimedia.org/r/1114978

Change #1114991 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] dumps: Re-enable the enwiki dumps on snapshot1012

https://gerrit.wikimedia.org/r/1114991

Change #1114978 merged by Btullis:

[operations/puppet@production] dumps: Use the analytics replicas by default for dumps 1.0

https://gerrit.wikimedia.org/r/1114978

Change #1114991 merged by Btullis:

[operations/puppet@production] dumps: Re-enable the enwiki dumps on snapshot1012

https://gerrit.wikimedia.org/r/1114991