Page MenuHomePhabricator

Replace the current multisource analytics-store setup
Closed, ResolvedPublic

Description

Problem

analytics-store (aka dbstore1002), which currently hosts the MediaWiki analytics replicas for all wikis, has an unsustainable setup. Specifically:

  • It runs Ubuntu Trusty, which will hit its end-of-life at the end of April 2019, so we have a hard deadline to move it to a new Debian Stretch setup.
  • It's out of warranty and is showing up some signs of failure (disks in the RAID array breaking, etc..), so it needs to be replaced by a newer one. This means replicating the setup on a new (very beefy) host of course.
  • It currently uses multisource replication to pull data from all the production shards onto a single host. From the user's point of view, it works nicely but on the SRE side it creates some challenges due to the big amount of data stored and replication throughput from the "wiki" production slaves. For example, we are currently storing close to 6TB of data on a single host, replicated with RAID with a data insertion rate that keeps growing over time. Simply throwing disks and buying new bigger hardware is not feasible/scalable in the medium/long term, at some point we will get again to an overloaded host that can't keep up with replication. The cost of maintenance (in terms of people working on it) will be high, more than what it is now (the Data Persistence in SRE is doing an incredible job behind the scenes on a daily basis for dbstore1002).
  • The data it currently contains is unreliable and not equal to production, however because the setup is so different from production it is not easily fixable
  • It mainly uses TokuDB, which has in the past be unreliable, lagging days behind and crashing/returning inconsistent results.

Planned solution

We are planning to migrate dbstore1002 to three new hosts: dbstore1003/4/5. The idea is to split the wiki replicas to multiple hosts, and finally deprecate dbstore1002 (we have a hard deadline of 30 April 2019 since Ubuntu Trusty will be EOLed).

The data persistence team proposed a layout in T210478#4794536. It would move sX sections (so the database groupings listed in s1.dblist, s2.dblist etc..) to their own mysql instance on an assigned dbstore node. For example, all wikis in S5 will be available (i.e. replicated) to a mysql instance on dbstore1003 (with an assigned port that we don't know yet). So joins between schemas belonging to different sX sections will not be possible anymore.

The staging database will likely be assigned to a separate mysql instance, so people will be able to keep using its data. It will still be possible to create tables etc.., but importing data from various wiki databases will need some extra work (dump the data, import it, etc..).

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

@mpopov @chelsyx @Neil_P._Quinn_WMF - Hi! Any thoughts about the refresh of dbstore1002?

elukey moved this task from Next Up to In Progress on the Analytics-Kanban board.Feb 26 2018, 4:07 PM
mpopov added a comment.EditedFeb 26 2018, 7:03 PM

I never do cross-wiki joins, so no objections from me either. BUT we do have Maps prevalence metrics calculated on a per-wiki basis so we will need to know which wikis will be on which hosts.

Just wanted to share our use case of querying multiple databases. As part of analyzing the usage of Gapfinder we join wikishared, log, and centralauth databases together. As it stands now, the log database is not on the same database server as wikishared and centralauth, which makes our analysis harder. It would be nice to have those three databases on the same server.

No objection from me, since I never do cross-wiki joins on Mysql. Plus, sqoop works great so far.

Neil_P._Quinn_WMF added a comment.EditedFeb 27 2018, 8:52 AM

@mpopov @chelsyx @Neil_P._Quinn_WMF - Hi! Any thoughts about the refresh of dbstore1002?

As I understand it, this means that data from different wikis would be on different db servers.

That would actually break one of my key workflows. I don't do cross-wiki joins, but I do aggregate data from all wikis into a staging table which I query for global editor metrics. This seems like it would be much harder to do if the data is split up between multiple servers, since I would have to go outside SQL.

I do want to switch these calculations to the Data Lake (T173853), but I won't be able to fully do that until we have edit tag data in the Data Lake (T161149), since tags are essential for calculating global metrics about mobile editing and visual editor adoption. It would also take a decent amount of time to adapt my queries to the Data Lake, but I should be able to prioritize that work if it's the only blocker.

Could we hold off on this change until edit tags are available and I have a few weeks to shift over?

No objection from me, since I never do cross-wiki joins on Mysql. Plus, sqoop works great so far.

Out of curiosity, what do you use sqoop for?

Could we hold off on this change until edit tags are available and I have a few weeks to shift over?

Of course, we are currently not rushing any new system but only asking to our users their use cases to avoid breaking them. Our final (medium/long) term goal would be to move everybody to a Hadoop only solution, removing the need for Mysql completely, but there is a lot of work remaining :)

I'll let you know more precise timings once I'll know them (we need to order hardware, schedule the migration with dbas, etc..) but for the moment you are completely fine.

No objection from me, since I never do cross-wiki joins on Mysql. Plus, sqoop works great so far.

Out of curiosity, what do you use sqoop for?

I'll let @Milimetric to follow up on this so you'll get a better and more complete answer :)

Of course, we are currently not rushing any new system but only asking to our users their use cases to avoid breaking them. Our final (medium/long) term goal would be to move everybody to a Hadoop only solution, removing the need for Mysql completely, but there is a lot of work remaining :)
I'll let you know more precise timings once I'll know them (we need to order hardware, schedule the migration with dbas, etc..) but for the moment you are completely fine.

That sounds great! I totally agree about aiming for a Hadoop-only solution, so I'm glad you're working on getting there :)

we will need to know which wikis will be on which hosts

Most likely, you will have to connect to something like sX-analytics-replica.eqiad.wmnet, where the division is the same one we have on production or labs: https://noc.wikimedia.org/db.php Maybe a metadata database can be set, like in labsdbs?

it would be nice to have those three databases on the same server.

There is a change to actually having that, but I cannot commit to it. For the short term, rather than having no multi-instance, I would prefer to copy individual tables around for special cases in a cron, if it was necessary. We can setup the new hosts in parallel to the existing ones, give a period of testing and then solve individual problems. E.g having wikishared on the same instance is way simpler than having all of x1.

Given the limited budget there is for this set of servers, having multiple instances means that if one breaks, the rest are still up, and not everything breaks, like it happens how (and honestly, it is likely to happen soon given the age of the current server).

Nuria added a comment.Feb 27 2018, 5:03 PM

@Neil_P._Quinn_WMF : " I don't do cross-wiki joins, but I do aggregate data from all wikis into a staging table which I query for global editor metrics"

Could you elaborate on which data do you need to do this that is currently not on data lake?

@Neil_P._Quinn_WMF : " I don't do cross-wiki joins, but I do aggregate data from all wikis into a staging table which I query for global editor metrics"
Could you elaborate on which data do you need to do this that is currently not on data lake?

So I also do this with the user_properties tables for all wikis.

No objection from me, since I never do cross-wiki joins on Mysql. Plus, sqoop works great so far.

Out of curiosity, what do you use sqoop for?

I had to run a complicated query on commonswiki which joins several tables and do a bunch of string matching. It took forever to run on Mysql and some of the tables I used was not on Hadoop (e.g. image). So I sqoop them to Hadoop and use Spark to speed up.

So I sqoop them to Hadoop and use Spark to speed up.

I LIKE YOUR STYYYYYYLE!

Neil_P._Quinn_WMF added a comment.EditedFeb 28 2018, 6:02 AM

Could you elaborate on which data do you need to do this that is currently not on data lake?

Certainly, let me know if I should expand on the explanation from my initial comment 😊:

I do want to switch these calculations to the Data Lake (T173853), but I won't be able to fully do that until we have edit tag data in the Data Lake (T161149), since tags are essential for calculating global metrics about mobile editing and visual editor adoption.

elukey moved this task from Backlog to In Progress on the User-Elukey board.Feb 28 2018, 3:20 PM
Nuria added a comment.Feb 28 2018, 4:00 PM

Ah, sorry, i see. Then -since we plan to add tags next quarter - your should not have to do cross-joins from july onwards,

Ah, sorry, i see. Then -since we plan to add tags next quarter - your should not have to do cross-joins from july onwards,

Excellent, I'm looking forward to it!

elukey moved this task from In Progress to Paused on the Analytics-Kanban board.Mar 15 2018, 11:00 AM
elukey moved this task from In Progress to Stalled on the User-Elukey board.Mar 15 2018, 5:04 PM
elukey edited projects, added Analytics; removed Analytics-Kanban.Apr 6 2018, 7:12 AM

Keeping this task updated with the recent discussions in the Analytics team. During the upcoming fiscal year we'll move forward with the hardware refresh of dbstore1002 (analytics-store) with a three hosts mysql multi instance setup. As written above, this will have consequences like not being able to join multiple databases, etc..

During this year though we (as Analytics team) will try to identify use cases and work with people to move away from mysql, ideally getting to a position in which everybody can work with data on Hadoop only. The idea is not to force people to avoid mysql, but to work with them to improve the Hadoop alternative solution and then offer a valid and suitable replacement.

Eventually (not during the immediate future but down the line) we'd like to use the (soon to be) replacements of dbstore1002 as data ingestion pipeline for the Analytics team, not allowing any other use case (so no direct mysql queries to them unless strictly necessary). As explained above, scaling Hadoop for us is way easier than do the same with mysql, and we are definitely more capable of offering support for any data crunch usage on Hadoop rather than on mysql.

So this is the master plan, if you have any concern please let us know :)

During this year though we (as Analytics team) will try to identify use cases and work with people to move away from mysql, ideally getting to a position in which everybody can work with data on Hadoop only.

I'm generally happy with the move from MariaDB to Hadoop, and I identified my main blocker (T161149) above.

However, there's also a long tail of analysis which gets done rarely and unpredictable and can use any arbitrary part of the data in the application tables. For example, this could be looking at notification patterns (T113664, using the echo_event and echo_notification tables) or skin preference (T147696, using the user_properties tables). So to feel confident that I'd never have to use MySQL, I'd need all the tables periodically loaded into Hadoop (once a month or so would be fine—for snapshot research about feature usage, it's rarely important to have the latest data).

This is exactly the kind of feedback that we'd need during the next months to be able to pull the trigger, we will not of course do anything before that. The goal is to improve the workflows of the analytics team and their users together :)

leila added a comment.Jun 11 2018, 5:20 PM

@elukey I reviewed what you say below and I confirm that it's in-line with our earlier discussions on and off this phabricator task. No concerns from Research end (I do expect us to need doing some joins but the use-cases we could spot were rare enough in Research applications that we decided we will handle those on a case by case basis and at the analysis level.)

@elukey we discussed this at the off-site and you're aware of it. putting it here so others are aware, too. Research will not be heavily affected if you replace dbstore1002 with a multi-instance solution. We will continue monitoring this task and will chime in as needed.

Keeping this task updated with the recent discussions in the Analytics team. During the upcoming fiscal year we'll move forward with the hardware refresh of dbstore1002 (analytics-store) with a three hosts mysql multi instance setup. As written above, this will have consequences like not being able to join multiple databases, etc..
During this year though we (as Analytics team) will try to identify use cases and work with people to move away from mysql, ideally getting to a position in which everybody can work with data on Hadoop only. The idea is not to force people to avoid mysql, but to work with them to improve the Hadoop alternative solution and then offer a valid and suitable replacement.
Eventually (not during the immediate future but down the line) we'd like to use the (soon to be) replacements of dbstore1002 as data ingestion pipeline for the Analytics team, not allowing any other use case (so no direct mysql queries to them unless strictly necessary). As explained above, scaling Hadoop for us is way easier than do the same with mysql, and we are definitely more capable of offering support for any data crunch usage on Hadoop rather than on mysql.
So this is the master plan, if you have any concern please let us know :)

elukey added a comment.EditedDec 3 2018, 6:53 PM

Hi everybody,

I know that you probably will not believe this but we are planning the dbstore1002 migration to three new hosts: dbstore1003/4/5. The idea is to split the wiki replicas to multiple hosts, and finally deprecate dbstore1002 (we have a hard deadline for April 2019 since Ubuntu Trusty will be EOLed). The data persistence team proposed a layout in:

https://phabricator.wikimedia.org/T210478#4794536

The staging database will be migrated as it is to one of the new dbstore nodes. If you want to check the task and let me know your thoughts/suggestions/etc.. it would be really great :)

elukey added a comment.Dec 6 2018, 6:06 PM

To follow up what I wrote (after a chat with the data persistence team):

  • the proposal in T210478#4794536 would move sX sections (so the database groupings listed in s1.dblist, s2.dblist etc..) to their own mysql instance on an assigned dbstore node. For example, all wikis in S5 will be available (i.e. replicated) to a mysql instance on dbstore1003 (with an assigned port that we don't know yet). So joins between schemas belonging to different sX sections will not be possible anymore (we already knew this).
  • the staging database will likely be assigned to a separate mysql instance, so people will be able to keep using its data. It will still be possible to create tables etc.., but importing data from various wiki databases will need some extra work (dump the data, import it, etc..).

Would the above points be ok for everybody? Any special need or use case that is not taken into consideration?

Banyek added a subscriber: Banyek.Dec 7 2018, 3:18 PM
Banyek added a comment.Dec 7 2018, 3:23 PM

@elukey the ports are mapped from 3311 to 3318 along with the section names (eg. s3 will be on 3313, s5 on 3315 etc.)

@elukey where can I see the mappings of wikishared, log, and centralauth to sX?

@bmansurov:

  • log is not anymore on dbstore1002/analytics-store, but you can find it in analytics-slave (db1108)
  • centralauth should be s7
  • wikishared no idea (@Banyek can you help?)
Neil_P._Quinn_WMF added a comment.EditedDec 11 2018, 2:01 AM

To follow up what I wrote (after a chat with the data persistence team):

  • the proposal in T210478#4794536 would move sX sections (so the database groupings listed in s1.dblist, s2.dblist etc..) to their own mysql instance on an assigned dbstore node. For example, all wikis in S5 will be available (i.e. replicated) to a mysql instance on dbstore1003 (with an assigned port that we don't know yet). So joins between schemas belonging to different sX sections will not be possible anymore (we already knew this).
  • the staging database will likely be assigned to a separate mysql instance, so people will be able to keep using its data. It will still be possible to create tables etc.., but importing data from various wiki databases will need some extra work (dump the data, import it, etc..).

Would the above points be ok for everybody? Any special need or use case that is not taken into consideration?

Unfortunately, this would be a big problem for me and likely others in the Product Analytics team. I discussed some of the reasons already in T172410#4005226 and T172410#4272291, although that was more focused on what I'd miss if we deprecated the wiki replicas altogether.

The Data Lake still lacks a lot of data we need frequently, like edit tags (T161149), user email addresses (needed to prepare lists for surveys), user preferences (where, for example, the Growth team is storing users' answers to a new onboarding questionnaire), and relatively real-time edit data. For most of these, we're not just interested in one wiki, so we have to query all the wiki databases on the replicas and aggregate the results. This is common enough that many of us researchers have written our own scripts to do this (e.g. @Halfak's multiquery and my wmfdata.mariadb.multirun).

For analysis that only requires retrieving an aggregate value from each database, with enough effort we could modify our scripts to do this across multiple databases. But for analysis where we build large intermediate tables using insert...select queries (like my editor month table), we could run into even more difficult trying to dump and then reload multiple large datasets.

So splitting up these databases will cause us a major headache by removing a stable older tool when we're still trying to rebuild our process around and secure necessary improvements to newer, often still experimental tools like mediawiki_history and Superset.

I understand this is a reaction to very real performance issues, but could we please keep the current setup until we have a full replacement?

Marostegui removed a subscriber: Marostegui.
Marostegui added a subscriber: Marostegui.

To follow up what I wrote (after a chat with the data persistence team):

  • the proposal in T210478#4794536 would move sX sections (so the database groupings listed in s1.dblist, s2.dblist etc..) to their own mysql instance on an assigned dbstore node. For example, all wikis in S5 will be available (i.e. replicated) to a mysql instance on dbstore1003 (with an assigned port that we don't know yet). So joins between schemas belonging to different sX sections will not be possible anymore (we already knew this).
  • the staging database will likely be assigned to a separate mysql instance, so people will be able to keep using its data. It will still be possible to create tables etc.., but importing data from various wiki databases will need some extra work (dump the data, import it, etc..).

Would the above points be ok for everybody? Any special need or use case that is not taken into consideration?

Unfortunately, this would be a big problem for me and likely others in the Product Analytics team. I discussed some of the reasons already in T172410#4005226 and T172410#4272291, although that was more focused on what I'd miss if we deprecated the wiki replicas altogether.
The Data Lake still lacks a lot of data we need frequently, like edit tags (T161149), user email addresses (needed to prepare lists for surveys), user preferences (where, for example, the Growth team is storing users' answers to a new onboarding questionnaire), and relatively real-time edit data. For most of these, we're not just interested in one wiki, so we have to query all the wiki databases on the replicas and aggregate the results. This is common enough that many of us researchers have written our own scripts to do this (e.g. @Halfak's multiquery and my wmfdata.mariadb.multirun).
For analysis that only requires retrieving an aggregate value from each database, with enough effort we could modify our scripts to do this across multiple databases. But for analysis where we build large intermediate tables using insert...select queries (like my editor month table), we could run into even more difficult trying to dump and then reload multiple large datasets.
So splitting up these databases will cause us a major headache by removing a stable older tool when we're still trying to rebuild our process around and secure necessary improvements to newer, often still experimental tools like mediawiki_history and Superset.
I understand this is a reaction to very real performance issues, but could we please keep the current setup until we have a full replacement?

Thanks a lot for the details Neil, I am reaching out to people beforehand to have this kind of feedback and work with you to avoid as much as possible to impact your work, I hope that eventually we'll reach a good compromise between hardware/performance needs and user needs.

I am going to add some comments about the following question since I think it is the most important one:

I understand this is a reaction to very real performance issues, but could we please keep the current setup until we have a full replacement?

These are the main issues that SRE/Analytics are facing now:

  • dbstore1002 runs Ubuntu Trusty that is going EOL on April 2019, so we have a hard deadline to move it to a new Debian Stretch setup.
  • dbstore1002 is a out of warranty host that is showing up some signs of failure (disks in the RAID array breaking, etc..), so it would need to be replaced by a newer one (with warranty from the vendor). This means replicating the setup on a new (very beefy) host of course.
  • the current "dbstore" setup is called multisource, namely one mysql instance replicating a lot of databases from multiple sources. From the users point of view, it works nicely but on the SRE side it creates some challenges due to the big amount of data stored and replication throughput from the "wiki" production slaves. For example, we are currently storing close to 6TB of data on a single host, replicated with RAID with a data insertion rate that keeps growing over time. Simply throwing disks and buying new bigger hardware is not feasible/scalable in the medium/long term, at some point we will get again to an overloaded host that can't keep up with replication. The cost of maintenance (in terms of people working on it) will be high, more than what it is now (the Data Persistence in SRE is doing an incredible job behind the scenes on a daily basis for dbstore1002).

This is why we are trying to move all the dbstore-like systems to a new model, that can be scalable to multiple nodes very easily. And this is why the dream of the Analytics team is to move everybody (one day) to Hadoop only, that is born be scalable both in term of usage and space offered. As you mentioned we are not there yet, but hopefully working all together we'll reach an acceptable state during 2019 :)

Now let's discuss if we can find a compromise that can work with the need of aggregating data among multiple wikis. One possible solution to the problem could be to have a staging-like database on each mysql instance that will run on the new dbstore nodes, so aggregating results for each "group" of wikis should basically be what you are currently doing now. The extra step would then need to be aggregate these results in one place/report, that should be easier than dumping/moving data around from various wikis. Would this be a reasonable/workable solution or it is still not feasible? Is there any other solution/proposal that you have in mind that we could work together on to see if it can be applied to the new (proposed) architecture?

Thanks :)

@Neil_P._Quinn_WMF Any thoughts about the above proposal? (I know that it will not solve all the problems, just trying to figure out what is the best compromise for the new dbstore architecture right now :)

@elukey, thank you for that detailed explanation! As always, thank you for stewarding our infrastructure; I definitely understand the tight constraints you're working under. We're ready to work with you to find something that works for both of us 😁

For some extra context, back in February, I was the only analyst working on contributors metrics, so I was the only one who identified issues here. Now, since Audiences' main annual plan goal is growing new contributors and content, we're all affected by these issues.

To help you understand our needs, we want to start adding our specific issues as subtasks. Should we attach them directly here, or to a subtask like "Product Analytics needs for multisource replacement"?

  • dbstore1002 runs Ubuntu Trusty that is going EOL on April 2019, so we have a hard deadline to move it to a new Debian Stretch setup.

What does this mean in terms of the migration timeline? Will we be able to use this system through the end of April if necessary, or will you need lead time?

This is why we are trying to move all the dbstore-like systems to a new model, that can be scalable to multiple nodes very easily. And this is why the dream of the Analytics team is to move everybody (one day) to Hadoop only, that is born be scalable both in term of usage and space offered. As you mentioned we are not there yet, but hopefully working all together we'll reach an acceptable state during 2019 :)

This is actually the ideal solution for us too! We'd much rather solve our issues by getting the data available in the Data Lake, even if it requires more up-front time from us. We're mostly to point where we only use these MariaDB replicas when the Data Lake doesn't have the capabilities or data we need, and we're ready to invest time in getting even closer.

Now let's discuss if we can find a compromise that can work with the need of aggregating data among multiple wikis. One possible solution to the problem could be to have a staging-like database on each mysql instance that will run on the new dbstore nodes, so aggregating results for each "group" of wikis should basically be what you are currently doing now. The extra step would then need to be aggregate these results in one place/report, that should be easier than dumping/moving data around from various wikis. Would this be a reasonable/workable solution or it is still not feasible? Is there any other solution/proposal that you have in mind that we could work together on to see if it can be applied to the new (proposed) architecture?

That would probably be an improvement, although I honestly don't think it makes things that much easier since we still need to add a whole new layer to our maintenance scripts. But unless there's some way to setup a staging database that can access and be accessed from all the hosts, I can't really think of anything else that's better; I don't know much about the possibilities and as I mentioned improving the Data Lake is our ideal option!

To help you understand our needs, we want to start adding our specific issues as subtasks. Should we attach them directly here, or to a subtask like "Product Analytics needs for multisource replacement"?

Let's use this task that has already people watching it :)

  • dbstore1002 runs Ubuntu Trusty that is going EOL on April 2019, so we have a hard deadline to move it to a new Debian Stretch setup.

What does this mean in terms of the migration timeline? Will we be able to use this system through the end of April if necessary, or will you need lead time?

I would use the middle of March as soft deadline, since the EOL for Trusty is set for April but I don't see exactly when (so pessimistically I assume beginning, better safe than sorry). It is fine to keep using dbstore1002 up to then.

This is why we are trying to move all the dbstore-like systems to a new model, that can be scalable to multiple nodes very easily. And this is why the dream of the Analytics team is to move everybody (one day) to Hadoop only, that is born be scalable both in term of usage and space offered. As you mentioned we are not there yet, but hopefully working all together we'll reach an acceptable state during 2019 :)

This is actually the ideal solution for us too! We'd much rather solve our issues by getting the data available in the Data Lake, even if it requires more up-front time from us. We're mostly to point where we only use these MariaDB replicas when the Data Lake doesn't have the capabilities or data we need, and we're ready to invest time in getting even closer.

Super :)

Now let's discuss if we can find a compromise that can work with the need of aggregating data among multiple wikis. One possible solution to the problem could be to have a staging-like database on each mysql instance that will run on the new dbstore nodes, so aggregating results for each "group" of wikis should basically be what you are currently doing now. The extra step would then need to be aggregate these results in one place/report, that should be easier than dumping/moving data around from various wikis. Would this be a reasonable/workable solution or it is still not feasible? Is there any other solution/proposal that you have in mind that we could work together on to see if it can be applied to the new (proposed) architecture?

That would probably be an improvement, although I honestly don't think it makes things that much easier since we still need to add a whole new layer to our maintenance scripts. But unless there's some way to setup a staging database that can access and be accessed from all the hosts, I can't really think of anything else that's better; I don't know much about the possibilities and as I mentioned improving the Data Lake is our ideal option!

@Neil_P._Quinn_WMF I have a proposal - would it be ok for you (and your team) to open subtasks for each major issue related to the dbstore replacement asap and then have a quick meeting to discuss how to work together to meet the March deadline? I don't want to rush work before holidays but at the same time it would be great to know what to do sooner rather than later (SRE paranoia I know :)).

Neil_P._Quinn_WMF added a comment.EditedDec 17 2018, 10:26 PM

I would use the middle of March as soft deadline, since the EOL for Trusty is set for April but I don't see exactly when (so pessimistically I assume beginning, better safe than sorry). It is fine to keep using dbstore1002 up to then.

It looks like the official date is actually April 30 ("Ubuntu 14.04 LTS – ESM will become available once Ubuntu 14.04 reaches its End of Life on April 30, 2019."). So perhaps the middle of April is a more accurate soft deadline?

@Neil_P._Quinn_WMF I have a proposal - would it be ok for you (and your team) to open subtasks for each major issue related to the dbstore replacement asap and then have a quick meeting to discuss how to work together to meet the March deadline? I don't want to rush work before holidays but at the same time it would be great to know what to do sooner rather than later (SRE paranoia I know :)).

Yes, that sounds good! I've scheduled a meeting with you for Wednesday, and in the meantime I'll work on writing down our requests.

We discussed this in a team meeting today and we've actually shifted the focus of our requests slightly. Although we definitely agree that the Data Lake is the long-term solution, we still have enough unmet needs that it's not realistic for you to address all of them in the next few months. So we will still need the MariaDB replicas for the foreseeable future.

So, our main need here is not additional data in the Data Lake (although we definitely still want that!) but a tool that allows us to run queries on the replicas without having to think about which wiki is stored on which shard. It seems like we're pretty agnostic about how that's achieved: it could be a MariaDB host that acts as a "magical query forwarder" (in the words of @mpopov), or Python and R packages that figure out internally which shard to use, or something else entirely. This solution wouldn't need to provide for cross-wiki joins or a single staging database accessible from all shards; we are willing to handle those use cases in our own code as long as we have the core ability to run shard-agnostic queries.

Hopefully, this sounds reasonable to you 😁 We can discuss more on Wednesday.

Neil_P._Quinn_WMF renamed this task from Phase out and replace analytics-store (multisource) to Replace the current multisource analytics-store setup.Dec 19 2018, 11:46 PM
Neil_P._Quinn_WMF updated the task description. (Show Details)
jcrespo updated the task description. (Show Details)Jan 10 2019, 1:56 PM
Neil_P._Quinn_WMF added a comment.EditedJan 16 2019, 1:48 AM

Product Analytics met with @elukey and @Milimetric on Monday to further discuss our needs around this changeover (full notes).

Overall, we found that our ideas for a query forwarding tool (T212386) were compatible and Analytics agreed to work on providing such a tool before the shutdown of dbstore1002 (based of course on our help clearly defining how we'd like the tool to work).

So, as a team, we've identified our two main blockers for this transition:

Analytics will doing both, so based on that, we feel comfortable saying goodbye to dbstore1002 by the end of April. Cheers to better infrastructure and fewer headaches for Ops folks! 🎉

Nuria added a comment.Jan 16 2019, 1:45 PM

Analytics agreed to work on providing such a tool before the shutdown of dbstore1002 (based of course on our help clearly defining how we'd like the tool to work).

Let me clarify: our biggest priority is that the hardware gets replaced before end of life. We are going to try to install in the replicas the tools that we now have in production that work across shards, we think those will work but we are not 100% sure until we install them. It might very well be that there is a period in which the hardware has been moved to the multi host setup but yet there is no tool to access shards in a shard agnostic way. Let's please keep this is mind. We will have an update for you guys probably by mid February, until then we will just be working on the replacement.

Neil_P._Quinn_WMF added a comment.EditedJan 16 2019, 6:32 PM

Analytics agreed to work on providing such a tool before the shutdown of dbstore1002 (based of course on our help clearly defining how we'd like the tool to work).

Let me clarify: our biggest priority is that the hardware gets replaced before end of life. We are going to try to install in the replicas the tools that we now have in production that work across shards, we think those will work but we are not 100% sure until we install them. It might very well be that there is a period in which the hardware has been moved to the multi host setup but yet there is no tool to access shards in a shard agnostic way. Let's please keep this is mind. We will have an update for you guys probably by mid February, until then we will just be working on the replacement.

I definitely understand that you can't make any ironclad guarantees when you're under a hard end-of-life constraint and have many responsibilities (that's why I wrote that y'all agreed to work on it 😁).

However, I did understand from @Milimetric and @elukey (who said they were on the same page as you) that y'all had accepted it as a real priority and would try hard to make it happen. @Milimetric literally said he considered it a blocker; I didn't take that as an ironclad guarantee, but I did understand it as a serious commitment to try hard and (for example) try other options if the production script doesn't work out.

Is that accurate? If not, then we really will need to have another meeting directly with you so we can avoid misunderstandings.

Nuria added a comment.Jan 16 2019, 7:49 PM

I definitely understand that you can't make any ironclad guarantees when you're under a hard end-of-life constraint (that's why I wrote that y'all agreed to work on it 😁).

Right, no iron clad guarantees. I clarifying that our top priority is to replace the hardware that is failing continuously (https://phabricator.wikimedia.org/T213670) and other requests we can consider after that work is done.

I definitely understand that you can't make any ironclad guarantees when you're under a hard end-of-life constraint (that's why I wrote that y'all agreed to work on it 😁).

Right, no iron clad guarantees. I clarifying that our top priority is to replace the hardware that is failing continuously (https://phabricator.wikimedia.org/T213670) and other requests we can consider after that work is done.

I definitely agree that dealing with the infrastructure fires are your top priority! No complaints there.

But while I agree that it's not your very top priority, I would like to understand better where it does fit among your many priorities. In particular, I wanted to know whether you agreed with the following summary (assuming for the sake of dicussion that there does not need to be an emergency change to the end-of-April migration timeline because of server failures).

However, I did understand from @Milimetric and @elukey (who said they were on the same page as you) that y'all had accepted it as a real priority and would try hard to make it happen. @Milimetric literally said he considered it a blocker; I didn't take that as an ironclad guarantee, but I did understand it as a serious commitment to try hard and (for example) try other options if the production script doesn't work out.

Is this accurate to you, or should we need to have an in-person meeting to clarify further?

I discussed this with @Nuria in a meeting today and she clarified that T212386 is important work to Analytics Engineering and said the team plans to try installing the production SQL query script on the new hardware once it's set up (probably in mid-February). However, she can't make any commitments about what the team might do after that.

I just noticed that the tables related to the Echo extension are (surprisingly) not yet available in the enwiki shard (s1-analytics-replica.eqiad.wmnet), but are in analytics-store.eqiad.wmnet. Is there a page we can refer to to check on parity/status of data availability?

I just noticed that the tables related to the Echo extension are (surprisingly) not yet available in the enwiki shard (s1-analytics-replica.eqiad.wmnet), but are in analytics-store.eqiad.wmnet. Is there a page we can refer to to check on parity/status of data availability?

The echo tables are on x1. x1 is a separate instance in production, and hence on the new dbstore model.
They are available on the old analytics because it has all the instances mixed.

Echo extension lives in x1, not on enwiki.

I just noticed that the tables related to the Echo extension are (surprisingly) not yet available in the enwiki shard (s1-analytics-replica.eqiad.wmnet), but are in analytics-store.eqiad.wmnet. Is there a page we can refer to to check on parity/status of data availability?

Echo is available in X1, that has a special shard, reachable via x1-analytics-replica.eqiad.wment. Let me know if it works!

I just noticed that the tables related to the Echo extension are (surprisingly) not yet available in the enwiki shard (s1-analytics-replica.eqiad.wmnet), but are in analytics-store.eqiad.wmnet. Is there a page we can refer to to check on parity/status of data availability?

The echo tables are on x1. x1 is a separate instance in production, and hence on the new dbstore model.
They are available on the old analytics because it has all the instances mixed.

Echo extension lives in x1, not on enwiki.

I just noticed that the tables related to the Echo extension are (surprisingly) not yet available in the enwiki shard (s1-analytics-replica.eqiad.wmnet), but are in analytics-store.eqiad.wmnet. Is there a page we can refer to to check on parity/status of data availability?

Echo is available in X1, that has a special shard, reachable via x1-analytics-replica.eqiad.wment. Let me know if it works!

Thank you all very much for clarifying! :)

While this may look like an annoyance, we don't usually talk about the things that this change improved:

  • x1 is now finally fully available, while before only certain tables were
  • every time a new wiki was created, it broke dbstore1002
  • If for some reason the new dbstore hosts break, reimport from 0 should take no more than 1-2 hours, increasing the overall uptime (vs multiple days).
  • Lag should be close to 0 at all times

While this may look like an annoyance, we don't usually talk about the things that this change improved:

  • x1 is now finally fully available, while before only certain tables were
  • every time a new wiki was created, it broke dbstore1002
  • If for some reason the new dbstore hosts break, reimport from 0 should take no more than 1-2 hours, increasing the overall uptime (vs multiple days).
  • Lag should be close to 0 at all times

I say all of us buy Luca, Manuel, and Jaime a round of whatever they want next time we're all hanging out. They went through a lot to keep the old system running and migrate to the new one as smoothly as possible.

Yup, thank you DBA crew! Your work is very much appreciated.

Marostegui added a subtask: Restricted Task.Feb 22 2019, 11:54 AM

Indeed many many thanks yall!

elukey moved this task from Stalled to Done on the User-Elukey board.Feb 27 2019, 9:17 AM

Mentioned in SAL (#wikimedia-operations) [2019-03-04T06:46:20Z] <marostegui> Stop MySQL on dbstore1002 for decommission T210478 T172410 T216491 T215589

MySQL has been stopped on dbstore1002 and won't be started again, as this host will be decommissioned

Can we resolve this already? I am guessing there may be many followups, but technically this has been done already? Subtasks can be left open as they don't seem to be hard blockers.

Marostegui closed this task as Resolved.Mar 18 2019, 11:53 AM

Closing as this is done

For the record: Information about how to work with the new setup was added to https://wikitech.wikimedia.org/wiki/Analytics/Data_access#MariaDB_replicas .