Page MenuHomePhabricator

Migrate users to dbstore100[3-5]
Closed, ResolvedPublic0 Story Points

Description

This is a tracking task to list and manage all the users of dbstore1002 and coordinate the migration to the new dbstore100[3-5] setup.

https://wikitech.wikimedia.org/wiki/Analytics/Data_access#MariaDB_replicas describes how the new set up looks like, for any question please follow up with the Analytics team :)

List of users and status of the migration:

  • Research team
  • Scoring team
  • Analytics Report Updater
  • Analytics montly Sqoop scripts
  • WMDE scripts
  • Product Analytics team

DEPRECATION WARNING: dbstore1002 is going to be decommissioned on March 4th

The Analytics team has been working with the SRE Data Persistence team during the last months to replace dbstore1002 with three brand new nodes, dbstore100[3-5]. We are moving from a single mysql instance (multi-source) to a multi-instance environment.

For more info please check:

We are planning to decommission the dbstore1002 host (namely stopping mysql and shutting down the server) on Monday March 4th (EU morning). We have recently been following up with a lot of users to help them migrate to the new environment, so we are reasonably sure that this move should not heavily impact anybody, but if we have left some use case aside please let us know in https://phabricator.wikimedia.org/T215589. If we don't hear anything before the March 4th deadline we'll proceed with the host decommission maintenance.

Event Timeline

elukey created this task.Feb 8 2019, 8:39 AM
elukey triaged this task as High priority.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 8 2019, 8:39 AM

@leila @Halfak Hi! The new dbstore100[3-5] hosts are ready, so I'd ask your teams to start using those and see what's missing/not-working/etc.. Let me know!

Marostegui edited projects, added User-Marostegui; removed DBA.Feb 8 2019, 8:56 AM
leila added a comment.Feb 8 2019, 7:51 PM

@elukey notified the team.

I haven't used dbstore1002 so all good on my end.

Message to everybody:

Analytics and the Data Persistence team are planning to schedule the official cut off date for the staging database on dbstore1002 for Monday 18th (US holiday) during the EU morning. This is the plan:

  • set the staging database as read only on dbstore1002
  • dump the database and import it to dbstore1005 (takes hours to complete)
  • set dbstore1005 as read/write and official supported staging database

Important considerations:

  • dbstore1002 will not be decommissioned or stopped during this maintenance window, all the database/tables will be available. The only difference will be that the staging database will be read-only.
  • we are doing this to allow people to migrate their (stateful) scripts to the new system and avoid loosing data. Basically we want to have a clean date/time of the last snapshot taken for staging.

Please let us know if this is too soon or impossible to coordinate/sustain, we'll find another timing. Going also to send an email to people to alert about this change.

Read only time would be around 16h (T210478#4942371)

Message to everybody:

Analytics and the Data Persistence team are planning to schedule the official cut off date for the staging database on dbstore1002 for Monday 18th (US holiday) during the EU morning. This is the plan:

  • set the staging database as read only on dbstore1002
  • dump the database and import it to dbstore1005 (takes hours to complete)
  • set dbstore1005 as read/write and official supported staging database

    Important considerations:
  • dbstore1002 will not be decommissioned or stopped during this maintenance window, all the database/tables will be available. The only difference will be that the staging database will be read-only.
  • we are doing this to allow people to migrate their (stateful) scripts to the new system and avoid loosing data. Basically we want to have a clean date/time of the last snapshot taken for staging.

    Please let us know if this is too soon or impossible to coordinate/sustain, we'll find another timing. Going also to send an email to people to alert about this change.

Hey @elukey! Sorry for the delay in responding. I actually have an issue with this (although I've checked and nobody else on Product Analytics does).

For my monthly movement metrics reporting, I aggregate data from all the wiki database to build an editor-month table on staging. Then I query that intermediate to easily calculate global metrics like the number of active editors or mobile edits across all projects.

This change would block this workflow, which I can't replicate on the new setup. I definitely want to move these calculations to the Data Lake, but for some of the calcuations (mainly mobile edit numbers) I need change tags, which aren't in mediawiki_history (T161149). I could use the revision_tags_create event stream, but since the tags should arrive in mediawiki_history soon, I'd prefer to wait and just do one rewrite rather than two (although if there's an urgent need, it honestly wouldn't be that bad 😁).

@Neil_P._Quinn_WMF Hi !
We have planned to release change_tags raw table next month (february snapshot, released at beginning of March). The data will however probably not be integrated into mediawiki_history before the following snapshot (maybe, but there is a high risk that not). Will it be possible for ou to use the raw data to generate your report?

@Neil_P._Quinn_WMF if the answer to the above question is no I'd proceed anyway with the staging migration, but probably Monday is too soon for you to figure all this out. I'll postpone the migration to Wednesday 20th (for the moment), let's sync when you are online.

Nuria added a subscriber: Nuria.Feb 15 2019, 5:42 PM

Per irc conversation we are good to proceed with migration Monday the 18th. As @JAllemandou mentioned the next scooping should include the change tag table even if the change tag data is not yet in the denormalized tables.

Per irc conversation we are good to proceed with migration Monday the 18th. As @JAllemandou mentioned the next scooping should include the change tag table even if the change tag data is not yet in the denormalized tables.

Yup, that's correct. @elukey offered to create another staging database for me (which would allow me to keep following my current workflow, while still pushing other users to move to the new machines). However, he also pointed out that dbstore1002 is having serious issues (T213670) and so it could potentially fail on me in the middle of my work.

So I've decided I should bite the bullet and move my calculations to the Data Lake now.

Another staging database where? Just to clarify: dbstore1002 will be full read only after the migration (MySQL doesn't allow to set read only on a database level, it is a global flag).

Ah, nevermind my comment, you decided to completely move away from dbstore1002 :-)
Thanks!

Mentioned in SAL (#wikimedia-operations) [2019-02-18T05:52:22Z] <marostegui> Set dbstore1002 on read only to start the migration T210478 T215589

For what is worth, dbstore1002 is now lagging behind on s8 (wikidatawiki) 7 days and it keeps lagging, I doubt it will ever catch up.

elukey updated the task description. (Show Details)Feb 22 2019, 11:07 AM
Marostegui updated the task description. (Show Details)Feb 22 2019, 11:08 AM
Marostegui updated the task description. (Show Details)

MySQL has been stopped on dbstore1002 and won't be started again, as this host will be decommissioned

elukey updated the task description. (Show Details)Mar 12 2019, 7:07 AM
elukey set the point value for this task to 0.
elukey closed this task as Resolved.