Page MenuHomePhabricator

Improve eventlogging replication procedure
Closed, ResolvedPublic5 Estimated Story Points

Description

Eventlogging databases (m4 shard): db1046 (m4-master), db1047 (analytics slave 1), dbstore1002 (analytics slave2), and dbstore2002 (dallas backup) use a custom replication mechanism for several reasons:

  • Regular mysql replication is too slow and unsuitable for large batches of data
  • Purging is innefficient over the network
  • Specially, over WAN, things get very slow
  • If replication stops, it is almost impossible to get them up to sync again
  • Analytics slaves are IO-saturated due to the large announcement of long-running queries, combined with having data from 8+ shards in a single physical machines (needed to run JOINS)

The current solution is using a script (https://phabricator.wikimedia.org/diffusion/OPUP/browse/production/files/mariadb/eventlogging_sync.sh) that does not have all the advantages that it could get, namely:

  • parallel replication of several tables at the same time
  • import and export using LOAD DATA, faster than parsing SQL commands
  • Using a 3rd server to offload the process, so it minimizes the mysql server time used
  • Using actual temporary files for batches, instead of OS unnamed pipes, that eventually fail due to locking the master during too much time
  • Configurable purging
  • No monitoring of the process

Event Timeline

jcrespo raised the priority of this task from to Needs Triage.
jcrespo updated the task description. (Show Details)
jcrespo added projects: Analytics, SRE, DBA.
jcrespo subscribed.
Milimetric triaged this task as Medium priority.Jan 21 2016, 6:25 PM
Milimetric set Security to None.
Milimetric moved this task from Incoming to Analytics Query Service on the Analytics board.

@Marostegui ok! So the T125135 auto-increment thing is a very small piece of this larger issue.

Let's see if we can hammer out a way to use regular MySQL replication before we think about making eventlogging_sync.sh better.

Regular mysql replication is too slow and unsuitable for large batches of data

Is the issue large batch inserts, or is the issue just too many inserts? We can do either. IIRC, EventLogging was optimized to do large batch inserts to make things easier on the master. There's no reason we couldn't revert to doing individual inserts, or smaller batch inserts, if that would help. Would it?

@Marostegui ok! So the T125135 auto-increment thing is a very small piece of this larger issue.

Let's see if we can hammer out a way to use regular MySQL replication before we think about making eventlogging_sync.sh better.

It would be great :-)

Regular mysql replication is too slow and unsuitable for large batches of data

Is the issue large batch inserts, or is the issue just too many inserts? We can do either. IIRC, EventLogging was optimized to do large batch inserts to make things easier on the master. There's no reason we couldn't revert to doing individual inserts, or smaller batch inserts, if that would help. Would it?

The issue is normally larg batch of inserts (not sure if you also do DELETEs, but that is even harder for MySQL, big DELETES normally cause replication lag). Normally high number of INSERTS is fine (as the the master will probably have all in memory), but depending on what "high number" means it could become a problem but I guess we are not at that point just yet.

As Jaime mentions in the original post, using LOAD DATA can also be a benefit here for large amount of data inserting.

@mforns, can you comment about large DELETES? Do they happen often? How large are they when it happens?

@Marostegui , Would LOAD DATA actually help replication?

@Marostegui , Would LOAD DATA actually help replication?

If you need to do massive data imports into the DB, it will help if you have the file ordered by PK and can be loaded into the DB it will be a lot faster than parsing all the SQL commands (as Jaime also stated on the original post).

EventLogging is a stream of data. We can do batching because the data is consumed from Kafka, and then inserted into MySQL via a python MySQL client. So we could consume periodically, or wait until N messages are consumed before inserting. Constructing a file and ordering by primary key (I'm not sure what primary key would be, other than an auto-increment id) would be a little hacky. But I'm confused. LOAD DATA is inherently batch, right? Wouldn't that hinder the replication process?

LOAD DATA is a lot faster to bulk lots of data in the DB, there is a lot less overhead in parsing SQL statements and all the processes around that parsing.

This is an interesting blog post from @jcrespo where you can see some performance benchmarks: https://dbahire.com/testing-the-fastest-way-to-import-a-table-into-mysql-and-some-interesting-5-7-performance-results/

@Ottomata: we do not delete data from eventlogging (other than the purging that it should happen after 90 days) the system just inserts batches of records. cc @mforns

purging that it should happen after 90 days

How do you implement purging? That surely must run deletes or some kind of updates?

Nuria edited projects, added Analytics-Kanban; removed Analytics.

Let's take advantage of the fact that after the rename we have now autoincrement ids on new tables .

Change 345646 had a related patch set uploaded (by Ottomata):
[operations/puppet@production] Improvements to eventlogging_sync.sh script

https://gerrit.wikimedia.org/r/345646

Change 345646 merged by Ottomata:
[operations/puppet@production] Improvements to eventlogging_sync.sh script

https://gerrit.wikimedia.org/r/345646

Change 346541 had a related patch set uploaded (by Ottomata):
[operations/puppet@production] Properly default to master database name when slave database not given

https://gerrit.wikimedia.org/r/346541

Change 346541 merged by Ottomata:
[operations/puppet@production] Properly default to master database name when slave database not given

https://gerrit.wikimedia.org/r/346541

Nuria set the point value for this task to 5.