Migrate MySQLs to use ROW-based replication
Closed, ResolvedPublic
Actions

Assigned To

None

Authored By

	jcrespo
	Aug 15 2015, 2:48 PM

Description

Row based replication has some important advantages:

It minimizes in most cases the replication lag (that we suffer in some hosts, sometimes)
It minimizes slave drift, that we are suffering now
More prone to break replication if a schema or data difference is detected (fails faster)
It usually end up with reduced contention, better performance and less locking needed

It has some disadvantages:

Increased size of binary logs, that affects both disk space and bandwidth needed. In edge cases (large blobs), it could impact binary logs performance when written to disk (configuration should be tuned)
It makes difficult to do one-host-at-a-time schema changes (which is the main mode we do them for mediawiki core hosts right now)
More prone to break replication if a schema or data difference is detected (fails faster) - this could be a curse or a blessing
Performance may not be great if slaves do not have proper primary keys; it can on the other side improve perforamnce
It requires an external tool (mysqlbinlog) in order to know the underlying ongoing queries (for example, if they are stuck)- changes are not shown on show processlist by the system threads
~~Sanitarium requires statement-based binary logs in order to filter rows to labsdbs~~ Not anymore since triggers can happen replica side from 10.1+ (row based replication triggers)
It makes impossible or more difficult to use some tools like pt-table-checksum, specially on multi-tiered setups (for codfw, for labs)

This ticket is to decide if this change is worth it, how to do it, where (maybe not all servers require it), when and what blockers there are.

Details

	Subject	Repo	Branch	Lines +/-
	Add additional variable binlog_format that can be used on templates	operations/puppet/mariadb	master	+10 -9

Customize query in gerrit

Related Objects
Search...

View Standalone Graph

This task is connected to more than 200 other tasks. Only direct parents and subtasks are shown here. Use View Standalone Graph to show more of the graph.

Status	Assigned	Task
		· · ·
Resolved	aaron	T95501 Fix causes of replica lag and get it to under 5 seconds at peak
Resolved	• RobLa-WMF	T112637 RFC: Increase the strictness of mediawiki SQL code and leverage database code blockers for scalability
Resolved	None	T109179 Migrate MySQLs to use ROW-based replication
Resolved	LSobanski	T17441 Some tables lack unique or primary keys, may allow confusing duplicate data
Resolved	jcrespo	T120122 Perform a rolling restart of all MySQL slaves (masters too for those services with low traffic)
Resolved	jcrespo	T121207 Implement slave_run_triggers_for_rbr at sanitarium for labs filtering
Resolved	jcrespo	T122429 Batch updates create slave lag on s3 over WAN
		· · ·

Event Timeline

jcrespo created this task.Aug 15 2015, 2:48 PM

jcrespo raised the priority of this task from to Needs Triage.

jcrespo updated the task description. (Show Details)

jcrespo added projects: Tracking-Neverending, acl*sre-team, DBA.

jcrespo added subscribers: jcrespo, • Springle.

Restricted Application added subscribers: Matanya, Aklapper. · View Herald TranscriptAug 15 2015, 2:48 PM

jcrespo mentioned this in T106647: mariadb multi-source replication glitch with site_identifiers.Aug 15 2015, 2:49 PM

PleaseStand subscribed.Aug 17 2015, 5:35 AM

jcrespo mentioned this in T108033: duplicate key error on db1056.Aug 18 2015, 9:53 AM

jcrespo added a subtask: T17441: Some tables lack unique or primary keys, may allow confusing duplicate data.Aug 19 2015, 8:20 AM

Assigning to @jcrespo as the man for the job

jcrespo updated the task description. (Show Details)Aug 25 2015, 1:04 PM

jcrespo set Security to None.

jcrespo added a parent task: T95501: Fix causes of replica lag and get it to under 5 seconds at peak.Aug 26 2015, 9:23 PM

jcrespo mentioned this in T95501: Fix causes of replica lag and get it to under 5 seconds at peak.Aug 26 2015, 9:37 PM

Legoktm subscribed.Aug 27 2015, 1:23 AM

jcrespo mentioned this in T111371: Potential templatelinks data integrity issue on Tool Labs' enwiki_p.Sep 7 2015, 7:24 PM

• MZMcBride subscribed.Sep 8 2015, 1:41 PM

jcrespo mentioned this in T108255: Enable MariaDB/MySQL's Strict Mode.Sep 13 2015, 5:19 PM

Ricordisamoa subscribed.Sep 15 2015, 7:24 AM

jcrespo added a parent task: T112637: RFC: Increase the strictness of mediawiki SQL code and leverage database code blockers for scalability.Sep 15 2015, 9:42 AM

Platonides subscribed.Sep 18 2015, 1:04 AM

Glaisher subscribed.Sep 30 2015, 11:02 AM

I would like to do a full-scale test of this feature on codfw to validate it and test its configuration (sadly binlog_row_image is not available until 10.1.6).

jcrespo added a subtask: T120122: Perform a rolling restart of all MySQL slaves (masters too for those services with low traffic).Dec 2 2015, 7:48 PM

Change 256660 had a related patch set uploaded (by Jcrespo):
Add additional variable binlog_format that can be used on templates

https://gerrit.wikimedia.org/r/256660

gerritbot added a project: Patch-For-Review.Dec 3 2015, 11:06 AM

Change 256660 merged by Jcrespo:
Add additional variable binlog_format that can be used on templates

https://gerrit.wikimedia.org/r/256660

jcrespo added a subtask: T121207: Implement slave_run_triggers_for_rbr at sanitarium for labs filtering.Dec 11 2015, 10:20 AM

jcrespo removed a project: Patch-For-Review.

Enwiki on codfw is now using ROW-based replication as a test, to check regressions and compare its performance to mixed (mostly, statement) on eqiad.

jcrespo added a parent task: T122429: Batch updates create slave lag on s3 over WAN.Dec 25 2015, 11:05 AM

jcrespo removed a parent task: T122429: Batch updates create slave lag on s3 over WAN.Feb 4 2016, 11:21 AM

jcrespo added a subtask: T122429: Batch updates create slave lag on s3 over WAN.

jcrespo closed subtask T122429: Batch updates create slave lag on s3 over WAN as Resolved.Feb 5 2016, 1:24 PM

jcrespo mentioned this in T132527: Reassure ourselves about triggers & replication.Apr 15 2016, 6:15 AM

jcrespo closed subtask T120122: Perform a rolling restart of all MySQL slaves (masters too for those services with low traffic) as Resolved.Apr 22 2016, 9:58 AM

Should we continue doing ROW performance testing on codfw at the same time than T121207 is worked on?

jcrespo added a subscriber: Volans.Apr 22 2016, 10:09 AM

jcrespo mentioned this in T105135: Implement mariadb 10.0 masters.Apr 22 2016, 4:24 PM

jcrespo removed jcrespo as the assignee of this task.Apr 22 2016, 5:06 PM

TerraCodes subscribed.Jul 6 2016, 5:27 PM

Danny_B moved this task from Tag to Should be Goal instead on the Tracking-Neverending board.Jul 11 2016, 3:22 PM

• Phabricator_maintenance added a project: Goal.Aug 13 2016, 8:41 PM

• Phabricator_maintenance renamed this task from Migrate MySQLs to use ROW-based replication (tracking) to Migrate MySQLs to use ROW-based replication.Aug 14 2016, 12:20 AM

• Phabricator_maintenance removed a project: Tracking-Neverending.

jcrespo mentioned this in T146444: Improve mediawiki data redaction.Oct 18 2016, 7:23 AM

• Marostegui moved this task from Triage to Meta/Epic on the DBA board.May 31 2017, 7:25 AM

What's the status of this task? The previous comment is from over a year ago.

This is an Epic task and quite hard to achieve in short or even medium term.
To give you an example, row based replication is quite strict with data drifts and can break replication if data isn't exactly the same on all the tables.
We are right now in process of checking out all the data across the shards (running pt-table-checksum and some in house scripts) to evaluate (and fix) any data drifts we have.
The new labs infra does use row based replication.

There are other things we need to evaluate when slowly moving things to row based replication such as binlog sizes, potential regression on replication speed (specially for big data changes) etc

@MZMcBride Note this doesn't depend on us DBAs- changing to row based replication is a one-time change that is instantaneous. However, the application has to work well with it. Mediawiki developers have so far expressed no interest on it as production itself works more or less ok for them, you can see by no comment here by mediawiki hackers. There is nothing we can do without that. Please help us convincing the application developer this is worth seeking.

However, what we can do now is creating an intermediate replica that is in row-based and make labs use row based replication exclusively. That has already happened (it is not allegedly, it is real), and it is working, as it can be seen on the newly setup servers under the temporary dns labsdb-analytics.eqiad.wmnet and labsdb-web.eqiad.wmnet which, as far as I can see, they have been maintained in sync with production from several months already. You can use them already for testing.

Sorry, I misunderstood the scope of this task. I thought this task was about Wikimedia Labs using row-based replication, not Wikimedia production. I think I'm actually looking for T138967: Labs database replica drift.

In T109179#3362034, @jcrespo wrote:

However, what we can do now is creating an intermediate replica that is in row-based and make labs use row based replication exclusively. That has already happened (it is not allegedly, it is real), and it is working, as it can be seen on the newly setup servers under the temporary dns labsdb-analytics.eqiad.wmnet and labsdb-web.eqiad.wmnet which, as far as I can see, they have been maintained in sync with production from several months already. You can use them already for testing.

In the context of Wikimedia Labs, the word testing is confusing to me. Isn't all of Labs for testing? It's nice to hear that the data integrity issues we've been having on Wikimedia Labs might soon be resolved.

In T109179#3365630, @MZMcBride wrote:

In the context of Wikimedia Labs, the word testing is confusing to me. Isn't all of Labs for testing? It's nice to hear that the data integrity issues we've been having on Wikimedia Labs might soon be resolved.

This is a pretty serious threadjack, but no, all of Labs/Cloud Services isn't for testing. The deployment-prep project is for testing MediaWiki and related services. The contint project is for running Jenkins workers that perform tests for code. Tool Labs/Toolforge and many other projects are the only "production" environment for the services that they supply. At least one academic paper has been written about the on-wiki impact of tools being off-line. We try very hard not to break things in the infrastructure for this reason. The replica databases are very much treated as services that should be available and stable as much as possible. The common association of 'labs' and 'experiments' is one of the drivers for rebranding Wikimedia Labs as a suite of products under the Wikimedia Cloud Services team.

• Marostegui closed subtask T121207: Implement slave_run_triggers_for_rbr at sanitarium for labs filtering as Resolved.Jul 20 2017, 4:06 PM

I don't think migrating to ROW is something we can actually do now after seeing the breakage caused when s5 master died (T180714) and there was a schema change on-going, and the replacement host was a host that was running row based replication.

Having ROW on the master makes some concrete schema changes (ie: adding a column) impossible unless they are done directly on the master and letting it replicate through replication.

Legoktm mentioned this in T174569: Schema change for refactored comment storage.Nov 21 2017, 1:45 AM

We believe that since s5 was accidentally migrated to ROW, the lag is improved; so it did on labsdbs despite not having any kind of replication control, unlike production.

Jgreen mentioned this in T183140: switch fundraising database replication from 'mixed' to 'row'.Dec 18 2017, 1:43 PM

jcrespo updated the task description. (Show Details)Dec 18 2017, 1:47 PM

jcrespo updated the task description. (Show Details)

This is important, but not a goal for this quarter- we are still blocked on mediawiki extension maintainers to be compatible with it; however, all databases (misc, x1, parsercache, es) have been meanwhile migrated to ROW already with great success.

There are some issues to solve regarding schema changes, but this is still a desirable change; at least to have the option, even if we go for MIXED or STATEMENT.

jcrespo mentioned this in T202245: Temporary table creation should be allowed when $wgReadOnly is set.Aug 20 2018, 3:10 PM

• Phabricator_maintenance moved this task from Backlog to Acknowledged on the SRE board.Jan 26 2019, 8:15 PM

In T109179#3952629, @jcrespo wrote:

This is important, but not a goal for this quarter- we are still blocked on mediawiki extension maintainers to be compatible with it; however, all databases (misc, x1, parsercache, es) have been meanwhile migrated to ROW already with great success.

There are some issues to solve regarding schema changes, but this is still a desirable change; at least to have the option, even if we go for MIXED or STATEMENT.

We're already on RBR in our 1.19-based environment, so I just wanted to learn more about the outstanding issues blocking this task in case there is something I might be able to help with. If I understand the task graph correctly, the main blockers are tables that do not have a primary key or unique index specified, right? Or is there other work left for application developers? Thanks in advance :)

@TK-999 Please note that this is an infrastructure limitation, which means it is mostly related to Wikimedia servers, not mediawiki. As I see it, our main limitations are:

Compatibility for schema changes: While row allows for different schemas between master and replicas, the conditions for that are much more limited (extra columns at the end) than with STATEMENT based replication. Many mediawiki schema changes are not ROW-based "hot" compatible. If your can depool or your replicas or they have limited amount of data, this may not affect you. While there are available synchronous hot schema change tools, most are too costly/dangerous for us.
Performance: Some writes are small transactions on STATEMENT but large amount of writes in ROW. Again, if you do not perform such maintenance jobs or operations, or your dataset is smaller, you may not be affected by this.
Some limitation in flexibility for some uses: STATEMENT can be converted to ROW, but not the other way- some use cases require STATEMENT replication (e.g. maintenance tools)

As you can see, this is mostly the things mentioned on the summary. We have run metadata servers in ROW by accident and it didn't have large issues- and we use row for all other servers except direct replicas, so there is high likely that mediawiki would work for your use cases, specially if it is smaller and less complex than Wikimedia.

@jcrespo I would like to close this - I don't think this is doable on long-term even, I would even say this is very long-long-long-long term for sX sections. There are many limitations here that I don't think we can resolve entirely in a decent amount of time, mostly

Schema changes deployment
100% sure that data is consistent across all the hosts across all the wikis
Schema drifts

As you originally created the task, I will leave it up to you.

This ticket is to decide if this change is worth it, how to do it, where (maybe not all servers require it), when and what blockers there are.

In a way this is "done"- it was changed on as many places it could be, specially on cloud where that causes drifts.

So it is in a state between resolved and declined.

jcrespo changed the task status from Declined to Resolved.Sep 30 2020, 9:38 AM

LSobanski closed subtask T17441: Some tables lack unique or primary keys, may allow confusing duplicate data as Resolved.May 17 2021, 11:09 AM

Migrate MySQLs to use ROW-based replicationClosed, ResolvedPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

Migrate MySQLs to use ROW-based replication
Closed, ResolvedPublic
Actions

Related Objects
Search...