Page MenuHomePhabricator

Evaluate WMF's ParserCache database setup for Multi DC
Closed, ResolvedPublic

Description

Split out from T133523: Decide how to improve parsercache replication, sharding and HA

Based on a recent multi DC strategy meeting (notes - NDA restricted), it was brought up that the DBAs may want to change the previous strategy set out for ParserCache DB.

The status quo is that they are replicated in a bi-directional nature where writes can happen in either DC and the dataset may or may not be eventually consistent. (The typical multi-master concerns don't apply since there only simple key-value operations on these tables using a primary key, and the primary key is a predetermined string, not an auto increment.) These loose guruantees are acceptable since the values are verified by MW at runtime and discarded accordingly, so it's totally fine for an older "wrong" write to win some race.

I believe this bi directional replication was set up specifically to aid in switch overs and to support multi DC, but I could be wrong, maybe it was set up this way for another reason?

The concerned raised by @Kormat is that the current set up causes lag spikes, with the question whether we actually need this or whether a locally kept cache would suffice.

Event Timeline

Krinkle updated the task description. (Show Details)

The concerned raised by @Kormat is that the current set up causes lag spikes, […]

Could you elaborate on this? What is the cause-effect we see, and how is it caused or aggrevated by having bi-directional replication?

[…] with the question whether we actually need this or whether a locally kept cache would suffice.

My gut feeling is that we do not need this indeed. I think the main reason we have it is for switchovers and disaster recovery so that our secondary DC has an effectively "warm" standby parser cache.

However, when we operate read requests from both DCs (multi-DC) this probably isn't a concern, perhaps even comparable to how we keep other caches local to the DC as well (like Memcached). And unlike Memcached/WANObjectCache, the parser cache does not have a need for proactively sending purges (as far as I know).

The only thing that comes to mind as a minor caveat is that gargage collection is a maintenance cron, which if parser cache becomes dc-local means we presumably want to run that in all DCs, which would probably make that the first maintenance cron of its kind. The others are only enabled in the primary DC, I think.

This is a duplicate of T133523

That is quite a broad and open-ended ticket. But we can continue the discussion there, sure. I'll make it a child one for now, and close this when the replication/multi-dc part of it is resolved.

Krinkle closed this task as Declined.EditedJul 21 2022, 7:48 AM

The topic of ParserCache has not come up once during the last 12 months and even in the years before that I believe it was largely my own uncertainty to want to possibly evaluate or optimise this.

Also during the past two years it has at no point been listed on the tracking page as known blocker (https://www.mediawiki.org/wiki/Wikimedia_Performance_Team/Multi-DC_MediaWiki).

To my knowledge there are no known functional blockers, given that ParserCache backend is bi-di replicated and locally readable, and the ParserCache interface within MediaWiki is written with those may-be-lagged and eventual-consistency expectations in mind.

To my knowledge there are no known performance issues that would arie as result of starting to read-write PC from Codfw.

As such, I'm closing this to limit the scope of the initial multi-DC launch. There exist ideas for future improvements, which remain tracked at T133523.

Krinkle reopened this task as Open.EditedAug 18 2022, 4:42 PM
Krinkle assigned this task to Marostegui.

As part of today's incident review meeting for T315271, it was brought up that while ParserCache is ready for bi-directional replication, it is not actually enabled.

I have no preference at this time for whether it is bi-direction, but as requirement for this task I would say:

  • pc hosts must be writeable in both DCs.
  • pc hosts must replicate from eqiad-codfw to improve cache warmness (for now, let's keep status quo, especially as Codfw has little traffic now)
  • pc hosts must use statement-based replication.

Replicating PC in the other direction as well is optional and to be decided by DBAs based on what seems easiest to manage (e.g. might prefer multi-master to match with x2 or to avoid thinking that Codfw is read-only; or maybe you prefer not-multi-master as multi-master brings complexity).

If you choose to replicate in one direction only, please verify that stuff will not break when there are conflicting writes. It is totally fine for writes to be lost or to be overwritten in Eqiad by slightly-older values. MW tolerates all that from PC and will correct and validate.

I believe that with SBR, one replication direction with both doing local writes would work fine, but this is for you to determine/sign off on.

As part of today's incident review meeting for T315271, it was brought up that while ParserCache is ready for bi-directional replication, it is not actually enabled.

I have no preference at this time for whether it is bi-direction, but as requirement for this task I would say:

  • pc hosts must be writeable in both DCs.

This is already ready

  • pc hosts must replicate from eqiad-codfw to improve cache warmness (for now, let's keep status quo, especially as Codfw has little traffic now)

This is ready, but if it is not in use, I prefer not to enabled it until it is going to be used (so we can proceed with maintenance in an easier way)

  • pc hosts may replicate in the other direction as well, this is optional and to be decided by DBAs on what seems easiest to manage.

So right now we do eqiad -> codfw (and the other way around when we have codfw as primary DC), not sure if you just meant also eqiad <-> codfw. If that is the case, see the above comment, it is fully ready, it just one command to enable it.

  • pc hosts must replicate from eqiad-codfw to improve cache warmness (for now, let's keep status quo, especially as Codfw has little traffic now)

Given the x2 outage and the fact that pc doesn't have the modtoken protection (to my knowledge) which makes it even more fragile, I highly recommend simply breaking the replication and let them grow apart. It is a cache and this can benefit from cache locality (e.g. Eastern Asian languages being served from codfw) and scale it even better. I'm happy with this happening once codfw got fully active but it should happen eventually. It'd make it much more resilient and much easier to maintain (plus better cache locality as I said).

If the above model is fine with MW, I would also prefer that rather than having it to be multi-master.

  • pc hosts may replicate in the other direction as well, this is optional […]

[…] not sure if you just meant also eqiad <-> codfw. If that is the case, see the above comment, it is fully ready, it just one command to enable it.

With "in the other direction as well", I indeed meant in addition to, not instead of. So either of eqiad>codfw or eqiad<>codfw will work to keep the status quo of Codfw being warm. I leave it up to your team.

My thinking was bi-di would be prefered as it will match x2-mainstash (thus less novel/unique different setups), and because bi-di is what MW expects/documents for SqlBagOStuff (which powers PC and MainStash). But, for PC specifically, we tolerate loss of writes and runtime validate the values. Consistency doesn't actually matter there. My main reason for preferring to replicating at least in one direction is that it will help keep it warm during roll-out and avoids making more changes at the same time right now.

This is ready, but if it is not in use, I prefer not to enabled it until it is going to be used (so we can proceed with maintenance in an easier way)

We have already begun rolling out traffic to www.mediawiki.org under T279664: Progressive Multi-DC roll out. By leaving this off, I think we are potentially not covering as much during this testing phase as we could. There are no longer any blockers left there. I would say "now" is basically the natural time to start enabling it, otherwise it will become a blocker very soon (TM).

Next step for this task: I'm waiting for someone in your team to close this task with a sign-off that ParserCache is ready to receive read-write traffic in Codfw and Eqiad.


[…] I highly recommend simply breaking the replication and let them grow apart. It is a cache and this can benefit from cache locality […]. I'm happy with this happening once codfw got fully active but it should happen eventually. […]

If the above model is fine with MW, I would also prefer that rather than having it to be multi-master.

This model has previously been what I was advocating as well. I think there is still long-term potential in turning off replication for PC. But, I recommend we keep the status quo during Multi-DC. Changing it afterward is fine. I do want to note however, there do exist (relatively new) reasons that may inform a future iteration of ParserCache in which we actually do need replication again:

  • deprecation of RESTBase, which leads to increase in demand for ParserCache via rest.php. Current trajectory is to find a way to accomodate it with a TTL, but long-term it seems desirable to keep the "current" revision around.
  • deployment of Parsoid, similarly needs to store its version of the HTML output.
  • pageview latency improvements. Optimizations to JobQueue and RefreshLinks have been made to benefit PC effeciency, but these have come at the cost of making it more likely that logged-in users require a parse during pageviews. One direction I'd like to explore, with some metrics, is what it would cost to eliminate parsing during pageviews entirely, thus only parsing during POSTs and during jobqueue (and reducing parser options, as Parsoid desires). In that reality, the backend of PC would need to be replicated. (This isn't yet planned, but fits in long-term direction after T302623 and T140664.)
  • pc hosts may replicate in the other direction as well, this is optional […]

[…] not sure if you just meant also eqiad <-> codfw. If that is the case, see the above comment, it is fully ready, it just one command to enable it.

With "in the other direction as well", I indeed meant in addition to, not instead of. So either of eqiad>codfw or eqiad<>codfw will work to keep the status quo of Codfw being warm. I leave it up to your team.

So replication from eqiad to codfw is there and has always been. We have no plans to remove that unless we agreed (below) that we want to keep both without knowing about each other.

My thinking was bi-di would be prefered as it will match x2-mainstash (thus less novel/unique different setups), and because bi-di is what MW expects/documents for SqlBagOStuff (which powers PC and MainStash). But, for PC specifically, we tolerate loss of writes and runtime validate the values. Consistency doesn't actually matter there. My main reason for preferring to replicating at least in one direction is that it will help keep it warm during roll-out and avoids making more changes at the same time right now.

I would prefer to keep it like it is now (like a normal section) where eqiad -> codfw (or codfw -> eqiad when codfw becomes our primary writable DC).

We have already begun rolling out traffic to www.mediawiki.org under T279664: Progressive Multi-DC roll out. By leaving this off, I think we are potentially not covering as much during this testing phase as we could. There are no longer any blockers left there. I would say "now" is basically the natural time to start enabling it, otherwise it will become a blocker very soon (TM).

Next step for this task: I'm waiting for someone in your team to close this task with a sign-off that ParserCache is ready to receive read-write traffic in Codfw and Eqiad.

If we can leave only eqiad -> codfw and having codfw -> eqiad isn't a blocker, we are ready. Just let us know when you want to start writing to codfw so we are aware.

Ideally, if we can have both topologies without any replication between them, I would prefer that. But as you noted, maybe that is something to do (it is just a single command from our side) after testing Multi-DC

This model has previously been what I was advocating as well. I think there is still long-term potential in turning off replication for PC. But, I recommend we keep the status quo during Multi-DC. Changing it afterward is fine.

Definitely, I'm not suggesting to do it right now. Let's talk about it after full deployment of multidc.

I do want to note however, there do exist (relatively new) reasons that may inform a future iteration of ParserCache in which we actually do need replication again:

  • deprecation of RESTBase, which leads to increase in demand for ParserCache via rest.php. Current trajectory is to find a way to accomodate it with a TTL, but long-term it seems desirable to keep the "current" revision around.
  • deployment of Parsoid, similarly needs to store its version of the HTML output.

These two are actually some of reasons I think we should switch to per-DC PC. These will increase storage of PC drastically and there is some risk that we run out of space for it again. By doing per-DC PC, we reduce the entries where it's not going to be accessed (e.g. entries in languages of east Asia will be stored mostly codfw only and languages like French or German will reside in eqiad only) which reduces PC storage by a reasonable amount.

@Krinkle I am not sure what else is needed here. Would this work for you?:

If we can leave only eqiad -> codfw and having codfw -> eqiad isn't a blocker, we are ready. Just let us know when you want to start writing to codfw so we are aware.

Ideally, if we can have both topologies without any replication between them, I would prefer that. But as you noted, maybe that is something to do (it is just a single command from our side) after testing Multi-DC

Multi-DC has been live for a few months now. Two comments up was the DBA approval, I'll close the task as such.