Page MenuHomePhabricator

Support switching change-prop backlog processing to other DC
Closed, DeclinedPublic

Description

Right now switching over the current processing is mostly automatic - when the DC switchover would happen events would start flowing from new active DC and a change-prop instance in an active DC will start picking them up - no intervention needed here.

However, in a real DC failover if eqiad goes offline for real, we also want to reprocess the backlog that's been non processed in eqiad. We have mirrored topics in codfw, but there's no way to instruct change-prop to reprocess something. We need to find a way to do that.

Update 2017-03-15:

  • Timestamp-based indexing is now available in Kafka 0.10.1 and librdkafka.
  • As far as we are aware, the node bindings don't expose this functionality yet.

Next steps towards timestamp support:

  • Upgrade to Kafka 0.10.1 (or 0.10.2)
  • Add support for calling rd_kafka_offsets_for_times in node-rdkafka.

Event Timeline

I've been thinking about this problem today and reading some materials on kafka cross-dc issues, and here's what we could do:

  1. We can support starting change-prop from a specific timestamp per rule - for now it would be achieved by starting from the beginning of the commit log and skipping messages with lower timestamp. It's a bit of an overhead, but wouldn't take long since the consumption is actually pretty fast. After update to kafka 0.10 we'd be able to seek to a specific timestamp in the log. I've prototyped this a bit and it seems totally feasible.
  2. If the unplanned switchover (due to eqiad going down) happens, we would be able to start a second instance of ChangeProp in codfw in a special 'backlog' mode - it will seek to the timestamps where eqiad have left and go through the mirrored topics processing the backlog. This step is manual, but I can't think of a solution that wouldn't require manual intervention.
  3. Now the most interesting question - how would change-prop in codfw know the timestamps where to start for each rule? We can create a new topic, like change-prop.status, mirror it and with each commit write a message to that topic with a consumer group and a timestamp of the message. Currently the rate of commits is 15/s, so that special topic wouldn't be a huge overhead. Then since it's replicated, the manually started backlog-processing change-prop will consume everything from that special mirrored topic from the beginning of the commit log and calculate desired timestamps for each of the rules.

This will work, but the question is whether we really need all this machinery? Normal non-transclusion topics rarely lag more then several messages, and losing a couple of updates in a case of a catastrophic disaster in eqiad would probably be the least of our problems. In case of the planned DC switchover nothing of this is needed because the backlog will be naturally processed by change prop in the DC that resigned the active status.

@GWicke @mobrovac @Eevans Any thoughts? I personally think we can avoid all this complexity and just lose some updates.

I've been thinking about this problem today and reading some materials on kafka cross-dc issues, and here's what we could do:

  1. We can support starting change-prop from a specific timestamp per rule - for now it would be achieved by starting from the beginning of the commit log and skipping messages with lower timestamp. It's a bit of an overhead, but wouldn't take long since the consumption is actually pretty fast. After update to kafka 0.10 we'd be able to seek to a specific timestamp in the log. I've prototyped this a bit and it seems totally feasible.

It seems that the only thing missing for native timestamp based indexing is an upgrade for kafka & librdkafka, and a change in node-rdkafka. So, maybe it is not worth implementing the segment-based work-around.

  1. If the unplanned switchover (due to eqiad going down) happens, we would be able to start a second instance of ChangeProp in codfw in a special 'backlog' mode - it will seek to the timestamps where eqiad have left and go through the mirrored topics processing the backlog. This step is manual, but I can't think of a solution that wouldn't require manual intervention.

Afaik we already have the ability to consume from multiple DCs, so we could perhaps just reconfigure the existing changeprop instance to consume both DCs.

  1. Now the most interesting question - how would change-prop in codfw know the timestamps where to start for each rule? We can create a new topic, like change-prop.status, mirror it and with each commit write a message to that topic with a consumer group and a timestamp of the message. Currently the rate of commits is 15/s, so that special topic wouldn't be a huge overhead.

Yeah, and we don't need to update this *that* often either. Periodic checkpointing should be enough, as double-processing a couple minutes worth up updates is really not problematic.

Another option could be to checkpoint timestamps to etcd. The key-value semantics might be a better fit & easier to use. A downside is the need for another integration, as well as setting up ACLs (per namespace?). @Joe, what is your take on using etcd?

GWicke triaged this task as Medium priority.Aug 8 2017, 10:12 PM

I don't think we will/need to implement this.