I think we can build the index in codfw and the future labs replica much quicker than a dump. Basically we just need to run the current Reindexer but with the source in one cluster and the sink in the other.
This needs to fit into the basic plan for bootstraping an additional cluster:
- Create mappings for all wiki's in codfw
- deploy operations/mediawiki-config to also send writes to codfw for one index
- Watch things for a bit
- turn on a few more. turn on the a few more. turn on the rest.
- Copy/rebuild into the same index accepting writes
Additionally by having the reindexing contained under a single process we could use standard tools like trickle[1] to rate limit our WAN traffic (The Reindexer can also fork to increase the speed if needed).