In case we need to temporarily stop updates to the TTMServer index, it would be useful to have a way to freeze the updates for a while and have them be executed after the freeze is over.
|mediawiki/extensions/Translate : master||Add support for freezing writes|
|Resolved||debt||T151324 [epic] System level upgrade for cirrus / elasticsearch|
|Resolved||Deskana||T154501 [Epic, Q3 Goal] Upgrade search systems to Elasticsearch 5|
|Resolved||Joe||T154658 Prepare and improve the datacenter switchover procedure|
|Resolved||dcausse||T132076 TTMServer should support multi-dc configuration|
|Resolved||dcausse||T132315 Implement update freeze and/or delays for TTMServerMessageUpdateJob|
It's not really causing any problems, it'd be nice to have I think.
More importantly the cross-dc feature can be convenient during elasticsearch upgrades.
Currently the solution is to copy manually the ttm index from one dc to another and run a full reindex to catch-up translations that fell into the crack.
My feeling is that is that it's not a super high priority but it requires someone aware of the problem and run some manual scripts when we do a full restart to avoid any downtime. It'd be nice to have to avoid any mistakes i.e. last time I did this I made a stupid mistake that caused a downtime, this is always a risk when running manual scripts...
To answer your question Nikerabbit (sorry I missed your ping): currently we handle freezing writes by storing a flag in a special cirrus index, when this flag is set the messages are simply resent to the jobqueue with a backoff delay. The way we detect that the cluster is frozen is done with this simple query.
I think it'd make sense to add a new config option in ttmserver to point ElasticTTMServer to this index so we can reuse it, ttm would not have all the tools to handle this index (to freeze and unfreeze) and would be dependent on cirrus for this kind of maintenance operations...
Overall it's not very difficult, ttm has already a sane foundation to support this (iirc all updates are sent to the jobqueue). Maybe we can find time to work on this early next year as part of the elastic5 upgrade?