Implement update freeze and/or delays for TTMServerMessageUpdateJob
Closed, ResolvedPublic
Actions

Description

In case we need to temporarily stop updates to the TTMServer index, it would be useful to have a way to freeze the updates for a while and have them be executed after the freeze is over.

Details

	Subject	Repo	Branch	Lines +/-
	Add support for freezing writes	mediawiki/extensions/Translate	master	+280 -6

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
Resolved	debt	T151324 [epic] System level upgrade for cirrus / elasticsearch
Resolved	• Deskana	T154501 [Epic, Q3 Goal] Upgrade search systems to Elasticsearch 5
Resolved	Joe	T154658 Prepare and improve the datacenter switchover procedure
Resolved	dcausse	T132076 TTMServer should support multi-dc configuration
Resolved	dcausse	T132315 Implement update freeze and/or delays for TTMServerMessageUpdateJob

Event Timeline

Nikerabbit created this task.Apr 11 2016, 6:58 AM

Restricted Application added projects: Discovery-ARCHIVED, Discovery-Search. · View Herald TranscriptApr 11 2016, 6:58 AM

@dcausse I think you said CirrusSearch has this feature. Do you have any suggestions how this could easily be implemented in Translate? A configuration flag? How do you delay the job is freeze is active?

Nikerabbit moved this task from Backlog to TTMServer on the MediaWiki-extensions-Translate board.Apr 20 2016, 9:43 AM

Nikerabbit removed Nikerabbit as the assignee of this task.Apr 20 2016, 1:29 PM

This ticket is old. Is this still causing a problem?

It's not really causing any problems, it'd be nice to have I think.
More importantly the cross-dc feature can be convenient during elasticsearch upgrades.
Currently the solution is to copy manually the ttm index from one dc to another and run a full reindex to catch-up translations that fell into the crack.
My feeling is that is that it's not a super high priority but it requires someone aware of the problem and run some manual scripts when we do a full restart to avoid any downtime. It'd be nice to have to avoid any mistakes i.e. last time I did this I made a stupid mistake that caused a downtime, this is always a risk when running manual scripts...

To answer your question Nikerabbit (sorry I missed your ping): currently we handle freezing writes by storing a flag in a special cirrus index, when this flag is set the messages are simply resent to the jobqueue with a backoff delay. The way we detect that the cluster is frozen is done with this simple query.
I think it'd make sense to add a new config option in ttmserver to point ElasticTTMServer to this index so we can reuse it, ttm would not have all the tools to handle this index (to freeze and unfreeze) and would be dependent on cirrus for this kind of maintenance operations...

Overall it's not very difficult, ttm has already a sane foundation to support this (iirc all updates are sent to the jobqueue). Maybe we can find time to work on this early next year as part of the elastic5 upgrade?

@dcausse All that sounds good to me.

dcausse claimed this task.Jan 31 2017, 10:05 AM

dcausse moved this task from needs triage to Current work on the Discovery-Search board.

dcausse edited projects, added Discovery-Search (Current work); removed Discovery-Search.

dcausse moved this task from Incoming to not in use - please delete on the Discovery-Search (Current work) board.

Change 337558 had a related patch set uploaded (by DCausse):
Add support for freezing writes

https://gerrit.wikimedia.org/r/337558

gerritbot added a project: Patch-For-Review.Feb 14 2017, 10:45 AM

dcausse moved this task from not in use - please delete to Needs review on the Discovery-Search (Current work) board.Feb 14 2017, 10:52 AM

Nikerabbit mentioned this in T158168: Monitor and support TTMServer development work.Feb 15 2017, 10:01 AM

EBernhardson moved this task from Needs review to Needs Reporting on the Discovery-Search (Current work) board.Feb 21 2017, 7:42 PM

Change 337558 merged by jenkins-bot:
Add support for freezing writes