Translate extension makes huge batch INSERTS
Closed, ResolvedPublic1 Estimated Story Points
Actions

Description

From email:
2014-12-18 07:47:11 mw1246 mediawikiwiki: Query affected 19034 rows:
INSERT INTO translate_messageindex (tmi_key,tmi_value) VALUES ('1198:project:About/Page_display_ti...

Note that the same transactions also DELETE about that many rows first.

Can we please get this both batched and moved into the job queue. This kind of insert can cause slave lag. These issue has been there for some time after being pointed out (partial fixes were made, granted). Can this get prioritized a bit more highly?

Details

	Subject	Repo	Branch	Lines +/-
	Try to avoid huge inserts in DatabaseMessageIndex	mediawiki/extensions/Translate	master	+243 -58

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
Declined	None	T3268 Database replication lag issues (tracking)
Duplicate	None	T108551 Database locked error while publishing article using CX
Resolved	aaron	T95501 Fix causes of replica lag and get it to under 5 seconds at peak
Resolved	Nikerabbit	T86385 Translate extension makes huge batch INSERTS

Event Timeline

aaron created this task.Jan 10 2015, 12:32 AM

aaron raised the priority of this task from to Needs Triage.

aaron updated the task description. (Show Details)

aaron added subscribers: aaron, Nikerabbit.

Krenair added a project: MediaWiki-extensions-Translate.Jan 10 2015, 12:33 AM

Krenair subscribed.

My reply to that email in 2014-12-19:

Aaron, I recommend you bring this up in SoS to convey the urgency. I am on vacation for weeks still.

In fact, it is was made to work this way long ago due to unreliable job queue: T38274. There is a global to move it back to JobQueue, but that does not do batching. For that we need some kind of locking system to avoid two updates running simultaneously and creating a mess.

This is the first time I hear about the issue with messageindex. The previous reports including the link above have been about messagegroupstats.

Anyone who wants to understand messageindex better should read http://laxstrom.name/blag/2013/12/02/pet-project-optimizing-message-index-to-the-last-byte/

Nikerabbit triaged this task as High priority.Jan 10 2015, 9:13 AM

aaron added a project: Sustainability.Apr 7 2015, 7:48 PM

aaron set Security to None.

Arrbee moved this task from Backlog to Sprint Backlog on the Language-Team board.Jun 4 2015, 9:32 AM

aaron renamed this task from Translate extension make huge batch INSERTS to Translate extension makes huge batch INSERTS.Oct 1 2015, 8:53 AM

aaron updated the task description. (Show Details)

jcrespo subscribed.Oct 1 2015, 8:55 AM

Also, from metawiki:

Query affected 44263 row(s):
query-m: DELETE FROM translate_messageindex [TRX#2d00942ca3e3]

Query affected 44263 row(s):
query-m: INSERT INTO translate_messageindex (tmi_key,tmi_value) VALUES ('X') [TRX#2d00942ca3e3]

Do all the rows actually need to change? Can the actual DELETE/UPDATE/INSERT of actual changes be done by the jobs?

aaron added a parent task: T95501: Fix causes of replica lag and get it to under 5 seconds at peak.Oct 1 2015, 9:30 AM

Some non-exclusive ideas:

Use poolcounter to allow only one message index refresh at a time (so that they don't mess up the data)
Do in-memory diff of old/new and insert / delete only what is needed
Bucketize the index and process one bucket at a time (for smaller inserts and hopefully reduced memory usage; but how to determine the buckets?)
Instead of creating message index from scratch every time, do incremental changes (but complex, easy to get out of sync)

Up to date message index is required for page translation to work (translations for new units wont be allowed until they are in the message index).

You can also use GET_LOCK(). Just make sure that you do BEGIN/COMMIT inside of the lock calls.

aaron raised the priority of this task from High to Unbreak Now!.Oct 22 2015, 2:39 AM

aaron added a project: Blocked-on-Language.

@Nikerabbit, any news?

I am aware of this request and I have discussed it with the team. I cannot promise a timely fix for you due to following reasons:

The fix is not trivial (in fact it still hasn't been even decided how to fix it)
I currently have only 20h per week commitment
I have to support the language team on their already planned work (especially now when some of the engineers are having holidays)

Having said that, I am thinking how to fix this and will work on as soon as I find time.

Translate extension is already doing some in-memory diffing. Easiest fix seems to be to verify that that diffing is working, and replace delete_all+insert with delete+replace.

Nemo_bis subscribed.Oct 25 2015, 1:42 PM

Nikerabbit added a project: LE-CX7-Sprint 2.Oct 27 2015, 7:10 AM

Change 249436 had a related patch set uploaded (by Nikerabbit):
Try to avoid huge inserts in DatabaseMessageIndex

https://gerrit.wikimedia.org/r/249436

gerritbot added a project: Patch-For-Review.Oct 28 2015, 4:39 PM

Nikerabbit moved this task from Backlog to In Review on the LE-CX7-Sprint 2 board.Oct 28 2015, 5:27 PM

Amire80 added a project: Essential-Work.Oct 30 2015, 11:32 AM

Amire80 edited a custom field.

Nikerabbit claimed this task.Oct 30 2015, 3:06 PM

Change 249436 merged by jenkins-bot:
Try to avoid huge inserts in DatabaseMessageIndex

https://gerrit.wikimedia.org/r/249436

ReleaseTaggerBot added a project: MW-1.27-release (WMF-deploy-2015-11-10_(1.27.0-wmf.6)).Nov 4 2015, 11:00 AM

Nikerabbit moved this task from In Review to Done on the LE-CX7-Sprint 2 board.Nov 5 2015, 7:39 AM

How to confirm this is fixed?

In T86385#1786471, @Nemo_bis wrote:

How to confirm this is fixed?

You can search for it in the DBPerformance log in logstash (the entry in the task description is from there).

Nikerabbit mentioned this in T118445: MessageIndex::get does unnecessary work.Nov 12 2015, 8:13 AM

This error no longer appears in logstash after the patch was deployed.

Translate extension makes huge batch INSERTSClosed, ResolvedPublic1 Estimated Story PointsActions

Description

Details

Related ObjectsSearch...

Event Timeline

Translate extension makes huge batch INSERTS
Closed, ResolvedPublic1 Estimated Story Points
Actions

Related Objects
Search...