Enabling LocalisationUpdate vastly increases CPU activity
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	• brion
	Sep 22 2009, 10:02 PM

Description

CPU usage went wayyy up when enabling LocalisationUpdate:
http://techblog.wikimedia.org/wp-content/uploads/2009/09/broke.png

The empty space in the middle is where bug 20773 killed the site after disabling the extension; after fixing that, CPU usage went back up, then back to normal as caches were rebuilt.

Not deployable in this state; CPU usage needs to be tracked down and cleared up. Is it breaking the cache infrastructure, or is pulling extra stuff from DBs when we've got main localization in CDB files inefficient?

Version: unspecified
Severity: enhancement
URL: http://techblog.wikimedia.org/wp-content/uploads/2009/09/broke.png

Details

Reference: bz20774

Related Objects
Search...

Status	Assigned	Task
Resolved	• tstarling	T21312 Enable LocalisationUpdate extension on Incubator
Resolved	• tstarling	T20604 update localised messages as soon as possible (LocalisationUpdate ext)
Resolved	Catrope	T22774 Enabling LocalisationUpdate vastly increases CPU activity

Event Timeline

• bzimport raised the priority of this task from to Medium.Nov 21 2014, 10:46 PM

• bzimport added a project: MediaWiki-extensions-LocalisationUpdate.

• bzimport set Reference to bz20774.

• brion created this task.Sep 22 2009, 10:02 PM

Note that l10nupdate's installation triggers invalidation of the l10ncache, causing it to be rebuilt from scratch. Try making a whitespace change to MessagesEn.php and syncing that, then see what the resulting CPU spike looks like. Also, please investigate how well-synchronized the Apaches' clocks are.

(In reply to comment #0)

Not deployable in this state; CPU usage needs to be tracked down and cleared
up. Is it breaking the cache infrastructure, or is pulling extra stuff from DBs
when we've got main localization in CDB files inefficient?

I'll look into the CPU usage; debug logs from earlier local test runs show that l10nupdate is not pulling localizations from the DB once all its stuff is in the l10ncache, however. Offhand, I think the dependency check may hit the DB, but that shouldn't double CPU usage AFAICT.

Reassigning to Roan.

Hopefully fixed with the rewrite in r56831.

Basically, the two major culprits were:

the code checking the timestamp of the last update.php run (to determine whether to rebuild the l10ncache) pulled stale data from the slaves, and wasn't smart enough to use queriedTimestamp > expectedTimestamp instead of !=
the initial update.php run inserted about a million rows in each of the 5 per-cluster localisation tables, using a separate REPLACE statement for each row; this presumably slowed down replication and worsened #1.

LU now uses a file-based storage system.

This is now believed to be fixed :)

Doing a more conservative progressive production rollout to confirm this...

Yay! System is much happier now :D

Enabling LocalisationUpdate vastly increases CPU activityClosed, ResolvedPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

Enabling LocalisationUpdate vastly increases CPU activity
Closed, ResolvedPublic
Actions

Related Objects
Search...