Maniphest T19179

No user-visible lag reports when database slave server has stopped replication slave thread
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	• brion
	Jan 27 2009, 6:50 PM

Description

Sometimes a slave server stops replicating, for instance due to some transitory funky error:

   Slave_IO_Running: Yes
  Slave_SQL_Running: No
    Replicate_do_db: 
Replicate_ignore_db: 
         Last_errno: 1205
         Last_error: Error 'Lock wait timeout exceeded; Try restarting transaction' on query. Default database: 'enwiki'. Query: 'UPDATE /* HTMLCacheUpdate::invalidateIDs This flag once ... */  `page` SET page_touched = '20090127180707' WHERE (page_id IN ('14890591'))'

In this case, there's no end-user-visible report of lag, but weird things happen such as a failure to show updated information on Special:Contributions.

After restarting the slave thread, we get a nice big warning like this:

Due to high database server lag, changes newer than 2146 seconds might not be shown in this list.

which is neat. It would be nice to have a similar warning if we're pulling from a server that's outright not replicating... it may be difficult to tell how far behind it is in this case, but even a "we're broken" warning would be nice.

Note that the lag report in the API shows up "" instead of say "0" for this case:
http://en.wikipedia.org/w/api.php?action=query&meta=siteinfo&siprop=dbrepllag&sishowalldb

whereas the 'lagtop' script reports a 0. Lagtop perhaps should be updated to show a visible warning as well if this is detectable.

Version: unspecified
Severity: enhancement

Details

Reference: bz17179

	Subject	Repo	Branch	Lines +/-
	Added pt-heartbeat support to DatabaseMysqlBase	mediawiki/core	master	+64 -12

Customize query in gerrit

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Declined		None	T3268 Database replication lag issues (tracking)
		Resolved		aaron	T19179 No user-visible lag reports when database slave server has stopped replication slave thread

Event Timeline

• bzimport raised the priority of this task from to Low.Nov 21 2014, 10:28 PM

• bzimport added a project: MediaWiki-libs-Rdbms.

• bzimport set Reference to bz17179.

• bzimport added a subscriber: Unknown Object (MLST).

• brion created this task.Jan 27 2009, 6:50 PM

Change 241133 had a related patch set uploaded (by Aaron Schulz):
Added pt-heartbeat support to DatabaseMysqlBase

https://gerrit.wikimedia.org/r/241133

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 2 2015, 12:20 AM

gerritbot added a project: Patch-For-Review.Oct 2 2015, 12:20 AM

Change 241133 merged by jenkins-bot:
Added pt-heartbeat support to DatabaseMysqlBase

https://gerrit.wikimedia.org/r/241133

ori mentioned this in rMWff34e3d464be: Added pt-heartbeat support to DatabaseMysqlBase.Oct 2 2015, 6:44 PM

ReleaseTaggerBot added projects: MW-1.27-release-notes, MW-1.27-release (WMF-deploy-2015-10-06_(1.27.0-wmf.2)).Oct 2 2015, 7:00 PM

I would consider this resolved thanks to performance team's patches regarding pt-heartbeat (shown above).

• MZMcBride subscribed.Dec 23 2016, 12:39 AM

No user-visible lag reports when database slave server has stopped replication slave threadClosed, ResolvedPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

No user-visible lag reports when database slave server has stopped replication slave thread
Closed, ResolvedPublic
Actions

Related Objects
Search...