Page MenuHomePhabricator

populateCognatePages.php query keeps timing out while waiting for replication
Open, HighPublic

Description

This script is reading from yuewiktionary and writing to extension1.wiktionary_cognate.

There could be something wrong with the abstraction waiting for rep lag while using 2 different clusters?
Or some replag?

addshore@mwmaint1002:~$ mwscript extensions/Cognate/maintenance/populateCognatePages.php --wiki yuewiktionary
Started processing.
1 rows processed.
Pass finished.
[Tue Jan 22 15:40:54 2019] [hphp] [93233:7f0aad4fc3c0:0:000001] [] SlowTimer [60000ms] at runtime/ext_mysql: slow query: SELECT MASTER_GTID_WAIT('0-171966669-4075108480,171966669-171966669-3589226378,171974792-171974792-120056328,180359174-180359174-94123433,180363367-180363367-134174373', 60)
0 rows processed.
Pass finished.
[Tue Jan 22 15:41:54 2019] [hphp] [93233:7f0aad4fc3c0:0:000002] [] SlowTimer [60000ms] at runtime/ext_mysql: slow query: SELECT MASTER_GTID_WAIT('0-171966669-4075108480,171966669-171966669-3589226378,171974792-171974792-120058635,180359174-180359174-94123433,180363367-180363367-134174373', 60)
0 rows processed.
Pass finished.
♥[Tue Jan 22 15:42:02 2019] [hphp] [93234:7f0aad4fc3c0:0:000001] [] Lost parent, LightProcess exiting
[Tue Jan 22 15:42:02 2019] [hphp] [93235:7f0aad4fc3c0:0:000001] [] Lost parent, LightProcess exiting

This script has worked in the past with no problem.

The code running the query that times out is:

			$loadBalancerFactory->waitForReplication();

I last ran this in Feb 2018 with no such issues https://tools.wmflabs.org/sal/log/AWGZVMQuL7tQ11ghvPqn

Event Timeline

Addshore renamed this task from populateCognatePages.php query keeps timing out while waiting for replecation to populateCognatePages.php query keeps timing out while waiting for replication.Jan 22 2019, 3:44 PM

Marking as high as without this working again we can't have sitelinks to yuewiktionary from other wiktionaries working through Cognate

Marostegui subscribed.

There is really not much we (DBAs) can do about this particular issue other than T172497 - check also T203059#4896539

As the MediaWiki-libs-Rdbms tag has been removed from this I will make T172497 a blocker of this task.
There is nothing to fix in Cognate afaik, this is only an issue with the mw db abstraction.

Michael subscribed.

I'm moving this to "Needs investigation" so that it can be figured out whether solving the parent task (T172497) is still the best/right way to address this issue (then this task here should be marked as stalled), or whether a different solution should be found given the discussion on T172497 since 2019.