Page MenuHomePhabricator

Test the impact of the wdqs updater performance by disabling values cleanup
Closed, ResolvedPublic

Description

Lag on the updater is rising:

wdqs_lag.png (780×1 px, 146 KB)

References cleanup has already been disabled to investigate another issue without noticeable negative impact, we should try to disable the values cleanup to see if it helps with catching up the lag.

Event Timeline

Change 585443 had a related patch set uploaded (by DCausse; owner: DCausse):
[wikidata/query/rdf@master] Disable values cleanup

https://gerrit.wikimedia.org/r/585443

Change 585443 merged by jenkins-bot:
[wikidata/query/rdf@master] Disable values cleanup

https://gerrit.wikimedia.org/r/585443

Mentioned in SAL (#wikimedia-operations) [2020-04-03T12:44:58Z] <dcausse@deploy1001> Started deploy [wdqs/wdqs@23495ae]: deploying wdqs 0.3.17 to wdqs1007: testing T249196

Mentioned in SAL (#wikimedia-operations) [2020-04-03T12:45:41Z] <dcausse@deploy1001> Finished deploy [wdqs/wdqs@23495ae]: deploying wdqs 0.3.17 to wdqs1007: testing T249196 (duration: 00m 43s)

Mentioned in SAL (#wikimedia-operations) [2020-04-07T07:40:16Z] <dcausse@deploy1001> Started deploy [wdqs/wdqs@23495ae]: deploying wdqs 0.3.17 to wdqs2002: T249196

Mentioned in SAL (#wikimedia-operations) [2020-04-07T07:41:44Z] <dcausse@deploy1001> Finished deploy [wdqs/wdqs@23495ae]: deploying wdqs 0.3.17 to wdqs2002: T249196 (duration: 01m 28s)

The lag on wdqs1007 has been absorbed much faster than other eqiad nodes.

lag_wdqs1007.png (202×888 px, 53 KB)

Comparing codfw nodes the lag is a bit better on a node where the cleanup has been removed (wdqs2002 removed vs wdqs2003 cleanup enabled) but the impact is not particularly impressive, wdqs2002 lag still climbs relying on back pressure propagation through maxLag (sawtooth shape)

lag_wdqs2003.png (169×852 px, 54 KB)

Overall I think that this cleanup does not bring much to the system, while it was being enabled for more than 3 days I didn't see a dramatic climb in the number of triples.
I think opting for more frequent reloads is much more efficient approach.