Page MenuHomePhabricator

High load on tools labsdb1005
Closed, ResolvedPublic

Description

Since few days (~7) there was an increased activity on tools databases (labsdb1005 and labsdb1004) with increased loadavg and saturated iops on the MySQL partition.
As a result of the increased activity on the master (labsdb1005), the slave labsdb1004 started to fall behind in the replication.

I'll analyse them tomorrow (EU time) to see what is causing the increased load.

Event Timeline

# Top CPU-consuming users during last ~5 hours
s51434: 39973
s51999: 27398
s51059: 14519
s52421:  8331
s51230:  4784
s51211:  4544
s51512:  2869

# Top user-schema writes during last week:
s51211__duplicity_p: 42552582
s51059__cyberbot:    40793948
s51412__data:        27465989
s52946__gns_p:       20450267
s51203__baglama2_p:  17642874
s51138__heritage_p:  14439807
s51230__linkwatcher:  2881069
s52481__scratch_en:   1999084
s52953__librarybase:  1573466

After throttling cyberbot tool in T131937 the CPU usage was reduced and the load too went down from ~28 to ~13.
Monitoring labsdb1004 to ensure it will catch up with the replication and keep in sync.

labsdb1004 replica back in sync, load on labsdb1005 under control.