Page MenuHomePhabricator

s52721__pagecount_stats_p import is making labsdb1005 100% utilized (and lagging its backup slave)
Closed, ResolvedPublic

Description

Not an issue if it is a temporary import, but it is if it is intended to be a continuous process.

Either provide those on a separate instance or make a pause every X amount of time.

Event Timeline

jcrespo created this task.Dec 9 2015, 11:16 AM
jcrespo raised the priority of this task from to Needs Triage.
jcrespo updated the task description. (Show Details)
jcrespo added projects: Tools, Cloud-VPS.
jcrespo added subscribers: jcrespo, Stigmj.
Restricted Application added a project: Cloud-Services. · View Herald TranscriptDec 9 2015, 11:16 AM
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald Transcript
Stigmj added a comment.Dec 9 2015, 2:10 PM

There are six temporary import tasks running right now processing 6 months of pagecount data (at the very end now). There is a crontabbed task running every hour checking for new pagecount files and importing them if they exist.

Stigmj added a comment.Dec 9 2015, 2:18 PM

I could insert a sleep some places, but do you have any recommendation as to how long this should be?

@Stigmj tool-labs database management is based on the assumption that everywhere behaves responsibly. 50% of the time importing, 50% sleeping would what I would recommend you. Also, lock for each import so you make sure you only use 1 thread at a time (and that way you let other users use the available resources).

I have ways to enforce that, but I would want to make it the users' responsability first. To give you an idea of the impact, this is the current CPU usage:

(despite mysql not being too CPU-hungry, but mostly blocking on IOPS)

And its slave, with no other load other than replication is 7 hours behind:

Replication suffers from imports, if needed, we could import separately on the master and the slave.

I would believe you that the rate will be slower soon, and give you feedback otherwise.

I have not commented it, but if this requires specific resources, because it could be useful for more than 1 person and it is considered an important contribution, we could try to get them (I cannot guarantee it, of course) separate resources so it does not disrupt other tools' work.

Stigmj added a comment.Dec 9 2015, 5:16 PM

I have put in some random sleeps (between 60 and 600 seconds) in between each time my importscript is called and implemented a lockfile-mechanism to only allow one instance to be active at a time. Hopefully this will ease the load on the DB-servers.

This seems to have worked. Lags is in the 0-10 range, which it is acceptable. There is still high cpu usage on labsdb1005, but I think this is now due to other users.

Thank you very much for your collaboration.

jcrespo closed this task as Resolved.Dec 10 2015, 10:35 AM
jcrespo claimed this task.