Page MenuHomePhabricator

Investigate jobs taking too long to complete in maps1001.eqiad
Closed, ResolvedPublic

Description

There are some jobs taking more than 24 hours to complete, see the following image

image.png (492×1 px, 53 KB)

Also, the Grafana dashboard also shows that the z15 jobs are always active and never stops the tile generation

Event Timeline

MSantos triaged this task as High priority.Sep 28 2018, 6:14 PM

Maybe we should back off a bit and run z10-z15 once every two days?

It can be a good start, but we have the will to increase the OSM replication T137939: Increase frequency of OSM replication still worth investigating.

Change 469329 had a related patch set uploaded (by MSantos; owner: MSantos):
[operations/puppet@production] Decrease OSM update Frequency

https://gerrit.wikimedia.org/r/469329

Change 469329 merged by Alexandros Kosiaris:
[operations/puppet@production] Decrease OSM update Frequency

https://gerrit.wikimedia.org/r/469329

After OS upgrade, tile generation is showing normal speed. You can compare the last 7 days with the time and period when the ticket was created:

  • When Issue was reported
    • From the graphs you can see that z15 never stopped being processed
  • Last 7 days
    • Now, tile generation stops at some point before starts again on the next iteration

After OS upgrade, tile generation is showing normal speed.

Awesome!