Investigate why this happened and provide a viable solution for this as this problem is not seen in maps eqiad
Description
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
maps: disable replication cron | operations/puppet | production | +2 -2 |
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | Gehel | T224395 Maps[12]004 /srv disk space is critical | |||
Resolved | • Mathew.onipe | T224874 Maps2004 ran into disk space issues again after reimaging with new partitioning scheme |
Event Timeline
Looking deep into maps2004 postgres database, the problem was traced to gis.planet_osm_line table being larger than normal when compared with its eqiad(maps1004) version.
We are still investigating why this is so.
maps2004 kept track osm2pgsql script for the last 5 days, all of them ended with failures during replication due to disk space. It seems that at some point osm2pgsql failed to roll back and didn't clean up some records causing some tables to grow indefinitely. Even though, the logs are inconclusive.
Next steps:
- fix DB with data available in eqiad, see https://wikitech.wikimedia.org/wiki/Maps#Postgres or recreate OSM database with the initial import script
- regenerate tiles from last osm2pgsql attempt, the list is available at /srv/osm_expire/expire.list.201905282106
Change 514090 had a related patch set uploaded (by Mathew.onipe; owner: Mathew.onipe):
[operations/puppet@production] maps: disable replication cron
Change 514090 merged by Gehel:
[operations/puppet@production] maps: disable replication cron
This was traced to some initial problems during osm-initial-script. This was resolved by reinitializing osm again.