Labs OSMdb is outdated/outofsync
Closed, ResolvedPublicBUG REPORT
Actions

Assigned To

Authored By

	TheDJ
	Jun 28 2021, 10:47 AM

Description

I was doing some maintenance on tiles.wmflabs and I noticed that the osmdb for labs currently doesn't seem to be in sync ?

For instance this tile: https://tiles.wmflabs.org/osm/18/136090/86311.png is freshly generated, and it shows structures that no longer exist and that have been removed from OSM and those also no longer show in the wmflabs version of the tile: https://maps.wikimedia.org/osm-intl/18/136090/86311@2x.png

I know a lot of maps work has been happening, but I'm not entirely sure what is going on exactly and if this is part previous problems, or if this is due to the current work, or completely unexpected...

Details

Subject	Repo	Branch	Lines +/-
cloud osmdb: update the filenames in case we re-import the shapefiles	operations/puppet	production	+2 -2
cloud osmdb: don't use proxy for cloud	operations/puppet	production	+1 -1
cloud osmdb: set num_threads in the sync job	operations/puppet	production	+8 -7

Customize query in gerrit

Related Objects

Mentioned In: T323159: Shut down osmdb.eqiad.wmnet (clouddb100[3-4])?
T300160: Request increased quota for maps Cloud VPS project
T187601: Examine replacing tiles.wmflabs.org with production tile server
T289101: Bring WMF map tile feature sets into line with OSM default feature sets
Mentioned Here: T300160: Request increased quota for maps Cloud VPS project
T254014: Reimport OSM data on eqiad
T285145: WMFlabs map tiles do not load on zoom level 17, 404 errors

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I'm not sure how those are maintained, but as far as I'm aware this is not a Toolforge service.

taavi renamed this task from Outdated tool forge maps to tiles.wmflabs.org OSM is outdated.Aug 7 2021, 8:26 AM

taavi unsubscribed.

Pigsonthewing mentioned this in T289101: Bring WMF map tile feature sets into line with OSM default feature sets.Aug 18 2021, 11:10 AM

@Majavah I didn't know we kept separate tags for the DB replicas of Toolforge. Do you know what the project tag for that is ?

In T285668#7309905, @TheDJ wrote:

@Majavah I didn't know we kept separate tags for the DB replicas of Toolforge. Do you know what the project tag for that is ?

I am unfortunately not sure. This service seems to live in the maps Cloud VPS project, which is completely separate from Toolforge (tools Cloud VPS project).

No thats just a specific rendering. This is about the postgres DB with the database, used by both VPS and Toolforge projects. https://wikitech.wikimedia.org/wiki/Help:Toolforge/Database#Connecting_to_OSM_via_the_official_CLI_PostgreSQL

i mean. it's infra that no one wants to be responsible for, i get it, but we gotta put some sort of tag on it...

In T285668#7310041, @TheDJ wrote:

No thats just a specific rendering. This is about the postgres DB with the database, used by both VPS and Toolforge projects. https://wikitech.wikimedia.org/wiki/Help:Toolforge/Database#Connecting_to_OSM_via_the_official_CLI_PostgreSQL

taavi@tools-sgebastion-07:~ $ host osmdb.eqiad.wmnet                                  osmdb.eqiad.wmnet is an alias for osm.db.svc.eqiad.wmflabs.
osm.db.svc.eqiad.wmflabs has address 172.16.6.105
taavi@tools-sgebastion-07:~ $ host 172.16.6.105
105.6.16.172.in-addr.arpa domain name pointer clouddb1003.clouddb-services.eqiad1.wikimedia.cloud.

That server is in clouddb-services. That's technically cloud-services-team territory, maybe they have an idea what's going on here?

• nskaggs triaged this task as Medium priority.Aug 27 2021, 1:27 PM

• nskaggs moved this task from Inbox to Needs discussion on the cloud-services-team (Kanban) board.

The OSM database on clouddb1003.clouddb-services.eqiad1.wikimedia.cloud constantly runs out of available connections, which would likely stop automated things from happening to it. I was curious if it was actually in any kind of functional use since I'd been seeing issues due to the way people connect to it for ages (and just restarted the database here and there to clear it). The problem easily returns because any misconfigured tool that connects will keep this happening since the database has no auth for read access inside cloud (which should not be true of any database). If that is part of the system here, I can start by increasing the number of connections from the default. That would be a new requirement, which makes me wonder what is broken.

• Bstorm added a project: Data-Services.Aug 30 2021, 5:08 PM

• Bstorm moved this task from Backlog to Maps on the Data-Services board.

Ok, that's not what's up here. The database is working, but replication isn't because it doesn't have a state file. I've downloaded that and am manually trying a replication job. I'm going to need to disabled the cron, though.

Mentioned in SAL (#wikimedia-cloud) [2021-08-30T21:53:52Z] <bstorm> disable puppet and osm updater script on clouddb1003 T285668

Ok, so far, the script didn't explode like it clearly did in the logs, so that's progress.

Never mind. It just did.
Osm2pgsql failed due to ERROR: Connection to database failed: FATAL: remaining connection slots are reserved for non-replication superuser connections

That's what I've been seeing.

Mentioned in SAL (#wikimedia-cloud) [2021-08-30T22:07:58Z] <bstorm> restarting osmdb on clouddb1003 to try to capture enough connections T285668

The puppetization is not flexible in this area, so I'm trying brute force first and restarting the DB.

Nope, it instantly runs out of available connections.

That's not because of external connections to the db, that's for sure....

root@clouddb1003:~# netstat -npt | grep 5432 | awk '{print $5}' | grep -Eo '([0-9]{1,3}\.){3}[0-9]{1,3}' | cut -d: -f1 | sort | uniq -c | sort -nr | head

12 172.16.5.154
 1 172.16.6.106

Ok, it works if I set the number of processes lower :) I'll set something in the puppetization.

That said, I find the lack of actual work it did concerning. It's using an internal web proxy that may not be valid for inside Cloud VPS...

Before I go and try to commit this change to puppet, @TheDJ have I fixed what I needed to on OSMDB? I'll work up a patch now that the script can at least run without obvious errors. I suspect it needs more work to be doing the right thing.

I'm thinking it's the state file I downloaded. Digging in a bit.

I have a feeling that when this was moved to VMs and set up, at some point replication broke from OSM and now it needs to be brought up to date in a more heavy handed way. Possibly akin to T254014. It could just be a matter of picking an older state file, but at this point, this isn't stuff I've kicked much. This database is weird in that it has a couple of custom databases in it, so whatever we do, we shouldn't drop everything on the server...just the osm bits. That should not matter to the actual OSM database since they are separate.

@MSantos any ideas or assistance would be appreciated. I don't think WMCS has ever had much idea how to operate this thing since it wasn't well documented when I tried to save it from certain death.

I'll kick the puppet setup so that I can persist the change to the number of threads, since that clears up *one* error.

Change 715623 had a related patch set uploaded (by Bstorm; author: Bstorm):

[operations/puppet@production] cloud osmdb: set num_threads in the sync job

https://gerrit.wikimedia.org/r/715623

gerritbot added a project: Patch-For-Review.Aug 30 2021, 10:51 PM

Change 715624 had a related patch set uploaded (by Bstorm; author: Bstorm):

[operations/puppet@production] cloud osmdb: don't use proxy for cloud

https://gerrit.wikimedia.org/r/715624

Aklapper awarded a token.Aug 31 2021, 8:11 AM

Yes likely we will need at the very least need to set a state file from before we got out of sync... Determining when that was exactly is probably going to be the hard part however... I think it was at least about a year or so....

I'm thinking if there are ways I can identify when.....
And thank you Brooke, for helping out.

Change 715623 merged by Bstorm:

[operations/puppet@production] cloud osmdb: set num_threads in the sync job

https://gerrit.wikimedia.org/r/715623

Change 715624 merged by Bstorm:

[operations/puppet@production] cloud osmdb: don't use proxy for cloud

https://gerrit.wikimedia.org/r/715624

@TheDJ Well, I created the instance on Feb. 22, 2019. If we presume it never was replicating in this setup (which seems a safe guess), I think that's a pretty good day to start from.

Maintenance_bot removed a project: Patch-For-Review.Aug 31 2021, 8:10 PM

I'm picking https://planet.openstreetmap.org/replication/day/000/002/353.state.txt.

Mentioned in SAL (#wikimedia-cloud) [2021-08-31T20:19:14Z] <bstorm> attempting to resync OSMDB back to Feb 21st 2019 T285668

Looking better now: Processing: Node(240k 0.6k/s) Way(0k 0.00k/s) Relation(0 0.00/s)

It's gonna be a while. I've disabled the cron and puppet. It's running in a screen session.

That did an awful lot. The output is not easily capturable at this point because of how much it dumped to the screen.

The weirdest thing I see is Writing dirty tile list (4954K)node cache: stored: 2743934(100.00%), storage efficiency: 81.38% (dense blocks: 291, sparse nodes: 493857), hit rate: 3.52%

The state file is now claiming things are up to date. I'll re-enable the cron and puppet. @TheDJ do you have a way to check if things are working as expected now? There are old parts of this puppet code that clearly do not mesh with reality still, so it is worth checking.

Since there was a discussion of project tags up there in the back scroll. The correct tag for the OSMDB is Data-Services moved to the column "maps". Data services is also the correct tag for the wikireplicas themselves used in Toolforge, though most tickets are about usage not the servers themselves so it's probably ok that nobody really uses that tag except me and a few other people. It's not well-documented.

• Bstorm claimed this task.Sep 1 2021, 5:47 PM

I wonder if the database servers for this could be moved into the maps project so people would have access to run these things who have access to administer the maps system. They are currently in clouddb-services where maps admins cannot access them, so you have to wait for me to fumble around with them :)

Was not able to confirm this yet, will have to look tomorrow.

I've not been able to get it to draw a tile that looks like a current tile.. I'm not sure why. I don't have the experience to deal with problems like these unfortunately. Maybe we need to ask wikitech-l ?

In T285668#7329299, @TheDJ wrote:

I've not been able to get it to draw a tile that looks like a current tile.. I'm not sure why. I don't have the experience to deal with problems like these unfortunately. Maybe we need to ask wikitech-l ?

That seems fair. The cloud database server is basically using the production WMF puppet code, but I have very little idea how it all works or about openstreetmaps in general. I'm happy I got something that was definitely broken to stop being broken, but that only buys so much :)

Also, I can grant access to the servers for any WMF maps people, or I could also start working on migrating this to the maps project instead where anyone with maps project access can get in and kick around the database themselves. I don't think WMCS provides much benefit in administering these databases separately since we don't have maps people and had no idea the database was so badly out of sync in the first place. I don't actually know who's doing maps work these days for the foundation, but I would very much like to get help from them.

Mentioned in SAL (#wikimedia-cloud) [2021-09-02T18:52:14Z] <bstorm> removed strange old duplicate cron for osmupdater T285668

@akosiaris I remember asking you about this setup in the past. Do you have any thoughts on what might be wrong here?

I also just found that we are actually missing the coastlines table and the land_polygons table from the original setup, it seems. I can try to recreate those by downloading the dumps mentioned in the puppet setup.

Change 716543 had a related patch set uploaded (by Bstorm; author: Bstorm):

[operations/puppet@production] cloud osmdb: update the filenames in case we re-import the shapefiles

https://gerrit.wikimedia.org/r/716543

gerritbot added a project: Patch-For-Review.Sep 2 2021, 8:38 PM

In T285668#7329495, @Bstorm wrote:

@akosiaris I remember asking you about this setup in the past. Do you have any thoughts on what might be wrong here?

Hi! I am quite a bit rusty on this, plus tools have changed since I last messed with this. In my experience the sanest (albeit not fastest) way out of out of sync database was to a) backup the a couple of user databases, b) drop the database c) reimport it from scratch using osm2pgsql (assuming that is still the tool still being used). That would usually take a day or two. d) restore the extra user databases dumped in step a). For the duration of all of this puppet was best left disabled. From that point on, I have no real experience with the maps WMCS infrastructure. I (think!) tiles are generated by apache's mod_tile module, but by know I am grasping at straws.

I also just found that we are actually missing the coastlines table and the land_polygons table from the original setup, it seems. I can try to recreate those by downloading the dumps mentioned in the puppet setup.

That would be the way those are populated. If the files are around, puppet would try and create those tables.

I don't actually know who's doing maps work these days for the foundation, but I would very much like to get help from them.

Maps work for the foundation is happening by the Web and Infrastructure Product department's team. However, I don't think they ever touched the setup in WMCS, as far as I know they are now using different tools, and thus I would not dare to presume they will be able to help much.

In fact, I don't think the foundation ever had a member that had delved beyond the osm syncing step for the maps WMCS project. I don't even know of a frequent channel of communication between the foundation's present and past maps teams and the volunteers running the maps WMCS project.

I was wrong about coastlines and land_polygons, btw. The tables are there. I don't know if they need updating, but they are there. The sync process seems to be working now, but I don't really know how to check. I could do a resync from scratch (from what I can tell, all the tooling is exactly the same as it always was), though I am not sure that will help the situation or not.

Does anyone on the task know how to tell if the database is the problem (at this point now that I've theoretically maybe caught it up since it stopped syncing) or if some other bit is?

In T285668#7340943, @Bstorm wrote:

I was wrong about coastlines and land_polygons, btw. The tables are there. I don't know if they need updating, but they are there.

Coastlines change very slowly so they very rarely need updating. I am not sure about land polygons though. I would expect them to also need updating on the order of years, so definitely not often.

The sync process seems to be working now, but I don't really know how to check.

There's modules/osm/files/osm_sync_lag.sh that should spit out the lag of the server. If that doesn't error out or spit out some nonsensical value, then the DB should be properly synced. It quite possibly is already setup via puppet and prometheus is scraping it regularly.

Does anyone on the task know how to tell if the database is the problem (at this point now that I've theoretically maybe caught it up since it stopped syncing) or if some other bit is?

Maybe @Kolossos or @dschwen could help with that.

I am late to the party, but tiles.wmflabs.org is not part of the official Maps infrastructure. I'm not sure how its infrastructure works and I've limited time to help you setup currently, maybe next quarter, but @Bstorm if you need to sync and know what we have available in terms of maps OSM tooling, please ping me for a chat.

We now use imposm3 to import data and it would be interesting to see this project do more fancy stuff, like using OpenMapTiles.

In T285668#7342314, @MSantos wrote:

I am late to the party, but tiles.wmflabs.org is not part of the official Maps infrastructure. I'm not sure how its infrastructure works and I've limited time to help you setup currently, maybe next quarter, but @Bstorm if you need to sync and know what we have available in terms of maps OSM tooling, please ping me for a chat.

We now use imposm3 to import data and it would be interesting to see this project do more fancy stuff, like using OpenMapTiles.

There's two pieces to all this here. We've got a maps project in cloud (which tiles.wmflabs.org is hosted on) that is run by volunteers and whoever knows how to make it work and a Postgres OSM database that is run by my team. That database consumes the same puppet stack as the production WMF ones as far as I know. Unfortunately, I don't know a thing about maps, but I have plenty of root so I'm trying to use that for good :)

My questions have partly been about how to make sure the Postgres database this connects to is in good working order for now so that I know I'm not the blocker. From there, I'm sure other people here who work on the maps project like @TheDJ could use any insight on what to kick to make things start working right. I usually think it is safe to assume that the problem is probably on my end of things since I haven't paid attention to the database much for a while, which is why I'm trying to sync up the Postgres database--just for the organizational context. I'll ping you soon!

In T285668#7342496, @Bstorm wrote:

In T285668#7342314, @MSantos wrote:

I am late to the party, but tiles.wmflabs.org is not part of the official Maps infrastructure. I'm not sure how its infrastructure works and I've limited time to help you setup currently, maybe next quarter, but @Bstorm if you need to sync and know what we have available in terms of maps OSM tooling, please ping me for a chat.

We now use imposm3 to import data and it would be interesting to see this project do more fancy stuff, like using OpenMapTiles.

There's two pieces to all this here. We've got a maps project in cloud (which tiles.wmflabs.org is hosted on) that is run by volunteers and whoever knows how to make it work and a Postgres OSM database that is run by my team. That database consumes the same puppet stack as the production WMF ones as far as I know.

I don't think that's true anymore (albeit it's a recent development). While puppet abstracts it a bit so that the same defines and classes are used, maps in production now uses imposm3 while the one in WMCS still uses osm2pgsql in order to populate the OSM database.

In T285668#7343992, @akosiaris wrote:

In T285668#7342496, @Bstorm wrote:

In T285668#7342314, @MSantos wrote:

I am late to the party, but tiles.wmflabs.org is not part of the official Maps infrastructure. I'm not sure how its infrastructure works and I've limited time to help you setup currently, maybe next quarter, but @Bstorm if you need to sync and know what we have available in terms of maps OSM tooling, please ping me for a chat.

We now use imposm3 to import data and it would be interesting to see this project do more fancy stuff, like using OpenMapTiles.

There's two pieces to all this here. We've got a maps project in cloud (which tiles.wmflabs.org is hosted on) that is run by volunteers and whoever knows how to make it work and a Postgres OSM database that is run by my team. That database consumes the same puppet stack as the production WMF ones as far as I know.

I don't think that's true anymore (albeit it's a recent development). While puppet abstracts it a bit so that the same defines and classes are used, maps in production now uses imposm3 while the one in WMCS still uses osm2pgsql in order to populate the OSM database.

@akosiaris is right, the puppet config is backwards compatible and to use imposm3 you need to enforce it, so it doesn't change tooling for tiles.wmflabs.org

If you need some reference about updating the OSM database using osm2pgsql, look into this task (T254014) for a checklist of procedures, some of them doesn't apply because are specific to the PG replicas in the production database. Nevermind, you already got it, maybe there's more relevant info at https://wikitech.wikimedia.org/wiki/Maps/OSM_Database

My biggest question is how to tell if replication is working right. I set it to do a full replication from the time it had stopped and now the state file thinks it is current. I can re-import the coastlines table and the polygon one, but how do you know when it is working :)

In T285668#7344880, @Bstorm wrote:

My biggest question is how to tell if replication is working right. I set it to do a full replication from the time it had stopped and now the state file thinks it is current.

That was always good enough in the past.

I can re-import the coastlines table and the polygon one, but how do you know when it is working :)

What was good enough in the past was having the tables there and knowing it was a relatively recent file one downloaded (not that it matters much, coastlines don't change that often, not to mention this was not syncing since circa 2019. Coastlines and land polygons are the least of the problems). The "app" would pick them up. The one thing that does come to mind is that the app needed to know which tiles had expired which was accomplished by allowing the tile servers to download a file osm2pgsql exported via rsync. I don't know whether it makes sense in this specific case, with so much time in between for the tile servers to use that file or just regenerate most tiles from scratch.

In T285668#7344934, @akosiaris wrote:

The one thing that does come to mind is that the app needed to know which tiles had expired which was accomplished by allowing the tile servers to download a file osm2pgsql exported via rsync. I don't know whether it makes sense in this specific case, with so much time in between for the tile servers to use that file or just regenerate most tiles from scratch.

Ahhh! Ok, I did see that the sync generates some kind of set of expired tiles. This also has an rsync server on it. I wonder if the maps project setup changed the server to rsync from back when labsdb1006/7 was shut down...or if the rsync server is still configured as expected? That suggests some places to look. Thanks!

OK, finally had some time to take a look again. I looked at planet_osm_nodes If we do a SELECT * FROM planet_osm_nodes ORDER BY id DESC LIMIT 1; we find the 'newest' nodes I'd guess. This returns node 9049471600

I've learned there is an easy way to find a node ( or way or relation) on the current OSM map: https://www.openstreetmap.org/node/9049471600
This shows "last edited" of Mon, 30 Aug 2021 23:59:34 +0000 (i guess that also means the sync isn't running ; )

So i've done some digging on ids, since we don't have timestamps in the tables..
This query gets lists of ids pretty close to that highest id we have in the db and calculates gaps in the ids. They show gaps of about a 100 ids max.

select id + 1 as gap_start,
    next_id -1 as gap_end,
    next_id - id as gap_size
from (
    select id, lat, lon, lead (id) over (order by id) as next_id
    FROM planet_osm_nodes WHERE id > 9048471600 ORDER by id limit 100000
) nr
where nr.id + 1 <> nr.next_id;

Now take it back further in time to say starting from id 9028100000 (last edited Mon, 23 Aug 2021 08:30:02 +0000)
i'm seeing lots of huge gaps of 10000 and even 200000 nodes at times... that seems very high for pretty recent data. This can also be compared to much older data (just decreasing the first digit of the id) and you can see that very old data also tends to show gaps more in the order of the hundreds than the hundred thousands as a pattern.

With some casual binary searching i'm seeing somewhere in the region of 6382000000 as a ballpark starting date of lots of very large gaps, which would be somewhere after 03 Apr 2019 21:20:07 +0000, indicating something went wrong with the sync, but pretty hard to figure out i guess what exactly, but its almost as if days got skipped perhaps ?

The point that i'm looking at myself to verify the state of things is one in my city that i know should have been added over the last couple of years and it is: https://www.openstreetmap.org/node/7005849394 Sun, 24 Nov 2019 10:38:15 +0000
Its just not in our tables.

I'm not sure what the best is. Maybe we just had a state that was too young/old and then we had cascading query failures on data when syncing ? Maybe we should go back a bit further ? Or maybe reimport from scratch ?

Ahhh! Ok, I did see that the sync generates some kind of set of expired tiles. This also has an rsync server on it. I wonder if the maps project setup changed the server to rsync from back when labsdb1006/7 was shut down...or if the rsync server is still configured as expected? That suggests some places to look. Thanks!

Pretty sure that was never done, because i didn't and I think i was the last person to touch the tiles server. Maybe @dschwen has it running on the wma server ?

Ok, so from what you just said, that sounds to me like the OSMDB needs to be rebuilt to make sure we don't have gaps after dumping the appropriate databases. Since it is on VMs. That also suggests it is a good time to consider building the service inside the maps project instead of in the special "admin only" space of clouddb-services. I don't know the implications of syncing up the design of this sync with the production one, but that might be worth considering as well.

Andrew moved this task from Doing to Soon! on the cloud-services-team (Kanban) board.Jan 5 2022, 5:28 PM

TheDJ mentioned this in T187601: Examine replacing tiles.wmflabs.org with production tile server.Jan 23 2022, 6:30 PM

• nskaggs removed • Bstorm as the assignee of this task.Jan 25 2022, 2:59 PM

@TheDJ is currently actively deprecating the service, all tiles look like https://tiles.wmflabs.org/osm/18/136090/86311.png now

@Multichill the underlying issue here is the OSM db replication not working. Tiles are just a symptom.

I was doing some maintenance on tiles.wmflabs and I noticed that the osmdb for labs currently doesn't seem to be in sync ?

In T285668#7650014, @dschwen wrote:

@Multichill the underlying issue here is the OSM db replication not working. Tiles are just a symptom.

I was doing some maintenance on tiles.wmflabs and I noticed that the osmdb for labs currently doesn't seem to be in sync ?

True, but it is also my recognition that you shouldn't be running a tiles server if you don't have some basic understanding of OSM's database structure and inners. This problem did contribute to my decision to deprecate. I was getting questions about out of date tiles and out of date poi information from users and tool builders. Add to that the fact that WMF foundation has problems even getting the production DB to run properly sort of was confirmation that this just wasn't something i was going to spend more time on.

Nevertheless, osmdb is still in use by other tools and should still work for them (hopefully someday correctly).

• nskaggs moved this task from Soon! to Needs discussion on the cloud-services-team (Kanban) board.Jan 25 2022, 8:28 PM

TheDJ renamed this task from tiles.wmflabs.org OSM is outdated to Labs OSMdb is outdated/outofsync.Jan 25 2022, 10:10 PM

How far out of sync is that DB anyways. I just queried

gis=> select COUNT(*) from planet_osm_ways;
   count
-----------
 586673978
(1 row)

and https://osmstats.neis-one.org/?item=elements has that number around second half of June'21.

Soo.... if I could get a 3TB quota increase for the maps project I could get a user supported replica going :-D

In T285668#7651226, @dschwen wrote:

How far out of sync is that DB anyways. I just queried

There is also missing data/changesets before that date it seems.

and https://osmstats.neis-one.org/?item=elements has that number around second half of June'21.

Nice website, didn't know about that one.

dschwen mentioned this in T300160: Request increased quota for maps Cloud VPS project.Jan 26 2022, 4:16 PM

• nskaggs moved this task from Needs discussion to Watching on the cloud-services-team (Kanban) board.Jan 26 2022, 5:00 PM

In T285668#7325677, @Bstorm wrote:

I wonder if the database servers for this could be moved into the maps project so people would have access to run these things who have access to administer the maps system. They are currently in clouddb-services where maps admins cannot access them, so you have to wait for me to fumble around with them :)

I see that @dschwen has started a server in maps project, with the potential idea of doing this ?

I see that @dschwen has started a server in maps project, with the potential idea of doing this ?

That's exactly right. My request to get quota for a volume to ouse such a DB was approved yesterday (T300160). I immediately started an import of an up to date planet file. It looks like it will take quite a long time. You can read documentation on my effort here: https://github.com/Commonists/maps-osmdb

You can peek at the import progress here: https://wma.wmflabs.org/osmdb.txt (updated once a minute)

Update: The initial import (with a planet data file from 1/31/22) has finished and I'm currently running a replication process to catch up.

dschwen@maps-osmdb:~/osmdb$ time osm2pgsql -c -s planet-latest2.osm.pbf -d gis
2022-01-31 20:47:21  osm2pgsql version 1.6.0 (1.6.0)
2022-01-31 20:47:21  Database version: 13.5 (Debian 13.5-0+deb11u1)
2022-01-31 20:47:21  PostGIS version: 3.1
2022-01-31 20:47:21  Setting up table 'planet_osm_point'
2022-01-31 20:47:22  Setting up table 'planet_osm_line'
2022-01-31 20:47:22  Setting up table 'planet_osm_polygon'
2022-01-31 20:47:22  Setting up table 'planet_osm_roads'
2022-02-11 17:52:17  Reading input files done in 939895s (261h 4m 55s).
2022-02-11 17:52:17    Processed 7476676170 nodes in 31789s (8h 49m 49s) - 235k/s
2022-02-11 17:52:17    Processed 833672717 ways in 656592s (182h 23m 12s) - 1k/s
2022-02-11 17:52:17    Processed 9622337 relations in 251514s (69h 51m 54s) - 38/s
2022-02-11 17:52:18  Clustering table 'planet_osm_roads' by geometry...
2022-02-11 17:52:18  Clustering table 'planet_osm_line' by geometry...
2022-02-11 17:52:18  Clustering table 'planet_osm_point' by geometry...
2022-02-11 17:52:18  Clustering table 'planet_osm_polygon' by geometry...
2022-02-11 19:49:23  Creating geometry index on table 'planet_osm_roads'...
2022-02-11 20:05:13  Creating geometry index on table 'planet_osm_point'...
2022-02-11 20:06:37  Creating osm_id index on table 'planet_osm_roads'...
2022-02-11 20:11:56  Analyzing table 'planet_osm_roads'...
2022-02-11 20:13:46  Done postprocessing on table 'planet_osm_nodes' in 0s
2022-02-11 20:13:46  Building index on table 'planet_osm_ways'
2022-02-11 22:57:25  Creating osm_id index on table 'planet_osm_point'...
2022-02-12 00:04:34  Analyzing table 'planet_osm_point'...
2022-02-12 00:05:07  Building index on table 'planet_osm_rels'
2022-02-13 01:50:51  Creating geometry index on table 'planet_osm_line'...
2022-02-13 07:55:49  Creating osm_id index on table 'planet_osm_line'...
2022-02-13 09:16:08  Analyzing table 'planet_osm_line'...
2022-02-13 17:54:52  Creating geometry index on table 'planet_osm_polygon'...
2022-02-14 13:22:19  Creating osm_id index on table 'planet_osm_polygon'...
2022-02-14 15:23:16  Analyzing table 'planet_osm_polygon'...
2022-02-17 01:09:21  Done postprocessing on table 'planet_osm_ways' in 449734s (124h 55m 34s)
2022-02-17 01:09:21  Done postprocessing on table 'planet_osm_rels' in 10962s (3h 2m 42s)
2022-02-17 01:09:21  All postprocessing on table 'planet_osm_point' done in 22368s (6h 12m 48s).
2022-02-17 01:09:21  All postprocessing on table 'planet_osm_line' done in 141865s (39h 24m 25s).
2022-02-17 01:09:21  All postprocessing on table 'planet_osm_polygon' done in 250293s (69h 31m 33s).
2022-02-17 01:09:21  All postprocessing on table 'planet_osm_roads' done in 8487s (2h 21m 27s).
2022-02-17 01:09:21  osm2pgsql took 1398120s (388h 22m 0s) overall.

real    23301m59.583s
user    989m57.448s
sys     394m59.727s
dschwen@maps-osmdb:~/osmdb$

If you compare https://maps.wikimedia.org/#10/49.0887/33.3064 and https://www.openstreetmap.org/#map=10/49.0887/33.3064
you will see that the former lacks the large blue body of water in the upper left; this is the Kremenchuk Reservoir https://www.openstreetmap.org/relation/2289192.

The log at https://wma.wmflabs.org/osmdb.txt appears to imply that the OSM database has successfully been imported today. Could it be that there is a backlog in creating tiles? However, the Kremenchuk Reservoir is missing on all zoom levels of the Wikimedia tiles.

Other reservoirs display correctly in blue, so I doubt it's an issue with the style we use.

In T285668#8038701, @AxelBoldt wrote:

The log at https://wma.wmflabs.org/osmdb.txt appears to imply that the OSM database has successfully been imported today. Could it be that there is a backlog in creating tiles? However, the Kremenchuk Reservoir is missing on all zoom levels of the Wikimedia tiles.

This task relates to the volunteer run maps Cloud VPS project. https://maps.wikimedia.org is an entirely separate OSM database and tile server. You probably want to make a new task tagged with Maps (Maps-data) and Maps (Map-Styles) to get the attention of folks who operate the production tile servers.

Yeah, I set up a separate OSMDB replica. By the way I could use some help with setting up access to it in a sensible way. It should be discussed if accounts are handed out to individuals, or if there'll be a free for all type of access. I'd prefer the former to have a bit more control over the resource.

bd808 mentioned this in T323159: Shut down osmdb.eqiad.wmnet (clouddb100[3-4])?.Nov 16 2022, 12:33 AM

The service was deprecated and a new one is now maintained by dschwen, so I'm closing this ticket.

Labs OSMdb is outdated/outofsyncClosed, ResolvedPublicBUG REPORTActions

Description

Details

Related Objects

Event Timeline

Labs OSMdb is outdated/outofsync
Closed, ResolvedPublicBUG REPORT
Actions