Right now the update process is (approximately)
while true: download data with osmosis read osmosis data with osm2pgsql to update db and produce tile expiry list, saving tile list to `/srv/osm_expiry` directory run tilerator on files in directory (it only processes new ones) sleep
And, async to this is a process cleaning up old files in /srv/osm_expiry
This means that a tilerator problem (e.g. {T168241: OSM replication stuck on tilerator notification}) stops database updates.
A better strategy is to do
while true: download data with osmosis read osmosis data with osm2pgsql to update db, saving tile expiry list to some local directory create a hard link putting the expiry list in the `/srv/osm_expiry` dir clean up old tile expiry lists in local directory sleep
while true: run tilerator on files in `/srv/osm_expiry` dir clean up old files in `/srv/osm_expiry` dir
This will mean that a failure in tilerator won't cause osm2pgsql replication to stop.
It will require monitoring for the size of /srv/osm_expiry to make sure the expiry is processing the files correctly.
The OSMF do this with replicate and expire-tiles