Page MenuHomePhabricator

cloudvps: maps project trusty deprecation
Open, NormalPublic


Ubuntu Trusty is no longer available in Cloud VPS since Nov 2017 for new instances. However, the EOL of Trusty is approaching in 2019 and we need to move to Debian Stretch before that date.

All instances in the maps project needs to upgrade as soon as possible.

The list of affected VMs is:

  • maps-tiles2.maps.eqiad.wmflabs
  • maps-tiles3.maps.eqiad.wmflabs
  • maps-warper2.maps.eqiad.wmflabs
  • maps-wma1.maps.eqiad.wmflabs

Listed administrator are:

More info in openstack browser:


  • figure out the current configuration for the servers
  • create new Debian 9 (Stretch) VMs for each and configure puppet client
    • Created maps-tiles1 instance and a maps-puppetmaster instance to experiment with a puppet config
    • maps-tiles1 is mapped to (previously unused)
    • setup puppet
  • create new manifests for new versions of packages and configurations in puppet and deploy
  • transfer any data that needs to be transfered for proper tile server operation (likely mostly via nfs ?)
  • test that all functionality is transferred and that everything works ok
  • shut down old servers, turn on new ones
  • delete old servers

Event Timeline

Krenair created this task.Sep 17 2018, 12:31 PM
Krenair triaged this task as Normal priority.
TheDJ added a comment.Sep 17 2018, 2:45 PM

Also sounds like a good time to trim the access list a bit of any people that are 100% no longer involved in the projects.

possibly but that's way out of scope

maps-wma1.maps.eqiad.wmflabs was upgraded in place long ago! Please remove it from your list.

See (the instance was upgraded to xenial)

See (the instance was upgraded to xenial)

In the long term there may be additional issues that come up from the continued use of Ubuntu. Xenial only works today as a side effect of support for Trusty. I am not making a declaration that you must jump over to Debian for this task, but this is something for folks to be aware of.

The virtual instances in Cloud VPS projects generally need to run our basic Puppet code so that things like LDAP authentication, NFS services, and our emergency root privileges work as designed. It is possible (and actually likely) that at some point following the Trusty deprecation project there will be changes in the operation/puppet.git Puppet codebase that will only work on Debian servers. When that happens we will have to figure out how to replace or remove virtual machines running unsupported operating systems.

Hey @aude @Awjrichards @Chippyy @cmarqu @coren @dschwen @jeremyb @Kolossos @MaxSem @Multichill @Nosy @TheDJ! Just a friendly reminder that you should get rid of your Trusty instances as described in The deadline is 2018-12-18. Please get in contact if you need help. Also please assign this task to an individual.

Krenair updated the task description. (Show Details)Oct 22 2018, 1:43 PM

Another ping. Deadline is approaching (2018-12-18).

maps-warper2.maps.eqiad.wmflabs is still actively in use. I can fit in some time to upgrade before the deadline but I'm not sure how easy it would be to upgrade to Debian (if at all). Also unknown if the software which has only been tested in Ubuntu environment would work on Debian without significant development time. What are my options?

TheDJ added a comment.Nov 20 2018, 2:46 PM

I'd be willing to help convert older tiles server instances if someone brings the maptile specific knowledge that I lack. This is assuming that software stack will run on newer debian.

...I am not making a declaration that you must jump over to Debian for this task, but this is something for folks to be aware of.

If I could ask again. I am confused what to do. What are my options with maps-warper2 ? Do I just need to upgrade Ubuntu? The Wikimaps Warper is set up to work with Ubuntu. It might be a significant task to get it working on another OS, rather than a more recent LTS of Ubuntu.

Would running the application within Docker be a workaround?
Would upgrading to latest Ubuntu be a workaround?
Would making sure the system works with Debian be a workaround?

What happens after the 18th of December?

Krenair added a comment.EditedDec 4 2018, 9:10 PM

Would running the application within Docker be a workaround?

A debian host with ubuntu in a container? Should be fine AFAIK.

Would upgrading to latest Ubuntu be a workaround?

Possibly but only temporarily, that's likely to break at some point (if our puppet code simply ceases to run under ubuntu for example, the instance would have to be removed). You shouldn't just leave systems be like this, you should replace the servers from time to time and that's your opportunity to do a distribution change.

Would making sure the system works with Debian be a workaround?

Yes, you could then replace the instances with fresh Debian ones.

Chippyy added a comment.EditedDec 4 2018, 9:35 PM

Okay, so after the 18th of December, the instance won't be deleted? But when the Ubuntu Trusty LTS EOL happens, it probably will be... and it's likely that the Puppet code will force that change to happen before then? So we don't have any firm timescales of when the switch is being pulled (before April 2019) but 18th Dec is to see which ones are not being used and can safely be switched off?

My main concern is that this is more than a one day thing for me, and there's only a few days remaining, so any additional advice would be great.

bd808 added a comment.Dec 4 2018, 9:36 PM

What happens after the 18th of December?

The announced information at says:

In December 2018 (2018-12-18), deadline. Evaluate if Trusty VMs not migrated are actually in use. If not, just delete them. We will help administrators with migration issues.

In practice this means that the cloud-services-team will audit all instances in all projects on or after this date and determine on a case by case basis how to proceed. I am only one person involved in this discussion, but my personal inclination today would be to allow additional time for implementing plans for migration that have been communicated via phabricator tasks like this one.

There is no desire to break things for the Wikimedia community arbitrarily. We set deadlines because they are a useful way to convey importance and urgency to our user community. That being said, this task was filed on 2018-09-17 and did not receive a substantive response until 2018-11-20 (64 days later). This is to some extent a sign that this project is in longer term danger due to a lack of active and/or engaged maintainers. I know from webserver logs that maps-tiles3.maps.eqiad.wmflabs is in very active use by end users all over the internet, but I have not seen a single comment on this ticket from anyone claiming ownership or responsibility for that instance. The maps project has a long history going back to the Toolserver days, but if it has reached a point where it only has end-users and no maintainers that is a problem for the movement.

I've created T211149 to track the maps-warper2.maps.eqiad.wmflabs instance migration.

TheDJ added a comment.Dec 12 2018, 1:09 PM

I have updated with some information about the current tiles servers as far as I could deconstruct from looking at the instances. If anyone has any further information, that would be appreciated.

TheDJ added a comment.EditedDec 12 2018, 1:36 PM

It seems overpass-wiki instance is not in use. Was once made by Jotpe in 2015. I've sent out an email via his wikimedia wiki account. I suggest we delete it if there is no response before the 18th.

TheDJ added a comment.Dec 12 2018, 1:39 PM

And i've sent out an email on maps-l and wikitech-l

I'm willing to attempt the OS conversion on the VM's but at this point I don't have shell access to the tools project ( apparently infrastructure access is not needed - sorry I'm new... ) and I also need admin access to the maps project. I don't know how to go about getting this but participating in this discussion was suggested as a means, so I'm just leaving this comment in here for now.

I'm working on redoing the maps-wma1 instance as maps-wma. This involves a region change and as a consequecnce it seems the /mnt/nfs/labstore1003-maps directory, which contains my home directory on the old instance is empty on the new instance. Same goes for the project directory. Will I have to copy everything over? Why is there no home dir?

Why is there no home dir?

For murky historical reasons, mounting NFS on a new VM in that project requires a puppet patch. I'll make one for maps-wma now. Are there others that need the same?

Change 479764 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] nfs: add another VM to the Maps nfs mount

Change 479764 merged by Andrew Bogott:
[operations/puppet@production] nfs: add another VM to the Maps nfs mount

@dschwen, if you reboot maps-wma your nfs mounts should be more like what you'd expect now.

maps-warper2 has been migrated to maps-warper3 and web proxy ( switched too, with everything seeming to work okay, but I'd like at least a day before we turn off the old instance just in case...

TheDJ added a comment.Dec 17 2018, 7:59 PM

@Chippyy can you please document the setup somewhere for the future ? That would be super helpful.

TheDJ added a comment.Dec 17 2018, 8:08 PM

@Sasheto hi, I haven't forgotten about your request... Please don't take this the wrong way, but lately there has been quite a LOT of malicious activity and your account is brand new.. And I can't find any history of your activity with the projects... At this time that makes me uncomfortable to give you access to this set of servers as one of them is actually rather critical and could allow a person to do a lot of harm to wikimedians. I'm considering how we should approach this.

@TheDJ Thanks for letting me know and no, I'm not taking it personally. You don't find any history because I'm super new and don't have any contributions yet. Let me know if you have something in mind that I could do to gain the trust of this community prior to doing more critical tasks.

@Chippyy can you please document the setup somewhere for the future ? That would be super helpful.

created T212166 to track this.

TheDJ updated the task description. (Show Details)Dec 18 2018, 3:21 PM
TheDJ added a comment.EditedDec 19 2018, 2:24 PM

Current progress:

  • Created maps-tiles1 with debian stretch
  • Created maps-puppetmaster to be able to write experiment with puppet manifest for the new host
  • Currently stuck on installing puppet master standalone, as it insists on creating /home/gitpuppet, which fails (acl ???) Now fixed
  • Figured out more of the dependencies and setup of the old server. Added to the documentation
  • Pondering how to tackle installing mod_tile in the future, as there is no package for this. Can compile from source, but..
  • make install mod_tile doesn't work for now as it fails to install on libiniparser Fixed by installing from /tmp ?
  • Wondering what the difference is between puppet httpd and apache2 modules, they seem to be used interchangeably..

I have shut down the maps-warper2 instance

(I'd like to keep it around for a week or so before deletion though, just in case)

Hi! Thanks for your work on this.

FYI since the deadline already passed, we agreed on shutting down remaining Trusty instances on 2019-01-18. More info at
It would be great if you folks can have the migration done by then.

maps-tiles1 now has access to the pgsql server.

When running the renderer I see SELECT ST_SRID("way") AS srid FROM planet_osm_polygon WHERE "way" IS NOT NULL LIMIT 1; ERROR: permission denied for relation planet_osm_polygon but the old tiles servers seem to have that problem as well.

Now i need a way to verify if the new renderer is creating proper tiles. I hope to be able to get to that this weekend.

maps-tiles2, maps-tiles3 and maps-wma1 should be shutoff today... It looks like maps-wma1 is still critical, what about the other two? Should this be extended given there is work actively being done on it?

Please, someone provide some estimation of time required for the maps-wma1 instance.

Dalba removed a subscriber: Dalba.Jan 22 2019, 4:42 PM

It's now over a week since more information was requested.

TheDJ added a comment.EditedJan 30 2019, 12:37 PM

I've had 4 hours since Christmas that i was able to spend on this ticket, all 4 were spent on that postgres issue (the patch for which was open for 3 weeks btw... just sayin)
I still haven't had time to verify if tiles1 is actually working now. (I suspect it's not actually.)

For maps-wma1, there is a new maps-wma but i'm not sure if @dschwen has worked on it since T204506#4824815

TheDJ added a comment.EditedFeb 11 2019, 11:01 PM

Spend a couple of hours. I now have tile rendering working again on tiles3 of the old instance as well as on tiles1 of the new server.

The new server is generating into a separate directory for now, as I wanted to confirm operation. Will work on consolidating the configurations in the next few days and attempt to verify operation of the new server. Then i should be able to let go of the old tiles instances. now shows tiles from the new instance. I'll clean that every day, as i don't want us to double the /data/project usage.

FYI, pretty sure that tile updates were not running for many many months, as @Bstorm suspected in T215560: Can't read from OSM replicas when connecting from Stretch bastion

For maps-wma1, there is a new maps-wma but i'm not sure if @dschwen has worked on it since T204506#4824815

Yeah, I have not. The home directory has not magically appeared. Do I have to rebuild the VM? I'm confused.

@dschwen the instance needs to be rebooted for it to appear.

TheDJ added a subscriber: Bstorm.Feb 11 2019, 11:07 PM
TheDJ removed a subscriber: Bstorm.

Ok, will do that later. No access from work.

@dschwen I triggered a reboot on it and can confirm the homedirs are mounted on that instance.

TheDJ added a comment.EditedFeb 13 2019, 10:20 AM

Hmm, osm tile rendering seems extremely slow...

Rendering client
Starting 1 rendering threads
Initial startup costs
Rendered 1 tiles in 0.00 seconds (62500.00 tiles/s)

Zoom(0) Now rendering 1 tiles
Rendered 1 tiles in 656.16 seconds (0.00 tiles/s)

Zoom(1) Now rendering 2 tiles
Rendered 2 tiles in 921.96 seconds (0.00 tiles/s)

This is on both old and new tile servers.
Anyone know if that is normal ? 15 minutes for 2 tiles at zoom level 1 ?

11 minutes for z0 is good (Kartotherian does this in ~13).

@MaxSem thanks for confirming !

TheDJ added a comment.Tue, Feb 26, 9:43 PM

Hmm, think I just figured out that none of the cached tiles have been expired since at least early 2015... learning so much...

TheDJ added a comment.Wed, Feb 27, 9:50 PM

I have now setup munin and have a grip on how the clusterization of renderd works. Next step is to make tiles1 the new master for renderd and http traffic ! Hope to get to that tomorrow evening.

TheDJ added a comment.Tue, Mar 12, 8:12 AM

The tiles traffic and tiles rendering is now primarily on the new server. I intend to verify this for a few days before shutting down the old servers completely.

Gehel awarded a token.Tue, Mar 12, 2:08 PM
TheDJ added a comment.EditedSat, Mar 16, 4:14 PM

After T218145: maps: take back root owned files/dirs from root_squash protected nfs old tiles can suddenly be regenerated again, causing an explosion of tile render events, in turn causing most render jobs to be dropped. I have temporarily disabled serving of hikebike, osm-no-labels and osm-bw, to allow the server to slowly catch up a bit on rendering old tiles.

TIL, that if you append /status to a tile, you can find out when it was last generated, and adding /dirty causes it to be put on the render queue.

Still seem to have some issues in other spots.
Also still need to take care of adding config files to redirect renderd logging to /var/log/renderd.log and a logrotate for that as well.

A user on one of the subtasks wrote:

JOSM maintainer here.
Since a few days we're unable to access two imagery layers hosted on '{zoom}/{x}/{y}.png' and '{zoom}/{x}/{y}.png'.
Is it due to this change? Is the service still provided?

TheDJ added a comment.EditedMon, Mar 18, 9:41 AM

Hi @Don-vip

Yes, as you may note from this ticket, this service has been mostly running unmaintained since 2015 (many tiles were not refreshed since date as well... ). As we are going through several large changes in our cloud platform these services needed migration. Over the past 3 months, as a volunteer, I've taken on the job of getting familiar with this service and starting to rebuild this service. This might lead to reduced availability of maps in the short term as I deal with several large problems in the little time I have available to work on this.

It is my intent that hikebike, osm-bw and osm-no-labels are retained in the long run.