cloudvps: maps project trusty deprecation
Open, NormalPublic

Description

Ubuntu Trusty is no longer available in Cloud VPS since Nov 2017 for new instances. However, the EOL of Trusty is approaching in 2019 and we need to move to Debian Stretch before that date.

All instances in the maps project needs to upgrade as soon as possible.

The list of affected VMs is:

  • maps-tiles2.maps.eqiad.wmflabs
  • maps-tiles3.maps.eqiad.wmflabs
  • maps-warper2.maps.eqiad.wmflabs
  • maps-wma1.maps.eqiad.wmflabs

Listed administrator are:

More info in openstack browser: https://tools.wmflabs.org/openstack-browser/project/maps

TODO:

  • figure out the current configuration for the servers
  • create new Debian 9 (Stretch) VMs for each and configure puppet client
    • Created maps-tiles1 instance and a maps-puppetmaster instance to experiment with a puppet config
    • maps-tiles1 is mapped to maps.wmflabs.org (previously unused)
    • setup puppet
  • create new manifests for new versions of packages and configurations in puppet and deploy
  • transfer any data that needs to be transfered for proper tile server operation (likely mostly via nfs ?)
  • test that all functionality is transferred and that everything works ok
  • shut down old servers, turn on new ones
  • delete old servers
Krenair created this task.Sep 17 2018, 12:31 PM
Krenair triaged this task as Normal priority.
TheDJ added a comment.Sep 17 2018, 2:45 PM

Also sounds like a good time to trim the access list a bit of any people that are 100% no longer involved in the projects.

possibly but that's way out of scope

maps-wma1.maps.eqiad.wmflabs was upgraded in place long ago! Please remove it from your list.

See https://phabricator.wikimedia.org/T143349 (the instance was upgraded to xenial)

See https://phabricator.wikimedia.org/T143349 (the instance was upgraded to xenial)

In the long term there may be additional issues that come up from the continued use of Ubuntu. Xenial only works today as a side effect of support for Trusty. I am not making a declaration that you must jump over to Debian for this task, but this is something for folks to be aware of.

The virtual instances in Cloud VPS projects generally need to run our basic Puppet code so that things like LDAP authentication, NFS services, and our emergency root privileges work as designed. It is possible (and actually likely) that at some point following the Trusty deprecation project there will be changes in the operation/puppet.git Puppet codebase that will only work on Debian servers. When that happens we will have to figure out how to replace or remove virtual machines running unsupported operating systems.

Hey @aude @Awjrichards @Chippyy @cmarqu @coren @dschwen @jeremyb @Kolossos @MaxSem @Multichill @Nosy @TheDJ! Just a friendly reminder that you should get rid of your Trusty instances as described in https://wikitech.wikimedia.org/wiki/News/Trusty_deprecation#Cloud_VPS_projects. The deadline is 2018-12-18. Please get in contact if you need help. Also please assign this task to an individual.

Krenair updated the task description. (Show Details)Oct 22 2018, 1:43 PM

Another ping. Deadline is approaching (2018-12-18).

maps-warper2.maps.eqiad.wmflabs is still actively in use. I can fit in some time to upgrade before the deadline but I'm not sure how easy it would be to upgrade to Debian (if at all). Also unknown if the software which has only been tested in Ubuntu environment would work on Debian without significant development time. What are my options?

TheDJ added a comment.Nov 20 2018, 2:46 PM

I'd be willing to help convert older tiles server instances if someone brings the maptile specific knowledge that I lack. This is assuming that software stack will run on newer debian.

...I am not making a declaration that you must jump over to Debian for this task, but this is something for folks to be aware of.

If I could ask again. I am confused what to do. What are my options with maps-warper2 ? Do I just need to upgrade Ubuntu? The Wikimaps Warper is set up to work with Ubuntu. It might be a significant task to get it working on another OS, rather than a more recent LTS of Ubuntu.

Would running the application within Docker be a workaround?
Would upgrading to latest Ubuntu be a workaround?
Would making sure the system works with Debian be a workaround?

What happens after the 18th of December?

Krenair added a comment.EditedDec 4 2018, 9:10 PM

Would running the application within Docker be a workaround?

A debian host with ubuntu in a container? Should be fine AFAIK.

Would upgrading to latest Ubuntu be a workaround?

Possibly but only temporarily, that's likely to break at some point (if our puppet code simply ceases to run under ubuntu for example, the instance would have to be removed). You shouldn't just leave systems be like this, you should replace the servers from time to time and that's your opportunity to do a distribution change.

Would making sure the system works with Debian be a workaround?

Yes, you could then replace the instances with fresh Debian ones.

Chippyy added a comment.EditedDec 4 2018, 9:35 PM

Okay, so after the 18th of December, the instance won't be deleted? But when the Ubuntu Trusty LTS EOL happens, it probably will be... and it's likely that the Puppet code will force that change to happen before then? So we don't have any firm timescales of when the switch is being pulled (before April 2019) but 18th Dec is to see which ones are not being used and can safely be switched off?

My main concern is that this is more than a one day thing for me, and there's only a few days remaining, so any additional advice would be great.

bd808 added a comment.Dec 4 2018, 9:36 PM

What happens after the 18th of December?

The announced information at https://wikitech.wikimedia.org/wiki/News/Trusty_deprecation says:

In December 2018 (2018-12-18), deadline. Evaluate if Trusty VMs not migrated are actually in use. If not, just delete them. We will help administrators with migration issues.

In practice this means that the cloud-services-team will audit all instances in all projects on or after this date and determine on a case by case basis how to proceed. I am only one person involved in this discussion, but my personal inclination today would be to allow additional time for implementing plans for migration that have been communicated via phabricator tasks like this one.

There is no desire to break things for the Wikimedia community arbitrarily. We set deadlines because they are a useful way to convey importance and urgency to our user community. That being said, this task was filed on 2018-09-17 and did not receive a substantive response until 2018-11-20 (64 days later). This is to some extent a sign that this project is in longer term danger due to a lack of active and/or engaged maintainers. I know from webserver logs that maps-tiles3.maps.eqiad.wmflabs is in very active use by end users all over the internet, but I have not seen a single comment on this ticket from anyone claiming ownership or responsibility for that instance. The maps project has a long history going back to the Toolserver days, but if it has reached a point where it only has end-users and no maintainers that is a problem for the movement.

I've created T211149 to track the maps-warper2.maps.eqiad.wmflabs instance migration.

TheDJ added a comment.Dec 12 2018, 1:09 PM

I have updated https://wikitech.wikimedia.org/wiki/OSM_Tileserver#Technology_stack with some information about the current tiles servers as far as I could deconstruct from looking at the instances. If anyone has any further information, that would be appreciated.

TheDJ added a comment.EditedDec 12 2018, 1:36 PM

It seems overpass-wiki instance is not in use. Was once made by Jotpe in 2015. I've sent out an email via his wikimedia wiki account. I suggest we delete it if there is no response before the 18th.

TheDJ added a comment.Dec 12 2018, 1:39 PM

And i've sent out an email on maps-l and wikitech-l

I'm willing to attempt the OS conversion on the VM's but at this point I don't have shell access to the tools project ( apparently infrastructure access is not needed - sorry I'm new... ) and I also need admin access to the maps project. I don't know how to go about getting this but participating in this discussion was suggested as a means, so I'm just leaving this comment in here for now.

I'm working on redoing the maps-wma1 instance as maps-wma. This involves a region change and as a consequecnce it seems the /mnt/nfs/labstore1003-maps directory, which contains my home directory on the old instance is empty on the new instance. Same goes for the project directory. Will I have to copy everything over? Why is there no home dir?

Why is there no home dir?

For murky historical reasons, mounting NFS on a new VM in that project requires a puppet patch. I'll make one for maps-wma now. Are there others that need the same?

Change 479764 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] nfs: add another VM to the Maps nfs mount

https://gerrit.wikimedia.org/r/479764

Change 479764 merged by Andrew Bogott:
[operations/puppet@production] nfs: add another VM to the Maps nfs mount

https://gerrit.wikimedia.org/r/479764

@dschwen, if you reboot maps-wma your nfs mounts should be more like what you'd expect now.

maps-warper2 has been migrated to maps-warper3 and web proxy (warper.wmflabs.org) switched too, with everything seeming to work okay, but I'd like at least a day before we turn off the old instance just in case...

TheDJ added a comment.Mon, Dec 17, 7:59 PM

@Chippyy can you please document the setup somewhere for the future ? That would be super helpful.

TheDJ added a comment.Mon, Dec 17, 8:08 PM

@Sasheto hi, I haven't forgotten about your request... Please don't take this the wrong way, but lately there has been quite a LOT of malicious activity and your account is brand new.. And I can't find any history of your activity with the projects... At this time that makes me uncomfortable to give you access to this set of servers as one of them is actually rather critical and could allow a person to do a lot of harm to wikimedians. I'm considering how we should approach this.

@TheDJ Thanks for letting me know and no, I'm not taking it personally. You don't find any history because I'm super new and don't have any contributions yet. Let me know if you have something in mind that I could do to gain the trust of this community prior to doing more critical tasks.

@Chippyy can you please document the setup somewhere for the future ? That would be super helpful.

created T212166 to track this.

TheDJ updated the task description. (Show Details)Tue, Dec 18, 3:21 PM
TheDJ added a comment.EditedWed, Dec 19, 2:24 PM

Current progress:

  • Created maps-tiles1 with debian stretch
  • Created maps-puppetmaster to be able to write experiment with puppet manifest for the new host
  • Currently stuck on installing puppet master standalone, as it insists on creating /home/gitpuppet, which fails (acl ???) Now fixed
  • Figured out more of the dependencies and setup of the old server. Added to the documentation
  • Pondering how to tackle installing mod_tile in the future, as there is no package for this. Can compile from source, but..
  • make install mod_tile doesn't work for now as it fails to install on libiniparser Fixed by installing from /tmp ?
  • Wondering what the difference is between puppet httpd and apache2 modules, they seem to be used interchangeably..

I have shut down the maps-warper2 instance

(I'd like to keep it around for a week or so before deletion though, just in case)

Hi! Thanks for your work on this.

FYI since the deadline already passed, we agreed on shutting down remaining Trusty instances on 2019-01-18. More info at https://wikitech.wikimedia.org/wiki/News/Trusty_deprecation#Cloud_VPS_projects
It would be great if you folks can have the migration done by then.

maps-tiles1 now has access to the pgsql server.

When running the renderer I see SELECT ST_SRID("way") AS srid FROM planet_osm_polygon WHERE "way" IS NOT NULL LIMIT 1; ERROR: permission denied for relation planet_osm_polygon but the old tiles servers seem to have that problem as well.

Now i need a way to verify if the new renderer is creating proper tiles. I hope to be able to get to that this weekend.