Page MenuHomePhabricator

Deprecate precise instances in Labs by 2017-03-31
Closed, ResolvedPublic

Description

(not counting Tools and deployment-prep, which should have their own plans).

We'd like to shut down all precise instances once LTS support for them ends in March / April 2017. Plan what to do and communicate to make sure people are aware. Users should be encouraged to upgrade to Debian Jessie.

Still present:

TenantInstanceCan be deleted? (per whom)Status

Resolved:

TenantInstanceCan be deleted? (per whom)Status
analyticslimn1yes (on 3/31)shutdown in https://phabricator.wikimedia.org/T146308#3134829
botswm-botasked T143349#2946792 and T157838deleted, done
integrationintegration-slave-precise-1002 deletedT158652 - Zend 5.3 for CI T143349#2683793
integrationintegration-slave-precise-1012 deletedT158652
integrationintegration-slave-precise-1011 deletedT158652
account-creation-assistanceaccounts-db2T143349#2963643deleted
contributorscontributors-metricsT143349#2962882deleted
editor-engagementmwuiT154616shutdown as per ticket discussion
editor-engagementdocsT157701deleted
fastccifastcci-masterdone by upgrade, do not delete!upgraded, resolved
fastccifastcci-worker1done by upgrade, do not delete!upgraded, resolved
hugglehuggleT157710upgraded in place
integrationintegration-publisherT156064Replaced by a Jessie one
mapsmaps-warperT159846Replaced, done
mapsmaps-wma1done by upgrade, do not delete!upgraded, resolved
mapsmaps-tiles1T157708presumed unused. Shut down, delete on 3/31
mediahandler-testsmediahandler-tests-staticT143349#2921022 and https://wikitech.wikimedia.org/wiki/User_talk:Rillke#Labs_instance_mediahandler-tests-static.mediahandler-tests.eqiad.wmflabsshutdown; delete on 3/31
openstacklabs-vmbuilder-preciseyesWe need to keep this until we're 100% sure we'll never need another precise base image
otrsotrs-test2yes (as per akosiaris and filippo)shutdown; delete on 3/31
signwritingsignwriting-ase-wikiyes (as per Slevinski)shutdown, delete on 3/31
social-toolssocial-tools1deleted
snugglesnuggle-enT158970deleted
sugarcrmofficetoolsT128819#2965217 not yetdeleted
utrsutrs-primaryT143349#2965177 and T159737deleted
visualeditortowtruckshutoff
wikidata-devwikidata-suggesterT143349#2963021removed
wikidata-devwikidata-wdq-mmT143349#2963902 (delete after 1/31)shutdown
wikisource-devwikisource-devshutdown. Discussion at https://meta.wikimedia.org/wiki/User_talk:Billinghurst#Labs_project_.27wikisource-dev.27
wikisource-toolswsexportshutdown
wikistreamwikistream-webshutoff, delete 3/31
wildcatdannyb-largedeleted
wildcatdannyb-testdeleted
wildcatdannybyesshutdown, delete at end of March
wlmjudgingwlm-mysql-mastershutdown -- asking leila
wlmjudgingwlm-apache1shutdown -- asking leila

Notices sent:

Dec 19th https://lists.wikimedia.org/pipermail/labs-announce/2016-December/000193.html
Feb 27 https://lists.wikimedia.org/pipermail/labs-announce/2017-February/000211.html
March 6 https://lists.wikimedia.org/pipermail/labs-announce/2017-March/000213.html
March 14 https://lists.wikimedia.org/pipermail/labs-announce/2017-March/000219.html
March 27 https://lists.wikimedia.org/pipermail/labs-announce/2017-March/000225.html

Related Objects

StatusSubtypeAssignedTask
ResolvedAndrew
ResolvedAndrew
Resolvedyuvipanda
Resolvedcoren
ResolvedAndrew
Resolvedyuvipanda
Resolvedyuvipanda
Resolvedscfc
Resolvedbd808
Resolvedbd808
Resolvedbd808
ResolvedAndrew
Resolvedyuvipanda
Resolved madhuvishy
Resolvedzhuyifei1999
ResolvedMusikAnimal
Resolvedbd808
Resolvedbd808
Resolvedbd808
Resolvedyuvipanda
Resolvedcoren
ResolvedAndrew
Resolvedbd808
Resolvedbd808
DuplicateNone
Resolvedbd808
ResolvedGiftpflanze
Resolvedbd808
ResolvedAndrew
Resolvedhashar
ResolvedRillke
Resolveddschwen
Resolvedhashar
ResolvedTParis
ResolvedChippyy
ResolvedPetrb
ResolvedAndrew
Resolved chasemp

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

For CI , we need to keep some Precise instances which are used to run Zend PHP 5.3 tests. That is until MediaWiki 1.23 LTS reaches end of life in May 2017 (lifecycle).

@hashar, is there some kind of back- or forward-port to Trusty that would address this issue?

Not without significant effort from ourselves, see T103786#2613662 and T143349#2565923.

About Zend 5.3 / CI:

Context

Candidates were to use https://github.com/phpenv/phpenvhttps://github.com/CHH/phpenv but they are not packaged for Debian, barely maintained and that is a lot of cruft to add to CI. Instead we went splitting jobs on different distributions, the current shards are:

DistroPHP Flavor
Precise Zend 5.3
 TrustyZend 5.5
JessieZend 5.6, Zend 7.0 and HHVM

For Zend 7.0 we use the package from https://deb.sury.org which are co installable up from PHP 5.5 (iirc). So on Jessie we have: /usr/bin/php5 /usr/bin/php7 and /usr/bin/hhvm. We have a php script that picks whatever version is passed to the CI build via a PHP_BIN env variable.

Eventually we will coinstall Zend 5.5 on Jessie as well ( T144959 ) and phase out Trusty instances entirely, specially since Wikimedia production has been doing the same.

The PHP shell script and sharding has a huge advantage: it is ridiculously easy to maintain for us . Set the version you want in CI, invoke the wrapper of few lines of code: success. The oddity is we have to do some routing to map Zend 5.3 jobs to Precise and Zend 5.5 ones to Trusty.

I did try a few times to compile Zend 5.3 on Jessie stripping down some extensions, borrowing patches here and there. But I am illiterate in C or the PHP build chain so that did not go to far.

Plan as of now

CI envisioned porting Zend PHP 5.3 support on Trusty or Jessie back in June 2015 eventually after much discussion and the above work happening (php wrapper, package from sury.org) I have declined the idea. My assumption is that it is not worth the trouble just for the six months or so we would have used them (November 2016 - May 2017).

The end of life of MediaWiki 1.23 is May 2016 which would be during the European Hackathon. At that point:

  • the REL1_23 branches will be closed
  • all the CI related configuration for REL1_23 removed
  • the php53 Jenkins configuration removed
  • the Precise instances deleted

Other solutions?

I understand that Precise has to be dropped due to end of quarter on 03/31 and most importantly Ubuntu dropping support for it in April. It is surely annoying that MediaWiki 1.23 supports just a month later (in May). So from there I guess the choices are:

  • phase out precise / drop PHP5.3. Leaving us without Zend 5.3 CI from April 1st to the May EOL of 1.23. If a new 1.23 security release has to be cut out, figure out a manual strategy to test MediaWiki.
  • Keep the Precise instances as-is till EOL. They can use a stripped down set of puppet manifests so the rest can be purged from Precise references.
  • Figure out a way to forward port Zend 5.3 to Trusty/Jessie in a way that is co installable with the PHP flavors we have (eg /usr/bin/php53 /usr/share/php53 etc). Which might work or not depending on lib versions and whatever MediaWiki codes ends up relying on.

My preference goes toward keeping the Precise instance for an extra month and a half and phase them out the moment 1.23 is EOL.

I think we should simply drop 5.3 from the CI tests, then. I wasn't aware that the PHP versions had to be co-installable, which makes a custom 5.3 build for trusty a far more complicated endeavour.

Or you could simply declare that a 1.23 security updated released between April and May is no longer supported with PHP 5.3. Which makes a lot of sense since there won't be any distros with 5.3 under security support around either (precise EOLed, squeeze EOLed, RHEL6 now provides official 5.6 packages via the rh-php56 package). And PHP doesn't provide updates for 5.3.x either for a long time.

My personal order of preference would be (best -> worst):

  1. Manually test 1.23 releases with PHP 5.3 (e.g. in a local VM or Vagrant or something); fix PHP 5.3 incompatibilities -if any- right before releasing a tarball. How big is the rate of change in the 1.23 release anyway and how many tarball releases do we anticipate making in that 2-3 month timeframe?
  2. Drop PHP 5.3 support from 1.23, as there is not going to be any security-support for it anyway by anyone.
  3. Keep -and firewall off- the precise instances, keeping them as-is, no puppet
  4. Keep -and firewall off- the precise instances, pointed at a puppetmaster with a fork of our tree that keeps precise support
  5. Stop the support of 1.23 early (I don't see why that is better than dropping PHP 5.3 support and it breaks a promise we previously made)
  6. Forward-port PHP 5.3 to trusty (a lot of work, especially given how CI is set up, for very little gain)
  7. Keep -and firewall off- the precise instances; keep precise support in the prod/Labs puppet tree.

I'm strongly in favor of #1. At this point, 1.23 is likely only to have security updates that need to be manually tested and skip the CI pipeline anyway.

My personal order of preference would be (best -> worst):

  1. Manually test 1.23 releases with PHP 5.3 (e.g. in a local VM or Vagrant or something); fix PHP 5.3 incompatibilities -if any- right before releasing a tarball. How big is the rate of change in the 1.23 release anyway and how many tarball releases do we anticipate making in that 2-3 month timeframe?

Current release is 1.23.15, there will probably be a 1.23.16 before MW LTS EOL, but unsure on timing (before Precise EOL not sure). I don't foresee there being two more releases though, so at worst it is (probably) one.

I'm strongly in favor of #1. At this point, 1.23 is likely only to have security updates that need to be manually tested and skip the CI pipeline anyway.

Let's just do that, and we can do it now even. 5.3 testing isn't so useful to the 1.23 branch (the testing is indeed mostly manual and the diff of backported stuff is minimal). Then we don't have to break back-compat by removing support, declaring an early EOL, or keep any of the precise instances.

I think we're all in agreement here?

I think we should simply drop 5.3 from the CI tests, then. I wasn't aware that the PHP versions had to be co-installable, which makes a custom 5.3 build for trusty a far more complicated endeavour.

I am not sure we really need this: I think we could in theory create special CI slaves to which we could send the php 5.3 jobs where php 5.3 is installed (manually? from a ppa? whatever). It's a bit of puppet work but I am ready to bet, not too much.

Esp. if we're willing to use third-party packages (from a ubuntu ppa).

Having said that, your other remark stays: php 5.3 is EOL since forever according to PHP itself, and no major linux distro will support it beyond April. So it makes sense to drop it from testing after that.

Option #1 is fine with me. Otherwise, @hashar, if you want to go with _joe_'s suggestion I'm available to help wrangle puppet for special-purpose testing nodes.

I regard the fate of the CI instances as in @hashar's hands for now. The releng team will discuss this issue tomorrow and report back here with a plan.

I'm strongly in favor of #1. At this point, 1.23 is likely only to have security updates that need to be manually tested and skip the CI pipeline anyway.

Let's just do that, and we can do it now even. 5.3 testing isn't so useful to the 1.23 branch (the testing is indeed mostly manual and the diff of backported stuff is minimal). Then we don't have to break back-compat by removing support, declaring an early EOL, or keep any of the precise instances.

I think we're all in agreement here?

I synced with @thcipriani and @demon about it. We are going to drop the Zend 5.3 support from CI entirely. On the basis that:

  • mediawiki/core security patches are tested all manually and the currently running tests barely offer any support
  • the few other mediawiki extensions only have the PHP linter, and a Zend 5.5 linter would be good enough

Change 337205 had a related patch set uploaded (by Dzahn):
labs_vagrant: drop precise support

https://gerrit.wikimedia.org/r/337205

Change 337207 had a related patch set uploaded (by Dzahn):
toollabs: drop precise-related monitoring check

https://gerrit.wikimedia.org/r/337207

@hashar, does that I mean I can schedule deletion of integration-slave-precise-*? And, if so, would you prefer that I delete them immediately, or postpone until late March?

Later in March. I am busy refactoring Jenkins puppet manifests this week and will probably as well next week.

I have to figure out all the CI configuration changes that it involves and draft an announcement. Will track progress via the child task T158652.

Once they are no more needed I will delete the Precise instances and update this task.

Email nag sent to labs-announce on 2017-02-27

Can i have a review for these?

https://gerrit.wikimedia.org/r/#/c/337207/ (drop icinga monitoring check called " tools-checker-grid-start-precise"

https://gerrit.wikimedia.org/r/337205 (drop precise support in vagrant)

Can i have a review for these?

https://gerrit.wikimedia.org/r/#/c/337207/ (drop icinga monitoring check called " tools-checker-grid-start-precise"

We don't want to merge this until 4/1 do we?

I'm in the middle of a move from Hawaii to Texas, Deltaquad would be the
best poc. But if it isn't moved by March 1st then I'll take care of it.

Vr
TParis

Thanks @TParis, I'll mark it as such

@TParis any news here? I notice the instance is still active.

bcb320a5-75ab-4bc3-a4db-b96c618e5a7butrs-primaryutrsACTIVE-Runningpublic=10.68.16.229, 208.80.155.172
Andrew updated the task description. (Show Details)

Change 337205 abandoned by Dzahn:
labs_vagrant: drop precise support

Reason:
i tried

https://gerrit.wikimedia.org/r/337205

Ricordisamoa renamed this task from Deprecate precise instances in Labs by 03/31/2017 to Deprecate precise instances in Labs by 2017-03-31.Mar 10 2017, 4:00 PM

Status update for integration

Child task is T158652

I have migrated all php53 jobs from the Precise instances to php55 jobs on Trusty instances

All three slaves have been disabled in Jenkins.

On March 20th if everything is fine, I will shutdown and delete all three instances.

A note that the appointed time grows nigh, and this is quickly becoming the most mysterious item left on the list:

wildcat dannyb no Andrew working with Danny on migration

ping @Danny_B

I have finally deleted all three Precise instances from the integration labs project and updated the task detail to reflect it. The sub task T158652 is still open pending puppet patches, but that is not a concern for this task.

A note that the appointed time grows nigh, and this is quickly becoming the most mysterious item left on the list:

wildcat dannyb no Andrew working with Danny on migration

ping @Danny_B

@Danny_B you are the last man standing for precise :) The rest are operational overhead AFAIK. Can you respond here with a time you and @Andrew can get together on this?

I have finally deleted all three Precise instances from the integration labs project and updated the task detail to reflect it. The sub task T158652 is still open pending puppet patches, but that is not a concern for this task.

thank you @hashar

@chasemp wrote:

thank you @hashar

You are welcome. And thank you Cloud-Services team to have babysitted and pushed for that Precise phase out sprint \o/

Andrew claimed this task.

I've just removed all of the shutdown Precise instances.

A few instances remain that have Precise base images but I've reconfirmed that they've all been upgraded in place and are running Trusty or Xenial. They are:

fastcci: fastcci-master (ACTIVE)
fastcci: fastcci-worker1 (ACTIVE)
huggle: huggle (ACTIVE)
maps: maps-wma1 (ACTIVE)
tools: tools-mail (ACTIVE)

Change 337207 abandoned by Andrew Bogott:
toollabs: drop precise-related monitoring check

Reason:
Looks like this was already taken care of by ff423bdb3f6f7287b30be36804aea725ca9908fa

https://gerrit.wikimedia.org/r/337207