Still on precise, migrate to jessie. This would also involve a postgresql update from 9.1 to 9.4
Description
Details
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Resolved | Dzahn | T123525 reduce amount of remaining Ubuntu 12.04 (precise) systems in production | |||
| Resolved | jcrespo | T123731 Migrate labsdb1005/1006/1007 to jessie | |||
| Resolved | Marostegui | T157358 labsdb1005 (mysql) maintenance for reimage | |||
| Duplicate | None | T157359 labsdb1006/1007 (postgresql) maintenance |
Event Timeline
1005 as well:
labsdb1005.eqiad.wmnet: True
labsdb1006.eqiad.wmnet: True
labsdb1007.eqiad.wmnet: True
one thought is we have an influx of new labsdb things coming I believe. This way sort itself out w/o a lot of in-place shuffling.
There is indeed a replacement for labsdb100[123] about to arrive. However, there are no short-term plans for these, as they have lower impact.
labsdb1005.eqiad.wmnet has already a jessie slave- so it should be able to have scheduled downtime soon.
Not much is happening for the postgres slaves.
Setting as stalled, though next steps look like this:
- Flip tools master from labsdb1005 to labsdb1004
- Decommission labsdb1005
Not sure about postgresql/osm steps and if the osm roles are jessie-ready yet, @akosiaris perhaps?
Change 318520 had a related patch set uploaded (by Jcrespo):
labsdb-toolsdb: Cleaning up tls certificates
We need to schedule a downtime to do this move from labsdb1005 to labsdb1004. This should be a very short window of actual outage.
Need to sort out impact of this maint.
@yuvipanda I think the asks here if you could think on it are:
- Who do we let know toolsdb is going down
- What are the impacts of toolsdb going down (just generally)
-1- assume labs-announce is fine
-2- I'm not sure? Are things using toolsdb going to be ok? Assuming things reconnect on failure (probably a bad assumption) it's a small window of issue but looking for input. There may be little we can do to shore that up other than verbose announcement.
I wonder if it'll be better to do this next quarter. We've already done a few bits of pretty disruptive maintenance, and have one coming up next week.
If not next quarter, how about 2nd week of December?
I wonder if it'll be better to do this next quarter.
I am ok with next quarter- let's set a time. I have workarounded the 5.5 support on puppet, so this is no longer a blocker.
Just one comment- note that in theory this maintenance is not disruptive.
Early January is here and the 15th is coming up fast -- @yuvipanda rightfully mentioned above that this will need a (presumably advance) notice to labs-announce, so… friendly ping :)
+1. let's meet at some point to organize the details of how to do it (there is several possibilities) and send an announcement.
I didn't manage to send out the announcement due to unforseen personal issues. I'll send it out now after checking with jynus.
Update: Since I'll be travelling on the 25th, I'm going to push this out to early February instead. I'll ping @jcrespo when he's back from vacation next week to put a solid date on it and make an announcement.
Change 337775 had a related patch set uploaded (by Yuvipanda):
tools: Make DNS point to labsdb1004 and not 1005
I see that labs1005/1006/1007 are all either re-installed or down. They don't show up as precise anymore when checking with salt.
Is this resolved (besides a decom subtask maybe?)?
Dzhan- the "reinstall as jessie" part is done, but the setup of the passive replica is not 100% complete. It will take one commit to fix it and some extra time for reimport- but there is not way to revert it anymore. I just got distracted with more important ongoing issues.
Change 343670 had a related patch set uploaded (by Jcrespo):
[operations/puppet] Change osm master to be labsdb1007 on configuration
Change 343670 merged by Jcrespo:
[operations/puppet] Change osm master to be labsdb1007 on configuration