Page MenuHomePhabricator

Downgrade pfw1-codfw to Junos 23.4R2-S3
Closed, ResolvedPublic

Description

pfw1-codfw currently runs 24.2R1.17, but some gNMI features are not working (see https://grafana.wikimedia.org/d/b0794d9e-cbc6-4fb6-aa24-fda6ce7c226c/gnmi-features-coverage?orgId=1 )
On the other side pfw1-eqiad is working as expected on 23.4

Could you sync up with the frack team to downgrade pfw1-codfw to a known working version ?

Event Timeline

@Jgreen @Dwisehaupt When do you think is best for me to work on this? Thank you.

Papaul triaged this task as Medium priority.May 14 2025, 12:43 AM

@Jgreen @Dwisehaupt When do you think is best for me to work on this? Thank you.

We have a frack maintenance week starting 5/27, if this is something that can wait that long let's plan it for early that week.

@Jgreen @Dwisehaupt When do you think is best for me to work on this? Thank you.

We have a frack maintenance week starting 5/27, if this is something that can wait that long let's plan it for early that week.

Yeah I don't think it's super-urgent that should be fine.

@ayounsi @cmooney siice i am out that week can someone take over this or wait when i am back . thanks

cmooney added a subscriber: Papaul.

@ayounsi @cmooney siice i am out that week can someone take over this or wait when i am back . thanks

Of course yeah, I can take care of it.

Just pinging on this. Maintenance week is this week and we are ok for the work to happen when you are ready. Morning to midday UTC is extra good for us as most of the US folks will still be asleep.

cmooney lowered the priority of this task from Medium to Low.Jun 4 2025, 11:57 AM

Just pinging on this. Maintenance week is this week and we are ok for the work to happen when you are ready. Morning to midday UTC is extra good for us as most of the US folks will still be asleep.

Dallas my apologies I missed this note last week and for some reason my mental calendar had this pencilled in for mid-June. We'll need to pick it up in the next maintenance window instead I think. Not terribly urgent the benefit our side is just to standardise the version with eqiad, but no functional pressing issue.

Just on this one we hit another issue with gnmi stats for this one.

In a change made a few days ago we added the "openconfig:" prefix to the BGP stats collection, which is needed for our Nokia devices to tell them to use the OpenConfig models. For all our other Juniper devices, including pfw1-eqiad, this works fine.

However pfw1-codfw is failing to return the stats now because of this. So more reason to downgrade - I guess during the February maintenance window?

Change #1198515 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/puppet@production] gnmic: split bgp collection targets for Juniper/Nokia

https://gerrit.wikimedia.org/r/1198515

Change #1198515 merged by Cathal Mooney:

[operations/puppet@production] gnmic: split bgp collection targets for Juniper/Nokia

https://gerrit.wikimedia.org/r/1198515

@cmooney We are clear for you to do this. We have a maint window coming up 2025-11-03 - 2025-11-07 if you want to do it quickly. Otherwise we are ok with it in the February window 2026-02-02 - 2026-02-06. Just let us know when works best.

@Dwisehaupt hello yes we can do this during the maintenance windows in November. Any day you prefer for that week? Thank you

@Papaul How does Wednesday 11/5 work? That would be good for us. Thursday would be a good alternative also.

@Dwisehaupt yes Wednesday 11/5 is ok with me. Let us do 10:00am CT. Thank you.

Icinga downtime and Alertmanager silence (ID=0cf41cdd-05b0-49da-ab43-1e9132f58a47) set by pt1979@cumin2002 for 2:00:00 on 1 host(s) and their services with reason: pfw1a/b-codfw

pfw1-codfw

Bother firewalls are not running Junos: 23.4R2-S5.5. Thanks to @Jgreen and @Dwisehaupt.
Closing this task now

Change #1202205 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/puppet@production] gnmic config: merge juniper and nokia bgp collection again

https://gerrit.wikimedia.org/r/1202205

Change #1202205 merged by Cathal Mooney:

[operations/puppet@production] gnmic config: merge juniper and nokia bgp collection again

https://gerrit.wikimedia.org/r/1202205