Page MenuHomePhabricator

Upgrade netbox-next to 2.9 series
Closed, ResolvedPublic

Event Timeline

crusnov created this task.Oct 26 2020, 5:09 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 26 2020, 5:09 PM
crusnov triaged this task as Medium priority.Oct 26 2020, 5:09 PM

Change 636464 had a related patch set uploaded (by CRusnov; owner: CRusnov):
[operations/software/netbox-deploy@master] Upgrade Netbox to 2.9.7

https://gerrit.wikimedia.org/r/636464

Mentioned in SAL (#wikimedia-operations) [2020-11-24T18:44:36Z] <crusnov@deploy1001> Started deploy [netbox/deploy@88f61d0]: Test deploy of 2.9.9 to netbox-next T266488

Mentioned in SAL (#wikimedia-operations) [2020-11-24T18:45:46Z] <crusnov@deploy1001> Finished deploy [netbox/deploy@88f61d0]: Test deploy of 2.9.9 to netbox-next T266488 (duration: 01m 09s)

Mentioned in SAL (#wikimedia-operations) [2020-11-24T18:45:52Z] <crusnov@deploy1001> Started deploy [netbox/deploy@88f61d0]: Test deploy of 2.9.9 to netbox-next T266488 p2

Mentioned in SAL (#wikimedia-operations) [2020-11-24T18:45:59Z] <crusnov@deploy1001> Finished deploy [netbox/deploy@88f61d0]: Test deploy of 2.9.9 to netbox-next T266488 p2 (duration: 00m 05s)

Change 643354 had a related patch set uploaded (by CRusnov; owner: CRusnov):
[operations/puppet@production] netbox: Adjust settings for supporting Netbox 2.9 series

https://gerrit.wikimedia.org/r/643354

Mentioned in SAL (#wikimedia-operations) [2020-11-24T23:48:51Z] <crusnov@deploy1001> Started deploy [netbox/deploy@0362a12]: Test deploy of 2.9.10 to netbox-next T266488

Mentioned in SAL (#wikimedia-operations) [2020-11-24T23:50:42Z] <crusnov@deploy1001> Finished deploy [netbox/deploy@0362a12]: Test deploy of 2.9.10 to netbox-next T266488 (duration: 01m 51s)

Mentioned in SAL (#wikimedia-operations) [2020-11-24T23:50:48Z] <crusnov@deploy1001> Started deploy [netbox/deploy@0362a12]: Test deploy of 2.9.10 to netbox-next T266488 p2

Mentioned in SAL (#wikimedia-operations) [2020-11-24T23:50:54Z] <crusnov@deploy1001> Finished deploy [netbox/deploy@0362a12]: Test deploy of 2.9.10 to netbox-next T266488 p2 (duration: 00m 05s)

Volans added a comment.EditedNov 25 2020, 9:34 AM

What's the status on this? Were the db migrations applied cleanly?

This morning I've found:

  • Puppet disabled on netbox-dev2001 without a reason (I personally can guess it but others would have no idea, and anyway it's far from ideal keeping Puppet disabled overnight).
    • I've changed the reason to Upgrading Netbox - chaomodus - T266488
  • netbox-next with DEBUG=True in Django, that should never be the case for a publicly accessible website.
    • I've reverted it back to False and restarted uwsgi.
  • I might be wrong but it seems to me that netbox-next data was not refreshed from prod as agreed before the upgrade. If that's true I see few problems:
    • We didn't fully tested the upgrade with the data we have in production, and Arzhel has imported quite some data recently.
    • We can't update netbox-next data from production until we upgrade production too (because of schema changes), unless we re-run the upgrade that should run the schema changes again (but in a potentially different way of how it will be done in production).

What's the status on this? Were the db migrations applied cleanly?

DB migrations eventually applied cleanly, yes.

This morning I've found:

  • Puppet disabled on netbox-dev2001 without a reason (I personally can guess it but others would have no idea, and anyway it's far from ideal keeping Puppet disabled overnight).
    • I've changed the reason to Upgrading Netbox - chaomodus - T266488

Thanks for updating the message.

  • netbox-next with DEBUG=True in Django, that should never be the case for a publicly accessible website.
    • I've reverted it back to False and restarted uwsgi.

That's reasonable.

  • I might be wrong but it seems to me that netbox-next data was not refreshed from prod as agreed before the upgrade. If that's true I see few problems:
    • We didn't fully tested the upgrade with the data we have in production, and Arzhel has imported quite some data recently.
    • We can't update netbox-next data from production until we upgrade production too (because of schema changes), unless we re-run the upgrade that should run the schema changes again (but in a potentially different way of how it will be done in production).

No it was not refreshed from prod, but it is trivial for us to update it from the prod db and then migrate again.

Change 636464 abandoned by CRusnov:
[operations/software/netbox-deploy@master] Upgrade Netbox to 2.9.7

Reason:
we've skipped ahead and merged that change already

https://gerrit.wikimedia.org/r/636464

Just to be abundantly clear about the current state of affairs:

  • puppet is disabled on netbox-dev2001. This is necessary for changes to the configuration needed to run Netbox 2.9, until this puppet patch is merged: https://gerrit.wikimedia.org/r/c/operations/puppet/+/643354
    • I have replicated the changes this patch makes on netbox-dev2001 in anticipation of those changes in Puppet, including installing Redis.
  • I have updated the database from a dump I made shortly ago and run the migrations, they ran cleanly.
Running migrations:
  Applying auth.0012_alter_user_first_name_max_length... OK
  Applying circuits.0019_nullbooleanfield_to_booleanfield... OK
  Applying virtualization.0015_vminterface... OK
  Applying ipam.0037_ipaddress_assignment... OK
  Applying virtualization.0016_replicate_interfaces...
    Replicating 193 VM interfaces...
    Replicating assigned objects...
      0/193 (0%)
 OK
  Applying dcim.0107_component_labels... OK
  Applying dcim.0108_add_tags... OK
  Applying dcim.0109_interface_remove_vm... OK
  Applying dcim.0110_virtualchassis_name... OK
  Applying dcim.0111_component_template_description... OK
  Applying dcim.0112_standardize_components... OK
  Applying dcim.0113_nullbooleanfield_to_booleanfield... OK
  Applying dcim.0114_update_jsonfield... OK
  Applying dcim.0115_rackreservation_order... OK
  Applying dcim.0116_rearport_max_positions... OK
  Applying extras.0043_report... OK
  Applying extras.0044_jobresult... OK
  Applying extras.0045_configcontext_changelog... OK
  Applying extras.0046_update_jsonfield... OK
  Applying extras.0047_tag_ordering... OK
  Applying users.0007_proxy_group_user... OK
  Applying users.0008_objectpermission... OK
  Applying users.0009_replicate_permissions... OK
  Applying secrets.0009_secretrole_drop_users_groups... OK
  Applying users.0010_update_jsonfield... OK
  Applying virtualization.0017_update_jsonfield... OK
  • I haven't verified the data very extensively as a result of the migrations, but some spot checks on VMs shows they have expected interfaces.
  • I believe we are good to go to continue testing the rest of the scripts and integrations. For the record I've been using the csv dumper as a test of the API (and it wasn't ridiculously slow, luckily). It works fine but likely needs updates for additional records.

@crusnov what's the timeline for the upgrade?
We can't leave the host with puppet disabled longer. Apart from not getting any changes in the base system that might include security stuff, after a week it gets automatically evicted from PuppetDB and hence Icinga and become a ghost hosts.

@crusnov what's the timeline for the upgrade?
We can't leave the host with puppet disabled longer. Apart from not getting any changes in the base system that might include security stuff, after a week it gets automatically evicted from PuppetDB and hence Icinga and become a ghost hosts.

As soon as we can merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/643354 we can re-enable puppet. This patch should work on existing instances as well.

Mentioned in SAL (#wikimedia-operations) [2020-12-14T18:03:42Z] <crusnov@deploy1001> Started deploy [netbox/deploy@2fc439e]: Redeploy Netbox 2.8 to netbox-next T266488 p1

Mentioned in SAL (#wikimedia-operations) [2020-12-14T18:04:15Z] <crusnov@deploy1001> Finished deploy [netbox/deploy@2fc439e]: Redeploy Netbox 2.8 to netbox-next T266488 p1 (duration: 00m 33s)

Mentioned in SAL (#wikimedia-operations) [2020-12-14T18:04:18Z] <crusnov@deploy1001> Started deploy [netbox/deploy@2fc439e]: Redeploy Netbox 2.8 to netbox-next T266488 p2

Mentioned in SAL (#wikimedia-operations) [2020-12-14T18:04:23Z] <crusnov@deploy1001> Finished deploy [netbox/deploy@2fc439e]: Redeploy Netbox 2.8 to netbox-next T266488 p2 (duration: 00m 05s)

Change 649436 had a related patch set uploaded (by CRusnov; owner: CRusnov):
[operations/puppet@production] netbox: Add only non-2.8 compatible setting for Netbox

https://gerrit.wikimedia.org/r/649436

Just to be clear about the deploy plan:

  • We will merge hhttps://gerrit.wikimedia.org/r/c/operations/puppet/+/643354/
  • We will complete porting scripts and things to 2.9 [next day or so]

[in one go when deploying, early next week]

Change 643354 merged by CRusnov:
[operations/puppet@production] netbox: Adjust settings for supporting Netbox 2.9 series

https://gerrit.wikimedia.org/r/643354

THis production plan is completed up to deploying 2.9 to production servers. We need to test script changes on 2.9 on -next now that the Puppet patch is completed deployment.

jbond added a subscriber: jbond.Mon, Jan 4, 4:18 PM

I notice that https://gerrit.wikimedia.org/r/643354 is merged however puppet is still disabled on netbox-dev2001. puppet has now been disabled long enough that the node has been purged from puppetdb which is now causing /usr/local/sbin/check-cumin-aliases to fail as there are no lonbger any nodes in the netbox-canary alias. can we re-enable puppet on netbox-dev2001?

I notice that https://gerrit.wikimedia.org/r/643354 is merged however puppet is still disabled on netbox-dev2001. puppet has now been disabled long enough that the node has been purged from puppetdb which is now causing /usr/local/sbin/check-cumin-aliases to fail as there are no lonbger any nodes in the netbox-canary alias. can we re-enable puppet on netbox-dev2001?

Ah my mistake. Yes it can be re-enabled. I shall do so now.

Change 655040 had a related patch set uploaded (by Volans; owner: Volans):
[operations/software/netbox-extras@master] dns: migrate script to Netbox 2.9+

https://gerrit.wikimedia.org/r/655040

crusnov added a comment.EditedMon, Jan 11, 7:30 PM

(comment moved to appropriate 'live' ticket)

crusnov closed this task as Resolved.Mon, Jan 11, 7:48 PM
crusnov moved this task from Patches / Reviews / WIP to Complete on the netbox board.

We have completed pre-development and we're ready to move production to upgrade path.

Change 655040 merged by CRusnov:
[operations/software/netbox-extras@master] dns: migrate script to Netbox 2.9+

https://gerrit.wikimedia.org/r/655040

Change 649436 merged by CRusnov:
[operations/puppet@production] netbox: Add only non-2.8 compatible setting for Netbox

https://gerrit.wikimedia.org/r/649436