Page MenuHomePhabricator

Netbox: generate CSV backups
Open, HighPublic

Description

Currently we're performing Postgresql backups of Netbox, but if a wrong edit is made it would be pretty hard to manually find the right values for a quick revert without having to restore the whole DB, potentially losing changes made by others.
We could, in addition to the DB backup, also perform backups in CSV form of the data, using the export to CSV function in Netbox.
A script in the netbox-deploy repo, that uses the already existing token should do the job and should be fairly simple to add.

Caveat: for the DCIM devices we should use the custom export all fields CSV method instead of the default one.

Things to be decided:

  • which objects to export (all?)
  • how frequent to perform the backup
  • in which structure
    • For this my suggestion would be something like:
netbox-csv-backups/
    2019-05-14/
        dcim.devices.csv
        dcim.sites.csv
        ....
  • how/when to rotate/compress the files

Details

Related Gerrit Patches:
operations/puppet : productionnetbox: Enable CSV dump rotations.
operations/software/netbox-deploy : masterAdd new dumpbackup.py script

Event Timeline

Volans created this task.May 14 2019, 3:47 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 14 2019, 3:47 PM
Volans triaged this task as Normal priority.May 14 2019, 3:47 PM

Just to follow up on this. I did spend some time trying to figure out how to initiate a template-based export from hitting a URL. It seems as though there's no API-way, and hitting the URL endpoint doesn't work with a token authentication as far as I can tell.

That said, it'd be relatively trivial to use the API to export things, so I propose that we proceed by basically implementing the same functionality except via the API.

Change 518166 had a related patch set uploaded (by CRusnov; owner: CRusnov):
[operations/software/netbox-deploy@master] Add new dumpbackup.py script

https://gerrit.wikimedia.org/r/518166

Change 518166 merged by CRusnov:
[operations/software/netbox-deploy@master] Add new dumpbackup.py script

https://gerrit.wikimedia.org/r/518166

Mentioned in SAL (#wikimedia-operations) [2019-08-13T18:31:47Z] <crusnov@deploy1001> Started deploy [netbox/deploy@367ca84]: Update Netbox to v2.6.1-wmf3 affects: T223292

Mentioned in SAL (#wikimedia-operations) [2019-08-13T18:32:23Z] <crusnov@deploy1001> Finished deploy [netbox/deploy@367ca84]: Update Netbox to v2.6.1-wmf3 affects: T223292 (duration: 00m 36s)

Mentioned in SAL (#wikimedia-operations) [2019-08-13T18:32:31Z] <crusnov@deploy1001> Started deploy [netbox/deploy@367ca84]: Update Netbox to v2.6.1-wmf3 affects: T223292

Mentioned in SAL (#wikimedia-operations) [2019-08-13T18:33:09Z] <crusnov@deploy1001> Finished deploy [netbox/deploy@367ca84]: Update Netbox to v2.6.1-wmf3 affects: T223292 (duration: 00m 43s)

Mentioned in SAL (#wikimedia-operations) [2019-08-13T18:33:39Z] <crusnov@deploy1001> Started deploy [netbox/deploy@367ca84]: Update Netbox to v2.6.1-wmf3 affects: T223292 (fix perms)

Mentioned in SAL (#wikimedia-operations) [2019-08-13T18:33:49Z] <crusnov@deploy1001> Finished deploy [netbox/deploy@367ca84]: Update Netbox to v2.6.1-wmf3 affects: T223292 (fix perms) (duration: 00m 09s)

crusnov closed this task as Resolved.Aug 13 2019, 6:43 PM

this has been fully deployed now and tested. It is automated.

faidon reopened this task as Open.Fri, Oct 18, 11:59 AM
faidon raised the priority of this task from Normal to High.

It looks like this just dumps files flat, without keeping any archives. We've already lost a bunch of history unfortunately :(

The original description stated:

netbox-csv-backups/
    2019-05-14/
        dcim.devices.csv
        dcim.sites.csv
        ...

…and I think that's a much better idea. Let's do that ASAP.

crusnov added a comment.EditedFri, Oct 18, 11:07 PM

It looks like this just dumps files flat, without keeping any archives. We've already lost a bunch of history unfortunately :(
The original description stated:

netbox-csv-backups/
    2019-05-14/
        dcim.devices.csv
        dcim.sites.csv
        ...

…and I think that's a much better idea. Let's do that ASAP.

You are correct that this was the spec, however for a time it was agreed between Riccardo and myself that using bacula for history would be preferable.

Based on feedback from you subsequent to its release, I have implemented a rotation strategy for this, I need to revisit the patch but it has been signed off on so I will get that merged this coming week.

Ping! I'd like to start killing old entries from esams, but I'd like to make sure we have them backed up first.

Ah roger, just need a quick deploy and done.

Mentioned in SAL (#wikimedia-operations) [2019-10-25T14:27:07Z] <crusnov@deploy1001> Started deploy [netbox/deploy@690f9ae]: deploy netbox scripts T223292

Mentioned in SAL (#wikimedia-operations) [2019-10-25T14:28:09Z] <crusnov@deploy1001> Finished deploy [netbox/deploy@690f9ae]: deploy netbox scripts T223292 (duration: 01m 02s)

Mentioned in SAL (#wikimedia-operations) [2019-10-25T14:30:48Z] <crusnov@deploy1001> Started deploy [netbox/deploy@690f9ae]: deploy netbox scripts (netbox2001) T223292

Mentioned in SAL (#wikimedia-operations) [2019-10-25T14:30:53Z] <crusnov@deploy1001> Finished deploy [netbox/deploy@690f9ae]: deploy netbox scripts (netbox2001) T223292 (duration: 00m 05s)

Mentioned in SAL (#wikimedia-operations) [2019-10-25T14:31:45Z] <crusnov@deploy1001> Started deploy [netbox/deploy@690f9ae]: deploy netbox scripts (netbox2001) -T223292

Mentioned in SAL (#wikimedia-operations) [2019-10-25T14:32:29Z] <crusnov@deploy1001> Finished deploy [netbox/deploy@690f9ae]: deploy netbox scripts (netbox2001) -T223292 (duration: 00m 44s)

Mentioned in SAL (#wikimedia-operations) [2019-10-25T16:04:08Z] <crusnov@deploy1001> Started deploy [netbox/deploy@0f4c92d]: deploy netbox scripts update (netbox2001) T223292

Mentioned in SAL (#wikimedia-operations) [2019-10-25T16:04:51Z] <crusnov@deploy1001> Finished deploy [netbox/deploy@0f4c92d]: deploy netbox scripts update (netbox2001) T223292 (duration: 00m 43s)

Mentioned in SAL (#wikimedia-operations) [2019-10-25T16:05:50Z] <crusnov@deploy1001> Started deploy [netbox/deploy@0f4c92d]: deploy netbox scripts update (netbox1001) T223292

Change 545123 had a related patch set uploaded (by CRusnov; owner: CRusnov):
[operations/puppet@production] netbox: Enable CSV dump rotations.

https://gerrit.wikimedia.org/r/545123

Change 545123 merged by CRusnov:
[operations/puppet@production] netbox: Enable CSV dump rotations.

https://gerrit.wikimedia.org/r/545123

Mentioned in SAL (#wikimedia-operations) [2019-10-25T16:19:20Z] <crusnov@deploy1001> Finished deploy [netbox/deploy@0f4c92d]: deploy netbox scripts update (netbox1001) T223292 (duration: 13m 31s)

Oke doke, rotations are in place. Just a note the old pre-rotation dumps are still backed up in bacula, so those are available for historical data, but now we'll have timestamped dumps with several historical dumps in addition to being backed up to bacula.

After running the numbers and looking at the way the rotations work, currently deployed version dumps 24 times a day and saves 16 of them, this doesn't seem that useful, so there is a patch https://gerrit.wikimedia.org/r/#/c/operations/software/netbox-deploy/+/546241 that changes the script to overwrite the daily directory with today's date (so it'll dump repeatedly to 2019-10-25 until 2019-10-26), and then rotates after 365 such dumps. This seems more useful in general.

Also it is trivial to create a persistent dump for example at the occasion of a major change to Netbox, by doing a manual dump and renaming it to something like "esams_purge-2019-10-25", this will preserve it from rotation since the rotate script only looks at directories that start with 20*.

As I've not yet fully understood the use case of those files given that AFAIK most of them cannot be re-imported as is into Netbox it's hard for me to give a feedback on the frequency of the backups and their retention.
If I have to ballpark it while keeping it simple then the standard hourly for a week, daily for the rest of the retention period might be a good compromise.