- Write a mass import / mass update script so that back changes and out of sync changes can be synchronized with Netbox.
|Open||crusnov||T213114 Q3 2018/19 Goal: TEC6: Build automated workflows for server provisioning (Tracking Task)|
|Resolved||crusnov||T215229 Keep Ganeti VMs synchronized in Netbox|
gnt-instance list --reason="crusnov - starting netbox sync dev" -o name,os,status,mac,ip,be/maxmem,vcpus,disk.sizes,tags,sda_size,sdb_size --separator=' ' gives a list of instances in a machine parsable format with all of teh pertinant data for netbox.
Also of note that command exits with a 1 status on non-master machines, so a deployment option would be to push the crontab or whatever to every ganeti host and then check the return status of that command to determine if it's the master or not at which point it would simply exit rather than attempting a sync.
Internal API equivalent : cl.Query(ganeti.constants.QR_INSTANCE, ['name','os','status','mac','ip','be/maxmem','vcpus','disk.sizes','tags'], ganeti.qlang.MakeSimpleFilter('name', None)).data where cl is initialized form ganeti.cli.GetClient
It raises ganeti.errors.OpPrereqError: ("This is not the master node, please connect to node 'ganeti1003.eqiad.wmnet' and rerun the command", 'wrong_input') on non-master nodes.
I had a conversation with Alex about this. His suggestion is to write the sync script for hosting on the netbox instances, and consume rapi from ganeti01.svc.*.wmnet, to be run periodically to sync the state into Netbox. Hooks would be right out of the picture. This seems like a good avenue.
So the procedure:
- Open rapi port to netmon*
- Add read-only user to ganeti's rapi authentication stuff http://docs.ganeti.org/ganeti/master/html/rapi.html#users-and-passwords
For the deploy of the sync script:
- Add the script to scripts/ in the netbox-deploy and add pynetbox to the freeze-requirements.sh
- Add a timer unit to systemd on netbox master host (using a puppet if to only deploy to master). See timer examples:
10:14:47 <volans> icinga/templates/initscripts/update-etcd-mw-config-lastindex.timer.systemd.erb 10:15:05 <volans> modules/icinga/templates/initscripts/update-etcd-mw-config-lastindex.systemd.erb
- Make sure that the timer unit from puppet happens after the scap pull in puppet.
One additional niggle once ownership is worked out. How to change the -b parameter to gnt-rapi - it is set in /etc/defaults/ganeti - this sets the listen address for the rapi daemon, which currently is set to 127.0.0.1 - which I'm not sure where this is set up, since I don't see where it is set up in Puppet.
One ongoing discussion we've been having is how to manage authorization tokens in netbox is how to track where changes are coming from. Currently the general idea is to have one read-only and one read-write token used for production in Puppet, so that regenerating a token would be as easy as creating a new one, changing it in puppet and all consumers of the netbox api are updated. The major downside of this is tracking which script is precisely interacting / making changes to the Netbox API. The initial idea was perhaps generating a separate token for each usage, but Netbox doesn't appear to track which token is used for any given API call, only the user ID so this seems sort of pointless (it is conceivable to patch netbox to track and expose this information). The only definitive way is to make separate users for each application but the management overhead seems a bit ridiculous and is not preferred. Another option may be to add a changelog parameter to the API and have that recorded and exposed in the extras_objectchange record. I guess the big question is, what level of tracking is desired? This is out of scope for now but will become pertinent as more scripts start changing netbox's contents.
One thing that is missing are the physical devices that belongs to a cluster, see https://netbox.wikimedia.org/virtualization/clusters/3/
It's probably something that this script should take care of IMHO. Thoughts?
I don't expect that changes all that often, but I agree that the script could take that into account (there is an API for tose devices, of course). Now that it's in place it should be straight forward to modify.