Site/Location:eqiad
Number of systems: 1
Service: Airflow - WMDE
Networking Requirements: internal IP - analytics vlan
Processor Requirements: 4 vCPUs
Memory: 8 GB of RAM
Disks: 100 GB
Other Requirements: none
Description
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | Manuel | T342331 [EPIC] Set up a sustainable tech stack for Wikidata Analytics | |||
Resolved | Stevemunene | T340648 [Airflow] Setup Airflow instance for WMDE | |||
Resolved | Stevemunene | T342424 eqiad: 1 VM request for WMDE Airflow |
Event Timeline
Comment Actions
Verifying the cluster availability and resources via
stevemunene@cumin1001:~$ sudo cookbook -d sre.ganeti.resource-report eqiad DRY-RUN: Executing cookbook sre.ganeti.resource-report with args: ['eqiad'] DRY-RUN: START - Cookbook sre.ganeti.resource-report +-------+-------+-----------+----------+-----------+---------+-----------+ | Group | Nodes | Instances | MFree | MFree avg | DFree | DFree avg | +-------+-------+-----------+----------+-----------+---------+-----------+ | A | 7 | 37 | 260.4GiB | 37.2GiB | 13.4TiB | 1.9TiB | | B | 6 | 33 | 207.5GiB | 34.6GiB | 9.8TiB | 1.6TiB | | C | 7 | 36 | 265.3GiB | 37.9GiB | 13.1TiB | 1.9TiB | | D | 6 | 34 | 230.0GiB | 38.3GiB | 10.4TiB | 1.7TiB | +-------+-------+-----------+----------+-----------+---------+-----------+ DRY-RUN: END (PASS) - Cookbook sre.ganeti.resource-report (exit_code=0)
Using group B based on the results.
Comment Actions
created the vm with
sudo cookbook sre.ganeti.makevm --vcpus 4 --memory 8 --disk 100 --network analytics --os buster --cluster eqiad --group B an-airflow1007
makevm and reimage succeeded with
Reimage completed: - an-airflow1007 (**PASS**) - Removed from Puppet and PuppetDB if present - Deleted any existing Puppet certificate - Removed from Debmonitor if present - Forced PXE for next reboot - Host rebooted via gnt-instance - Host up (Debian installer) - Set boot media to disk - Host up (new fresh buster OS) - Generated Puppet certificate - Signed new Puppet certificate - Run Puppet in NOOP mode to populate exported resources in PuppetDB - Found Nagios_host resource for this host in PuppetDB - Downtimed the new host on Icinga/Alertmanager - First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202308141054_stevemunene_2724910_an-airflow1007.out - configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production - Rebooted - Automatic Puppet run was successful - Forced a re-check of all Icinga services for the host - Icinga status is optimal - Icinga downtime removed - Updated Netbox data from PuppetDB END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-airflow1007.eqiad.wmnet with OS buster END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host an-airflow1007.eqiad.wmnet
VM is online and reachable resolving this.