Page MenuHomePhabricator

Site: eqiad|codfw VM request for Kafka Burrow Lag monitoring
Closed, ResolvedPublic

Description

Site/Location: eqiad and codfw
Number of systems: 2 (one for each DC)
Service: Kafka Burrow
Networking Requirements: internal IP, no specific requirement
Processor Requirements: 2 or 4
Memory: 8GB
Disks: 50/100GB
Other Requirements: None

Reference: https://phabricator.wikimedia.org/T187805

Event Timeline

elukey created this task.Feb 21 2018, 4:05 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 21 2018, 4:05 PM
Dzahn added a subscriber: Dzahn.Feb 21 2018, 8:48 PM

What host names do we want to use?

Dzahn added a comment.Feb 21 2018, 8:48 PM

(because DNS needs to exist before anything else and the VMs can be created)

Dzahn added a comment.Feb 21 2018, 8:56 PM

How about "kafkamon", akin to "netmon"?

Change 413281 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/dns@master] introduce kafkamon1001/2001

https://gerrit.wikimedia.org/r/413281

Change 413283 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] partman: add kafkamon[1-2]00[0-9]

https://gerrit.wikimedia.org/r/413283

Change 413281 merged by Dzahn:
[operations/dns@master] introduce kafkamon1001/2001

https://gerrit.wikimedia.org/r/413281

Change 413283 merged by Dzahn:
[operations/puppet@production] partman: add kafkamon[1-2]00[0-9]

https://gerrit.wikimedia.org/r/413283

Mentioned in SAL (#wikimedia-operations) [2018-02-23T04:53:35Z] <mutante> ganeti: creating new VM kafkamon1001 - vcpus=2,memory=8g,disk=60G, row_A eqiad (T187901)

Mentioned in SAL (#wikimedia-operations) [2018-02-23T04:56:27Z] <mutante> ganeti: ganeti2004 - creating new VM kafkamon2001 - vcpus=2,memory=8g,disk=60G, row_A codfw (T187901)

Dzahn added a comment.Feb 23 2018, 5:29 AM

Fri Feb 23 05:18:56 2018 - INFO: - device disk/0: 100.00% done, 0s remaining (estimated)
Fri Feb 23 05:18:57 2018 - INFO: Instance kafkamon1001.eqiad.wmnet's disks are in sync
Fri Feb 23 05:18:57 2018 - INFO: Waiting for instance kafkamon1001.eqiad.wmnet to sync disks
Fri Feb 23 05:18:57 2018 - INFO: Instance kafkamon1001.eqiad.wmnet's disks are in sync
..
Fri Feb 23 05:23:28 2018 - INFO: - device disk/0: 100.00% done, 0s remaining (estimated)
Fri Feb 23 05:23:28 2018 - INFO: Instance kafkamon2001.codfw.wmnet's disks are in sync
Fri Feb 23 05:23:28 2018 - INFO: Waiting for instance kafkamon2001.codfw.wmnet to sync disks
Fri Feb 23 05:23:28 2018 - INFO: Instance kafkamon2001.codfw.wmnet's disks are in sync

Change 413670 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] DHCP: add MACs for kafkamon1001/2001

https://gerrit.wikimedia.org/r/413670

Change 413670 merged by Dzahn:
[operations/puppet@production] DHCP: add MACs for kafkamon1001/2001

https://gerrit.wikimedia.org/r/413670

Mentioned in SAL (#wikimedia-operations) [2018-02-23T05:40:14Z] <mutante> ganeti1004 - initial startup of kafkamon1001 - booting to PXE, installing stretch (T187901)

Mentioned in SAL (#wikimedia-operations) [2018-02-23T05:58:03Z] <mutante> puppetmaster1001 - signing puppet certs for kafkamon1001/kafkamon2001 - initial puppet runs, adding as role spare (T187901)

Dzahn closed this task as Resolved.Feb 23 2018, 6:23 AM
Dzahn claimed this task.
  • created VMs
  • installed with stretch
  • signed puppet certs on master, added to site with role(test)
  • confirmed added in Icinga

https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=kafkamon

the VMs exist and the request is done. from here the details continue on T187805

Nice work. Thanks!

Indeed, thank you!