Page MenuHomePhabricator

esams,ulsfo,eqsin: one VM request each for install_servers
Open, MediumPublic

Description

Just like in T244390 a VM was requested to replace old installservers with a "light" variant (DHCP/TFTP but not APT repo), now the same is requested for the 3 edge sites / POPs who don't have their own install servers so far.

So 3 VMs in total, one in eqsin, one in ulsfo, one in esams. Note we have not had ganeti VMs with public IPs in these before.

T242602 was the ticket for the planning and T252526 is for the implementation.

install3001.wikimedia.org

Labs Project Tested: n/a
Site/Location: ESAMS
Number of systems: 1
Service: install_server
Networking Requirements: public
Processor Requirements: 1
Memory: 1G
Disks: 20G
Other Requirements: net-ops, ACL / dhcp-helper config changes

The VM will be used as an install_server (TFTP, DHCP, ...)

install4001.wikimedia.org

Labs Project Tested: n/a
Site/Location: ULSFO
Number of systems: 1
Service: install_server
Networking Requirements: public
Processor Requirements: 1
Memory: 1G
Disks: 20G
Other Requirements: net-ops, ACL / dhcp-helper config changes

The VM will be used as an install_server (TFTP, DHCP, ...)

install5001.wikimedia.org

Labs Project Tested: n/a
Site/Location: EQSIN
Number of systems: 1
Service: install_server
Networking Requirements: public
Processor Requirements: 1
Memory: 1G
Disks: 20G
Other Requirements: net-ops, ACL / dhcp-helper config changes

The VM will be used as an install_server (TFTP, DHCP, ...)

Event Timeline

Dzahn created this task.Jun 1 2020, 1:15 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 1 2020, 1:15 PM
Dzahn updated the task description. (Show Details)Jun 1 2020, 1:17 PM
Dzahn added a subscriber: Muehlenhoff.

Change 599883 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/dns@master] add IPs for installservers in POPs

https://gerrit.wikimedia.org/r/599883

Restricted Application added a project: Operations. · View Herald TranscriptJun 1 2020, 7:40 PM
Dzahn triaged this task as Medium priority.Jun 4 2020, 9:19 AM
akosiaris added subscribers: ayounsi, akosiaris.

I just had a quick look into the 3 PoP ganeti clusters and it seems they aren't ready to serve public IPs VMs. /etc/network/interfaces lacks the "public" interface that the main DC clusters have.

A quick look into asw2-ulsfo and asw2-esams also point out that the ports of those servers aren't set up to serve the public vlan in those PoPs either.

We should first fix these if we want to have those VMs in public IP space.

Sure I can do it, but do they need internet access? DHCP/TFTP shouldn't need internet access afaik? Are there other services running on them?

BBlack added a subscriber: BBlack.Jun 9 2020, 3:54 PM

@ayounsi - Yes, we're going to have some outbound recursive DNS needs from some ganeti-hosted services

ayounsi claimed this task.Jun 9 2020, 3:59 PM
Dzahn added a comment.Jun 9 2020, 4:01 PM

Sure I can do it, but do they need internet access? DHCP/TFTP shouldn't need internet access afaik? Are there other services running on them?

The reason for public IPs of these new "light" installservers in eqiad/codfw was that they also run the squid proxies. It wasn't clear to me yet whether we also add these in POPs.

Mentioned in SAL (#wikimedia-operations) [2020-06-10T06:53:00Z] <XioNoX> trunk public vlan to ulsfo ganeti hosts - T254157

Mentioned in SAL (#wikimedia-operations) [2020-06-10T07:16:20Z] <XioNoX> trunk public vlan to eqsin ganeti hosts - T254157

Mentioned in SAL (#wikimedia-operations) [2020-06-10T07:26:50Z] <XioNoX> trunk public vlan to esams ganeti hosts - T254157

ayounsi reassigned this task from ayounsi to akosiaris.Jun 10 2020, 7:28 AM
ayounsi removed a project: netops.

All yours!

@Dzahn: @akosiaris configured public interfaces on the ganeti hosts and after the Ganeti clusters are rebooted (which I'm currently handling), you can crate VMs with a public IP. I'm already done with the ulsfo Ganeti cluster, so feel free to give install4001.wikimedia.org a shot.

ayounsi removed a subscriber: ayounsi.Mon, Jun 15, 7:18 AM

The Ganeti clusters in esams and eqsin have also been rebooted, they should also be ready for instances with public IPs now.

Dzahn claimed this task.Mon, Jun 15, 2:48 PM

Change 599883 merged by Dzahn:
[operations/dns@master] add IPs for installservers in POPs

https://gerrit.wikimedia.org/r/599883

Dzahn added a comment.Fri, Jun 19, 3:28 PM

added to DNS:

install3001.wikimedia.org has address 91.198.174.63
install3001.wikimedia.org has IPv6 address 2620:0:862:1:91:198:174:63

install4001.wikimedia.org has address 198.35.26.12
install4001.wikimedia.org has IPv6 address 2620:0:863:1:198:35:26:12

install5001.wikimedia.org has address 103.102.166.13
install5001.wikimedia.org has IPv6 address 2001:df2:e500:1:103:102:166:13

Dzahn updated the task description. (Show Details)Fri, Jun 19, 3:35 PM
Dzahn added a comment.Fri, Jun 19, 3:43 PM

feel free to give install4001.wikimedia.org a shot.

Thanks! Just did. First try i followed the docs to check in netbox for the row name and saw that in ULSFO it is row "1" (not a letter for the row like in eqiad/codfw) and tried with --network public ulsfo_1 but it told me to just use --network public ulsfo (and eqsin and esams without a number as well).

so:

dzahn@cumin1001:~$ sudo cookbook sre.ganeti.makevm --vcpus 1 --memory 1 --disk 20 --network public ulsfo install4001.wikimedia.org

Ready to create Ganeti VM install4001.wikimedia.org in the ganeti01.svc.ulsfo.wmnet cluster on row 1 with 1 vCPUs, 1GB of RAM, 20GB of disk in the public network.

Change 606718 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] site/DHCP: add install4001.wikimedia.org

https://gerrit.wikimedia.org/r/606718

Change 606718 merged by Dzahn:
[operations/puppet@production] site/DHCP: add install4001.wikimedia.org

https://gerrit.wikimedia.org/r/606718

Change 606720 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] DHCP: configure install2003 as next-server for install4001

https://gerrit.wikimedia.org/r/606720

Change 606720 merged by Dzahn:
[operations/puppet@production] DHCP: configure install2003 as next-server for install4001

https://gerrit.wikimedia.org/r/606720

Dzahn added a comment.EditedFri, Jun 19, 5:15 PM

Creating the VM worked fine. Installing the OS on install4001 has not worked yet though.

DHCP was working right away, but serving the installer was not. Then i changed the "next-server" for install4001 to install2003 (just like bast4001 has it set in DHCP config) and after that i could see it serving lpxelinux.0 but that's where it stops and the console stays empty.

Jun 19 16:07:16 install2003 dhcpd[22103]: DHCPREQUEST for 198.35.26.12 (208.80.153.51) from aa:00:00:6d:c7:59 via 198.35.26.3
Jun 19 16:07:16 install2003 dhcpd[22103]: DHCPACK on 198.35.26.12 to aa:00:00:6d:c7:59 via 198.35.26.3
Jun 19 16:07:16 install2003 atftpd[19167]: Serving lpxelinux.0 to 198.35.26.12:58560

on install2003 the ferm rule matching it is there:

ACCEPT     udp  --  198.35.26.0/28       anywhere             udp dpt:bootps
ACCEPT     tcp  --  198.35.26.0/28       anywhere             tcp dpt:http
ACCEPT     tcp  --  198.35.26.0/28       anywhere             tcp dpt:http-alt
ACCEPT     udp  --  198.35.26.0/28       anywhere             udp dpt:tftp

Change 601342 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] site: add new POP install servers with insetup role

https://gerrit.wikimedia.org/r/601342