⚓ T167992 rack/setup/install new kafka nodes kafka-jumbo100[1-6]

Subject	Repo	Branch	Lines +/-
profile::kafka::broker: add the monitoring_enabled option	operations/puppet	production	+22 -11
network::constants: update IP addresses of the new Kafka hosts	operations/puppet	production	+12 -12
jmxtrans.sh: reduce MaxTenuringThreshold to 15	operations/debs/jmxtrans	master	+1 -1
Add kafka rack (row) awareness configs	operations/puppet	production	+10 -1
Allow new kafka-jumbo hosts to talk to zookeeper on conf*	operations/puppet	production	+15 -1
Debugging	operations/puppet	production	+5 -4
Add debug notifies to figure out error message in prod	operations/puppet	production	+3 -0
Un-apply kafka role -- these should be stretch, not jessie! :/	operations/puppet	production	+3 -1
Install kafka-jumbo as Stretch	operations/puppet	production	+12 -0
Apply kafka::jumbo::broker on new kafka-jumbo100* hosts	operations/puppet	production	+6 -4
setting kafka-jumbo100[1-6].eqiad.wmnet dns	operations/dns	master	+17 -13
kafka-jumbo install params	operations/puppet	production	+35 -0

• Cmjohnson moved this task from Up next to High Priority Task on the ops-eqiad board.Jul 25 2017, 3:10 PM

Change 368186 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] Adding dns entries for kafka-jumbo100[1-6] T167992

https://gerrit.wikimedia.org/r/368186

gerritbot added a project: Patch-For-Review.Jul 27 2017, 2:58 PM

I'd ask if possible to pause naming and configurations for these hosts since me and @Ottomata are thinking about the optimal solution for this cluster. It might be possible to migrate one kafka broker at the time (deprecate an old one and replace it with one of the new batch), but it would be really handy to keep the nomenclature (kafka1012 currently means kafka broker with id 12). This is not a strict requirement and we already got Rob's feedback that kafka-jumbo1001-6 would be better, but we'd need some time to think about it (couple of days).

Thanks and sorry!

elukey added a subscriber: User-Elukey.Jul 27 2017, 5:44 PM

So we had a discussion about this earlier in IRC, and after that agreed to document some of it here. @elukey has done so above.

My stance is that we should have the oldest hardware servers (when possible) in a cluster as the lowest sequence. If that is not possible, it is best to include ALL systems from the same age group within a sequence range. Example: If kafka1009, kafka1011, and kafka1019 were to become kafka-jumbo, I'd recommend they be kakfla-jumbo100[123], and these newer systems be kafka-jumbo100[4-9]. If that isn't possible (due to not knowing how many old kafka hosts will rename to kafka-jumbo), then I'd call these new hosts kafka-jumbo100[1-6], and the older hosts should all move into hostname numbers in direct sequence, kafka-jumbo100[789], not kafka-jumbo1009, kafka-jumbo1011, kafka-jumbo1014.

I think the above is clear, please advise if not!

The rest of the hostnames for all our other clusters follow the above standard. Analytics varied (without my knowing about it at the time) with the analytics to kafka hostname change, and its caused some confusion. If it takes more work for the inital migration, then it falls in line with the rest of the cluster, it seems to be a better plan to me than continuing to have these analytics clusters vary from the rest of the clusters. (This has been a regular issue with analytics naming, etc.) When they are in a sequence, we know they will age out at roughly the same rate, and they are usually also grouped by similar specifications within that sequence.

@elukey & @Ottomata are already aware of the above ^ I'm just echoing it to the task for record keeping.

Since the kafka1012->kafka1022 are going to be decommed and kafka-jumbo is a complete new cluster from our point of view (that may share old Kafka broker ids) I'd be in favor of sticking with convention and call the nodes kafka100[1-6]. It will be less obvious for us the mapping between id->nodename (like 12 -> kafka-jumbo1001, etc..) but it will be better for the procurement's point of view.

Going to wait for Andrew's thoughts :)

Change 368186 abandoned by Cmjohnson:
Adding dns entries for kafka-jumbo100[1-6] T167992

Reason:
abandoning this until naming has been figured out

https://gerrit.wikimedia.org/r/368186

If we decide to keep broker ids (seeming less likely, I will test some stuff in labs this week), then I think we should keep the node numbers as they are. Otherwise, we'll def start with 1001. But! Let's not fight that fight until we decide to do that.

Alright! Luca and I have tested some things, and discussed this migration a little more. We're going to stick with the original plan of spinning this up as a new separate cluster, and then migrating clients over one by one.

So! We can proceed. These can be installed as kafka-jumbo100[1-6]. Thanks @Cmjohnson !

Note about disk config: we are going for a 12 disk RAID10 partition plus a raid1 root one (on separate disks). I can work on it if you guys are busy, otherwise I'll be happy to review the partman recipe :)

@elukey: I'm happy to help with partman, but I want to confirm:

These systems have dual 1TB OS disks, which will be placed in a raid1 for the OS. The 12 disks for DATA will be placed in a raid10. These systems come with hardware raid controllers, so I would NOT use software raid.

I'd suggest that all the systems be setup with hardware raid with the following:

2 * 1 TB SFF as raid1 SDA
12 * 4 TB LFF as raid10 SDB

Then the partitioning recipe just needs to setup a single / in the sda, and then all your data in a LVM on sdb, sound right?

+1 @RobH

It seems good to me, for some reason I was under the impression that we preferred sw raid vs hw controlled ones, this is why I mentioned it. Nevermind, the partman recipe will be simpler :)

I'm not sure of a reason to prefer software raid other than ease of management. Likely performance is better with HW raid.

@elukey: So we do prefer sw raid over hw raid when purchasing servers. However, servers in this particular chassis (R730xd) have to have a full hw raid controller installed. Since its already there, the performance on this particular raid controller is better than using software raid.

Change 369725 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] Adding mgmt and production dns for kafka-jumbo100[1=6] T167992

https://gerrit.wikimedia.org/r/369725

FYI: These nodes should be installed with Debian Stretch.

In T167992#3493916, @RobH wrote:

@elukey: So we do prefer sw raid over hw raid when purchasing servers. However, servers in this particular chassis (R730xd) have to have a full hw raid controller installed. Since its already there, the performance on this particular raid controller is better than using software raid.

Thanks for explaining, completely trust your judgement so no issues from my side :)

Change 369725 merged by Cmjohnson:
[operations/dns@master] Adding mgmt and production dns for kafka-jumbo100[1=6] T167992

https://gerrit.wikimedia.org/r/369725

elukey added a project: User-Elukey.Aug 9 2017, 1:59 PM

Hello people, any timeline for these hosts? Don't mean to pressure, just knowing the timings to organize/schedule all the work during the next weeks :)

elukey moved this task from In Progress to Stalled on the User-Elukey board.Aug 14 2017, 10:19 AM

@Cmjohnson Heyaaa, we are pretty ready and excited to start working with these. Can you let us know when they'll be worked on?

Thank you!

@Ottomata okay, I understand I will get them going as soon as I can there in my being worked on queue with a few other things https://phabricator.wikimedia.org/tag/ops-eqiad/

Great, thank you!

• Cmjohnson updated the task description. (Show Details)Aug 21 2017, 10:13 PM

• Cmjohnson updated the task description. (Show Details)Aug 22 2017, 2:52 PM

• Cmjohnson created subtask T173837: kafka-jumbo1004 h/w problem most likely raid card.Aug 22 2017, 2:56 PM

the issue with 1004 has been resolved assigning to @RobH to do installs.

• Cmjohnson moved this task from High Priority Task to Blocked on the ops-eqiad board.Aug 23 2017, 3:18 PM

Change 373328 had a related patch set uploaded (by RobH; owner: RobH):
[operations/puppet@production] kafka-jumbo install params

https://gerrit.wikimedia.org/r/373328

Change 373328 abandoned by RobH:
kafka-jumbo install params

https://gerrit.wikimedia.org/r/373328

Change 373357 had a related patch set uploaded (by RobH; owner: RobH):
[operations/dns@master] setting kafka-jumbo100[1-6].eqiad.wmnet dns

https://gerrit.wikimedia.org/r/373357

Change 373357 merged by RobH:
[operations/dns@master] setting kafka-jumbo100[1-6].eqiad.wmnet dns

https://gerrit.wikimedia.org/r/373357

Ok, kafka-jumbo1001 has odd issues.

It is confirmed to have the correct MAC address in dhcp, as well as dns is right. The vlan is correct, and I can see the dhcp request come in on the correct subnet/vlan. I'm not sure why it is getting no free leases.

I've moved on to the rest of the systems, which so far boot fine via dhcp. However, there is no workign partman recipe for a hardware raid setup like this. I've created kafka-jumbo.cfg recipe and I'm tweaking it now.

So far, I have it booting, installing the OS to the sda raid1, and then putting a large LVM across the sdb. Its so far failing to mount the /srv in sdb. Still working on it.

elukey awarded a token.Aug 28 2017, 12:00 PM

RobH created subtask T174457: kafka-jumbo.cfg partman recipe creation/troubleshooting.Aug 29 2017, 4:57 PM

So we solved the issue with partman and I was able to install the os on kafka-jumbo100[12], but failed to PXE boot on the other nodes (they hang after selecting the boot option afaics). Is there anything else to configure on them to proceed with OS/puppet/etc.. deployment?

All hosts up with OS installed and puppet/salt running.

elukey updated the task description. (Show Details)Sep 6 2017, 4:24 PM

elukey updated the task description. (Show Details)

Change 376336 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Apply kafka::jumbo::broker on new kafka-jumbo100* hosts

https://gerrit.wikimedia.org/r/376336

Change 376336 merged by Ottomata:
[operations/puppet@production] Apply kafka::jumbo::broker on new kafka-jumbo100* hosts

https://gerrit.wikimedia.org/r/376336

Change 376339 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Un-apply kafka role -- these should be stretch, not jessie! :/

https://gerrit.wikimedia.org/r/376339

Change 376340 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Install kafka-jumbo as Stretch

https://gerrit.wikimedia.org/r/376340

Change 376339 merged by Ottomata:
[operations/puppet@production] Un-apply kafka role -- these should be stretch, not jessie! :/

https://gerrit.wikimedia.org/r/376339

Change 376340 merged by Ottomata:
[operations/puppet@production] Install kafka-jumbo as Stretch

https://gerrit.wikimedia.org/r/376340

Script wmf_auto_reimage was launched by otto on neodymium.eqiad.wmnet for hosts:

['kafka-jumbo1001.eqiad.wmnet', 'kafka-jumbo1002.eqiad.wmnet', 'kafka-jumbo1003.eqiad.wmnet', 'kafka-jumbo1004.eqiad.wmnet', 'kafka-jumbo1005.eqiad.wmnet', 'kafka-jumbo1006.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/201709061953_otto_25477.log.

Completed auto-reimage of hosts:

['kafka-jumbo1001.eqiad.wmnet', 'kafka-jumbo1002.eqiad.wmnet', 'kafka-jumbo1003.eqiad.wmnet', 'kafka-jumbo1004.eqiad.wmnet', 'kafka-jumbo1005.eqiad.wmnet', 'kafka-jumbo1006.eqiad.wmnet']

Of which those FAILED:

set(['kafka-jumbo1001.eqiad.wmnet'])

Change 376377 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Add debug notifies to figure out error message in prod

https://gerrit.wikimedia.org/r/376377

Change 376377 merged by Ottomata:
[operations/puppet@production] Add debug notifies to figure out error message in prod

https://gerrit.wikimedia.org/r/376377

Change 376379 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Debugging

https://gerrit.wikimedia.org/r/376379

Change 376379 merged by Ottomata:
[operations/puppet@production] Debugging

https://gerrit.wikimedia.org/r/376379

Change 376407 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Allow new kafka-jumbo hosts to talk to zookeeper on conf*

https://gerrit.wikimedia.org/r/376407

Change 376407 merged by Ottomata:
[operations/puppet@production] Allow new kafka-jumbo hosts to talk to zookeeper on conf*

https://gerrit.wikimedia.org/r/376407

Change 376428 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Add kafka rack (row) awareness configs

https://gerrit.wikimedia.org/r/376428

Change 376428 merged by Ottomata:
[operations/puppet@production] Add kafka rack (row) awareness configs

https://gerrit.wikimedia.org/r/376428

elukey@kafka-jumbo1001:/usr/share/jmxtrans$ source /etc/default/jmxtrans
elukey@kafka-jumbo1001:/usr/share/jmxtrans$ ./jmxtrans.sh start
elukey@kafka-jumbo1001:/usr/share/jmxtrans$ OpenJDK 64-Bit Server VM warning: ignoring option PermSize=384m; support was removed in 8.0
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=384m; support was removed in 8.0
MaxTenuringThreshold of 16 is invalid; must be between 0 and 15
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

Replacing 16 with 15 in jmxtrans.sh made everything work again. Since this file is copied over from the deb package, it should be only a matter of changing our stretch-wikimedia package.

Change 376663 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/debs/jmxtrans@master] jmxtrans.sh: reduce MaxTenuringThreshold to 15

https://gerrit.wikimedia.org/r/376663

Applied the following to all the nodes to remove the placeholder logical volumes:

root@kafka-jumbo1001:/home/elukey# lvremove /dev/vg-flex/root-placeholder
Do you really want to remove active logical volume vg-flex/root-placeholder? [y/n]: y
  Logical volume "root-placeholder" successfully removed
root@kafka-jumbo1001:/home/elukey# lvremove /dev/vg-data/srv-placeholder
Do you really want to remove active logical volume vg-data/srv-placeholder? [y/n]: y
  Logical volume "srv-placeholder" successfully removed
root@kafka-jumbo1001:/home/elukey# pvs
  PV         VG      Fmt  Attr PSize   PFree
  /dev/sda2  vg-flex lvm2 a--  926.34g 93.13g
  /dev/sdb1  vg-data lvm2 a--   21.83t  2.73t

elukey mentioned this in T157435: Review ACLs for the Analytics VLAN.Sep 8 2017, 7:42 AM

I could be wrong but from cr1/cr2 eqiad the hosts seem to be in the Analytics VLAN, and they shouldn't be:

elukey@re0.cr1-eqiad> show route kafka-jumbo1006.eqiad.wmnet

inet.0: 649448 destinations, 3244861 routes (649322 active, 0 holddown, 132 hidden)
Restart Complete
+ = Active Route, - = Last Active, * = Both

10.64.53.0/24      *[Direct/0] 24w1d 21:25:02
                    > via ae4.1023

{master}
elukey@re0.cr1-eqiad> show configuration interfaces ae4.1023
description "Subnet analytics1-d-eqiad";
vlan-id 1023;
family inet {
    filter {
        input analytics-in4;
    }

@RobH, @Cmjohnson: can you guys double check? If I am right what is the procedure to move those hosts out of the Analytics VLAN? (I guess new IPs + reimage?)

@Ottomata: let's also remember to whitelist the jumbo IPs in the Analytics VLAN firewall rules, otherwise hosts like analytics1003 will not be able to contact them.

Change 376663 merged by Elukey:
[operations/debs/jmxtrans@master] jmxtrans.sh: reduce MaxTenuringThreshold to 15

https://gerrit.wikimedia.org/r/376663

@Ottomata: I merged https://gerrit.wikimedia.org/r/#/c/376663 but I then realized that master/debian branches are a bit weird, namely the master contains the debian directory and it is out of sync from the debian branch. From what I can see it seems that the current debian package is made from the debian branch, but difficult to say. If you have time can you double check and let me know your opinion? With my current understanding I'd simply cherry pick https://gerrit.wikimedia.org/r/#/c/376663 to debian and build HEAD from copper.

Assigning back to Chris as discussed on IRC: we'd need to move the Kafka Jumbo hosts out of the analytics VLAN and then reimage (will take care of this step).

• Cmjohnson moved this task from Blocked to High Priority Task on the ops-eqiad board.Sep 11 2017, 7:05 PM

Change 377329 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] Updating dns entries for kafka-jumbo100[1-6] to reflect change in vlan T167992

https://gerrit.wikimedia.org/r/377329

Change 377329 merged by Cmjohnson:
[operations/dns@master] Updating dns entries for kafka-jumbo100[1-6] to reflect change in vlan T167992

https://gerrit.wikimedia.org/r/377329

@elukey updated dns entries and swich ports to reflect vlan-private1-row-eqiad

elukey closed subtask T174457: kafka-jumbo.cfg partman recipe creation/troubleshooting as Resolved.Sep 12 2017, 8:26 AM

Script wmf_auto_reimage was launched by volans on sarin.codfw.wmnet for hosts:

['kafka-jumbo1002.eqiad.wmnet', 'kafka-jumbo1003.eqiad.wmnet', 'kafka-jumbo1004.eqiad.wmnet', 'kafka-jumbo1005.eqiad.wmnet', 'kafka-jumbo1006.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/201709120902_volans_9858.log.

Completed auto-reimage of hosts:

['kafka-jumbo1002.eqiad.wmnet', 'kafka-jumbo1003.eqiad.wmnet', 'kafka-jumbo1004.eqiad.wmnet', 'kafka-jumbo1005.eqiad.wmnet', 'kafka-jumbo1006.eqiad.wmnet']

Of which those FAILED:

set(['kafka-jumbo1003.eqiad.wmnet', 'kafka-jumbo1002.eqiad.wmnet', 'kafka-jumbo1005.eqiad.wmnet', 'kafka-jumbo1004.eqiad.wmnet', 'kafka-jumbo1006.eqiad.wmnet'])

For the record they were reimaged correctly, the new reimage script hit a small bug in the post-reimage part, I've already re-run it for the "failed" host to complete the post-reimage steps.

Change 377417 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] network::constants: update IP addresses of the new Kafka hosts

https://gerrit.wikimedia.org/r/377417

Change 377417 merged by Elukey:
[operations/puppet@production] network::constants: update IP addresses of the new Kafka hosts

https://gerrit.wikimedia.org/r/377417

Proposed new term for the analytics-in4 filter on cr1/cr2 eqiad:

term kafka {
    from {
        destination-address {
            10.64.0.175/32;
            10.64.0.176/32;
            10.64.16.99/32;
            10.64.32.159/32;
            10.64.32.160/32;
            10.64.48.117/32;
        }
        protocol tcp;
        destination-port 9092;
    }
    then accept;
}

EDIT: Just checked and a kafka term is already present (with kafka1012->1022) so it should be as easy as executing these:

set firewall family inet filter analytics-in4 term kafka from destination-address 10.64.0.175/32

set firewall family inet filter analytics-in4 term kafka from destination-address 10.64.0.176/32

set firewall family inet filter analytics-in4 term kafka from destination-address 10.64.16.99/32

set firewall family inet filter analytics-in4 term kafka from destination-address 10.64.32.159/32

set firewall family inet filter analytics-in4 term kafka from destination-address 10.64.32.160/32

set firewall family inet filter analytics-in4 term kafka from destination-address 10.64.48.117/32

LGTM. Minor nitpick: I love comments about which hostname IPs match to . e.g.

term puppet {
        from {
            destination-address {
                /* puppetmaster1001 */
                10.64.16.73/32;
                /* puppetmaster2001 */
                10.192.0.27/32;
            }
            protocol tcp;
            destination-port 8140;
        }
        then accept;
}

Volans unsubscribed.Sep 12 2017, 2:26 PM

Added them with annotations!

Cleaned up placeholders lvm partitions, now the next steps are:

decide a TLS port for the Kafka cluster and whitelist it in the analytics vlan and ferm firewalls
try prometheus jmx exporter as alternative to jmxtrans. If feasible go for it, otherwise just rebuild jmxtrans with the last commit in master.

Change 377753 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] [WIP] role::kafka::jumbo::broker: enable Prometheus JMX monitoring

https://gerrit.wikimedia.org/r/377753

Change 378876 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] profile::kafka::broker: add the monitoring_enabled option

https://gerrit.wikimedia.org/r/378876

Change 378876 merged by Elukey:
[operations/puppet@production] profile::kafka::broker: add the monitoring_enabled option

https://gerrit.wikimedia.org/r/378876

• Cmjohnson moved this task from High Priority Task to Blocked on the ops-eqiad board.Sep 19 2017, 7:37 PM

• Cmjohnson removed a project: ops-eqiad.Sep 21 2017, 4:22 PM

elukey moved this task from Stalled to Done on the User-Elukey board.Oct 2 2017, 3:44 PM

elukey closed this task as Resolved.Oct 3 2017, 1:07 PM

Aklapper edited projects, added Analytics-Radar; removed Analytics.Jun 10 2020, 6:44 AM

Aklapper removed a subscriber: User-Elukey.May 16 2023, 11:25 AM

rack/setup/install new kafka nodes kafka-jumbo100[1-6]
Closed, ResolvedPublic
Actions

Description

Details

Related Objects
Search...

Event Timeline

Status	Assigned	Task
Declined	elukey	T166833 Produce webrequests from varnishkafka to Kafka with Kafka message timestamp set to configurable content field
Resolved	Ottomata	T152015 Provision new Kafka cluster(s) with security features
		Unknown Object (Task)
Resolved	elukey	T168538 Perf test RAID vs JBOD with new hardware and kafka versions
Resolved	elukey	T167992 rack/setup/install new kafka nodes kafka-jumbo100[1-6]
Resolved	• Cmjohnson	T173837 kafka-jumbo1004 h/w problem most likely raid card
Resolved	RobH	T174457 kafka-jumbo.cfg partman recipe creation/troubleshooting

	RobH
	Jun 15 2017, 5:55 PM

rack/setup/install new kafka nodes kafka-jumbo100[1-6]Closed, ResolvedPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

rack/setup/install new kafka nodes kafka-jumbo100[1-6]
Closed, ResolvedPublic
Actions

Related Objects
Search...