This is the second redis cluster, it should be as the similar cluster in eqiad - see https://racktables.wikimedia.org/index.php?page=object&tab=default&object_id=1026
Description
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
let rbf200x hosts be Ubuntu for now | operations/puppet | production | +0 -4 | |
rbf2001: use eth2 MAC for DHCP | operations/puppet | production | +1 -1 |
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | Joe | T86894 Set up the mediawiki application layer in codfw | |||
Resolved | Joe | T86887 Setup redis clusters in codfw | |||
Resolved | Dzahn | T86898 Check that the redis roles can be applied in codfw, set up puppet. | |||
Resolved | Dzahn | T86897 Procure and setup rbf2001-2002 | |||
Resolved | Papaul | T86940 label & setup drac/basic setings for rbf2001 & rbf2002 | |||
Resolved | Papaul | T88380 reclaim rbf2002/WMF5833 back to spare, allocate WMF5845 as rbf2002 |
Event Timeline
So these are dual cpu (4 core) systems with 32GB of memory. My spare systems in codfw are slightly better, but will work.
Allocating:
Dell PowerEdge R420, Dual Intel Xeon E5-2440, 32 GB Memory, (2) 500GB Disks
rbf2001
wmf5849
a5-codfw
rbf2002
wmf5833
b5-codfw
rbf2001 is installed and ready for service implementation
rbf2002 is having install issues detecting disks, and I need to further troubleshoot the installation.
daniel's setup rbf2001 via the linked task for service implementation https://phabricator.wikimedia.org/T86898
dzahn@iron:~$ ssh root@rbf2001.mgmt root@rbf2001.mgmt's password: dzahn@iron:~$ ssh root@rbf2002.mgmt ssh: Could not resolve hostname rbf2002.mgmt: Name or service not known
^ hmm? odd? at a quick glance i see it in DNS zones though.
i tried to install Debian on rbf2001 and the installer claims:
┌────────────┤ [!!] Download debconf preconfiguration file ├────────────┐ │ │ │ Malformed IP address │ │ The IP address you provided is malformed. It should be in the form │ │ x.x.x.x where each 'x' is no larger than 255 (an IPv4 address), or a │ │ sequence of blocks of hexadecimal digits separated by colons (an IPv6 │ │ address).
the change was just adding the options for jessie:
The DNS issues for rbf2001 and rbf2002 mgmt have been fixed.
However, rbf2002.mgmt is on 10.193.2.118, and its not responsive to ping or ssh (both via fqdn or direct ip) I've reopened the blocking ticket (T88380) for repair of rbf2002 mgmt interface settings and connection.
Well, we know that the install worked in Ubuntu before (since I had installed ubuntu on rbf2001). I'm not sure what issue would arise for its production DNS, as it all appears correct.
That being said, I did clear out all negatively cached entries on the recursors, perhaps try again?
I don't see the change on iron yet.I''ll check later again.
dzahn@iron:~$ host rbf2001.mgmt rbf2001.mgmt.codfw.wmnet has address 10.193.2.116 rbf2001.mgmt.codfw.wmnet has address 10.193.2.118 dzahn@iron:~$ host rbf2002.mgmt Host rbf2002.mgmt not found: 3(NXDOMAIN)
Thanks, i tried again and it did change, but to this:
┌──────────┤ [!!] Download debconf preconfiguration file ├──────────┐ │ │ │ Failed to run preseeded command │ │ Execution of preseeded command "wget -O /tmp/early_command │ │ http://apt.wikimedia.org/autoinstall/scripts/early_command && sh │ │ /tmp/early_command" failed with exit code 10. │ │ │ │ <Go Back> <Continue> │ │ │ └───────────────────────────────────────────────────────────────────┘
let's figure this out after the weekend
The issue is still unchanged. I attempted another reinstall of rbf2001 and:
│ Malformed IP address │ │ The IP address you provided is malformed. It should be in the form │ │ x.x.x.x where each 'x' is no larger than 255 (an IPv4 address), or a │ │ sequence of blocks of hexadecimal digits separated by colons (an IPv6 │ │ address). Please try again. │
after the installer detects link on eth2 and eth3
BusyBox v1.22.1 (Debian 1:1.22.0-15) built-in shell (ash) Enter 'help' for a list of built-in commands. ~ # ip a s 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth3: <BROADCAST,MULTICAST> mtu 1500 qdisc mq qlen 1000 link/ether 90:b1:1c:2d:85:70 brd ff:ff:ff:ff:ff:ff inet 10.192.0.33/22 scope global eth3 valid_lft forever preferred_lft forever 3: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc mq qlen 1000 link/ether 90:b1:1c:2d:85:71 brd ff:ff:ff:ff:ff:ff
5155 host rbf2001 { 5156 hardware ethernet 90:B1:1C:2D:85:70; 5157 fixed-address rbf2001.codfw.wmnet;
Change 196138 had a related patch set uploaded (by Dzahn):
rbf2001: use eth2 MAC for DHCP
i tried to use eth2 and " Network autoconfiguration failed Your network is probably not using the DHCP protocol. "
Change 196624 had a related patch set uploaded (by Dzahn):
let rbf200x hosts be Ubuntu for now
reinstalled rbf2001 with trusty, re-enabled in icinga:
https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?host=rbf2001&style=detail&nostatusheader
rbf2001 and 2002 are both up and running but with trusty for now (while we can still investigate the problem with jessie on the related rdf2xxx hosts and their ticket