Page MenuHomePhabricator

Rack and set up ms-fe100[5-8]
Closed, ResolvedPublic

Description

ms-fe1005-8

  • - receive in normally on parent task T149867
  • - rack location A5/C8
  • - create dns entries for internal production IP address, and mgmt entries for both asset tag and hostname
  • - setup bios and drac
  • - update switch port info
  • - install_server module update
  • - install OS
  • - sign/accept puppet & salt keys
  • - hand off to @Filippo for service implementation.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 11 2017, 4:01 PM

Change 331625 had a related patch set uploaded (by Cmjohnson):
Adding mgmt and production dns entries for ms-fe100[5-7] T155095

https://gerrit.wikimedia.org/r/331625

Change 331625 merged by Cmjohnson:
Adding mgmt and production dns entries for ms-fe100[5-7] T155095

https://gerrit.wikimedia.org/r/331625

Cmjohnson updated the task description. (Show Details)Jan 18 2017, 6:55 PM

@fgiunchedi
ms-fe1005 and 1006 are both on asw2-a5 and the switch is not seeing them. They are both setup exactly the same as 1007 and 1008 and using the same cables which were all new. They are confirmed in the right port. The issue is with the switch.

ms-fe1007 installed correctly
ms-fe1008 installs but is having an issue finding the OS

Loading Linux 4.4.0-3-amd64 ...
Loading initial ramdisk ...
Loading, please wait...
mdadm: No devices listed in conf file were found.
Gave up waiting for root device. Common problems:

  • Boot args (cat /proc/cmdline)
    • Check rootdelay= (did the system wait long enough?)
    • Check root= (did the system wait for the right device?)
  • Missing modules (cat /proc/modules; ls /dev)

ALERT! /dev/disk/by-uuid/024d544f-fdfc-4574-9b26-7a20ac0eafb4 does not exist. !
modprobe: module ehci-orion not found in modules.dep

BusyBox v1.22.1 (Debian 1:1.22.0-9+deb8u1) built-in shell (ash)
Enter 'help' for a list of built-in commands.

/bin/sh: can't access tty; job control turned off
(initramfs)

thanks Chris! it looks like ms-fe1008 issue with the installer is an instance of T149845: Something is wrong with installer root disk stuff for which we don't have a root cause yet. I was able to fix it by manually assembling the arrays and subsequent reboots should be fine.

(initramfs) cat /etc/mdadm/mdadm.conf 
HOMEHOST <system>
ARRAY /dev/md/0  metadata=1.2 UUID=925aef19:f00631ee:69b9e385:bcb94dee name=ms-fe1008:0
ARRAY /dev/md/1  metadata=1.2 UUID=2aea563c:bcf5a59d:bd595c3c:bf02ddac name=ms-fe1008:1
(initramfs) mdadm --assemble /dev/md/0
mdadm: /dev/md/0 has been started with 2 drives.
(initramfs) mdadm --assemble /dev/md/1
mdadm: /dev/md/1 has been started with 2 drives.
(initramfs) exit
/dev/md0: clean, 41944/60981248 files, 4176884/243913216 blocks

re: asw2-a5 switch indeed I see both xe-0/0/16 and xe-0/0/15 with their physical link down ATM

I was able to confirm the servers and NIC cards were good and ms-fe1005 and 1006 are now up and accessible.

Cmjohnson reassigned this task from Cmjohnson to fgiunchedi.Feb 6 2017, 6:43 PM

@fgiunchedi these are all yours....lmk if you have any issues.

Change 340721 had a related patch set uploaded (by Filippo Giunchedi):
[operations/puppet] Provision ms-fe100[5-8]

https://gerrit.wikimedia.org/r/340721

Change 340721 merged by Filippo Giunchedi:
[operations/puppet] Provision ms-fe100[5-8]

https://gerrit.wikimedia.org/r/340721

fgiunchedi renamed this task from Rack and set up ms-fe100[5-7] to Rack and set up ms-fe100[5-8].Mar 2 2017, 3:03 PM
fgiunchedi moved this task from Backlog to Doing on the User-fgiunchedi board.

Change 343837 had a related patch set uploaded (by Filippo Giunchedi):
[operations/puppet] hieradata: use ms-fe100[5-8] as swift memcache

https://gerrit.wikimedia.org/r/343837

Change 343837 merged by Filippo Giunchedi:
[operations/puppet] hieradata: use ms-fe100[5-8] as swift memcache

https://gerrit.wikimedia.org/r/343837

Change 343840 had a related patch set uploaded (by Filippo Giunchedi):
[operations/puppet] swift: decom ms-fe100[1-4]

https://gerrit.wikimedia.org/r/343840

Change 343840 merged by Filippo Giunchedi:
[operations/puppet] swift: decom ms-fe100[1-4]

https://gerrit.wikimedia.org/r/343840

fgiunchedi closed this task as Resolved.Mar 21 2017, 10:38 AM

Done