Page MenuHomePhabricator

Rack and Initial setup db1074-79
Closed, ResolvedPublic

Description

db1074-79
@jcrespo please comment on whether or not it's okay to add 4 to row A2 and 2 to C2?

  • - rack and update racktables
  • - add mgmt dns entries (both asset tag and hostname)
  • - setup and test bios/ilom/redirection
  • - setup raid10 of disks in hardware raid with stripe length of 256 KB
  • - switch ports setup (description/enable/vlan)
  • - add production dns entries (internal vlan)
  • - update install_server module for system (dhcp and netboot entries) standard db partitioning
  • - install OS - jessie
  • - sign/accept puppet/salt keys

Details

Related Gerrit Patches:
operations/puppet : productionCorrect comment with vlan name
operations/dns : masters/labdsdb1008/labsdb1008/ for the non-mgmt ip

Event Timeline

Cmjohnson created this task.Mar 3 2016, 5:31 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 3 2016, 5:31 PM

@jcrespo I can fit all of these into rack. They're 1u instead of 2u.

Important! One of those will be a replacement for labsdb1002 (I do not know if we need to name it labsdb1008, or just substitute it transparently).

For the rest, at least 2 rows is required. Those will go an substitute eventually a subset of db1001-db1030 and be used for shards s2 and s3.

Okay, having one set for replacing labsdb1002 is fine. labsdb1002 is in row C and I had 2 scheduled to go in the same rack so it will work out fine. We will name it labsdb1008.

I am going to rack
2 in row A
2 in row B
2 in row C (one of these will replace labsdb1002)

Volans updated the task description. (Show Details)Mar 4 2016, 3:13 PM
Volans added a subscriber: Volans.

FYI I just did 2 minor edit in the description:

  • add as a reminder the stripe size in the RAID step
  • replace the reference to es20* to db1074-79
Cmjohnson updated the task description. (Show Details)Mar 7 2016, 10:39 PM

check db1077.mgmt password ..can't access

Cmjohnson updated the task description. (Show Details)Mar 8 2016, 6:04 PM

fixed the mgmt issue for db1077

db1074,5,6 installed without an issue and are now ssh accessible. 1077 and 1078 did not install correctly and when I access via palladium I am put at this prompt.

~ # pwd
/
~ #

jcrespo mentioned this in Unknown Object (Task).Mar 10 2016, 8:39 AM

@jcrespo check potential issue with firewall for carbon installation (dhcp) on labs-support vlan.

Change 276716 had a related patch set uploaded (by Jcrespo):
[WIP]Fix labs-support vlan dhcp-install config on eqiad

https://gerrit.wikimedia.org/r/276716

Thank you, @jcrespo. I have checked the isuess with the help of Mortiz, and I believe it is not a firewall issue, but lack of dhcp offerings to that vlan that:

https://gerrit.wikimedia.org/r/276716

should fix.

@Cmjohnson please check the correctness of the ip ranges and vlan name, as I purely guessed it based on indirect config. If correct, I will apply it, run puppt on carbon again and that should fix the issues with the booting.

Change 276718 had a related patch set uploaded (by Jcrespo):
s/labdsdb1008/labsdb1008/ for the non-mgmt ip

https://gerrit.wikimedia.org/r/276718

Change 276718 merged by Jcrespo:
s/labdsdb1008/labsdb1008/ for the non-mgmt ip

https://gerrit.wikimedia.org/r/276718

Change 276716 merged by Alexandros Kosiaris:
Correct comment with vlan name

https://gerrit.wikimedia.org/r/276716

The issue was, in fact https://gerrit.wikimedia.org/r/276718 labsdb1008 should have been already installed.

db1074, db1075 and db1076 all had puppet certs signed and salt-keys added
labsdb1008 had puppet certs signed and salt-keys added.

db1077 and 1078 did not install correctly and need troubleshooting.

Cmjohnson closed this task as Resolved.Mar 17 2016, 9:25 PM
Cmjohnson updated the task description. (Show Details)

@jcrespo db1077 and db1078 are finished with install. Resolving tasks