- - name mirror1001 (discussed via PM with @RobH and @Cmjohnson)
- - setup network ports
- - mgmt & production dns
- - install_module update - dhcp lease file
- - install_module update - partitioning and netboot.
- - install OS (Jessie)
- - puppet/salt key acceptance
- - service implementation (hand off to @faidon for this step)
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | RobH | T137117 Replace/refresh carbon | |||
Unknown Object (Task) | |||||
Unknown Object (Task) | |||||
Resolved | • Cmjohnson | T139171 Rack/setup sodium (carbon/mirror server replacement) |
Event Timeline
This will require 10GbE, and also will be an apt-mirror, not our actual apt-server. (That wording was my mistake in earlier tasks.)
So it won't replace carbon entirely, but some of its service(s).
The details on what this system will do are on T137117. I suggest the following:
- 10GbE rack
- use element name (dont forget to parse site.pp for ganeti vms using element names)
- rack anywhere in rows a-c where there is a free 10GbE rack (avoid D for its imminent switch stack upgrade)
- public vlan
- raid10 with jessie
No need to — I wouldn't expect more of those. An element name for it would be fine IMHO.
Mgmt and production DNS completed. I added to public vlan and assigned both ipv4 and ipv6.
Only thing missing at this point is preferred partitioning. Please let me know and I will update and install.
Confirmed that the pxe is enabled on the 10G NIC, disabled the 1G NICS in bios. The servers is connected via fiber w/SFP+'s, going to remove and use a DAC cable.
So, this was simply the case of a misconfigured VLAN on the switch. I did that and with another small hack[1] managed to make the server install.
However, it is currently impossible for the server to boot off the disk — the BIOS simply doesn't list the virtual disk as an option in the boot sequence. Both @RobH and me tried a few different things.
The symptoms seem similar to this:
https://arstechnica.com/civis/viewtopic.php?f=21&t=1316257
I tried upgrading the controller's BIOS from 25.4.0.0015 to 25.4.1.0004 and after that tried downgrading to 25.2.2-0004 (and factory resetting the controller, as well as recreating the VD — as RAID5). None of these had any effect.
The next step would be to contact Dell about this — @Cmjohnson could you take care of this? Thanks!
1: The hack would be adding modprobe.blacklist=tg3 after ixgbe.allow_unsupported_sfp=1 and before console=ttyS1,115200n8 in carbon's /srv/tftpboot/jessie-installer/pxelinux.cfg/ttyS1-115200.
A workorder to replace the system board has been issued. Congratulations: Work Order SR933837812 was successfully submitted.
A new system board has been confirmed. Dell will be sending a tech out to me next week.
Your appointment has been scheduled for : 12:00 PM-05:00 PM , Wednesday, August 03, 2016.
PowerEdge Expandable RAID Controller BIOS
Copyright(c) 2014 LSI Corporation
Press <Ctrl><R> to Run Configuration Utility
HA -0 (Bus 1 Dev 0) PERC H730 Mini
FW package: 25.2.2-0004
0 Non-RAID Disk(s) found on the host adapter
0 Non-RAID Disk(s) handled by BIOS
1 Virtual Drive(s) found on the host adapter.
0 Virtual Drive(s) handled by BIOS
Requested a new RAID Controller. Found that we're not the only one w/this problem https://arstechnica.com/civis/viewtopic.php?f=21&t=1316257
Replaced the broken cable, during post I am still getting the same message that the VD is not handled by bios
Created a new work order to have a technician come to the data center and troubleshoot.
Spoke with Dell support technician Robert Thaler today. We went over some things that were already one and he's also stumped by the issue. He did state that there has been numerous issues with the 4k drives. Dell's suggestion is to set it to HPA mode and software raid.
The RAID controller is actually useful and expensive. They sold us this system, in this configuration, with a RAID controller (w/ a BBU) and those specific disks. Can you circle back with them and demand they fix this for us? Cc: @RobH (in case we need to involve our sales rep too)
The dell tech did look and told me there are non 4k 6TB disks we could
use.
http://accessories.ap.dell.com/sna/productdetail.aspx?c=au&l=en&s=dhs&cs=audhs1&sku=400-ALDU
should we talk with Dell about exchanging them?
We'll need our Dell reps looped in, as they did sell us this config and it should work. In addition, we had to buy cables, since one broke diagnosing an issue that they saddled us with.
I've chatted with Chris about this via IRC. He is documenting all the steps taken by tech support into a cohesive email to send over to our Dell reps so they can get involved and solve this issue for us.
In regards to swapping for another 4TB disk(s): I have no preference, except that Dell sold us the config. They'll need to eat the costs of the swaps and need to be held accountable for selling us a bad configuration.
@faidon Would you be okay with 4TB disks instead of the 6TB disks we have now or would you want to go w/ SW raid?
4x4TB + HWRAID would be preferrable. In any case Dell should refund us the difference.
I wonder why @RobH and @Cmjohnson are talking about 4TB disks. The current problems are caused by 4k (6TB) disks, and the accessoires link given by Cmjohnson mentions a non-4k 6TB disk.
Have you been discussing the use of 4TB disks via another way (e.g. IRC), or is there confusion between 4TB (non-4k?) and 6TB non-4k disks?
I wonder why @RobH and @Cmjohnson are talking about 4TB disks. The current problems are caused by 4k (6TB) disks, and the accessoires link given by Cmjohnson mentions a non-4k 6TB disk.
Have you been discussing the use of 4TB disks via another way (e.g. IRC), or is there confusion between 4TB (non-4k?) and 6TB non-4k disks?
@Southparkfan The issue is with the 4k and 512e disks. However, in order to
replace them with something other than 4k and 512e disks we will need to
reduce the capacity from 6TB to 4TB. I hope that clears up any confusion
Still working on getting the disks replaced w/out any costs to us and possibly a refund. This is the latest message .
Chris,
Base on your request below return request order number 935064839, To return this order we submitted a FSR (Financial Services Request) as it is out of policy.
An FSR does not guarantee that the order will be approved for the return. Moreover, I will do all my effort to help on this request.
Request Id 9726888.
I have created a Service Request for this issue. Feel free to contact me if there is any question or doubt.
Regards,
Carlos Brown
Customer Care Analyst
Dell | Dell Business Operations
Carlos_Brown@DellTeam.com
Work Hours: Monday to Friday: 8am-6pm
Customer feedback | How am I doing? Please contact my manager Claudia_taboada@dell.com
- Please do not remove your unique tracking number! ------
<<#3075-36464440#>>
Received the new disks from Dell, installed them set to RAID and the new disks are now handled by the BIOS.
Thanks Chris. I installed the system, reconfigured BIOS etc.; system is installed and up and running now.