Actually we need to close this task and open a separate task about the
disk. Different issue should get a different task.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Sep 13 2019
Sep 12 2019
I did notice that ssds are different types
The new ssd is a DC3320 series
The old ssd is a DC3610 series
@wiki_willy not really but I reseated it anyway. As far as I can tell in bios everything looks normal. I did swap the 2 disks. @Gehel try again please.
Sep 10 2019
The PDU has been swapped and the new pdus are in netbox. @RobH can you help with the setup for serial console please.
Sep 9 2019
this got lost in the shuffle....will work on it this week . @Jclark-ctr can you contact HPE support and open a ticket please.
Sep 6 2019
Sep 5 2019
@Ottomata the on-site work is done, They will need updated production DNS but all are moved and connected.
@Andrew the new mac is in an earlier update. The server is moved, connected to the new port and raid cfg completed...needs the dhcp file updated and ready for you to re-image.
B0:26:28:29:6A:E0
@Andrew this is ready for you to re-image
@Andrew this is ready for you to re-image
Sep 4 2019
@Jclark-ctr Please set up the idrac and add the mgmt dns. Let me know if you have any issues or questions. I also need the switch ports.
Sep 3 2019
@Ottomata All the servers are moved and all of them but cloudvirtan1003 are connected to the switch in the correct vlan. @Jclark-ctr if you are still around can you verify that cloudvirtan is connected to switch in rack d7 xe-7/0/20, please.
@Ottomata Do you still need the 2nd port now that you're not doing the cloud thing? If so which vlan?
Aug 30 2019
@wiki_willy negative, we do not have any spare BBUs lying around.
updated the idrac and raid f/w
Aug 29 2019
@Jclark-ctr please rack 1 each in B2/B4/B7 please and update netbox
@Jclark-ctr please rack 1 each in B2/B4/B7 please and update netbox
looks like the mgmt is locked out and this server will require a hard reboot and flea power drain. please let me know when it's safe to turn the server off for 5-10 mins.
Aug 27 2019
@Jclark-ctr Can you move these servers as evenly as you can into rows B2/B4 and B7, cable with 10G DAC cables and the mgmt cable please and update netbox and this task with their location and the port numbers you connected the servers.
@Andrew This server will require a physical move to B2, B4 or B7. I will do this one last, working on cabling 1021/1022 and updating the raid cfg so you can re-image
@Jclark-ctr Can you run 10G DAC cables in rack B7. Connect to the 10G ports on the server but do not plug into the switch. Be sure to label each cable.
@Jclark-ctr Can you run 10G DAC cables in rack B4. Connect to the 10G ports on the server but do not plug into the switch. Be sure to label each cable.
The reason for the task being declined. I verified that the failed disk is indeed 1.9TB but is a SSD. The original order and showing on the disk caddy label is for an Intel 1.6TB SSD S3610. Assigning to @wiki_willy
@Marostegui Replaced the disk with one of the few remaining used spares. I did notice 2 more disks are starting to fail....you may want to speed up the decom process.
Aug 23 2019
I replaced the failed disk
The ticket was declined by Dell....stating that the disk we have installed are not original to the server. this requires me to investigate
Finished the idrac setup. on-site work is complete
Aug 22 2019
@Jclark-ctr did you add this to the tracking sheet?
Aug 21 2019
Board arrived DOA...need another one
The disk was replaced but from what I can tell is that the raid configuration is not accepting the new disk. When I am in the raid utility it shows that all the disks are good but the raid is missing a disk. This may need the raid config manually updated and a re-install. Let me know
@Bstorm can you try rebooting the server and see if the disks get back to the correct order. I know that works for analytics. Please try that...i do have a disk but I'm not sure which disk is bad
Aug 20 2019
@Jclark-ctr Please wipe and remove these servers from the rack and update the task -- assign it back to me once done please.
Can you wipe this server and remove from the rack as soon as you can. Need the space.
@Jclark-ctr has this ben done? We need the space in rack B2 so please make this a priority item. Thanks!
@elukey the site specific portion is complete if you want to take over from here
@Marostegui I had a used disk on-site and replace it....it's currently in rebuild
Swapped the DIMM B3 with A3 and B7 with A7. Powered on and cleared log. Let's see if the errors return or change,
A ticket has been placed with Dell
Another ticket has been placed with Dell
Aug 16 2019
Dell approved my ticket. I talked to the technician today and he will be
out Monday morning to replace the motherboard.
Aug 15 2019
I will add that this server is out of warranty and would require a motherboard replacement if it is the nic. We typically do not do this after the warranty period and the host should be decommissioned.
- I checked the network switch and the port shows up/up meaning that link from the server to the network switch is up
ge-3/0/17 up up elastic1017
@Marostegui I see a potential issue with B3 as well. I will need to do a DIMM swap A -> B side and see if the errors stay with the DIMM or are the CPU. Let's schedule this for early next week, please. Tuesday 1400UTC?
cloudcephosd1001 10.65.2.177
cloudcephosd1002 10.65.2.178
cloudcephosd1003 10.65.2.179
Submitted the ticket with Dell. We will see what happens
Aug 14 2019
@Jclark-ctr can you add asset tags and enter these servers into Netbox (T222916 is the procurement task). Leave them on the floor and the rack information blank in netbox until we know for sure where they're going. Once done, please re-assign back to Rob
@Jclark-ctr can you add asset tags and enter these servers into Netbox (T221698 is the procurement task). Leave them on the floor and the rack information blank in netbox until we know for sure where they're going. Once done, please re-assign back to Rob
+an-conf1001 1H IN A 10.65.5.118
+an-conf1002 1H IN A 10.65.5.119
+an-conf1003 1H IN A 10.65.5.120
Please check to make sure that the power cables are fully seated. Update the task and let me know if I need to order a new PSU.
ganeti1019 10.65.5.114
ganeti1020 10.65.5.115
ganeti1021 10.65.5.116
ganeti1022 10.65.5.117
@Jclark-ctr Mgmt IP's that need to be setup on the idrac
Aug 13 2019
@Jclark-ctr Please rack 4 of the servers from the same ganeti stack in row D and label them as ganeti1019-1022. Please update netbox, and provide access switch port info.
Disks replaced, please re-open an ping me if the disk fails
Please rack, label and cable these servers with the racking locations above. Add them to netbox, be sure to make sure status is set to planned and asset tag/SN is ALL CAPS. Please update the task with which network ports each server is attached to on the access switch.
Aug 8 2019
This doesn't really tell me anything about the bad disk. I am not able to ssh into the host for more details. I will create a ticket and hope that there is something Dell can use in their TSR
The ticket was approved. the new ssd should arrive today or tomorrow
This is odd, I am not getting a link light on the raid controller connections.
@Jclark-ctr Please wipe logstash1004 and 1005 and then remove from rack and update netbox and the google tracking sheet.
https://docs.google.com/spreadsheets/d/1JhjeV3cXfIzIyekJrnA2nNFFDGTT4SeLmyAFvDa4HmA/edit#gid=2026042311