Wed, Jun 19
I updated the switch config to private1-d.....both servers are currently off and ready for installs. assigning to @RobH to install
Dell is sending me a new Raid card, cables and backplane. Sorry, it took so long, I had to call them after they denied my second request.
I thought I had a ticket for this open with HPE but it doesn't look that way. I will take care of it ASAP
Closing this for now, let me know if there is another issue. Keep in mind this server is out of warranty
The DIMM has been reseated and swapped to the opposite sides.
Mon, Jun 17
these have been racked
servers are set up and have been added to the tracking sheet
servers are ready as spares and in tracking sheet
@ayounsi I rather not move the servers...I racked them based on the instructions and they're already in racks and setup
@Marostegui: do they all go to the cloud vlan? if they do then 1020 and 1021 are in row D...that support-cloud vlan is not available on row D yet. I need Arzhel to copy the vlan over.
Assigning to @ayounsi to add cloud-support1-d-eqiad. Once that is done, the vlan for dbproxy1020 and 1021 will need to be set up. Switch port descriptions are done.
Tue, Jun 11
they declined my ticket...says I didn't isolate the problem well enough.
This server accepts all the racadm commands successfully. I verified on-site that these things actually happened
@Andrew what parts? There is nothing that suggests that it is CPU on the server side of things. I reseated and moved the DIMM and that error has not returned. It may very well have been poorly seated DIMM. I checked dmesg and do not see any more errors related to memory or CPU. Try putting it back into production and let's see if anything comes back. Unfortunately, I need to demonstrate and prove there is a problem for Dell to do anything and right now I do not have anything to give them.
this is a duplicate task declining
This server's SSD's are not part of the original build and under HP warranty. They are intel SSDs that I believe came from restbase1001-1003. Assigning to @RobH to order new SSDs.
I found a spare disk and added the disk back, it's now online
@Gehel you will need to take the server offline for a day so I can reseat the DIMM. The server logs do not indicate any memory errors. If you want to downtime it for Wednesday or Thursday let me know.
@Marostegui that log entry may have been old. The server has both power supplies connected and does not report any current errors. Resolving the task.
Mon, Jun 10
@RobH this disk will need to be ordered outside of the warranty. These servers were shipped without disks, the procurement task states that the disk from RBDEV1001-1003 will be used. They are 800GB Intell SSDS
This has been completed
since this server is out of warranty and @elukey said to skip replacing the disk. If the status changes and needs to be done please re-open task
declining this for now since it's out of warranty and the disk has not failed
I updated with the service pack and powered on...reassigning to @Marostegui
Fri, Jun 7
The motherboard was replaced and the server is back up
Thu, Jun 6
The server is out of warrant and we will need to order more 600GB disks.
The HP technician will be her June 7 @1000 Ashburn time.
You have successfully submitted request SR991779294.
Wed, Jun 5
Update on this server. I have updated all of the f/w including the raid card. I am able to isolate the problem to slot 0 right now. I moved the disks around and they do not report any errors only the slot. I have blown out the raid several times and re-configured but the error keeps coming back. I have reseated the raid card as well.
The bbu has been replaced.
Good afternoon! db1091...i do have a spare bbu but that spare has been helpful the last year or so. HP is slow to send out the batteries, they can take days to get because of their slow response time and then having to ship batteries via ground transportation only. If I use it for this server than I am not able to quickly change out the bbu on something that may be more important in the future. The call
10:22 is yours since you have the most BBU issues.
Tue, Jun 4
Fri, May 31
@jcrespo the server is back on and I am able to reach the mgmt interface.
@greg the disks have been added and assigned to you
Thu, May 30
Swapped DIMM B3 with DIMM A3 and cleared the log.
Swapped DIMM A5 with DIMM B5 and cleared the racadm log.
Wed, May 29
a ticket has been created with HP for a replacement 5338974144
Steps i have taken
this server is out of warranty. @RobH should we order a new battery?
@elukey I do not have any 4TB disks left over in eqiad. If I understand your comment correctly you are saying it's okay to ignore this for now.
Tue, May 28
May 16 2019
I don't think DC-Ops is holding this task up any longer.
lvs1013 idrac is configured and connected to all ports and all switches
lvs1014 idrac is configured and is connected to all the switches
May 13 2019
asw2-a6-eqiad Asset tag mismatch for s/n PE3717440136: WMF7322 (Accounting) vs. WMF7232 (Netbox) - Correct asset tag is WMF7322
asw2-a7-eqiad Asset tag mismatch for s/n TA3717230855: WMF7323 (Accounting) vs. WMF7233 (Netbox)- Correct asset tag is WMF7323
asw2-a8-eqiad Asset tag mismatch for s/n PE3717440010: WMF7324 (Accounting) vs. WMF7234 (Netbox) - Correct asset tag is WMF7324
asw2-b5-eqiad Asset tag mismatch for s/n PE3717430036: WMF7329 (Accounting) vs. WMF7239 (Netbox) - Correct asset tag is WMF7329
May 9 2019
new disk ordered You have successfully submitted request SR990443425.
May 7 2019
@RobH these are currently powered down
I drained the flea power and you should not have any issues with the idrac. The server is still out of warranty so not much I can do about the raid at this point in time. I did disabled the network port switch. If you need access again please let me know and I will enable the port.
May 6 2019
I now have h/w log entries. I will need the server to be taken offline so I can relocate the DIMM and check to see if the error follows. Unfortunately there are steps involved that need to be done to get Dell to replace.
I created a task for this with HPE.
@moborvac I haven't had a chance to get to them until this week. I should be able to get them racked and the on-site done this week.
May 2 2019
@RobH all the servers are racked and on-site work has been completed. Some are off and some are in a state that just needs to rebooted.
@Andrew the disk has been replaced, all yours to install