User Details
- User Since
- Dec 18 2014, 3:39 PM (218 w, 1 d)
- Availability
- Available
- LDAP User
- Papaul
- MediaWiki User
- Unknown
Thu, Feb 21
CyrusOne Checked the reading from the fiber patch panel in A8 same readings. So they are still going to run some test out of the cage.
@fgiunchedi is it possible to depool this server for me to do a firmware upgrade before I resolve the task?
disk replaced
Tue, Feb 19
The network cable was was plugged back in after the disk replacement. Should be good now.
db2089 upgrade complete
Upgrade
BIOS from 2.4.3 to 2.9.1
IDRAC from 2.40. to 2.61
Can db2089 be depool please if it is not yet? Thanks
Thu, Feb 14
Upgrade
BIOS from 2.4.3 to 2.9.1
IDRAC from 2.40. to 2.60
@Marostegui this can be done anytime today. Just let me know when the server is down. Thanks
@Marostegui in most cases the CPU1/CPU2 Machine check error detected is caused from outdated BIOS. I will recommend that we first update the BIOS. The system BIOS right now is at 2.4.3 and there is a new version out (2.9.1) from 11/02/2019.After this we can check some settings in the BIOS under BIOS profile .
Complete
Tue, Feb 12
Mon, Feb 11
Previous hardware has been already returned since last Thursday. (See comment on Feb7) We can resolve this task.
Fri, Feb 8
Checked temperature in the rack all looks good. add blanks to the rack since we have only 8 servers in that rack. Leaving the task open for another week.
complete
Thu, Feb 7
Old server has been shipped out. Shipping information below.
Wed, Feb 6
second NIC configuration
Put back sda in the server.
- Remove sda from the server
- boot the server
- server boot without a problem
Disk replaced, server didn't boot up.
Tue, Feb 5
Mon, Feb 4
@fgiunchedi I replaced the problematic server with the new one Dell shipped to me. The OS is installed and puppet first run done. I will proceed to the disk wipe on the old server on Wednesday before shipping it back to Dell. Let me know if you have any questions.
I put back the bad disk and boot the system and the system boot into OS with no problem. it looks like what @jcrespo and other mentioned on IRC the grub is installed on /dev/sda/ only which is the disk that needs to be replaced. so we need to fix this issue first so I can be able to replace the disk.
Disk with serial number WMAYP0E607DT has been replaced. Server can not find boot device. Server can not boot to OS after disk replacement.
Can you please update this disk with which disk failed? Thanks
Removed old puppet cert for ms-be2047.codfw.wmnet
update Netbox with new serial number
Received replacement server
Fri, Jan 25
@Andrew there is no raid controller on the new servers. They all have 2x200GB SSD's
@Andrew can you also specify on this task in which VLAN eth1 needs to be for cloudvirt200[1-3]. Thanks
@Andrew for all those new servers I am using for partman labvirt-ssd.cfg?
@Marostegui disk replacement complete
Thu, Jan 24
Jan 23 2019
Jan 15 2019
looks like the mgmt switch froze have to unplug and plug the power back. Switch is back up
Jan 14 2019
This is complete. All servers ready to be ship out.
This is complete. All servers ready to be ship out.
Jan 9 2019
papaul@asw-c-codfw> show chassis environment | match Power
Power FPC 1 Power Supply 0 OK
FPC 1 Power Supply 1 OK FPC 2 Power Supply 0 OK
Jan 8 2019
BIOS from 2.4.3 to 2.8.0
IDRAC from 2.40 to 2.61
Jan 7 2019
Disk replacement complete
Jan 4 2019
Jan 3 2019
Dec 21 2018
Dec 20 2018
Dell just called me. They will be shipping a new system and will arrive by the first week on January.
Power cable got loose as well may be when working on asw-b8-codfw on Tuesday. System is back up.
Loose power cable. System is back up.
Dec 19 2018
Redundancy Policy on this system was set to Not redundant or on the other working system it was set to redundant so we change the settings for this system to redundant as well. Monitoring the system again
Firmware upgrade complete
fpc2-fpc8 xe-2/0/41 and xe-2/0/42
fpc7-fpc8 xe-7/0/43 and xe-7/0/44
Dec 18 2018
Dec 17 2018
Before
papaul@asw-c-codfw> show interfaces descriptions | match "ge-1/0/1[0-2]" ge-1/0/10 up down elastic2013 ge-1/0/11 up down elastic2014 ge-1/0/12 up down elastic2015
The problem happen again twice after replacing CPU1
connected to scs-c1-codfw on port 48
Complete
fpc2-fpc8 connection xe-2/0/41 and xe-2/0/42
fpc7-fpc8 connection xe-7/0/43 and xe-7/0/44
CPU 1 has been replaced. I clear also the log. The system is back up and I will be monitoring it once again.
Dec 14 2018
Dell will be shipping 1 New CPU by Monday.
Dec 13 2018
This can be resolved then since i am done with it .
papaul@asw-a-codfw# run show interfaces ge-5/0/8 descriptions Interface Admin Link Description ge-5/0/8 down down DISABLED
This is complete
Dec 12 2018
same error again at 22:47
@Marostegui no need to close the task. It can be assign to @RobH so he can keep track
Dec 11 2018
@Marostegui any reason why production DNS is still showing for pc2004?