Page MenuHomePhabricator

labstore1005 A PCIe link training failure error on boot
Open, HighPublic

Description

labstore1005 has displayed this error on boot. It booted normally after pressing F1 but it's unclear if the system will reboot successfully without manual intervention.

UEFI0067: A PCIe link training failure is observed in PCIe Slot 6 and the link
is disabled.
Do one of the following: 1) Turn off the input power to the system and turn on
again. 2) Update the PCIe device firmware. If the issue persists, contact your
service provider.

Available Actions:
F1 to Continue and Retry Boot Order
F2 for System Setup (BIOS)
F10 for LifeCycle Controller
- Enable/Configure iDRAC
- Update or Backup/Restore Server Firmware
- Help Install an Operating System
F11 for Boot Manager

Event Timeline

herron created this task.Jun 30 2017, 1:52 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 30 2017, 1:52 AM
herron updated the task description. (Show Details)Jun 30 2017, 1:53 AM
Andrew added a subscriber: Andrew.

I tagged dc-ops because... have y'all ever seen something like this?

bd808 added a subscriber: bd808.Jun 30 2017, 2:07 AM

http://www.dell.com/support/manuals/us/en/04/dell-opnmang-sw-v8.1/EEMI_13G_v1.2-v1/UEFI-Event-Messages?guid=GUID-823669E3-2D7B-41B5-85F1-AF7A6BC11ACC&lang=en-us

UEFI0067

Message
    A PCIe link training failure is observed in arg1 and device link is disabled. 
Arguments
    arg1 = PCIe device 
Detailed Description
    A PCIe link failure is observed in the PCIe device identified in the message and device link is disabled. 
Recommended Response Action
    Do one of the following: 1) Turn off the input power to the system and turn on again. 2) Update the PCIe device firmware. If the issue persists, contact your service provider. 
Category
    System Health (UEFI = UEFI Event) 
Severity
    Severity 1 (Critical)

We did another reboot to downgrade the kernel back to 4.3 and the error happened again.

chasemp triaged this task as High priority.EditedJul 3 2017, 5:47 PM
chasemp added subscribers: Christopher, chasemp.

I tagged dc-ops because... have y'all ever seen something like this?

We had this server not come back with reboot a few times on its own so we are a bit scared of it atm :)

@Cmjohnson I ping'd the wrong chris before :) As of this moment labstore1005 is the standby, if you have time to look at this it would be great chris. Thanks.

Bstorm added a subscriber: Bstorm.

This is pretty old. We'll have to reboot it again to know if this is still happening. I suspect it actually isn't.