[fm-discuss] "PCE-E fabric" and "interconnect;lfu-l" errors
Tarik Soydan - Sun BOS Software
Tarik.Soydan at Sun.COM
Thu Apr 10 09:46:40 PDT 2008
Eric Sun wrote:
> Scott;
>
> Below is the log, any info is appreciated.
>
> Eric
>
>
> =====
>
> SUNW-MSG-ID: SUNOS-8000-0G, TYPE: Error, VER: 1, SEVERITY: Major
> EVENT-TIME: 0x47e88814.0xa220262 (0xafd46d33c7d)
> PLATFORM: SUNW,Sun-Blade-T6320, CSN: -, HOSTNAME: gd65-19-1
> SOURCE: SunOS, REV: 5.10 glendale-on10-nightly_nightly:08/07/2007
> DESC: Errors have been detected that require a reboot to ensure system
> integrity. See http://www.sun.com/msg/SUNOS-8000-0G for more information.
> AUTO-RESPONSE: Solaris will attempt to save and diagnose the error telemetry
> IMPACT: The system will sync files, save a crash dump if needed, and reboot
> REC-ACTION: Save the error summary below in case telemetry cannot be saved
>
> ereport.io.fire.fabric ena=afd46c71b2e0a001 detector=[ version=0
> scheme="dev"
> device-path="/pci at 0/pci at 0/pci at c/network at 0" ] req_id=7a00 device_id=105e
> vendor_id=8086 rev_id=6 dev_type=0 cap_off=e0 aer_off=100 sts_reg=4010
>
> sts_sreg=0 pcix_sts_reg=0 pcix_bdg_sts_reg=0 dev_sts_reg=4 aer_ce=0 aer_ue=
> 40000 aer_sev=62011 aer_ctr=12 aer_h1=a002000 aer_h2=8200 aer_h3=7a010870
> aer_h4=0 saer_ue=0 saer_sev=0 saer_ctr=0 saer_h1=0 saer_h2=0 saer_h3=0
> saer_h4=0 remainder=2 severity=40
>
The network device /pci at 0/pci at 0/pci at c/network at 0 detected a malformed
completion
packet according to the AER registers and the header log registers. It
must have been doing
some DMA operation. I don't know whats "malformed" about the packet, but
for some reason
the network device didn't like it.
PCI Status Reg = signalled system error
AER_UE = malformed TLP
I would suspect the upstream switch device /pci at 0/pci at 0/pci at c.
I would also expect there to be a fault diagnosed to that effect.
> ereport.io.fire.fabric ena=afd46c71b2e0a001 detector=[ version=0
> scheme="dev"
> device-path="/pci at 0/pci at 0/pci at c" ] req_id=360 device_id=8548 vendor_id=10b5
> rev_id=aa dev_type=60 cap_off=68 aer_off=fb4 sts_reg=10 sts_sreg=0
> pcix_sts_reg=0 pcix_bdg_sts_reg=0 dev_sts_reg=0 aer_ce=0 aer_ue=0 aer_sev=
> 62030 aer_ctr=1ff aer_h1=0 aer_h2=0 aer_h3=0 aer_h4=0 saer_ue=0 saer_sev=0
> saer_ctr=0 saer_h1=0 saer_h2=0 saer_h3=0 saer_h4=0 remainder=1 severity=1
>
>
No errors.
> ereport.io.fire.fabric ena=afd46c71b2e0a001 detector=[ version=0
> scheme="dev"
> device-path="/pci at 0/pci at 0" ] req_id=200 device_id=8548 vendor_id=10b5
> rev_id=
> aa dev_type=50 cap_off=68 aer_off=fb4 sts_reg=10 sts_sreg=0 pcix_sts_reg=0
> pcix_bdg_sts_reg=0 dev_sts_reg=0 aer_ce=0 aer_ue=0 aer_sev=62030
> aer_ctr=1ff
> aer_h1=0 aer_h2=0 aer_h3=0 aer_h4=0 saer_ue=0 saer_sev=0 saer_ctr=0
> saer_h1=0
> saer_h2=0 saer_h3=0 saer_h4=0 remainder=0 severity=1
>
>
No errors.
> panic[cpu40]/thread=2a10c289cc0: Fatal error has occured in: PCIe fabric.
>
> 000002a10c2b1c40 px:px_err_panic+174 (0, 1337c00, 2a10c2b1cf0, 41,
> 2a10c2b1cf1, 0)
> %l0-3: 0000000000000034 00000000018fe000 0000000000000000 0000000000000001
> %l4-7: 00000000018fe000 0000000000000000 0000000001846c00 ffffffffffffffff
> 000002a10c2b1d50 px:px_err_fabric_intr+b8 (300008e9e00, 33, 7a00,
> 300008d0210, 300008e9f50, 7a00000000000000)
> %l0-3: 0000000000000001 0000060005ac4000 0000000000000020 0000060005bdccc0
> %l4-7: 0000000000000041 0000000000000000 0000000000000000 0000000000000001
> 000002a10c2b1e40 px:px_msiq_intr+1c0 (300008e7ce8, 300008d0210, 132c9bc,
> 0, 300008e14f0, 60001c2a3e0) %l0-3: 0000000000000000 000002a10c2b1f10
> 0000000000000000 0000000000000003
> %l4-7: 000002a10c2b1f40 00000600025fc000 0000000000000000 0000000000000033
> 000002a10c2b1f50 unix:current_thread+170 (16, 10000000000,
> fffffefffffffeff, fffffefffffffeff, 0, 12) %l0-3: 000000000100985c
> 000002a10c289021 000000000000000e 000000000000003a
> %l4-7: ffffffffffffffff 0000000000000000 0000000000000000 000002a10c2898d0
> 000002a10c289970 unix:cpu_halt+114 (30001b32000, 10000000000, 184d100,
> 28, 30001b32000, 60005b0b35c)
> %l0-3: 0000000000000016 00000300005afa80 00000300005afa80 0000000000000000
> %l4-7: 0000000000000000 0000000000000000 0000000000000001 0000000000000001
> 000002a10c289a20 unix:idle+128 (1819c00, 0, 30001b32000,
> ffffffffffffffff, 29, 1818c00)
> %l0-3: 0000060005b0b338 000000000000001b 0000000000000000 ffffffffffffffff
> %l4-7: 0000060005b0b338 ffffffffffffffff 000000000184d100 000000000103afbc
>
> syncing file systems...
> ======
> Scott Davenport wrote:
>
>> On Wed, 2008-04-09 at 10:56, Eric Sun wrote:
>>
>>
>>> Hi,
>>>
>>> Recently we got some error from N2/VF system, if anyone on this board
>>> could give a pointer as where to further FA is appreciated.
>>>
>>> 1. On N2, Glendale, system panic due to "PCIe fabric". On this system,
>>> FEM (fabric express module) is not installed.
>>>
>>>
>> I don't know the specific layout of this system, but the T2
>> (Niagara-2) has a built-in PIU. And if it's connected to at
>> least one PLX switch, that constitutes a fabric.
>>
>> Do you have more info you can share? The panic itself would produce
>> a general FMA message (SUNOS-8000-0G, IIRC) - basically saying the
>> system had to panic. But if the panic was due to HW problem, I would
>> expect subsequent telemetry and another diagnosis. Look for FMA
>> message codes in the /var/adm/messages file, check 'fmadm faulty',
>> or run an 'fmdump'.
>>
>>
>>
>>
>>> 2. On VF/Batoka, FMA logged
>>> "fault.asic.ultraSPARC-T2plus.interconnect.lfu-f", any suggestion as
>>> where FMA decument can be found on this error?
>>>
>>>
>> Each fault has a corresponding message id. That is printed to
>> the console and /var/adm/messages when the diagnosis is
>> issued. This particular one is http://www.sun.com/msg/SUN4V-8001-UH.
>> But in short you've had a single lane failure on a VF coherency
>> channel.
>>
>> -- scott
>> http://blogs.sun.com/sdaven
>>
>>
>>
>> _______________________________________________
>> fm-discuss mailing list
>> fm-discuss at opensolaris.org
>>
>>
>
> _______________________________________________
> fm-discuss mailing list
> fm-discuss at opensolaris.org
>
More information about the fm-discuss
mailing list