[fm-discuss] many fmd errors on T2000
Mike Gerdts
mgerdts at gmail.com
Tue Jun 20 17:41:45 PDT 2006
On 6/20/06, Gavin Maltby <Gavin.Maltby at sun.com> wrote:
> My pcie-enabled colleague says:
>
> These are correctable errors on the PCIe link (receiver errors and Bad TLPs).
> Sounds like a bad link. The transactions are being retried successfully (for now).
>
> Actually the spec specifically says that if the Physical Layer detects a
> receiver error, then the Link Layer must not also report the error as a bad TLP,
> so there looks to be a chip error handling bug too.
>
> All those ereports should have resulted in some diagnosis. What does
> 'fmadm faulty' report, and what does 'fmdump -v' show?
Could it be related to this?
http://sunsolve.sun.com/search/document.do?assetkey=1-26-102099-1
To avoid the described issues, the following mandatory patches must be applied:
* 118822-23 or later
* 119578-10 or later
* 121236-01 or later
* 121265-01 or later
* 119981-05 or later
* 120824-03 or later
* 120849-02 or later
* 118918-09 or later
along with the following entries that must be set in the "/etc/system" file:
set segkmem_lpsize=0x400000
set pcie:pcie_aer_ce_mask=0x1
Presumably the patches aren't the issue, but the /etc/system entries could be.
Mike
--
Mike Gerdts
http://mgerdts.blogspot.com/
More information about the fm-discuss
mailing list