[fm-discuss] FMA changes for UltraSPARC I revival
Gavin Maltby
gavin.maltby at sun.com
Mon May 15 07:06:03 PDT 2006
On 05/02/06 19:35, Rainer Orth wrote:
> I've just raised a couple of questions about the proposed UltraSPARC I
> revival. One of the issues affects an FMA header, so Jonathan Adams
> suggested to raise it here:
>
> http://mail.opensolaris.org/pipermail/opensolaris-code/2006-May/000822.html
>
> * When UltraSPARC I support is resurrected, it seems to be correct to
> rename /usr/include/sys/fm/cpu/UltraSPARC-II.h to UltraSPARC.h (or
> UltraSPARC-I.h) since this file already refers to both CPU types.
>
> What do you think about such a change?
US-I and US-II are architecturally near identical, especially from the
error reporting point-of-view that fault management is concerned with.
In many places "UltraSPARC-II" is used to refer to both UltraSPARC-I and
UltraSPARC-II. So I wouldn't bother splitting or renaming this file
since it is already consistent with common practice, and any changes
would suggest greater level of support for US-I/II in fault management
than there is.
> Btw., speaking of US-I revival, there has been a question in the past about
> proper FMA support for US-I/II CPUs, and Mike Shapiro posted an excellent
> overview about what would be required for this:
>
> http://www.opensolaris.org/jive/thread.jspa?threadID=1427&tstart=0
>
> Is anyone already working on this? It would be really helpful for revived
> US-I support as well.
Certainly the FMA group in Sun isn't working on that, or planning to anytime
soon. The SPARC platform software people also are not working on it
to my knowledge - most development activity there is concentrated on
current/upcoming SPARC offerings.
Following on from Mike's email referenced above, the conversion to ereports
for a telemetry flow would be pretty simple: most support code is common
already, and most of the work would be in determining what info would
be in each ereport which involves getting up to speed on the error
architecture for US-I/US-II and then copying what was done for US-III*/IV*.
The harder work lies in any diagnosis engine that is required. I'd have a
go at writing eversholt rules for it (as we did for amd64) rather than
any dumbed-down C diagnosis engine. Most diagnosis is pretty simple, especially
with the built-in eversholt features. But there are some trickier bits,
such as handling errors resulting from "signalling ECC" in which some ECC
are deliberately introduced to prevent the use of bad data - likely
doable in eversholt but somewhat tricky.
I'd be happy to help abd advise anyone wanting to implement this - I have the
background in both sparc error handling and fma software.
Cheers
Gavin
More information about the fm-discuss
mailing list