[fm-discuss] Proposal: libtopo eumeration of power supplies and fans via IPMI
Rob Johnston
Robert.Johnston at Sun.COM
Thu Jan 10 13:32:54 PST 2008
Scott Davenport wrote:
> On Mon, 2008-01-07 at 12:49 -0800, Rob Johnston wrote:
>> Hello and happy new year!
>>
>> What follows is a proposal that Eric Schrock and I have been working on
>> to define a mechanism to enumerate power supplies and fans on platforms
>> that support IPMI. This is a first step in the larger Sensor
>> Abstraction Layer project.
>>
>> Comments and questions are encouraged.
>>
>> --------------------------------------------------------------------
>>
>> 1. DESCRIPTION
>>
>> The Solaris FMA framework is designed to diagnose failures in system
>> components. Currently these components are discovered by probing the
>> hardware visible to Solaris via standard OS paths (I/O, CPU, DIMMs,
>> etc). However, there exists a set of components that are crucial to the
>> ongoing health of the system that have no connection visible to Solaris.
>> The most common components, and the most likely to encounter failures,
>> are power supplies and fans.
>>
>> On low-end hardware, these components are often not observable, and it
>> is the responsibility of the user to manually detect component failure,
>> or run custom (Windows) software to observe the system. Higher end
>> systems (such as the x4000 series shipped by Sun) have a service
>> processor that manages the physical components and sensors in the
>> system. Some systems (such as SPARC) have a custom communications
>> mechanism between the OS and the SP, but the industry standard is IPMI
>> (Intelligent Platform Management Interface). Solaris already has the
>> ability to communicate with the SP over the baseboard management
>> controller (/dev/bmc), and a basic library (libipmi) already exists.
>>
>> Integrating support for power supplies and fans within FMA is an
>> important step in bringing all hardware topology enumeration and
>> diagnosis under a single infrastructure. Without this ability, users
>> must manage a separate OS instance (on the SP) with different
>> configuration, separate management, and separate notification
>> mechanisms.
>>
>> This proposal adds basic enumeration support for power supplies and fans
>> on platforms supporting IPMI. It does not include the ability to
>> diagnose psu or fan failures, nor does it provide a way to read
>> environmental sensors (fan speed, etc) for these components. This
>> functionality will be provided by a future project.
>>
>>
>> 2. TOPOLOGY CHANGES
>>
>> On x86 systems, the root of the hc topology tree is hc:///motherboard=0
>> (though bay nodes can exist at the root level as well). It doesn't make
>> sense to have physical components like fans underneath the motherboard,
>> nor does it make sense to have them directly at the root level. Future
>> projects will add sensors that monitor the chassis itself, and the
>> components are contained within the chassis, so a new root hc node is
>> created:
>>
>> hc:///chassis=0
>>
>> There is only ever a single chassis. Within IPMI, fans and psus can be
>> grouped together into domains that represent a logical unit (typically a
>> FRU). While uncommon for power supplies, this is quite common for fan
>> modules or fan trays that contain multiple fans. Therefore a
>> multi-level topology will be created of the form:
>>
>> hc:///chassis=0/psu=0
>> hc:///chassis=0/psu=1
>> hc:///chassis=0/powermodule=0
>> hc:///chassis=0/powermodule=0/psu=0
>> hc:///chassis=0/powermodule=0/psu=1
>>
>> hc:///chassis=0/fan=0
>> hc:///chassis=0/fan=1
>> hc:///chassis=0/fanmodule=0
>> hc:///chassis=0/fanmodule=0/fan=0
>> hc:///chassis=0/fanmodule=0/fan=1
>
> Rob,
>
> One other question here.....I'm unclear on what will create the
> 'chassis' node? Driven by the xml map file? Can another enumerator
> create this, then have ipmi enumerate items beneath it?
>
> Thanks,
> -scott
Hi Scott,
The chassis node is being statically created in i86pc-hc-topology.xml. I've
attached preliminary copies of the xml files from my workspace, to give you a
better idea of what we're doing. These also include a first pass at replacing
the <propset> element and the "set" attribute for the <range> element with a
more generic <set> element.
rob
-------------- next part --------------
A non-text attachment was scrubbed...
Name: i86pc-hc-topology.xml
Type: text/xml
Size: 2824 bytes
Desc: not available
Url : http://mail.opensolaris.org/pipermail/fm-discuss/attachments/20080110/bfd0cfff/attachment-0004.xml
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fan-hc-topology.xml
Type: text/xml
Size: 32889 bytes
Desc: not available
Url : http://mail.opensolaris.org/pipermail/fm-discuss/attachments/20080110/bfd0cfff/attachment-0005.xml
-------------- next part --------------
A non-text attachment was scrubbed...
Name: psu-hc-topology.xml
Type: text/xml
Size: 3181 bytes
Desc: not available
Url : http://mail.opensolaris.org/pipermail/fm-discuss/attachments/20080110/bfd0cfff/attachment-0006.xml
-------------- next part --------------
A non-text attachment was scrubbed...
Name: chip-hc-topology.xml
Type: text/xml
Size: 8396 bytes
Desc: not available
Url : http://mail.opensolaris.org/pipermail/fm-discuss/attachments/20080110/bfd0cfff/attachment-0007.xml
More information about the fm-discuss
mailing list