[fm-discuss] Proposal: libtopo eumeration of power supplies and fans via IPMI
Scott Davenport
Scott.Davenport at Sun.COM
Thu Jan 10 13:26:41 PST 2008
On Mon, 2008-01-07 at 12:49 -0800, Rob Johnston wrote:
> Hello and happy new year!
>
> What follows is a proposal that Eric Schrock and I have been working on
> to define a mechanism to enumerate power supplies and fans on platforms
> that support IPMI. This is a first step in the larger Sensor
> Abstraction Layer project.
>
> Comments and questions are encouraged.
>
> --------------------------------------------------------------------
>
> 1. DESCRIPTION
>
> The Solaris FMA framework is designed to diagnose failures in system
> components. Currently these components are discovered by probing the
> hardware visible to Solaris via standard OS paths (I/O, CPU, DIMMs,
> etc). However, there exists a set of components that are crucial to the
> ongoing health of the system that have no connection visible to Solaris.
> The most common components, and the most likely to encounter failures,
> are power supplies and fans.
>
> On low-end hardware, these components are often not observable, and it
> is the responsibility of the user to manually detect component failure,
> or run custom (Windows) software to observe the system. Higher end
> systems (such as the x4000 series shipped by Sun) have a service
> processor that manages the physical components and sensors in the
> system. Some systems (such as SPARC) have a custom communications
> mechanism between the OS and the SP, but the industry standard is IPMI
> (Intelligent Platform Management Interface). Solaris already has the
> ability to communicate with the SP over the baseboard management
> controller (/dev/bmc), and a basic library (libipmi) already exists.
>
> Integrating support for power supplies and fans within FMA is an
> important step in bringing all hardware topology enumeration and
> diagnosis under a single infrastructure. Without this ability, users
> must manage a separate OS instance (on the SP) with different
> configuration, separate management, and separate notification
> mechanisms.
>
> This proposal adds basic enumeration support for power supplies and fans
> on platforms supporting IPMI. It does not include the ability to
> diagnose psu or fan failures, nor does it provide a way to read
> environmental sensors (fan speed, etc) for these components. This
> functionality will be provided by a future project.
>
>
> 2. TOPOLOGY CHANGES
>
> On x86 systems, the root of the hc topology tree is hc:///motherboard=0
> (though bay nodes can exist at the root level as well). It doesn't make
> sense to have physical components like fans underneath the motherboard,
> nor does it make sense to have them directly at the root level. Future
> projects will add sensors that monitor the chassis itself, and the
> components are contained within the chassis, so a new root hc node is
> created:
>
> hc:///chassis=0
>
> There is only ever a single chassis. Within IPMI, fans and psus can be
> grouped together into domains that represent a logical unit (typically a
> FRU). While uncommon for power supplies, this is quite common for fan
> modules or fan trays that contain multiple fans. Therefore a
> multi-level topology will be created of the form:
>
> hc:///chassis=0/psu=0
> hc:///chassis=0/psu=1
> hc:///chassis=0/powermodule=0
> hc:///chassis=0/powermodule=0/psu=0
> hc:///chassis=0/powermodule=0/psu=1
>
> hc:///chassis=0/fan=0
> hc:///chassis=0/fan=1
> hc:///chassis=0/fanmodule=0
> hc:///chassis=0/fanmodule=0/fan=0
> hc:///chassis=0/fanmodule=0/fan=1
Rob,
One other question here.....I'm unclear on what will create the
'chassis' node? Driven by the xml map file? Can another enumerator
create this, then have ipmi enumerate items beneath it?
Thanks,
-scott
> The IPMI components are technically 'cooling' elements, not fans. For
> the systems which currently support Solaris and IPMI, only fans are
> supported. In the future, we may be able to detect non-fan cooling
> elements by examining the set of associated sensors (such as a
> tachometer) and inferring the type of cooling element.
>
> With IPMI, we know all components, even if a component is not currently
> present. To allow management software to detect empty component slots,
> the FMRIs will always be enumerated, but the is_present method will
> return false if the component is not currently present.
>
>
> 3. DYNAMIC ENUMERATION
>
> A new common libtopo module, ipmi, will be provided that will do dynamic
> enumeration of IPMI components. While currently only supported on x86
> systems, any system supporting IPMI should work, so the module will be
> present on all architectures. If future SPARC platforms support IPMI
> over /dev/bmc, then everything should "just work".
>
> IPMI has the unusual property that the world is defined solely by
> 'sensor descriptor records' (which may be sensors, FRUs, etc). Instead
> of iterating over entities (the IPMI term for components), one instead
> iterates over all SDR records and infers an entity's existence based on
> the sensor records that refer to it. The logic to handle this will be
> kept within libipmi, and the ipmi enumerator will iterate over all
> discovered entities for any 'power domain', 'power supply', 'cooling
> domain', or 'cooling unit' entities. Using IPMI entity association
> records, libipmi will have already organized these into the appropriate
> hierarchy.
>
> The default label for each entity will be based on the entity id and the
> entity instance number (which is globally unique). These labels may or
> may not correspond to the labels on the chassis, but under a correct
> IPMI implementation they will be roughly correct, and there will be a
> means to override them on a per-platform basis (see below). For
> components with a FRU locator record, it may be possible to assign a
> label matching the FRU name, such as 'ft0.fm1.fru', though it's unclear
> if this is any better (the naming is entirely up to the SP, and the
> '.fru' extension is just a convention currently used by the current SP
> firmware).
>
> Each component that is directly under the chassis will be assigned a FRU
> matching its resource. Components within an association will default to
> the FRU of their parent, unless they have associated FRU locator
> records, in which case they will have a distinct FRU matching their
> resource.
>
> The sensors associated with the entity will be used to determine
> presence as described in the IPMI specification.
>
> 4. STATIC ENUMERATION
>
> It would be nice if dynamic enumeration were enough to model any system
> supporting IPMI. Unfortunately, as is the case with most platform
> technologies (such as SMBIOS), complete support for enumeration is
> hampered by limitations of the specification as well as the
> implementation. With a proper implementation of the IPMI spec, it is
> possible to enumerate all the components, though attaching semantic
> meaning to them (labels, failure sensors, etc) is only possible in some
> cases.
>
> On top of this, most platforms have an IPMI implementation that leaves
> something to be desired. A common problem is the lack of entity
> association records, so fans that should be part of a logical module
> (even if correctly represented via SDR records) are not associated with
> one another. Other problems include presence sensors that reference
> incorrect entities, missing or incorrect FRU locator records, etc.
>
> To compensate for both of these problems, libtopo will support both
> dynamic enumeration, static enumeration, and static assignment of senors
> and properties to dynamically discovered entities.
>
>
> 5. LIBIPMI DETAILS
>
> As part of this work, libipmi will be expanded in several different
> capacities, mostly related to parsing SDR records and representing
> entities.
>
> The SDR infrastructure will be expanded to support all possible SDR
> record types (compact sensors, full sensor, entity association, etc).
> The code will also be simplified to separate out the SDR name (when
> available) from the record, since constructing this value is non-trivial
> and should not be left to the consumer.
>
> New interfaces for gathering sensor readings based on a compact or full
> SDR record will be introduced. This consists mainly of a large number
> of #defines, code to transform readings based on the linearization
> function, and parsing the sensor units. Some of this infrastracture
> will not be fully used until future sensor work is complete, but enough
> of it is needed at this point (namely parsing sensor-specific state
> masks) to warrant its inclusion as part of this project.
>
> Based on this new infrastructure, libipmi will be enhanced to have a
> native notion of entities, even these do not exist as such in the IPMI
> specification. The library will scan the SDR records, detect referenced
> entities, group sensors with associated entities, and parse entity
> association records to create a hierarchy of entities. This will also
> include a function to detect entity presence.
>
> This isolates the details of IPMI entities (of which there are many) to
> within libipmi, simplifying the topo enumerator and allowing other
> software to be developed on top of it. One of these pieces of software
> will be a private utility under /usr/lib/fm, 'ipmitopo', which will
> display all IPMI entities (id, type, presence) and sensors associated
> with each entity (reading, state, type, etc). This tool is not designed
> to replace the open source 'ipmitool' and exists solely to debug the
> IPMI topo implementation by leveraging the same code used by libtopo.
>
>
> 6. LIBTOPO ENHANCEMENTS
>
> To make the implementation of this project possible, a handful of
> extensions to both the libtopo enumerator module API and XML schema are
> necessary.
>
> Currently it is not possible to register module methods on nodes that
> are statically enumerated via XML map files. Typically, node methods
> are registered onto a node by the enumerator module after the node is
> bound to the topology. However, since statically enumerated modules
> aren't created by the enumerator module this registration doesn't occur.
>
> While there will be cases where we will be forced to statically define
> psu and fan topologies via XML, these nodes still need to support the
> node methods that are implemented by the ipmi enumerator module. In
> order to allow these methods to be registered on statically defined
> nodes, the topo_modops_t struct will be extended with a new operation
> (tmo_meth_reg) as shown below:
>
> typedef int topo_meth_reg_f(topo_mod_t *, tnode_t *);
>
> typedef struct topo_modops {
> topo_enum_f *tmo_enum; /* enumeration op */
> topo_release_f *tmo_release; /* resource release op */
> topo_meth_reg_f *tmo_meth_reg; /* method registration op */
> } topo_modops_t;
>
> The tmo_meth_reg operation will be optional. Enumerator modules
> which implement this operation will register the appropriate set of
> methods on the topo node that is passed in.
>
> To provide a connection between this new operation and nodes that are
> statically defined in XML, the syntax of the <node> element will be
> extended to include a new optional "mod" attribute. The value of this
> attribute should be set to the name of an enumerator module, whose methods
> should be registered on that node. Below is an example usage of this
> new attribute:
>
> <range name='fan' min='0' max='2'>
> <node instance='0' mod='ipmi'>
> . . .
> </node>
> </range>
>
> Additionally, the syntax of the <range> element will also be extended to
> allow a new "set" attribute. The intention is to allow for conditional
> enumeration of a range of nodes based on the platform type. This is
> analagous to the conditional specification of properties which is
> currently supported via the <propset> element. Below is an example
> usage of this new attribute:
>
> <range name='fanmodule' min='0' max='4'
> set='Sun-Fire-X4500|Sun-Fire-X4540'>
> . . .
> </range>
>
> In the example above, the <range> element (and all children elements
> within) will only be parsed and evaluated if the machine's platform type
> matches one of the platforms specified by the "set" attribute's value.
>
> All of the above extensions will be backwards compatible with any
> existing map files and enumerator modules.
>
>
> 7. FUTURE WORK
>
> This proposal lays the groundwork for a variety of future work under the
> auspices of the FMA Sensor Framework.
>
> The next step will be to include fan and PSU diagnosis. This requires
> representing failure sensors within libtopo using the facility nodes
> proposed as part of the sensor framework. These sensors are then read
> by a sensor-transport module that has as 1:1 correspondence between
> ereports and faults.
>
> This will serve as a proof of concept for facility nodes and prepare
> the way for the larger sensor and alert framework, while providing the
> greatest immediate benefit. Future work will include representing
> analog sensors in libtopo, developing an environmental monitor,
> detecting fan and PSU hotplug, and creating a persistent alert
> framework.
>
>
> 8. REFERENCES
>
> "IPMI v2.0 rev. 1.0 specification markup for IPMI v2.0/v1.5 errata
> revision 3"
>
> http://www.intel.com/design/servers/ipmi/pdf/IPMIv2_0_rev1_0_E3_markup.pdf
>
> Sensor Abstraction Layer OpenSolaris Project
>
> http://www.opensolaris.org/os/project/sensors/
>
> Libtopo documentation: FMD Programmer's Reference, Chapter 9
>
> http://www.opensolaris.org/os/community/fm/FMDPRM.pdf
>
> _______________________________________________
> fm-discuss mailing list
> fm-discuss at opensolaris.org
More information about the fm-discuss
mailing list