[tesla-dev] Project Proposal - FB-DIMM Idle Power Saving on Intel 5000P/5000V/5000Z MCH Chipsets]
Liu, Jiang
jiang.liu at intel.com
Tue Mar 18 07:07:12 PDT 2008
Stan Studzinski <mailto:stan.studzinski at sun.com> wrote:
> Liu, Jiang wrote:
>> Hi Stan,
>> Currently Intel 5000 series MCH supports two types of power
>> saving: putting memory into low power state and shutting down unused
>> memory. For the first type of power saving by putting memory
into
>> low
>> power state, it's really based on memory controller thermal
>> protection mechanism and works as below. When the entire system is
>> predicted to be idle for a relative long time, the driver will
>> trigger hardware thermal Hi Liu,
>
> I am just curious is there any rough estimate on how much energy LPS
> would save
> us over some time lets say on 4 CPU system with 8 G of memory when
> the system
> is idle about 50% of time?
> When you say idle do you mean all CPU's do not access any memory? We
> do have some activity on idle systems like clock interrupts (100 per
> sec) etc. which
> will execute code from ISR. Would that be affected?
Yes, idle means all CPUs do not access any memory. For periodic clock
interrupt, it's an obstacle to current FB-DIMM power saving mechanism
and will impact the power saving effect. But I heard there's some
ongoing work on tickless timer and power state aware scheduler. When
this enhancements are ready, the impact of periodic clock interrupt may
be alleviated or removed.
>> protection mechanism to throttle memory access requests which has
>> similar effection of frequence deduction. When system is in thermal
>> protection state, all memory contents are preserved and system can
>> access all memory but with reduced bandwidth and power consumption.
>> When
>>
> OK this "transition" to LPS is done by hardware.
> We can execute code out of RAM. That would allow us to have a driver
> which would turned on memory back (via mem controller) to operate at
> full speed.
>> system wakes up from idle, driver will restore normal memory
>> operation mode. The transition time is about several microseconds.
>> This kind of memory power saving will operate on the whole memory
>> instead part of
>>
> Since LPS applies to whole memory probably it would be not that
> practical to have
> this with NUMA or xVM .
Yes, it's hard to support xVM with this power saving mechanism. And it's
only supported on Intel 5000 series MCHs which don't support NUMA
architecture.
>> them and is feature of Intel 5000 series MCH. Future memory
>> controller may support this type of power saving by hardware
>> automatically without software's assistance.
>>
> That would be nice!.
>> The second type of memory power saving mechanism is shutting
>> down unused memory components. All memory contents will get lost
>> after shutdown so it needs OS'es help to free up all memory in a
>> specific range before shutting down corresponding memory components.
>> To enable this type of power saving, there's much more work needed
>> to be done. OS needs to monitor memory usage and provide mechanism
>> to free/move memory in a specific range, this ability may also be
>> needed by memory hotplug.
>>
> I agree the above would require changes to memory allocator some of
> which already exist on Sparc platforms to support DR. For xVM we
> would have to come up with framework allowing Guest OS to free some
> of its memory to pass this info to xVM.
Aha, Sparc already supports such featur, great news! Could you give me
info/docs on DR? I have poor knowledge about Sparc.
>
> It could provide full power management on memory components
> conforming to Solaris PM system by adopting this type of memory
> power saving
>> mechanism. It will get more important and useful in NUMA system, some
>> nodes with CPU and memory controller may be entirely shut down when
>> system is under low pressure. To sum up, the first type memory
>> power saving mechanism is much
>> more ease to implement and will take effect when entire system is
>> idle. The second type memory power saving mechanism could provide us
>> a cpupm like "mempm" with full power management capability in
>> current and future systems, but with much much more work.
>>
> It would be interesting to try first scheme to see how it works in
> practice.
I'll try to implement a prototype and do some basic tests for the first
scheme.
>
> Stan
>>
>> Stan Studzinski <mailto:stan.studzinski at sun.com> wrote:
>>
>>> On Wed, 12 Mar 2008, Liu, Jiang wrote:
>>>
>>>
>>>> / All,
>>>>
>>> />/ I would like to propose a project to enable FB-DIMM idle
power
>>> />/ saving on Intel 5000P/5000V/5000Z MCH chipsets.
>>> />/ Intel 5000 series MCH chipsets have a thermal protection
>>> />/ mechanism which can be used to reduce FB-DIMM power consumption
>>> when />/ system is idle. This power saving mechanism has been proven
>>> to be very />/ useful in real IDC. On the other hand, platforms
>>> based on Intel 5000 />/ series MCH will continue to ship until the
>>> end of 2009 plus 4-5 years' />/ lifetime. So enabling FB-DIMM
>>> power saving feature can make systems />/ based on 5000 series MCH
>>> more power efficient and save money for />/ customers.
>>> />/ Any thoughts or suggestions?
>>>
>>> Hi Gerry,
>>>
>>> From your proposal it sounds like FB-DIMM power consumption can
>>> be reduced when the system is idle.
>>> How does this work? From your previous emails this feature applies
>>> to whole memory which means that one can not turnoff specified
>>> chunks of memory. Is this correct? Does this mean that for memory
>>> in a "lower power state" (or LPS) all data is preserved?. From what
>>> I understand so far is that if system tries to access LPS memory
>>> the software has to enable memory (via register) and then memory
>>> can be accessed again.
>>>
>>> When the system is idle and the memory is off we can't execute code
>>> from RAM?
>>>
>>> Stan
>>>
>>> />/
>>> />/ Liu Jiang (Gerry)
>>> />/ Senior Software Engineer
>>> />/ OpenSolaris, OTC
>>> />/ Tel: (8610)82611515-1643
>>> />/ iNet: 8758-1643
>>> /
More information about the tesla-dev
mailing list