[indiana-discuss] System freeze in snv_106
Brian Ruthven - Sun UK
Brian.Ruthven at Sun.COM
Thu Feb 5 06:44:46 PST 2009
Hi Chris,
If you are talking about the delay before the clock thread declaring the
system has hung, then yes, this is tunable. The variable is
snoop_interval and the units are expressed as microseconds. Default
value is 50000000 (50 seconds). See
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/conf/param.c
for the definition of SNOOP_INTERVAL_DEFAULT and snoop_interval.
For example, put the following line in /etc/system and reboot:
set snooping=1
set snoop_interval = 10000000
This will set the deadman timer to 10 seconds and enable the deadman
checking code in clock.c. I would not advise setting either variable on
a live system using mdb.
The value is a balance between getting the system back quickly vs.
accidentally tripping the system for a period of inactivity due to
hardware intervention (e.g. repeated correction of a system bus event by
the hardware). You can set it to whatever you want (minimum value is 1
second) for debugging purposes, but watch out for some false hits if you
are too aggressive with the setting.
Also bear in mind that with a large system which takes 15 minutes to
reboot anyway, 50 seconds is not a lot :-)
Hope that helps,
Brian
Chris Ridd wrote:
>>>> Eric Saxe has an indepth example at http://blogs.sun.com/esaxe/entry/debugging_solaris_scheduling_problems_and
>>>>
>>> Nothing detailed then? :-)
>>>
>> basically, a very-high priority timer is set up, and all that's done
>> when it fires is check whether the system clock (which is driven by
>> a lower-priority interrupt) has progressed. If the clock does not
>> progress for a given number of times the timer fires, we assume the
>> machine is hung and panic the machine.
>>
>
> Is the given number (50?) tunable at all? It feels like it might make
> sense to have smaller values on a server (non-responsiveness means the
> company is losing money) and higher values on a workstation (non-
> responsiveness means I go and have a cup of tea). Different values,
> anyway.
>
--
Brian Ruthven Sun Microsystems UK
Solaris Revenue Product Engineering Tel: +44 (0)1252 422 312
Sparc House, Guillemont Park, Camberley, GU17 9QG
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.opensolaris.org/pipermail/indiana-discuss/attachments/20090205/e1436084/attachment.html>
More information about the indiana-discuss
mailing list