[fm-discuss] panic[cpu0]/thread=ffffff000961dc60: Unrecoverable Machine-Check Exception
Gavin Maltby
Gavin.Maltby at Sun.COM
Mon Jan 19 16:51:26 PST 2009
Hi,
Ed Kaczmarek wrote:
>
>>
>> Changing that with kmdb involves setting deferred breakpoints. We'll
>> cheat
>> by first disabling everything and setting what we want in /etc/system:
>>
>> 1) Boot into kmdb as before (add -kd to unix line in grub). At the
>> prompt
>> utter 'cmi_no_init/W1' then ':c' to continue. We'll boot with loading
>> and cpu module support, and that should get you booted I think (if not
>> there are bigger problems)
>
> I got big problems then...
>
> Welcome to kmdb
> kmdb: unable to determine terminal type: assuming `vt100'
> Loaded modules: [ unix krtld genunix ]
> [0]> cmi_no_init/W1
> cmi_no_init: 0 = 0x1
Good news for me - that switches off just about all my code :-)
> [0]> :c
> SunOS Release 5.11 Version snv_106 64-bit
> Copyright 1983-2008 Sun Microsystems, Inc. All rights reserved.
> Use is subject to license terms.
> WARNING: Time-of-day chip unresponsive; dead batteries?
> Configuring /dev
> \
A few more tests to try
1) Is this a dual-core system? If so could you force use of
just a single cpu as follows and see if we boot ok:
boot kmdb (-kd on unix line in grub)
use_mp/W0
:c
2) We can force the NB watchdog to be disabled when the cpu
module is loaded (we're getting that far because we
saw the machine check details). We have to use
a deferred breakpoint for this:
boot kmdb
::bp cpu_ms.AuthenticAMD.15`ao_ms_init
:c
When the breakpoint hits on the boot cpu you'll return to kmdb. Now
ao_nb_watchdog_policy/W1
:z
:c
That sets policy AO_NB_WDOG_DISABLE which will unconditionally disable the
watchdog. The :z clears breakpoints so we don't hit them on other cpus.
3) If the BIOS is enabling the watchdog Solaris does not touch it by default.
We can force Solaris to apply its chosen watchdog rate (longest possible
timeout) with AO_NB_WDOG_ENABLE_FORCE_RATE:
boot kmdb
::bp cpu_ms.AuthenticAMD.15`ao_ms_init
:c
When the breakpoint hits on the boot cpu you'll return to kmdb. Now
ao_nb_watchdog_policy/W3
:z
:c
4) Now here's a a real stab in the dark. If your BIOS offers an option
to present your SATA disks as AHCI devices (rather than the old
and busted IDE mode) make sure that is set. I guess ata may still
be involved if you have an IDE DVD drive, but we may get further.
Not sure if path to disk devices will change if you do this - machine
may not boot for other reasons!
Thanks
Gavin
>
>
> I searched thru BIOS screens for any mention of any watchdog timer
> settings.
> None.
>
>
>>
>> 2) Append the following to /etc/system:
>>
>> set cpu_ms\.AuthenticAMD\.15:ao_nb_watchdog_policy=0
>>
>> 3) Reboot normally
>>
>> That will leave the watchdog as the BIOS had it, and I suspect it's
>> off by default, while leaving other functionality operational.
>>
>> I think ata has necessitated this workaround on one or two other
>> motherboards
>> before now. I don't know the true root cause.
>>
>> Gavin
>
More information about the fm-discuss
mailing list