[dtrace-discuss] dtrace: processing aborted: Abort due to systemic unresponsiveness - again

Jon Haslam Jonathan.Haslam at Sun.COM
Fri Dec 15 13:56:27 PST 2006


Hi,

Yes, this looks to be an issue. I've had the same problem on some
dual core M2 systems and so has a colleague.

By the look of it, our high res timer gets severely out of whack every
now and then which causes us to think that things have hung. I'll
let you know when I know more.

Jon.

>Hello,
>
>we have 2-way (4-cores) Opteron server (x220M2):
>
>bash-3.00# uname -a
>SunOS e2 5.10 Generic_118855-19 i86pc i386 i86pc
>bash-3.00# psrinfo -v
>Status of virtual processor 0 as of: 12/14/2006 14:08:12
>  on-line since 12/13/2006 11:41:29.
>  The i386 processor operates at 2613 MHz,
>	and has an i387 compatible floating point processor.
>Status of virtual processor 1 as of: 12/14/2006 14:08:12
>  on-line since 12/13/2006 11:41:34.
>  The i386 processor operates at 2613 MHz,
>	and has an i387 compatible floating point processor.
>Status of virtual processor 2 as of: 12/14/2006 14:08:12
>  on-line since 12/13/2006 11:41:36.
>  The i386 processor operates at 2613 MHz,
>	and has an i387 compatible floating point processor.
>Status of virtual processor 3 as of: 12/14/2006 14:08:12
>  on-line since 12/13/2006 11:41:38.
>  The i386 processor operates at 2613 MHz,
>	and has an i387 compatible floating point processor.
>
>I was trying to catch syscalls and got unexpected message:
>bash-3.00# dtrace -n 'syscall::: { @[execname] = count (); }'
>dtrace: description 'syscall::: ' matched 454 probes
>dtrace: processing aborted: Abort due to systemic unresponsiveness
>
>Because the server is not busy and is being prepared for a production
>I was surprised and did dtrace-ing again with vmstat-ing it. Below is output from vmstat:
>
>[...]
>
> 0 0 0 14546888 7392452 0 0  0  0  0  0  0  0  0  0  0  668  160  316  0  0 100
> 0 0 0 14546888 7392452 0 0  0  0  0  0  0  2  2  1  1  671  130  268  0  0 100
> 0 0 0 14546888 7392452 0 0  0  0  0  0  0  3  3  1  1  738  250  326  0  0 100
> 0 0 0 14546888 7392452 0 0  0  0  0  0  0  2  2  1  1  690  145  284  0  0 100
> 0 0 0 14546888 7392452 0 0  0  0  0  0  0  0  0  0  0  651  161  285  0  0 100
> 0 0 0 14546888 7392452 0 0  0  0  0  0  0  0  0  0  0  687  284  316  0  0 100
> 0 0 0 14546888 7392452 0 0  0  0  0  0  0  0  0  0  0  634  146  268  0  0 100
> 0 0 0 14546888 7392452 0 0  0  0  0  0  0  0  0  0  0  644  150  271  0  0 100
> 0 0 0 14546284 7391632 143 288 0 0 0 0  0  0  0  0  0  692  910  342  0  0 100
> 0 0 0 14459332 7312120 228 3332 0 0 0 0 0  0  0  0  0  884 2050  293  5  4 91		[1]
> 0 0 0 14459332 7312116 0 1  0  0  0  0  0  0  0  0  0  865  204  286  0  1 99
> kthr      memory            page            disk          faults      cpu
> r b w   swap  free  re  mf pi po fr de sr cd cd m2 m3   in   sy   cs us sy id
> 0 0 0 14459332 7312116 0 0  0  0  0  0  0  0  0  0  0  889  232  306  0  0 100
> 0 0 0 14459332 7312116 0 0  0  0  0  0  0  1  1  0  0  938  326  353  0  0 100
> 0 0 0 14546272 7391616 0 1  0  0  0  0  0  0  0  0  0 53402 1389 1109 4  3 93		[2]
> 0 0 0 14546272 7391616 148 362 0 0 0 0  0  1  1  1  1 1691 3243 1915  3  1 96
> 0 0 0 14545648 7390776 0 0  0  0  0  0  0  0  0  0  0  635   97  264  0  0 100
> 0 0 0 14545648 7390776 1 3  0  0  0  0  0  0  0  0  0  687  210  340  0  0 100
> 0 0 0 14545648 7390772 0 0  0  0  0  0  0  0  0  0  0  718  289  380  0  0 100
> 0 0 0 14545648 7390772 0 0  0  0  0  0  0  0  0  0  0  676  224  322  0  0 100
> 0 0 0 14545648 7390772 0 0  0  0  0  0  0  0  0  0  0  641  126  278  0  0 100
> 0 0 0 14545648 7390772 0 0  0  0  0  0  0  0  0  0  0  642  127  263  0  0 100
>[...]
>
>The [1] is when I ran the dtrace and [2] is when I got the
>message ("unresponsiveness").
>
>I have read the relevant topic:
>http://www.opensolaris.org/jive/thread.jspa?messageID=15073&
>and am aware that:
>- enabling destructive actions (-w)
>or
>- tuning below parameters for deadmen
>    dtrace_deadman_user
>    dtrace_deadman_interval
>    dtrace_deadman_timeout
>can be helpfull.  I agree that all of them are useful while server is really busy
>but I wouldn't expect such behaviour on an idle server !
>
>Is there any way to solve the problem without the tweaks ? I would like to get
>more knowledge about a nature of the problem.
>
>Regards
>przemol
>
>
>----------------------------------------------------------------------
>smieszne, muzyka, pilka, sexy, kibice, kino, ciekawe, extreme, kabaret
>http://link.interia.pl/f19d4 - najlepsze filmy w intermecie
>
>_______________________________________________
>dtrace-discuss mailing list
>dtrace-discuss at opensolaris.org
>  
>



More information about the dtrace-discuss mailing list