[mdb-discuss] Re: [dtrace-discuss] examine dtrace behaviour with mdb
Jon Haslam
Jonathan.Haslam at Sun.COM
Thu Feb 22 01:36:26 PST 2007
Hi Peter,
> Hi!
>
> I want to see how the syscall instrumentation work in assembly level, so
> similar to this:
You're looking in the wrong place - you need to look at how the
syscall provider works for that. DTrace instruments system calls
by (basically) modifying the function pointer in the sysent table.
For example, with a write(2) call, instead of calling the write() function
directly via the sysent table, we first modify the original entry
to call into dtrace_systrace_syscall() function. From there
we then call dtrace_probe() (which is the centre of the universe
w.r.t DTrace) and also carry out the original call itself (which is
stored in the systrace_sysent array). You can follow along by referencing
these two source files:
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/os/sysent.c
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/dtrace/systrace.c
Here's an example of how write(2) system call is instrumented:
1) The sysent table entry before instrumentation:
> sysent::array struct sysent 10 | ::print struct sysent
[chop]
{
sy_narg = '\003'
sy_flags = 0x2
sy_call = 0
sy_lock = 0
sy_callc = write
}
2) Execute `dtrace -n syscall::write:entry`
3) The sysent table entry when instrumented:
[chop]
{
sy_narg = '\003'
sy_flags = 0x2
sy_call = 0
sy_lock = 0
sy_callc = dtrace_systrace_syscall
}
4) Here's what the systrace_sysent entry looks like for the write()
call:
> *systrace_sysent::array struct systrace_sysent 10 | ::print -t struct
systrace _sysent
[chop]
{
dtrace_id_t stsy_entry = 0xb263
dtrace_id_t stsy_return = 0
int (*)() stsy_underlying = write
}
The example you give below is slightly confused as it uses the fbt
provider to instrument the ufs_write() entry point and then it looks
at a user-land application to inspect the instrumentation. If you want
to look at how user-land instrumentation is done then you'd need to
use the pid provider (Adam has several excellent presentations
and blog entries covering that which go into enough detail to
give you a nose bleed).
If you have it you might want to take a quick look at the DTrace
introduction chapter in McDougall, Mauro and Gregg's excellent
Performance, Observability and Debugging book (the second volume
in the second edition of Solaris Internals). It aims to introduce
give a quick overview of how DTrace is put together. Obviously,
the DTrace user guide should be read as well.
Cheers.
Jon.
>> ufs_write::dis -n 3
>>
> ufs_write: save %sp, -0x110, %sp
> ufs_write+4: stx %i4, [%sp + 0x8bf]
> ufs_write+8: mov %i0, %i5
> ufs_write+0xc: ldx [%i0 + 0x10], %i4
>
>
>> ufs_write::dis -n 3
>>
> ufs_write: ba,a +0x19814c <0x14c95dc>
> ufs_write+4: stx %i4, [%sp + 0x8bf]
> ufs_write+8: mov %i0, %i5
> ufs_write+0xc: ldx [%i0 + 0x10], %i4
>
>
>> ufs_write+0x19814c::dis
>>
> 0x14c95b4: sethi %hi(0x1331000), %g1
> 0x14c95b8: call +0x79ebc0e8 <dtrace_probe>
> 0x14c95bc: or %g1, 0xc8, %o7
> 0x14c95c0: sethi %hi(0x4000), %o0
> 0x14c95c4: or %o0, 0x98, %o0
> 0x14c95c8: mov 0x300, %o1
> 0x14c95cc: call +0x79ebc0d4 <dtrace_probe>
> 0x14c95d0: mov %i0, %o2
> 0x14c95d4: ret
> 0x14c95d8: restore
> ---
> 0x14c95dc: save %sp, -0x110, %sp
> 0x14c95e0: sethi %hi(0x4000), %o0
> 0x14c95e4: or %o0, 0x99, %o0
> 0x14c95e8: mov %i0, %o1
> 0x14c95ec: mov %i1, %o2
> 0x14c95f0: mov %i2, %o3
> 0x14c95f4: mov %i3, %o4
> 0x14c95f8: mov %i4, %o5
> 0x14c95fc: sethi %hi(0x1331400), %g1
> 0x14c9600: call +0x79ebc0a0 <dtrace_probe>
> 0x14c9604: or %g1, 0x8c, %o7
>
> So, to examine this, I wrote a program, which makes a system call:
> #include <unistd.h>
> int main(int argc, char *argv[]) {
> write(0,"helloworld\n",11);
> return 0;
> }
>
> So, I start to examing it with mdb:
> mdb ./syscall
>
>> main:b
>> :r
>>
> mdb: stop at main
> mdb: target stopped at:
> main: save %sp, -0x68, %sp
>
>> .::dis
>>
> main: save %sp, -0x68, %sp
> main+4: st %i0, [%fp + 0x44]
> main+8: st %i1, [%fp + 0x48]
> main+0xc: sethi %hi(0x10c00), %o1
> main+0x10: or %o1, 0x90, %o1
> main+0x14: clr %o0
> main+0x18: call +0x100ac <PLT:write>
> main+0x1c: mov 0xb, %o2
> main+0x20: clr [%fp - 0x4]
> main+0x24: clr %i0
> main+0x28: ret
> main+0x2c: restore
> main+0x30: clr %i0
> main+0x34: ret
> main+0x38: restore
>
> Okay, the syscall is there, dtrace instuments it, if I turn on the
> syscall::write:entry probe.
>
> When I try to examing write itself I get the same results in
> instrumented and non-instrumented case (I followed the brances, it is
> the same after that too):
>
>> main+0x100ac::dis
>>
> PLT:exit: sethi %hi(0xf000), %g1
> PLT:exit: ba,a -0x40 <PLT:>
> PLT:exit: nop
> PLT:_exit: sethi %hi(0x12000), %g1
> PLT:_exit: ba,a -0x4c <PLT:>
> PLT:_exit: nop
> PLT:write: sethi %hi(0x15000), %g1
> PLT:write: ba,a -0x58 <PLT:>
> PLT:write: nop
> PLT:_get_exit_frame_monitor: sethi %hi(0x18000), %g1
> PLT:_get_exit_frame_monitor: ba,a -0x64 <PLT:>
>
> I tried to ::step the program through the instrumentation, but when the
> probe is on, it conseqently crashes at one instruction (with this, at
> some point I should run into dtrace_probe).
>
> How can I see the effect of system call instrumentation at assembly
> level? Maybe it would be easier if I could compile a static binary. I am
> using nevada build 56 on sparc.
>
> Peter
>
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org
>
More information about the mdb-discuss
mailing list