[hpcdev-discuss] Static DTrace Probes for Grid Engine
Daniel Templeton
Dan.Templeton at Sun.COM
Mon Aug 20 17:34:11 PDT 2007
All,
I've been working on replacing the rmon facility in Grid Engine 6.1 with
user-land static DTrace probes for systems running Solaris Nevada. It
actually works really well, and I suspect it's slightly faster than the
unmodified 6.1 binaries, as the DTrace probes use some tricks to be
faster than the regular "if" statements that rmon uses. Because I'm
using user-land static probes, though, it only work with Solaris Nevada
builds and distros. It will not work with Solaris 10.
With my changes, there's a new script in the util directory called
dld.sh. dld.sh lets you attach to a running qmaster, scheduler,
execution daemon, or shadow daemon and print debugging output. There is
no need to restart the daemon. There is no penalty to the daemon when
the debugging output is not being printed. All that good DTrace stuff.
The old method of restarting the daemon also works, but the process is a
little different. See the header of the dld.sh script for details. The
debug output from dld.sh is the same as the debug output you get today.
Attached is a diff patch for 6.1 to get my changes. Most of the changes
are in the supporting scripts and make files. Very little source code
changed. After you have applied the diff patch, here's the process to
see the new debug output:
1) Build the system normally, i.e. "aimk".
2) Build special DTrace daemon binaries with "aimk -dtrace". The
binaries will be built in a NEVADA* build directory. The -dtrace switch
will cause aimk to only build the four main daemon binaries.
3) Install the system normally, i.e. "distinst -local -all".
4) Install the special DTrace binaries with "aimk -local -onlydaemons
snv-(amd64|sparc64|x84)". The binaries will be installed in the sol-*
directory, not a snv-* directory, for simplicity's sake.
5) Start the daemons normally, i.e.
"$SGE_ROOT/$SGE_CELL/common/sgemaster;$SGE_ROOT/$SGE_CELL/common/sgeexecd".
6) Set a debug level normally, e.g. "source $SGE_ROOT/util/dl.csh; dl 2".
7) Connect to the qmaster with "$SGE_ROOT/util/dld.sh qmaster".
8) Watch the debug output scroll by... Should look familiar.
The DTrace probes obey the debug level settings as set by the dl
scripts. Otherwise, you'd get swamped in a ton of useless output. In
order to use dld.sh, you must set the debug level with the dl scripts,
just like you'd do normally. Also note that you must be root to run
dld.sh as only root is allowed to run the dtrace command.
The dual build process is a little bulky. The reason for it is that
DTrace has to modify every object file that uses user-land static
probes. To avoid having to completely rewrite the build process, I only
applied the modifications to the four main daemons. They're the only
ones for which the DTrace probes make sense anyway.
If this is something that is generally agreed to be useful, I'll talk to
Andy and Joachim about merging it into the main trunk. Please do send
me feedback.
Daniel
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ybbs.zip
Type: application/zip
Size: 9186 bytes
Desc: not available
Url : http://mail.opensolaris.org/pipermail/hpcdev-discuss/attachments/20070820/3458fd4f/attachment.zip
More information about the hpcdev-discuss
mailing list