[dtrace-discuss] [networking-discuss] Re: DTrace Network Providers, take 2

Darren.Reed at Sun.COM Darren.Reed at Sun.COM
Mon Jul 2 19:33:45 PDT 2007


Brendan Gregg - Sun Microsystems wrote:

>G'Day Darren,
>
>On Fri, Jun 29, 2007 at 05:27:01PM -0700, Darren.Reed at sun.com wrote:
>  
>
>>Brendan Gregg - Sun Microsystems wrote:
>>
>>    
>>
>>>G'Day Darren,
>>>
>>>On Fri, Jun 22, 2007 at 09:01:26PM -0700, Darren.Reed at Sun.COM wrote:
>>>
>>>      
>>>
>>>
>>>Thoughts:
>>>
>>>Each probe addition adds 1 to 5 nops, and a few movs of recently
>>>accessed data which we would expect to still be cached. Given a maximum
>>>packet rate per CPU per second of 200,000, the execution overhead of some
>>>nops and cache movs per packet should be negligible. The effect on i-cache
>>>is harder to estimate, and is probably only best checked through
>>>measurements.
>>>
>>>In summary, the execution time of the probes is negligible when considering
>>>the per CPU packet rates; the i-cache effect looks negligble, and from the
>>>tests was more affected by code layout than the existance or non-existance
>>>of probes.
>>>
>>>      
>>>
>>I'm not really worried about the data cache, just the i-cache and
>>the CPU pipeline.  Throwing in some NOPs and otherwise redundant
>>instructions every now and then doesn't seem like too big of a sin,
>>but filling an entire i-cache row or two with NOPs seems...wasteful.
>>Depending on when the CPU decides to throw away a NOP, during the
>>pipeline decoding of the instruction would by and large determine
>>what it means to have 2-3 vs 6-9 of them from 1 vs 3 dtrace probes
>>next to each other.  This may sound silly, but I'd have less of an
>>issue about this if the same probes were further away from each
>>other :)  However this really is micro-optimisation.
>>    
>>
>
>3 DTrace probes next to each other is certainly questionable. I've recently
>dropped one set of the IP probes so that it is now 2, and put a webrev
>of this with just the send/receive probes at the end of,
>http://www.opensolaris.org/os/community/dtrace/NetworkProvider/Prototype2/
>
>I wasn't certain that having both ip::: and ipv4/ipv6::: providers was
>the right way to go, and put the webrev and website out to see how it looked.
>I've also been writing scripts using either to see how they felt. The ip:::
>provider is more in the DTrace mould (compare with the io::: provider), and
>so I've dropped the ipv4/ipv6 providers from the recent webrev (ip::: now
>has identical functionality).
>
>[...]
>  
>
>>>I'd suggest this strategy.
>>>
>>>1) For development/troubleshooting probes, use fbt. 
>>>...
>>>
>>>      
>>>
>>I agree with this.
>>But having seen the sdt:::ip-xmit-v4 probe, wouldn't it be
>>nice if you could present that as fbt::ip-xmit-v4:entry but
>>pass in ire->ire_nce as arg2 instead of ire?
>>    
>>
>
>I would have thought that the existing (ire_t *)ire as args[2] was already
>ideal, as users could refer to args[2]->ire_nce as needed. It might be
>a different story if the kernel wasn't CTF'd and these wern't already
>casted.
>  
>

Indeed and I suspect that this hook is simply "ease of use"
in debugging and could be a candidate for removing..


>>>We can't - DTrace doesn't support aliasing like that.
>>>
>>>      
>>>
>>Can I have this added to the list of new things I'd like to
>>see dtrace be able to support, along with the DTRACE_IF? :)
>>
>>I don't know how this would look, maybe:
>>
>>DTRACE_DEF(ip__hook__abc, m, ip, stq_ill);
>>DTRACE_VIEW1(ip__hook__abc, ip__physical__out__end, mblk_t *);
>>DTRACE_VIEW2(ip__hook__abc, ip_send, mblk_t *, void_ip_t *);
>>DTRACE_VIEW3(ip__hook__abc, ip4__send, mblk_t *, ipha_t *, ill_t *);
>>
>>or maybe something more complex so that you could create a hook
>>alias that included just the mblk_t and the ill_t but still only
>>have one hook in the code path.
>>
>>The general story is we have a collection of values that we want
>>to export out through dtrace and we'd like to have a way to see
>>them using different filters (or views.)
>>
>>If dtrace can handle multiple people using the same probe then
>>technically it should be able to cope with multiple people
>>using the same probe but with different aliases to present the
>>values caught by the dtrace hook in different ways.
>>    
>>
>
>Cool idea. I don't think there is a need to do this right now based on
>performance testing so far, but it is good to have options to try like
>this than not to. And both DTRACE_IF and DTRACE_VIEW are options that
>could be added later without changing the exported stable provider
>interface.
>  
>

Yes :)


>It would take some moderate work to get this done; and when complete, there
>would still be more non-enabled probe overhead in other places - like the
>mib probes, that wouldn't benifit from this approach.
>  
>

Understood.  Should I RFE this or is there some other virtual
whiteboard on opensolaris.org for new things to add to dtrace?


>>Another thought is can these dtrace probe views be defined with
>>dtrace(1m) rather than in the kernel?
>>
>>This would mean that "dtrace -l" might not provide an exhaustive
>>list of all the probes available (if it only consults the kernel.)
>>
>>So, if you wanted to get the ip-send dtrace probe, maybe in your
>>dtrace script you would include some special library that knows
>>to use the ip4-physical-out-end (with extra args) and then to
>>apply some sort of condition and change the types of the args.
>>
>>So my dtrace script might be:
>>
>>#!/usr/sbin/dtrace -Fs
>>
>>include <provider/ip>
>>ip:::ip-send{
>>...
>>}
>>
>>If dtrace could be evolved to do that then we could define base
>>probes, such as the ip4-physical-out-start and layer on top of
>>that new probes without having to deliver any new kernel modules.
>>    
>>
>
>Yes, that would be another line of attack.
>
>...
>
>There might be an easier way to drop the duplicate probe overhead, which
>currently looks like,
>
>   14223                 DTRACE_PROBE4(ip4__physical__out__start,
>   14224                     ill_t *, NULL, ill_t *, stq_ill,
>   14225                     ipha_t *, ipha, mblk_t *, mp);
>   14226                 FW_HOOKS(ipst->ips_ip4_physical_out_event,
>   14227                     ipst->ips_ipv4firewall_physical_out,
>   14228                     NULL, stq_ill, ipha, mp, mpip, ipst);
>   14229                 DTRACE_PROBE1(ip4__physical__out__end, mblk_t *,
>   14230                     mp);
>   14231                 if (mp == NULL)
>   14232                         goto drop;
>   14233 
>   14234                 DTRACE_IP5(send, mblk_t *, mp, void_ip_t *, ipha,
>   14235                     ill_t *, stq_ill, ipha_t *, ipha, ip6_t *, NULL);
>
>Why have sdt:::ip4-physical-out-start and sdt:::ip4-physical-out-end anyway?
>  
>

For -start, the point is to have a dtrace probe that is particular to the
location of the hook in the stack relative to packet processing.

For -end, it is useful to see the mp after the FW_HOOKS() is complete.
Is it NULL?  Has it been changed?  Both of these are possible here.
Or in other words, the complexity of using dtrace to achieve something
meaningful is why there are two hooks.

Plus, none of the standard fbt things will show us what happens to mp on
the successful return of the call to hook_run.

In the scripts that you added, yes, you've achieved the same thing,
but it's not a trivial exercise.  You need to (a) know that you can work
this way with dtrace and (b) know what's inside the args.  Recommending
that people take that approach isn't something we should be doing a lot
of if they end up using interfaces that aren't stable.

Or to put it another way, it it was possible to use a probes that
presented just as simple an interface but implemented it using
other means then sure, the probes could be eliminated.

Or to use the syntax from above, I would be just as happy if I could
write a probe like this:

dtrace -I provider/pfh -n 'pfh:::ip4-physical-out-end{...}'

where "-I" is an "include" function, and dtrace somehow built the
ip4-physical-out-end using fbt probes like you did, why do I care
how the probe is implemented?  A consideration that needs to be
taken into account is the performance cost of this approach.

A danger in relying on fbt is that it only works when the function
call works.  So, for example, if ip4-physical-out-start/end were made
to depend on fbt::hook_run, if that function doesn't get called then
there's no physical-out-start/end.  While the -end probe is only
really useful in the presence of hook run being called, the other is
not.

Darren



More information about the dtrace-discuss mailing list