2009/235 dladm Possible Values List

Mike Shapiro mws at sun.com
Tue Apr 14 10:14:17 PDT 2009


On Tue, Apr 14, 2009 at 10:08:58AM -0700, Garrett D'Amore wrote:
> Michael Shapiro wrote:
> >This case currently does not address the issue of how an administrator
> >or layered software determines the optimal large MTU, as opposed to
> >the maximum MTU.  The two are not always the same.  For example, on
> >Neptune (nxge), the maximum is 9000, but the optimal large MTU is 8150,
> >because of the size of the DMA transfers the card does in hardware.
> >This case needs to address this issue explicitly, by either:
> >
> >(a) Defining an additional interface by which the optimal value
> >    can be returned from the driver as another attribute, OR
> >
> >(b) Making extremely clear to driver writers that if a large MTU
> >    size less than the maximum is more optimal than the maximum,
> >    that the optimal size should be returned by this interface.
> >
> >My preference is for option (a), but others should weigh in.
> >
> >-Mike
> >
> >  
> This feels like a "hardware tuning" element.

That's because it is a hardware tuning element :)

> MTU configuration generally shouldn't need to worry about page sizes and 
> such.  An extra DMA transfer is usually in the "noise" as far as 
> overheads of network processing are concerned.

Not if you're trying to maximize performance of a heavily
loaded server, which is precisely when this matters most.
 
> More specifically, large MTUs are intended, as I see it, to minimize the 
> effect of per-packet overheads found in NIC hardware, switches, routers, 
> and most especially *hosts*.  (I.e. the TCP/IP stack overheads.)   I 
> suspect that because of these overheads, that in general the largest MTU 
> you can configure is always "optimal", even if the hardware has to 
> perform some extra DMA transfers to make them happen.
> 
> Unless some specific real world tests (e.g. TCP throughput or UDP stress 
> tests) show otherwise, I'm disinclined to believe that there is any 
> reason an administrator would need to know about the underlying hardware 
> DMA limitations, or that the "optimum" value is anything other than the 
> largest supported value.
> 
>    -- Garrett

It makes a rather large difference in the absolute performance
numbers we achieve on the 7000 series.  This is because you have
a system which is executing 16 cores at 100% cpu bound when it's
fully loaded and therefore cutting down on transfers and extraneous
packet processing makes a rather huge difference.

Here's the actual point: the knowledge of this optimal size is a
function of the hardware and the driver.  Our customers spend a lot
of time complaining about the fact that the out-of-the-box tuning
of networking on Solaris is sub-optimal and hard to understand how to do
better.  Jumbo MTU isn't an area we can easily enable by default,
but we can make it a lot easier to achieve the fastest setting.
That knowledge belongs in the driver source, not on some out-of-date
wiki page where everyone has to waste more time googling around
and experimenting to figure it out.

-Mike

-- 
Mike Shapiro, Sun Microsystems Open Storage / Fishworks. blogs.sun.com/mws/



More information about the opensolaris-arc mailing list