IP Datapath Refactoring [PSARC/2009/331 FastTrack timeout 06/09/2009]
James Carlson
james.d.carlson at sun.com
Thu Jun 4 11:43:27 PDT 2009
Erik Nordmark writes:
> James Carlson wrote:
> > I'd suggest both route(1M) and route(7P). The latter is what routing
> > protocol authors are supposed to be reading.
>
> OK.
> Do you want to see draft man pages? (It would essenntially be my "two
> things" text above.)
No need; the discussion has cleared this up for me.
> > Yep; that's how BSD works.
> >
> > It has RTF_CLONING for the former route, to indicate that when you
> > match it, you need to create a cloned route, and RTF_CLONED to mark
> > the entries that were created by the cloning process.
> >
> > http://www.daemon-systems.org/man/route.8.html
>
> And I think RTF_CLONED maps to 'W' if I don't misremember. Would it make
> sense to use 'W' instead of 'C' for Solaris in this context?
The table that I know about is in here:
http://cvsweb.netbsd.org/cgi-bin/cvsweb.cgi/~checkout~/src/usr.bin/netstat/netstat.1?rev=1.51&content-type=text/plain
... and it maps RTF_CLONING to 'C' and RTF_CLONED to 'c'. I don't
know of a flag that maps to 'W', but I guess it's possible that was
done on some system.
Lining up with BSD would be a nice thing, if possible, but if we have
to be different, I wouldn't complain too much.
> Note that right now the refactor-gate implementation for local
> connections doesn't have visibility to the port numbers when it does the
> ECMP behavior. We'd to fix that to get better spreading.
I'm not sure I follow. I thought that the selection mechanism for
local connections was described as round-robin. How would port
numbers or any ECMP get involved with that?
(I agree that if you're doing ECMP, the more flow-identifying stuff
you can get in there, the better. But that might be tending towards
design review ...)
> > What other code (besides the TCP/IP stack) sends M_MULTIDATA? It was
> > a private interface, and I don't know of any other users.
> >
> > If we're disabling multidata, then why not start the process of
> > removing this old stuff? We could at least start the notification
> > process on the MDT contracts.
>
> Not enough hours in a day, and I don't want to spend those few hours on
> debating whether or not Cassini is the best Ethernet driver ever.
> (As far as I know Cassini is the only driver that uses multidata.)
At least to me, that's not quite the point. We're _breaking_ that
previous feature by permanently disabling it. I think that may well
be a good thing -- I can believe that we get acceptably good
performance without the complication of MDT -- but I don't think it's
useful to say that those things are still "supported" but simply never
work.
Would we say we "support" LSO but then never allow anyone to use it?
I think doing this cleanly means obsoleting those old interfaces
properly.
--
James Carlson, Solaris Networking <james.d.carlson at sun.com>
Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677
More information about the opensolaris-arc
mailing list