IP Datapath Refactoring [PSARC/2009/331 FastTrack timeout 06/09/2009]
James Carlson
james.d.carlson at sun.com
Fri Jun 5 06:16:36 PDT 2009
Erik Nordmark writes:
> > The table that I know about is in here:
> >
> > http://cvsweb.netbsd.org/cgi-bin/cvsweb.cgi/~checkout~/src/usr.bin/netstat/netstat.1?rev=1.51&content-type=text/plain
> >
> > ... and it maps RTF_CLONING to 'C' and RTF_CLONED to 'c'. I don't
> > know of a flag that maps to 'W', but I guess it's possible that was
> > done on some system.
>
> Ah - I looked at freebsd at
> http://www.freebsd.org/cgi/man.cgi?query=netstat&apropos=0&sektion=0&manpath=FreeBSD+7.2-RELEASE&format=html
> which has different flags:
> C RTF_CLONING Generate new routes on use
> c RTF_PRCLONING Protocol-specified generate new routes on use
> W RTF_WASCLONED Route was generated as a result of cloning
Interesting ... I guess it's not surprising that there's mutation
going on here, though it's puzzling how that RTF_PRCLONING would be
used.
> > Lining up with BSD would be a nice thing, if possible, but if we have
> > to be different, I wouldn't complain too much.
>
> Which BSD seems to be the key question ;-)
>
> I'll stick with 'C'.
OK.
> >> Note that right now the refactor-gate implementation for local
> >> connections doesn't have visibility to the port numbers when it does the
> >> ECMP behavior. We'd to fix that to get better spreading.
> >
> > I'm not sure I follow. I thought that the selection mechanism for
> > local connections was described as round-robin. How would port
> > numbers or any ECMP get involved with that?
>
> Where is it described as round-robin?
> In my email where I said "currently the kernel only does some form of
> round robin for default routes"?
You said this:
The project extends the kernel's ability to handle multiple routes for the same
prefix; currently the kernel only does some form of round robin for default
routes and the project extends that to all off-link routes (default, prefix, and
host routes).
That text appears to say directly that you're extending the existing
round-robin behavior seen for default routes to all off-link routes.
Is that not true?
> The issue is that we don't want the implementation to pick a different
> route just because some unrelated route change caused a need to
> revalidate the ire cached for the connection. We do this by having a
> predictable implementation. My point was that that algorithm can be
> improved.
OK. Hashed IDs instead of round-robin is certainly fine by me. If
you go back through the thread, I was just trying to find out exactly
what the feature was (and wasn't) supposed to do. Nothing more.
> > At least to me, that's not quite the point. We're _breaking_ that
> > previous feature by permanently disabling it. I think that may well
> > be a good thing -- I can believe that we get acceptably good
> > performance without the complication of MDT -- but I don't think it's
> > useful to say that those things are still "supported" but simply never
> > work.
>
> In which sense is it a feature? It is just a performance trick with a
> contract private interface. Given that the trick now cost more in
> support complexity than its benefits, why are we not free to stop using
> that trick?
The IP stack has a contract extended to 'ce' to use it.
> > Would we say we "support" LSO but then never allow anyone to use it?
>
> Do we claim to support MDT?
Sure. For example, see:
http://docs.sun.com/app/docs/doc/817-0404/appendixa-46?a=view
That's still not really the point here.
> I don't see what that would mean given that
> the interfaces are contract private, hence no ISV/IHV should use them.
I doubt they do. I think you're missing the point I'm making.
> > I think doing this cleanly means obsoleting those old interfaces
> > properly.
>
> I do not see any harm in keeping the MDT interfaces in place so that the
> Cassini driver binaries continue to load into the kernel and work properly.
> Once Cassini is EOLed we can remove all vestiges of the MDT from the
> system, but removing it earlier implies addition cost to the business
> (maintaining another version of the Cassini driver) which doesn't seem
> warranted.
>
> But if you are going to block this case on this issue, then I would be
> forced to go spend the time to start the EO*L process.
I'm not asking about removal of the functions. I agree that'd be
quite foolish. It'd break at least the 'ce' driver and do so in
violation of the agreed-on contract and for no apparent benefit. But,
as I said, I'm not asking for that at all. (Nor do I understand why
anyone would think I'd ask for breakage like that or that by asking
questions I'd somehow "block this case." I never said any such
things.)
What I am asking for is *merely* documentation. Please explicitly say
that this project retires those previous projects (the ARC cases
themselves). You could do so by:
- Explicitly stating that the interfaces specified in those previous
projects are now Obsolete.
- Going through the process of notifying the contract holders that
the interfaces are going away.
That's all. It shouldn't be this hard.
--
James Carlson, Solaris Networking <james.d.carlson at sun.com>
Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677
More information about the opensolaris-arc
mailing list