replication of stuff in /usr/gnu
Garrett D'Amore
garrett at damore.org
Fri Jul 6 13:26:23 PDT 2007
Stephen Hahn wrote:
> * Garrett D'Amore <garrett at damore.org> [2007-07-06 11:54]:
>
>> Stephen Hahn wrote:
>>
>>> * Garrett D'Amore <garrett at damore.org> [2007-07-06 10:59]:
>>>
>>>> Stephen Hahn wrote:
>>>>
>>>>> * Garrett D'Amore <garrett at damore.org> [2007-07-03 15:12]:
>>>>> (Although you find --help and --version pointless, many *ix users now
>>>>> expect commands to process these options.)
>>>>>
>>>>>
>>>> I would expect that to be true for commands that actually have some
>>>> non-trivial purpose. E.g. for GNU grep this argument holds water. But
>>>> for "true" to have --version and --help? It seems a bit of a stretch.
>>>>
>>>> If we expect *all* commands to support this, then we should just
>>>> integrate that into our own command set. But I think that this is not a
>>>> realistic approach.
>>>>
>>>>
>>> (It seems this week that all threads lead to CLIP.) The CLIP spec
>>> specifically asks for help and version options in 6.1 and 6.2. That
>>> is, all long option commands should have --version and --help, and
>>> that short option ("getopt-compliant") commands should support -V and
>>> -?.
>>>
>>> So, we've already elected that approach.
>>>
>> Even for commands that take *no* options? That's a bit of a surprise!
>>
>
> The CLIP classification appears to place a command with no options in
> the "getopt-compliant" class.
>
I should review that spec then, I suppose. There maybe conflict here
with POSIX and backwards compatibility (some of the commands listed take
an optional argument, but do not use getopt. That would lead to
different treatment of options like -?.)
>
>>>> 3) in the case of the GNU variants, I submit that the GNU variants of
>>>> these trivial utilities actually have a negative impact on
>>>> performance... the GNU utilities are clearly *larger* than the Sun
>>>> versions, and I can only imagine that this has an increased negative
>>>> impact on cache, etc.
>>>>
>>>>
>>> But, to run the GNU variant, you must actually change your path to
>>> invoke it--presumably meaning that you were willing to trade absolute
>>> performance for a known variant (used more widely than the historic
>>> Solaris version in most cases).
>>>
>> *BUT*, I may have good reasons to have GNU variants in my path first...
>> because I prefer the GNU versions (for one reason or another... in my
>> *particular* case I don't, but someone else might) because they offer
>> different functionality.
>>
>> If this is the case, should I also pay a performance penalty for these
>> other commands which have no difference? (In particular I'm thinking
>> about /bin/true and similar commands which may be called from shell
>> scripts.)
>>
>
> This might be an argument for correct minimization boundaries, not for
> exclusion from the largest set of offered components. (As an aside,
> the shells have direct support for most of the commands we're
> discussing...)
>
Yes, I understand that most shells do have builtins for some of the
commands.
I don't think there is *any* argument for installing the largest
possible set of 3rd party components. There *is* an argument for
installing the largest set of such components where they will offer an
EOU enhancement. I think you're not getting this point... some of these
utilities offer *no* benefit to Solaris or in portability to users
coming from GNU systems. If there is no benefit, then why bother
installing them in the first place.
>> Note on shared systems (such as Sun Ray servers), the cache impact
>> affects *multiple* users, the performance impact is not limited to just
>> the user using the GNU program.
>>
>
> Again, this scenario argues against any additions to the system,
> because any potential application changes the working set for whatever
> cache you'd like to examine.
>
It argues against *senseless* additions. When there is value added,
then by all means, we should add away. But when there is no conceivable
benefit, then why are we doing it?
>
>> I agree that the impact is *small*, but why pay *any* such cost if there
>> is no benefit in doing so?
>>
>
> Because the path we take to arrive at this hypothetical "zero cost"
> outcome was substantially more expensive than others. That appears to
> be the crux of our disagreement: you believe that developer attention
> should be focussed on refining this kind of integration; I don't.
> There are larger problems to tackle.
>
Why? Replacement of the binaries with symbolic links, or outright
removal, would be a very, very cheap engineering effort. Probably less
than 1 hour of engineering time. In fact, this argument has probably
burned more time than the engineering effort involved!
>
>>>> The cavalier attitude that "its only 200k" is symptomatic of a larger
>>>> problem, which is that certain developers have stopped caring about
>>>> performance, size, etc... the assumption is that Moore's Law overcomes
>>>> sloppy engineering.
>>>>
>>>>
>>> Actually, I don't feel that any of my reasoning in this space is
>>> cavalier or sloppy...
>>>
>>>
>> The 200k argument certainly was, IMO, cavalier. Just because you have
>> infinite disk space (and other system resources) to burn doesn't mean
>> everyone else does.
>>
>
> I know of no supported system on which we would consider installing
> SUNWgnu-coreutils where 200k is a factor. I know of no hypothetical
> target system where installing SUNWgnu-coreutils couldn't be omitted
> to achieve footprint goals.
>
If everyone takes this attitude, then pretty soon all those 200k's add
up. *That* is the cavalier attitude I'm having trouble with. Sort of
like "think globally, act locally". If everyone leaves their lights on
because "that way I won't have to turn them on when I walk into the
room, and besides, its only 40W", then we have a statewide global energy
crisis. Its not all that different a situation with developers and
integrators putting everything *and* the kitchen sink on the media.
Eventually its also true that we're going to hit some boundary, where
stuff doesn't fit on a DVD. That 200k might be the difference between a
single DVD and a multiple DVD installation in the future.
Again, please try to think outside of just the single instance of 200k.
Of course, if everyone says 'I don't care about 200k', then we get to
where we are... bloated systems that struggle to run on 500MHz systems,
and where I can no longer perform a default installation on a 4GB disk.
>
>> I'm saying that this is just one more straw on the camel's back.
>>
>> For a large number of system utilities, I agree that the GNU versions
>> offer an EOU enhancement. But I'm also saying that there is a
>> significant set of GNU utilities for which this is *not* true.
>>
>> Understand, please, that GNU coreutils is intended for use on systems
>> that do not otherwise have these utilities. But for systems like
>> Solaris, which have a native version of commands like /bin/true,
>> providing a 2nd version of the same command, which offers no difference
>> in functionality, seems largely wasteful to me.
>>
>
> I understand that, but I don't think it's a balanced assessment of
> waste.
>
Please explain, in one or two sentences, how having two versions of the
true utility (or logname if you prefer, or pick one of the others I've
identified) can be described as anything other than pointless waste.
By the way, have you considered that having two versions means that
someone has to sustain both versions? This includes QA, packaging,
etc. Putting binaries on the system is not free in terms of human cost,
even if you ignore machine resources.
>
>>>> But then again, I've had in the past year to make Linux work on an 8MB
>>>> system, and had to develop a thin-client application that fit within
>>>> 512K of flash. And more recently, I've been working on IP forwarding
>>>> performance, where each extra branch in IP costs about .1% to .2% hit in
>>>> the number of packets per second that the system can forward.
>>>>
>>>> Waste is still bad. Moore's Law notwithstanding. I will still tend to
>>>> hunt down (and destroy, as much as possible) bloat where I find it,
>>>> particularly where that bloat serves no useful purpose.
>>>>
>>>>
>>> I suspect strongly there are more rewarding veins of waste to mine.
>>>
>>>
>> Probably true. But yours is also low-hanging. And more to the point, I
>> hope to prevent *further* growth here.
>>
>> I would really, really like to see justification for *any* new utility
>> added to the system. Where there is different functionality that users
>> or layered software will notice, then I agree the EOU probably
>> overrides. But where there is no difference, then I'd argue against
>> wasting the system resources.
>>
>
> Well, we'll be having this discussion again, I suppose, as examination
> of this kind for "upstream integration" cases seems in itself wasteful.
>
Maybe. But if you don't know what you're integrating, or why you are
integrating it, then how are you testing it? How do you even know it
works? I think you've elided a major cost of integration of software in
your analysis.
> Perhaps you could come up with some guidelines for easier
> minimization, rather than causing each proposed supported component to
> be subject to some (testable?) assertions about waste.
>
I do believe that a guideline is very simple. If there is some tangible
benefit to users or to Sun in having multiple versions of a utility,
then that is justification enough. But when there isn't any tangible
benefit, then we should avoid duplication.
In any case, the minimization lines are wrong here anyway. I can
readily see cases where someone will want GNU diff, or perhaps some
other utility that is part of coreutils, but also be extremely sensitive
to disk space considerations.
I've recently been through the pain of trying to get a Solaris system to
fit within 2GB. It couldn't be easily done, without a lot of time
figuring out what I could safely remove, and what I couldn't, without
impacting the system's usability as a host for running the NIC driver
test suites. (I didn't need graphics, etc.) I spent time manually
trying to identify cases, in some cases 200k at a time! So I'm
particularly sensitive here.
-- Garrett
More information about the opensolaris-arc
mailing list