replication of stuff in /usr/gnu
Garrett D'Amore
garrett at damore.org
Fri Jul 6 10:56:15 PDT 2007
Stephen Hahn wrote:
> * Garrett D'Amore <garrett at damore.org> [2007-07-03 15:12]:
>
>> I happened to notice that we ship two different versions of a few very
>> trivial programs:
>>
>> -r-xr-xr-x 1 root bin 7860 Jun 17 14:37 /usr/bin/yes
>> -r-xr-xr-x 1 root bin 19616 Jun 14 13:06 /usr/gnu/bin/yes
>>
>>
>> -r-xr-xr-x 1 root bin 7708 Jun 17 14:21 /usr/bin/true
>> -r-xr-xr-x 1 root bin 15100 Jun 14 13:06 /usr/gnu/bin/true
>>
>> -r-xr-xr-x 1 root bin 7888 Jun 17 14:36 /usr/bin/logname
>> -r-xr-xr-x 1 root bin 19748 Jun 14 13:06 /usr/gnu/bin/logname
>>
>> -r-xr-xr-x 1 root bin 8084 Jun 17 14:21 /usr/bin/pwd
>> -r-xr-xr-x 1 root bin 24176 Jun 14 13:06 /usr/gnu/bin/pwd
>>
>> -r-xr-xr-x 1 root bin 7948 Jun 17 14:20 /usr/bin/hostid
>> -r-xr-xr-x 1 root bin 19692 Jun 14 13:06 /usr/gnu/bin/hostid
>>
>> The GNU versions only differ in that they accept a pointless --help and
>> --version option.
>>
>> and a few that where the GNU differences are so trivial that it hardly
>> seems worth shipping a separate binary:
>>
>> nohup: The Sun version is a superset of the GNU version
>>
>> -r-xr-xr-x 72 root bin 8100 Jun 17 14:21 /usr/bin/nohup
>> -r-xr-xr-x 1 root bin 23032 Jun 14 13:06 /usr/gnu/bin/nohup
>>
>> cksum: The only difference is the type of whitespace separating the
>> checksums (tabs vs. spaces)
>>
>>
>> It strikes me as a really unfortunate thing to deliver two versions of
>> the same programs, when there is no differentiating functionality, or
>> even command line options.
>>
>> Is there anything we can do to clean this up, and to avoid it in the future?
>>
>
> (Although you find --help and --version pointless, many *ix users now
> expect commands to process these options.)
>
I would expect that to be true for commands that actually have some
non-trivial purpose. E.g. for GNU grep this argument holds water. But
for "true" to have --version and --help? It seems a bit of a stretch.
If we expect *all* commands to support this, then we should just
integrate that into our own command set. But I think that this is not a
realistic approach.
> The current choice to ship GNU command variants was made to economize
> utility developer time, in that it's a questionable use of time to
> enhance our commands to be supersets of their Solaris-historic,
> XPG4/XPG6, and GNU behaviours,
Uh, I disagree with that statement. I've specifically made changes to
code to eliminate redundancy between /usr/bin and /usr/xpg4, when the
/usr/xpg4 version was clearly a superset of /usr/bin.
Hey, I'm all for choice, when the choice actually has some meaningful
value. I just don't think this is the case here.
> rather than to ship the upstream
> versions in a separate location. During the discussion that led up to
> 2007/048, it was noted that deviations from the standard GNU coreutils
> installation pattern would also lead to dissatisfied populations; as a
> result, we elected to minimize the differences in our installation (to
> omitting the su(1M) variant and following the "alternative
> environment" pattern for /usr/xpg[46]/bin). People assembling
> distributions have the SUNWgnu-coreutils package boundary to base
> their decision on, or can repackage the SFW proto area to their
> liking.
>
> I suppose my question is "what engineering quality are you trying to
> optimize?", since your example involves 12 pathnames and ~200k of disk
>
Having multiple versions of commands, beyond the wast of disk and
pathnames, causes other issues:
1) increased build times.
2) increased variability in the environment... making debug and
diagnosis of bugs, and maintenance incrementally more complex
3) in the case of the GNU variants, I submit that the GNU variants of
these trivial utilities actually have a negative impact on
performance... the GNU utilities are clearly *larger* than the Sun
versions, and I can only imagine that this has an increased negative
impact on cache, etc.
The cavalier attitude that "its only 200k" is symptomatic of a larger
problem, which is that certain developers have stopped caring about
performance, size, etc... the assumption is that Moore's Law overcomes
sloppy engineering. I reject that attitude; this is how we got into a
situation that we are in today: I used to be able to boot and use
productively a 33MHz 486 with 32MB of RAM. Today, I need at least
512MB and 1GHz cpu to have the same responsiveness that I used to enjoy
back on that 486 15 years ago. And fundamentally, I'm not doing much
different... running an Xserver, a few xterms, and a few vi sessions.
(Yes, there's some new eye candy on the desktop, new hardware support,
etc. But a lot of what is occupying my system resources ... memory,
cache, disk, and cpu cycles ... is code that qualifies as "bloat".)
But then again, I've had in the past year to make Linux work on an 8MB
system, and had to develop a thin-client application that fit within
512K of flash. And more recently, I've been working on IP forwarding
performance, where each extra branch in IP costs about .1% to .2% hit in
the number of packets per second that the system can forward.
Waste is still bad. Moore's Law notwithstanding. I will still tend to
hunt down (and destroy, as much as possible) bloat where I find it,
particularly where that bloat serves no useful purpose.
> space. In any case, sfwnv-discuss is the correct place to have the
> conversation, since we're continuing to integrate OSS tools and
> utilities of this kind.
>
I've added them, but I've left for now PSARC-ext on the distribution
list, because I think these kinds of considerations certainly warrant
ARC review. Certainly, they've taken an interest in reducing /usr/xpg4
vs. /usr/bin redundancy in the past, and reducing /usr/gnu vs. /usr/bin
redundancy seems similarly scoped to me.
-- Garrett
More information about the opensolaris-arc
mailing list