PSARC/2008/603 ELF objects to adopt GNU-style Versym indexes
Ali Bahrami
Ali.Bahrami at sun.com
Mon Sep 22 16:43:39 PDT 2008
I am sponsoring the following self-reviewed case for myself.
I believe it qualifies for self review:
- The versioning extension described is compatible
with that already supported by Solaris for objects
produced by the GNU ld (gld), and preserves full
backward compatibility.
- The changes to the pvs command line (new -I option,
and support for multiple -N options) follow the
same changes made previously for elfdump, under
PSARC/2007/247 Add -I option to elfdump
-----
Release Binding: Patch/Micro
pvs -I option: Committed
pvs multiple -N options: Committed
ELF Versym sections used to index Committed
dependency versions (GNU extension)
Reserved ELF Vernaux struct vna_other Committed
field used to record Versym index
(GNU extension)
VER_FLG_INFO flag, defined for ELF Committed
Vernaux struct vna_flags field
---------------------------------------------------------------------------
ELF object format changes
-------------------------
Traditional ELF symbol versioning, as originally implemented
by Solaris, uses three ELF sections:
SHT_SUNW_verdef
Enumerate the versions defined by the object.
Each version is assigned a "version index", starting
at 1. Index 0 is reserved for non-versioned symbols.
SHT_SUNW_verneed
Enumerate the versions needed from dependencies
(sharable objects) loaded at runtime. No index is
assigned to these versions.
SHT_SUNW_versym
An array of indexes, one per symbol from the
dynamic symbol table (.dynsym), mapping each
symbol to one of the Verdef items. External
symbols are given an index of 0.
At runtime, the runtime linker (ld.so.1) verifies that the
versions exported by the objects (Verdef) being loaded satisfy
the requirements of the other objects (Verneed).
The developers of the GNU link editor (known as gld under Solaris)
adopted this scheme, but made three backward compatible extensions
to it:
(1) The Versym also contains indexes to Verneed records,
recording which object/version contributed the external
symbol at link time. These indexes start with the next
value following the final Verdef index.
To do this, they needed to tag each Verneed record with
an index value. However, the ELF Vernaux structure (used
to record needed items) does not have an index field.
The GNU developers therefore decided to use the vna_other
field from the Vernaux structure --- a field that the original
Solaris developers had defined as a placeholder for future
extensions, and which we currently define as always containing
the value 0.
(2) The top bit of the Versym value is no longer part of the
index, but is used as a "hidden bit" to prevent binding
to the symbol.
(3) Multiple implementations of a given symbol, contained in
varying versions are allowed, using special assembler
pseudo ops (issued via gcc asm directives), and encoded
in the symbol name using '@' characters.
We have previously made changes to our system to accommodate
GNU objects containing these extensions, so that such objects
will run under Solaris:
6516665 The link-editors should be more resilient against
gcc's symbol versioning
6565476 rtld symbol version check prevents GNU ld binary
from running
6577462 Additional improvements needed to handling of
gcc's symbol versioning
As a result of these changes, objects containing the GNU extensions
listed above run under Solaris. As a practical matter, this also
makes items (1) and (2) part of the definition of Solaris ELF objects:
- The meaning of the ELF Vernaux vna_other field, which was
previously reserved and undefined has been fixed, and means
"needed version index when non-zero, and ignored otherwise".
- The top bit of the Versym index is a "hidden bit".
To date, neither of these features are used by the Solaris
link-editor, but their meaning is fixed nonetheless.
We have observed that GNU objects that use the Versym to reference
Verneed definitions have a valuable observability benefit --- given
an undefined symbol to be supplied by a dependency at runtime,
the Versym index can be used to identify which dependency is expected
to supply the symbol at runtime, and which version in that dependency
contains that symbol.
We also note that having made the GNU-related changes listed above,
that we could also record this information in our native Solaris
objects, without requiring additional changes to our object format,
and in so doing, gain the same observability benefit without
altering how the system behaves at runtime.
This case therefore modifies the Solaris linkers in the
following ways:
- Verneed entries now use the vna_other field to record
Versym indexes for each needed version, starting with the
first index following the final Verdef index.
- The Versym index for undefined symbols now reference
Verneed records, using the appropriate indexes recorded
in the vna_other field of the corresponding needed item.
- Previously, a Solaris object would only contain a Versym
index if a Verdef section was present. Now, a Versym is
generated for any object that has either a Verdef or a Verneed
section.
- In the past, only Verneed items that require runtime
validation by the runtime linker (ld.so.1) would be
recorded in an object. With these changes, the link-editor
must now create a Verneed item for every version referenced
by the Versym array. Not all of these items actually require
runtime validation. The link-editor has therefore been modified
to set the new VER_FLG_INFO flag on Verneed records that are
purely informational, and ld.so.1 modified to skip validation
for such records. Note that no harm would be done if ld.so.1
did validate these records --- VER_FLG_INFO is simply an
optimization.
The pvs command is also modified:
- The -N option is extended to allow multiple -N
options, and to accept the -I option. Both of these
changes are made to bring it in line with the conventions
previously established by the elfdump command.
- When used with an object that contains Versym indexes
for Verneed items, pvs can now display the symbols from
each version of each dependency.
A revised manpage for pvs is included in the case materials.
---------------------------------------------------------------------------
Impact on OSnet size
--------------------
There are two ways in which the changes described here can
increase the size of an object:
1) If it causes additional entries to be added
to the Verneed section, due to the Versym section
needing to reference a version that would otherwise
not be needed.
2) If the object does not already have a Versym,
then these changes will cause the link-editor
to add one. The cost of this is the size of a section
header, plus 2 bytes per symbol found in the dynamic
symbol table (.dynsym). The sharable libraries
found in Solaris nearly always already have Versym sections,
so this applies mainly to executables.
These are both very small, so I expected going in that this work
would not significantly alter the size of the objects shipped
with Solaris. To verify this after the implementation was complete,
I did two full nightly builds of the OSnet. One build was of
a completely stock workspace with no modifications. The other
was in a workspace containing my changes, updated to the same
revision as the stock workspace. After the builds were complete,
I ran the command
/usr/gnu/bin/du -sb root_i386
from within the proto subdirectory of each workspace. The
results are:
stock: 403131517
new: 403940816
This represents an increase of 0.2% in size as a result of adopting
these changes.
---------------------------------------------------------------------------
Examples
--------
(These examples appear in the revised pvs manpage)
With the changes covered by this case, the link-editor
records where it expects undefined symbols to be resolved
from at runtime. We can use elfdump, or pvs, to examine
this information.
For instance, imagine that you wanted to know where the
link-editor expects the ldd command to find the printf()
function at runtime. The following pvs command tells
us that it is expected to come from version SYSVABI_1.3
in libc.so.1:
% pvs -ors /usr/bin/ldd | grep ' printf'
/usr/bin/ldd - libc.so.1 (SYSVABI_1.3): printf;
Similarly, we might ask for the list of symbols that are
expected to be provided by SYSVABI_1.3 in libc.so.1:
% pvs -s -N 'libc.so.1 (SYSVABI_1.3)' /usr/bin/ldd
libc.so.1 (SYSVABI_1.3):
_exit;
strstr;
printf;
__fpstart;
strncmp;
lseek;
strcmp;
getopt;
execl;
close;
fflush;
wait;
strerror;
putenv;
sprintf;
getenv;
open;
perror;
fork;
strlen;
geteuid;
access;
setlocale;
atexit;
fprintf;
exit;
read;
malloc;
This information reflects what the link-editor knew when
it constructed the object (in this case, ldd). It is
important to note that the user may use interposition to
change what actually happens at runtime.
More information about the opensolaris-arc
mailing list