2008/688 Sun Cluster TCP/IP Hooks Update
James Carlson
james.d.carlson at sun.com
Wed Dec 3 08:06:47 PST 2008
I'm restarting the timer on this fast-track for Huafeng Lu and the Sun
Cluster team. The changes from the last go-around include removing
the version number string and adding a netstack ID and a flexible void
* argument for future expansion, and the contract (contract-01) has
been updated. The timer is set to 12/10/2008.
Introduction
"TCP/IP hooks for Clustering" (PSARC 1997/314) introduced several
TCP/IP hooks for the Sun Cluster product, and the contract is
1997/314:contract-1.new. These hooks are called by the Solaris
TCP/IP code at certain places. Now, to support client-side shared
address (RFE 6717519), we need to modify one of these hooks,
cl_inet_connect(). In addition, two more arguments (netstackid_t,
void *) are added to the signatures of all hooks to allow Cluster to
support exclusive-IP zones and to accommodate future signature changes.
Release Binding
A patch/micro release binding is asserted. This will require the
release of coordinated patches for Solaris and Sun Cluster.
Interfaces
Exported Interface Stability Comments
================== ========= ========
cl_inet_connect Removed (was Contracted) PSARC 1997/314
cl_inet_connect2 Contr. Project Private function pointer
cl_inet_isclusterwide Contr. Project Private function pointer
cl_inet_ipident Contr. Project Private function pointer
cl_inet_getspi Contr. Project Private function pointer
cl_inet_checkspi Contr. Project Private function pointer
cl_inet_deletespi Contr. Project Private function pointer
cl_inet_idlesa Contr. Project Private function pointer
cl_inet_listen Contr. Project Private function pointer
cl_inet_unlisten Contr. Project Private function pointer
cl_inet_disconnect Contr. Project Private function pointer
cl_tcp_walk_list Contr. Project Private function pointer
cl_inet_bind Contr. Project Private function pointer
cl_inet_unbind Contr. Project Private function pointer
Detailed Description
Currently, a shared address is used only at the server side. Server
applications are started on Cluster nodes, which reply to requests
coming from external clients. This proposal (RFE 6717519) aims to
allow Cluster nodes to use shared addresses as client addresses to
talk to external servers. Note: this proposal only handles TCP and
UDP; SCTP is beyond its scope.
For TCP, several Cluster nodes bind to the same shared address, then
connect to external applications. The cl_inet_connect() hook is
called at connect time (for both outgoing and incoming connection
requests) to let the Cluster software handle the packets for these
connections to be dispatched to the right Cluster code. The hook is
added with a return value to indicate whether the Cluster software
succeeded in handling the connection, and an input parameter to
indicate whether the connection is outgoing or incoming that is needed
by the Cluster software.
The situation for UDP is a bit different. Since there is no real
connection in UDP and the same socket can be used to send to
multiple destinations, the major task is to register the <client_IP_addr,
Cluster_node_ID> association with the Cluster software. Such registration
is performed at connect(), sendto() and sendmsg(), which call
the cl_inet_connect() hook at proper time. For performance reasons,
sendto()/sendmsg() calls the hook only when the destination is new.
The cl_inet_connect() hook will also be renamed to cl_inet_connect2()
to avoid confusing old version of Cluster. After renaming, the old name
cl_inet_connect is obsolete and will no longer be called from Solaris.
Currently the Cluster product is limited to shared-IP zones, but in the
future Cluster will also be useful with exclusive-IP zones. To support
this, the signatures of all Cluster hooks are added an instance identifier
(netstackid_t). In addition, to avoid further signature changes, a void*
argument is also added to accommodate all other possible parametners.
So the new signatures of the hooks are listed below. The first argument
(netstackid_t stack_id) and the last (void *args) are added by this case.
cl_inet_connect2 has one new more "boolean_t is_outgoing" argument.
int (*cl_inet_isclusterwide)(netstackid_t stack_id, uint8_t protocol,
sa_family_t addr_family, uint8_t *laddrp, void *args);
uint32_t (*cl_inet_ipident)(netstackid_t stack_id, uint8_t protocol,
sa_family_t addr_family, uint8_t *laddrp, uint8_t *faddrp,
void *args);
void (*cl_inet_getspi)(netstackid_t, uint8_t, uint8_t *, size_t,
void *args);
int (*cl_inet_checkspi)(netstackid_t, uint8_t, uint32_t,
void *args);
void (*cl_inet_deletespi)(netstackid_t, uint8_t, uint32_t,
void *args);
void (*cl_inet_idlesa)(netstackid_t, uint8_t, uint32_t, sa_family_t,
in6_addr_t, in6_addr_t, void *args);
void (*cl_inet_listen)(netstackid_t stack_id, uint8_t protocol,
sa_family_t addr_family, uint8_t *laddrp,
in_port_t lport, void *args);
void (*cl_inet_unlisten)(netstackid_t stack_id, uint8_t protocol,
sa_family_t addr_family, uint8_t *laddrp,
in_port_t lport, void *args);
int (*cl_inet_connect2)(netstackid_t stack_id, uint8_t protocol,
boolean_t is_outgoing, sa_family_t addr_family,
uint8_t *laddrp, in_port_t lport,
uint8_t *faddrp, in_port_t fport,
void *args);
void (*cl_inet_disconnect)(netstackid_t stack_id, uint8_t protocol,
sa_family_t addr_family, uint8_t *laddrp,
in_port_t lport, uint8_t *faddrp,
in_port_t fport, void *args);
int cl_tcp_walk_list(netstackid_t stack_id,
int (*callback)(cl_tcp_info_t *, void *), void *arg);
void (*cl_inet_bind)(netstackid_t stack_id, uchar_t protocol,
sa_family_t addr_family, uint8_t *laddrp, in_port_t lport,
void *args);
void (*cl_inet_unbind)(netstackid_t stack_id, uint8_t protocol,
sa_family_t addr_family, uint8_t *laddrp, in_port_t lport,
void *args);
Out of Scope
Note the actual implementation of these hooks are in the Cluster
code, not in the Solaris code. Solaris TCP/IP code only declares
and calls them. For this project, the Cluster team is responsible
for changing the implementation of the cl_inet_connect2() hook in
the Sun Cluster code.
More information about the opensolaris-arc
mailing list