2007/347 NFS/RDMA - Transport Version Update (email vote)

Sebastien Roy Sebastien.Roy at sun.com
Wed Apr 9 12:04:30 PDT 2008


PSARC members,

This case (2007/347) went through inception on 08/15/2007, and the committee 
agreed to hold an email vote after the remaining issues were addressed rather 
than require the project team to come back for commitment.  The project team is 
ready and have provided updates as is detailed in the following paragraphs. 
This message is a call to PSARC members to vote on this case by replying to this 
message at this time.

I've placed all materials for the case in the commitment.materials directory. 
The following files have been updated since inception:

* 20-questions.txt: Patch binding is being requested rather than Micro.  Micro 
was a typo in the original materials.

* The three IETF drafts have been updated and reflect the most recent versions 
produced by the nfsv4 working group.  According to the project team, they will 
soon be in IETF last call.

The project team (Spencer Shepler) has also issued the following text as 
responses to the outstanding issues from inception:

-------

 > wes-1   20q18: you mention performance exporting tmpfs, but have
 >     measurements been done on NFS backed by persistent storage
 >     (zfs, ufs, etc.,)?  data to support the assertion that the
 >     extra complexity is still worth it when you're limited by
 >     real storage performance would help.
 >

The following represents simple iozone throughput measurements
in a reasonable hardware configuration.
IB stands for the NFS/RDMA/Infiniband implementation where
IPoIB represents NFS/IP/Infiniband.  This was done to have
comparable network hardware.

Server: Thumper with 44 disk pool
Client: X2200M2 (2 x 2.6Ghz AMD with 4GB memory)
-------------------------------------------------------------------
iozone, 4 threads, NFS cached I/O path

         write throughput    read throughput
         ----------------    ---------------
ZFS, IB:    289 MB/sec        685 MB/sec
Tmpfs, IB:    378 MB/sec        881 MB/sec
ZFS, IPoIB:    178 MB/sec        700 MB/sec
Tmpfs, IPoIB    192 MB/sec        631 MB/sec


Remember that the NFS/RDMA mechanism will reduce CPU overhead
with respect to data copies and allow for better, overall
system scalability.


 >
 > wes-2    20q19: what happens if a "version 0" client contacts a "version 1"
 >     server; do you fall back to NFS over IP, does the mount hang, does
 >     something catch fire, etc?
 >

If a version 0 client contacts a version 1 server, the client
will revert to TCP/IB without intervention from the administrator.
If a version 1 client contacts a version 0 server, the client's
mount will hang.  The "proto=tcp" mount option can be used, however,
to allow the client to mount in this case.

As mentioned at inception, the deployment of version 0 has been
very minimal to non-existent.  Even though there isn't a complete
"fall back" available for the version 1 client to version 0 server,
the impact is expected to be very minimal.

 >
 > wes-3    which of NFS_RDMA_Design.pdf and NFS_RDMA_Design_Version_1_1.pdf
 >     is newer?  (I didn't notice any difference in context other than
 >     "version 1.1" in one copy).
 >


Version 1.1 had an update to the addressing mechanism being used.
No changes to that design have been made since inception.

-------

-Seb



More information about the opensolaris-arc mailing list