2007/347 NFS/RDMA - Transport Version Update (email vote)
Sebastien Roy
Sebastien.Roy at sun.com
Wed Apr 9 12:04:30 PDT 2008
PSARC members,
This case (2007/347) went through inception on 08/15/2007, and the committee
agreed to hold an email vote after the remaining issues were addressed rather
than require the project team to come back for commitment. The project team is
ready and have provided updates as is detailed in the following paragraphs.
This message is a call to PSARC members to vote on this case by replying to this
message at this time.
I've placed all materials for the case in the commitment.materials directory.
The following files have been updated since inception:
* 20-questions.txt: Patch binding is being requested rather than Micro. Micro
was a typo in the original materials.
* The three IETF drafts have been updated and reflect the most recent versions
produced by the nfsv4 working group. According to the project team, they will
soon be in IETF last call.
The project team (Spencer Shepler) has also issued the following text as
responses to the outstanding issues from inception:
-------
> wes-1 20q18: you mention performance exporting tmpfs, but have
> measurements been done on NFS backed by persistent storage
> (zfs, ufs, etc.,)? data to support the assertion that the
> extra complexity is still worth it when you're limited by
> real storage performance would help.
>
The following represents simple iozone throughput measurements
in a reasonable hardware configuration.
IB stands for the NFS/RDMA/Infiniband implementation where
IPoIB represents NFS/IP/Infiniband. This was done to have
comparable network hardware.
Server: Thumper with 44 disk pool
Client: X2200M2 (2 x 2.6Ghz AMD with 4GB memory)
-------------------------------------------------------------------
iozone, 4 threads, NFS cached I/O path
write throughput read throughput
---------------- ---------------
ZFS, IB: 289 MB/sec 685 MB/sec
Tmpfs, IB: 378 MB/sec 881 MB/sec
ZFS, IPoIB: 178 MB/sec 700 MB/sec
Tmpfs, IPoIB 192 MB/sec 631 MB/sec
Remember that the NFS/RDMA mechanism will reduce CPU overhead
with respect to data copies and allow for better, overall
system scalability.
>
> wes-2 20q19: what happens if a "version 0" client contacts a "version 1"
> server; do you fall back to NFS over IP, does the mount hang, does
> something catch fire, etc?
>
If a version 0 client contacts a version 1 server, the client
will revert to TCP/IB without intervention from the administrator.
If a version 1 client contacts a version 0 server, the client's
mount will hang. The "proto=tcp" mount option can be used, however,
to allow the client to mount in this case.
As mentioned at inception, the deployment of version 0 has been
very minimal to non-existent. Even though there isn't a complete
"fall back" available for the version 1 client to version 0 server,
the impact is expected to be very minimal.
>
> wes-3 which of NFS_RDMA_Design.pdf and NFS_RDMA_Design_Version_1_1.pdf
> is newer? (I didn't notice any difference in context other than
> "version 1.1" in one copy).
>
Version 1.1 had an update to the addressing mechanism being used.
No changes to that design have been made since inception.
-------
-Seb
More information about the opensolaris-arc
mailing list