[zfs-discuss] [nfs-discuss] NFS, ZFS & ESX
Roch
Roch.Bourbonnais at Sun.COM
Wed Jul 8 02:32:42 PDT 2009
erik.ableson writes:
> Comments in line.
>
> On 7 juil. 09, at 19:36, Dai Ngo wrote:
>
> > Without any tuning, the default TCP window size and send buffer size
> > for NFS
> > connections is around 48KB which is not very optimal for bulk
> > transfer. However
> > the 1.4MB/s write seems to indicate something else is seriously wrong.
>
> My sentiment as well.
>
> > iSCSI performance was good, so the network connection seems to be OK
> > (assuming
> > it's 1GbE).
>
> Yup - I'm running at wire speed on the iSCSI connections.
>
> > What is your mount options look like?
>
> Unfortunately, ESX doesn't give any controls over mount options
>
> > I don't know what datastore browser does for copying file, but have
> > you tried
> > the vanilla 'cp' command?
>
> The datastore browser copy command is just a wrapper for cp from what
> I can gather. All types of copy operations to the NFS volume, even
> from other machines top out at this speed. The NFS/iSCSI connections
> are in a separate physical network so I can't easily plug anything
> into it for testing other mount options from another machine or OS.
> I'll try from another VM to see if I can't force a mount with the
> async option to see if that helps any.
>
> > You can also try NFS performance using tmpfs, instead of ZFS, to
> > make sure
> > NIC, protocol stack, NFS are not the culprit.
>
> From what I can observe, it appears that the sync commands issues
> over the NFS stack are slowing down the process, even with a
> reasonable number of disks in the pool.
>
> What I was hoping for was the same behavior (albeit slightly risky) of
> having writes cached to RAM and then dumped out in an optimal manner
> to disk, as per the local behavior where you see the flush to disk
> operations happening on a regular cycle. I think that this would be
> doable with an async mount, but I can't set this on the server side
> where it would be used by the servers automatically.
>
> Erik
>
I would wouldn't do this, sounds like you want to have
zil_disable.
http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide
If you do, then be prepared to unmount or reboot all clients of
the server in case of a crash in order to clear their
corrupted caches.
This is in no way a ZIL problem nor a ZFS problem.
http://blogs.sun.com/roch/entry/nfs_and_zfs_a_fine
And most NFS appliance provider will use a form of write
accelerating devices to try to make the NFS experience closer to
local filesystem behavior.
-r
> > erik.ableson wrote:
> >> OK - I'm at my wit's end here as I've looked everywhere to find
> >> some means of tuning NFS performance with ESX into returning
> >> something acceptable using osol 2008.11. I've eliminated
> >> everything but the NFS portion of the equation and am looking for
> >> some pointers in the right direction.
> >>
> >> Configuration: PE2950 bi pro Xeon, 32Gb RAM with an MD1000 using a
> >> zpool of 7 mirror vdevs. ESX 3.5 and 4.0. Pretty much a vanilla
> >> install across the board, no additional software other than the
> >> Adaptec StorMan to manage the disks.
> >>
> >> local performance via dd - 463MB/s write, 1GB/s read (8Gb file)
> >> iSCSI performance - 90MB/s write, 120MB/s read (800Mb file from a VM)
> >> NFS performance - 1.4MB/s write, 20MB/s read (800Mb file from the
> >> Service Console, transfer of a 8Gb file via the datastore browser)
> >>
> >> I just found the tool latencytop which points the finger at the ZIL
> >> (tip of the hat to Lejun Zhu). Ref: <http://www.infrageeks.com/zfs/nfsd.png
> >> > & <http://www.infrageeks.com/zfs/fsflush.png>. Log file: <http://www.infrageeks.com/zfs/latencytop.log
> >> >
> >>
> >> Now I can understand that there is a performance hit associated
> >> with this feature of ZFS for ensuring data integrity, but this
> >> drastic a difference makes no sense whatsoever. The pool is capable
> >> of handling natively (at worst) 120*7 IOPS and I'm not even seeing
> >> enough to saturate a USB thumb drive. This still doesn't answer why
> >> the read performance is so bad either. According to latencytop,
> >> the culprit would be genunix`cv_timedwait_sig rpcmod`svc
> >>
> >> From my searching it appears that there's no async setting for the
> >> osol nfsd, and ESX does not offer any mount controls to force an
> >> async connection. Other than putting in an SSD as a ZIL (which
> >> still strikes me as overkill for basic NFS services) I'm looking
> >> for any information that can bring me up to at least reasonable
> >> throughput.
> >>
> >> Would a dedicated 15K SAS drive help the situation by moving the
> >> ZIL traffic off to a dedicated device? Significantly? This is the
> >> sort of thing that I don't want to do without some reasonable
> >> assurance that it will help since you can't remove a ZIL device
> >> from a pool at the moment.
> >>
> >> Hints and tips appreciated,
> >>
> >> Erik
> >> _______________________________________________
> >> nfs-discuss mailing list
> >> nfs-discuss at opensolaris.org
> >
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
More information about the zfs-discuss
mailing list